Journal Articles – HLT

  • Jingru Lin, Meng Ge, Wupeng Wang, Haizhou Li, Mengling Feng, "Selective HuBERT: Self-Supervised Pre-Training
    for Target Speaker in Clean and Mixture Speech", in IEEE Signal Processing Letters 2024, DOI: 10.1109/LSP.2024.3383794
  • Qu Yang*, Malu Zhang*, Jibin Wu, Kay Chen Tan, Haizhou Li, "LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding", IEEE Transactions on Cognitive and Developmental Systems 2023, DOI: 10.1109/TCDS.2023.3334010
  • Siqi Cai, Hongxu Zhu, Tanja Schultz Haizhou Li, "EEG-based Auditory Attention Detection in Cocktail Party Environment", in APSIPA Transactions on Signal and Information Processing 2023, Vol. 12: No. 3, e22. http://dx.doi.org/10.1561/116.00000128, October 2023.
  • Qinyi Wang, Xinyuan Zhou, Haizhou Li, "Speech-and-Text Transformer: Exploiting Unpaired Text for End-to-End Speech Recognition", APSIPA Transactions on Signal and Information Processing: Vol. 12: No. 1, e27. May 2023, http://dx.doi.org/10.1561/116.00000001
  • Xiaoxue Gao, Chitralekha Gupta, Haizhou Li, "PoLyScriber: Integrated Fine-Tuning of Extractor and Lyrics Transcriber for Polyphonic Music," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1968-1981, May 2023, DOI: 10.1109/TASLP.2023.3275036.
  • Yi Zhou, Zhizheng Wu, Mingyang Zhang, Xiaohai Tian, Haizhou Li, "TTS-Guided Training for Accent Conversion Without Parallel Data", in IEEE Signal Processing Letters, vol. 30, pp. 533-537, April 2023, DOI: 10.1109/LSP.2023.3270079.
  • Yi Zhou, Zhizheng Wu, Xiaohai Tian, Haizhou Li, Optimization of Cross-Lingual Voice Conversion With Linguistics Losses to Reduce Foreign Accents," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1916-1926, April 2023, DOI: 10.1109/TASLP.2023.3271107.
  • Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamaki, Haizhou Li, "Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs" in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1706-1719, April 2023, DOI: 10.1109/TASLP.2023.3268568.
  • Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li, "PoE: A Panel of Experts for Generalized Automatic Dialogue Assessment," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1234-1250, March 2023, DOI: 10.1109/TASLP.2023.3250825

  • Kun Zhou, Berrak Sisman, Rajib Rana, B.W. Schuller, Haizhou Li, “Emotion Intensity and its Control for Emotional Voice Conversion”, in IEEE Transactions on Affective Computing, vol. 14, no. 1, pp. 31-48, 1 Jan.-March 2023, DOI: 10.1109/TAFFC.2022.3175578
  • Kun Zhou, Berrak Sisman, Rajib Rana, Bjorn Schuller and Haizhou Li,“Speech Synthesis with Mixed Emotions”, IEEE Transactions on Affective Computing, 2023. [link]
  • Qiquan Zhang, Xinyuan Qian, Zhaoheng Ni, Aaron Nicolson, Eliathamby Ambikairajah, Haizhou Li, “A Time-Frequency Attention Module for Neural Speech Enhancement”, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 462-475, 2023, DOI: 10.1109/TASLP.2022.3225649.
  • Xinyuan Qian, Zhengdong Wang, Jiadong Wang, Guohui Guan, Haizhou Li, “Audio-Visual Cross-Attention Network for Robotic Speaker Tracking”, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 550-562, 2023, DOI: 10.1109/TASLP.2022.3226330.
  • Zexu Pan, Meng Ge, Haizhou Li, “USEV: Universal Speaker Extraction with Visual Cue”, in IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 30, pp. 3032-3045, 2022, DOI 10.1109/TASLP.2022.3205759.
  • Siqi Cai, Peiwen Li, Enze Su, Qi Liu, and Longhan Xie, “A Neural-Inspired Architecture for EEG-Based Auditory Attention Detection,” in IEEE Transactions on Human-Machine Systems, vol. 52, no. 4, pp. 668-676, Aug. 2022, DOI: 10.1109/THMS.2022.3176212
  • Xiaoxue Gao, Chitralekha Gupta, Haizhou Li, “Automatic Lyrics Transcription of Polyphonic Music with Lyrics-Chords Multi-Task Learning”, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2280-2294, 2022, DOI: 10.1109/TASLP.2022.3190742.
  • Xianghu Yue, Jingru Lin, Fabian Ritter Gutierrez, Haizhou Li, “Self-Supervised Learning with Segmental Masking for Speech Representation”, in IEEE Journal of Selected Topics in Signal Processing, vol. 16, no. 6, pp. 1367-1379, Oct. 2022, DOI: 10.1109/JSTSP.2022.3191845.
  • Chitralekha Gupta, Haizhou Li, Masataka Goto, “Deep Learning Approaches in Topics of Singing Information Processing”, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2422-2451, 2022, DOI: 10.1109/TASLP.2022.3190732.
  • Z. Pan, X. Qian and H. Li, “Speaker Extraction with Co-Speech Gestures Cue”, in IEEE Signal Processing Letters, Vol. 29, pp. 1467-1471, 2022, DOI: 10.1109/LSP.2022.3175130.
  • Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, and Junchi Yamagishi, “Optimizing Tandem Speaker Verification and Anti-Spoofing Systems”, IEEE Transactions on Audio, Speech and Language Processing Vol. 30, pp. 477-488, January 2022. DOI: https://doi.org/10.1109/TASLP.2021.3138681
  • Kun Zhou, Berrak Sisman, Rui Liu, and Haizhou Li, “Emotional voice conversion: theory, databases, and esd”, Speech Communication. Volume 137, February 2022, Pages 1-18. DOI: https://doi.org/10.1016/j.specom.2021.11.006
  • Tianchi Liu; Rohan Kumar Das; Kong Aik Lee; Haizhou Li, “Neural Acoustic-Phonetic Approach for Speaker Verification with Phonetic Attention Mask”, in IEEE Signal Processing Letters, vol. 29, pp. 782-786, 2022, DOI: 10.1109/LSP.2022.3143036.
  • Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li, “Decoding Knowledge Transfer for Neural Text-to-Speech Training”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1789-1802, 2022, DOI: 10.1109/TASLP.2022.3171974.
  • Z. Pan, R. Tao, C. Xu and H. Li, “Selective Listening by Synchronizing Speech With Lips,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1650-1664, 2022, DOI: 10.1109/TASLP.2022.3153258.
  • S. Cai, E. Su, L. Xie and H. Li, "EEG-Based Auditory Attention Detection via Frequency and Channel Neural Attention," in IEEE Transactions on Human-Machine Systems, vol. 52, no. 2, pp. 256-266, April 2022, DOI: 10.1109/THMS.2021.3125283.
  • Jibin Wu, Qi Liu, Malu Zhang, Zihan Pan, Haizhou Li, Kay ChenTan, “HuRAI: A brain-inspired computational model for human-robot auditory interface”, Neurocomputing,  2021, 465, pp 103-113.
  • Jibin Wu, Malu Zhang, Yansong Chua, Guoqi Li, Haizhou Li, “A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks”, IEEE Transactions On Neural Networks and Learning Systems, 2021, DOI: 10.1109/TNNLS.2021.3095724
  • Laxmi R Iyer, Yam Song (Yansong) Chua and Haizhou Li, “Is Neuromorphic MNIST neuromorphic? Analyzing the discriminative power of neuromorphic datasets in the time domain”, Frontiers in Neuroscience-Neuromorphic Engineering [Article In-Process].
  • Enze Su, Siqi Cai, Longhan Xie, Haizhou Li, and Tanja Schultz, “STAnet: A Spatiotemporal Attention Network for Decoding Auditory Spatial Attention from EEG”, IEEE Transactions on Biomedical Engineering [Article In-Process].
  • Jibin Wu, Chenglin Xu, Xiao Han, Daquan Zhou, Malu Zhang, Haizhou Li and Kay Chen Tan, “Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks”, TPAMI.2021.3114196, IEEE Transactions on Pattern Analysis and Machine Intelligence [Article In-Process].
  • Bidisha Sharma, Xiaoxue Gao, Karthika Vijayan, Xiaohai Tian, and Haizhou Li, “NHSS: A speech and singing parallel database”, Speech Communication, 133,  July 2021, pp. 9-22.
  • Chenglin Xu, Wei Rao, Jibin Wu, and Haizhou Li, “Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech”, IEEE ACM Trans. Audio Speech Lang. Process. 29: 2696-2709 (2021).
  • Xinyuan Qian, Qi Liu, Jiadong Wang, and Haizhou Li, “Three-dimensional Speaker Localization: Audio-refined Visual Scaling Factor Estimation”, IEEE Signal Processing Letters, July 2021.
  • Chen Zhang, Grandee Lee, Luis Fernando D’Haro, and Haizhou Li, “D-score: Holistic Dialogue Evaluation without Reference”, IEEE/ACM Transactions on Audio, Speech and Language Processing, April 2021.
  • Rui Liu, Berrak Sisman, Guanglai Gao and Haizhou Li, “Expressive TTS Training with Frame and Style Reconstruction Loss”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, April 2021, pp. 1-13.
  • Rui Liu, Berrak Sisman, Yixing Lin and Haizhou Li, “FastTalker: A Neural Text-to-Speech Architecture with Shallow and Group Autoregression”, Neural Networks, April 2021.
  • Mingyang Zhang, Yi Zhou, Li Zhao, and Haizhou Li, “Transfer learning from speech synthesis to voice conversion with non-parallel training data,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, March 2021, pp. 1290-1302.
  • Jichen Yang, Hongji Wang, Rohan Kumar Das, and Yanmin Qian, “Modified Magnitude-phase Spectrum Information for Spoofing Detection”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 29, February 2021, pp. 1065-1078.
  • Zhixuan Zhang and Qi Liu, “Spike-event-driven deep spiking neural network with temporal encoding”, IEEE Signal Processing Letters, 28, 2021, pp. 484-488.
  • Qi Liu and Jibin Wu, “Parameter tuning-free missing-feature reconstruction for robust sound recognition”, IEEE Journal of Selected Topics in Signal Processing, 15(1), January 2021, pp. 78-89.
  • Berrak Sisman, Junichi Yamagishi, Simon King, and Haizhou Li, “An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 2021, pp. 132-157.
  • Rui Liu, Berrak Sisman, Feilong Bao, Jichen Yang, Guanglai Gao and Haizhou Li, “Exploiting morphological and phonological features to improve prosodic phrasing for Mongolian speech synthesis” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 2021,  pp. 274-285.
  • Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao and Haizhou Li, “Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS”, IEEE Signal Processing Letters, 27, 2020, pp. 1470-1474.
  • Yi Zhou, Xiaohai Tian and Haizhou Li, “Multi-Task WaveRNN with an Integrated Architecture for Cross-lingual Voice Conversion”, IEEE Signal Processing Letters, 27, 2020, pp. 1310-1314.
  • Changhuai You and Jichen Yang, “Device Feature Extraction Based on Parallel Neural network training for replay spoofing detection”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 28, 2020, pp 2308-2318.
  • Mingyang Zhang, Berrak Sisman, Li Zhao and Haizhou Li, “DeepConversion: Voice conversion with limited parallel training data”, Speech Communication, 122, 2020, pp. 31-43.
  • Chenglin Xu, Wei Rao, Eng Siong Chng and Haizhou Li, “SpEx: Multi-Scale Time Domain Speaker Extraction Network”, IEEE/ACM Transaction on Audio, Speech, and Language Processing, 28, 2020, pp. 1370-1384.
  • Malu Zhang, Xiaoling Luo, Jibin Wu, Yi Chen, Ammar Belatreche, Zihan Pan, Hong Qu, and Haizhou Li, “An Efficient Threshold-Driven Aggregate-Label Learning Algorithm for Multimodal Information Processing,” IEEE Journal of Selected Topics in Signal Processing, 14(3), March 2020, pp. 592-602.
  • Malu Zhang, Jibin Wu, Ammar Belatreche, Zihan Pan, Xiurui Xie, Yansong Chua, Guoqi Li, Hong Qu and Haizhou Li, “Supervised Learning in Spiking Neural Networks with Synaptic Delay-Weight Plasticity,” Neurocomputing, 409, October 2020, pp. 103-118.
  • Jibin Wu, Emre Yılmaz, Malu Zhang, Haizhou Li and Kay Chen Tan, “Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition,” Frontiers in Neuroscience, 14(199), March 2020, pp. 1-14.
  • Zihan Pan, Yansong Chua, Jibin Wu, Malu Zhang, Haizhou Li and Eliathamby Ambikairajah, “An Efficient and Perceptually Motivated Auditory Neural Encoding and Decoding Algorithm for Spiking Neural Networks”, Frontiers in Neuroscience, 13(1420), January 2020, pp. 1-17.
  • Jichen Yang and Rohan Kumar Das. “Improving anti-spoofing with octave spectrum and short-term spectral statistics information.” Applied Acoustics, 157(107017), 2020, pp. 1-10.
  • Jichen Yang, Rohan Kumar Das and Haizhou Li, “Significance of Subband Features for Synthetic Speech Detection”, IEEE Transactions on Information Forensics and Security, 15, 2020, pp. 2160-2170.
  • Chitralekha Gupta, Haizhou Li and Ye Wang, “Automatic Leaderboard: Evaluation of Singing Quality Without a Standard Reference,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 2020, pp. 13-26.
  • Paul Yaozhu Chan, Minghui Dong and Haizhou Li, “The Science of Harmony: A Psychophysical Basis for Perceptual Tensions and Resolutions in Music,” Research, 2019 (2369041), September 2019, pp. 1-22.
  • Qiang Yu, Haizhou Li and Kay Chen Tan, “Spike Timing or Rate? Neurons Learn to Make Decisions for Both Through Threshold-Driven Plasticity”, IEEE Trans. Cybernetics, 49(6), June 2019, pp. 2178-2189.
  • Berrak Sisman, Mingyang Zhang and Haizhou Li, “Group Sparse Representation with WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion”, IEEE/ACM Trans. Audio, Speech & Language Processing, 27(6), June 2019, pp. 1085-1097.
  • Jichen Yang* and Rohan Kumar Das*, “Low Frequency Frame-wise Normalization over Constant-Q Transform for Playback Speech Detection”, Digital Signal Processing, 89, June 2019, pp. 30-39. (*equal contribution)
  • Emre Yılmaz, Vikramjit Mitra, Ganesh Sivaraman and Horacio Franco, “Articulatory and Bottleneck Features for Speaker-Independent ASR of Dysarthric Speech,” Computer Speech & Language, 58, May 2019, pp. 319-334.
  • Karthika Vijayan, Haizhou Li and Tomoki Toda, “Speech-to-Singing Voice Conversion: The Challenges and Strategies for Improving Vocal Conversion Processes,” IEEE Signal Processing Magazine, 36(1), January 2019, pp. 95-102.
  • Luis Fernando D’Haro, Rafael E. Banchs, Chiori Hori and Haizhou Li, “Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics,” Computer Speech & Language, 55, March 2019, pp. 200-215.
  • Chong Zhang, Kay Chen Tan, Haizhou Li and Geok Soon Hong, “A Cost-Sensitive Deep Belief Network for Imbalanced Classification”, IEEE Transactions on Neural Networks and Learning Systems, 30(1), January 2019, pp. 109-122.
  • Maulik Madhavi and Hemant Patil, “Vocal Tract Length Normalization using a Gaussian mixture model framework for query-by-example spoken term detection,” Computer Speech & Language, 58, November 2019, pp. 175-202.
  • Rohan Kumar Das, Sarfaraz Jelil and S. R. M. Prasanna, “Exploring Text- constraint Models and Source Information for Long-enrollment with Short-test Speaker Verification”, in Circuits, Systems and Signal Processing, Springer, 38(4), April 2019, pp. 1175-1792.
  • Rohan Kumar Das and S. R. M. Prasanna, “Investigating Text-independent Speaker Verification Systems Under Varied Data Conditions”, in Circuits, Systems and Signal Processing, Springer, 38(8), August 2019, pp. 3778-3801.
  • Malu Zhang, Hong Qu, Ammar Belatreche, Yi Chen, and Zhang Yi, “A Highly Effective and Robust Membrane Potential-Driven Supervised Learning Method for Spiking Neurons,” IEEE Transactions on Neural Networks and Learning Systems, 30(1), January 2019, pp. 123-137.
  • Chong Zhang, Kay Chen Tan, Haizhou Li, and Geok Soon Hong, “A cost-sensitive deep belief network for imbalanced classification,” IEEE Transactions on Neural Networks and Learning Systems, 30(1), January 2019, pp. 1-14.
  • Chitralekha Gupta, Haizhou Li, and Ye Wang, “A Technical Framework for Automatic Perceptual Evaluation of Singing Quality”, APSIPA Transactions on Singnal and Information Processing, 7(E10), September 2018, pp. 1-11.
  • Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng and Haizhou Li, “Re-ranking spoken term detection with acoustic exemplars of keywords”, Speech Communication, 104, November 2018, pp. 12-23.
  • L. Xu, Kong-Aik Lee, Haizhou Li and Zhen Yang, “Generalizing I-Vector Estimation for Rapid Speaker Recognition”, IEEE/ACM Trans. Audio, Speech & Language Processing, 26(4), April 2018, pp. 749-759.
  • Saad Irtza, Vidhyasaharan Sethu, Eliathamby Ambikairajah and Haizhou Li, “Using language cluster models in hierarchical language identification”, Speech Communication, 100, June 2018, pp. 30-40.
  • Jibin Wu, Yansong Chua, Malu Zhang, Haizhou Li, and Kay Chen Tan, “A Spiking Neural Network Framework for Robust Sound Classification,” Frontiers in Neuroscience, 12(836), November 2018, pp. 1-17.
  • Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah and Haizhou Li, “Front-End for Antispoofing Countermeasures in Speaker Verification: Scattering Spectral Decomposition”, IEEE Journal of Selected Topics in Signal Processing, 11(4), June 2017, pp. 632-643.
  • Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Multitask Feature Learning for Low-Resource Query-by-Example Spoken Term Detection”, IEEE Journal of Selected Topics in Signal Processing, 11(8), December 2017, pp. 1329-1339.
  • Xiaohai Tian, Siu Wa Lee, Zhizheng Wu, Eng Siong Chng and Haizhou Li, “An Exemplar-Based Approach to Frequency Warping for Voice Conversion, IEEE/ACM Trans. Audio, Speech & Language Processing”, 25(10), October 2017, pp. 1863-1876.
  • Hongjie Chen, Lei Xie, Cheung-Chi Leung, Xiaoming Lu, Bin Ma and Haizhou Li, “Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News”, IEEE/ACM Trans. Audio, Speech & Language Processing, 25(1), January 2017, pp. 112-123.
  • Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah and Haizhou Li, “Front-End for Antispoofing Countermeasures in Speaker Verification: Scattering Spectral Decomposition”, IEEE Journal of Selected Topics in Signal Processing, 11(4), June 2017, pp. 632-643.
  • Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Multitask Feature Learning for Low-Resource Query-by-Example Spoken Term Detection”, IEEE Journal of Selected Topics in Signal Processing, 11(8), December 2017, pp. 1329-1339.
  • Xiaohai Tian, Siu Wa Lee, Zhizheng Wu, Eng Siong Chng and Haizhou Li, “An Exemplar-Based Approach to Frequency Warping for Voice Conversion”, IEEE/ACM Trans. Audio, Speech & Language Processing 25(10), October 2017, pp. 1863-1876.
  • Xiong Xiao, Shengkui Zhao, Duc Hoang Ha Nguyen, Xionghu Zhong, Douglas L. Jones, Eng Siong Chng and Haizhou Li, “Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation”, Eurasip Journal on Advances in Signal Processing, 1(4), December 2016, pp. 1-18.
  • Zhizheng Wu and Haizhou Li, “On the study of replay and voice conversion attacks to text-dependent speaker verification”, Multimedia Tools Applications, 75(9), May 2016, pp. 5311-5327.
  • Nancy F. Chen, Darren Wee, Rong Tong, Bin Ma and Haizhou Li, “Large-scale characterization of non-native Mandarin Chinese spoken by speakers of European origin: Analysis on iCALL”, Speech Communication, 84, November 2016, pp. 46-56.
  • Sven Ewan Shepstone, Kong-Aik Lee, Haizhou Li, Zheng-Hua Tan and Søren Holdt Jensen, “Total Variability Modeling Using Source-Specific Priors”, IEEE/ACM Trans. Audio, Speech & Language Processing, 24(3), March 2016, pp. 504-517.
  • Duc Hoang Ha Nguyen, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition”, IEEE/ACM Trans. Audio, Speech & Language Processing, 24(6), June 2016, pp. 1006-1019.
  • Qiang Yu, Rui Yan, Huajin Tang, Kay Chen Tan and Haizhou Li, “A Spiking Neural Network System for Robust Sequence Recognition” IEEE Trans. Neural Networks and Learning Systems, 27(3), March 2016, pp. 621-635.
  • Yu, Rui Yan, Huajin Tang, Kay Chen Tan and Haizhou Li, “A Spiking Neural Network System for Robust Sequence Recognition”, IEEE Transactions on Neural Networks and Learning Systems, 27(3), March 2016, pp. 621-635.
  • Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li and Li-Rong Dai, “Exploration of Local Variability in Text-Independent Speaker Verification”, Journal of Signal Processing Systems, 82(2), February 2016, pp. 217-228.
  • Jun Hu, Huajin Tang, Kay Chen Tan and Haizhou Li, “How the Brain Formulates Memory: A Spatio-Temporal Model Research Frontier”, IEEE Computational Intelligence Magazine, 11(2), May 2016, pp. 56-68.
  • Jonathan Dennis, Huy Dat Tran and Haizhou Li, “Generalized Hough Transform for Speech Pattern Classification”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23(11), November 2015, pp. 1963-1972.
  • Chang Huai You, Haizhou Li, and Kong-Aik Lee, “Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition”, Journal of  Computer Speech and Language, 30(1), March 2015, pp. 116-134.
  • Dau-Cheng Lyu, Tien Ping Tan Eng Chng and Haizhou Li, “Mandarin-English code-switching speech corpus in South-East Asia”, Language Resources and Evaluation, 49(3), September 2015, pp. 581-600.
  • Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma, and Haizhou Li, “Acoustic Segment Modeling with Spectral Clustering Methods”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23(2), February 2015, pp. 264-277.
  • Van Hai Do, Xiong Xiao, Eng Siong Chng, and Haizhou Li, “Context-dependent Phone Mapping for Acoustic Modeling of Under-resourced Languages”, International Journal of Asian Language Processing, 23(1), 2015, pp. 21-33.
  • Haizhou Li, Marcello Federico, Xiaodong He, Helen M. Meng, and Isabel Trancoso, “Introduction to the Special Section on Continuous Space and Related Methods in Natural Language Processing”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23(3),March 2015, pp. 427-430.
  • Tze Yuang Chong, Rafael E. Banchs, Eng Chng and Haizhou Li, “Decoupling Word-Pair Distance and Co-occurrence Information for Effective Long History Context Language Modeling”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23(7), July 2015, pp. 1221-1232.
  • Rafael E. Banchs, Luis F. D’Haro, and Haizhou Li, “Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23(3), March 2015, pp. 472-482.
  • Zhizheng Wu, Nicholas Evans, Tomi Kinnunen, Junichi Yamagishi, Federico Alegre, and Haizhou Li, “Spoofing and countermeasures for speaker  verification: a survey”, Speech Communication, 66(c), February 2015, pp. 130-153.
  • Haizhou Li, Inaugural editorial: Embracing Opportunities for Growth, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23(1), January 2015, pp. 5-6.
  • Zhizheng Wu, Eng Siong Chng, and Haizhou Li, “Exemplar-based voice conversion using joint nonnegative matrix factorization”, Multimedia Tools and Applications, Springer, 74(22), November 2015, pp. 9943-9958.
  • Yuma Ueda, Longbiao Wang, Atsuhiko Kai, Xiong Xiao, Engsiong Chng and Haizhou Li, “Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization”, The 9th International Symposium on Chinese Spoken Language Processing, Singapore, October 2014, pp. 379-383.
  • Van Hai Do, Xiong Xiao, Eng Siong Chng, and Haizhou Li, “Cross-lingual phone mapping for large vocabulary speech recognition of under-resourced languages”, IEICE Transactions on Information and Systems, 97-D(2), February 2014, pp. 285-295.
  • Miaolong Yuan, Huajin Tang, and Haizhou Li, “Real-Time Keypoint Recognition Using Restricted Boltzmann Machine,” IEEE Transactions on Neural Networks and Learning Systems, 25(11), November 2014, pp. 2119-2126.
  • Zhizheng Wu and Haizhou Li, “Voice conversion versus speaker verification: an overview”, APSIPA Transactions on Signal and Information Processing, 3(e17), December 2014, pp. 1-16.
  • Zhizheng Wu, Tuomas Virtanen, Eng Siong Chng, and Haizhou Li, “Exemplar-based sparse representation with residual compensation for voice conversion”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 22(10), October 2014, pp. 1506-1521.
  • Anthony Larcher, Kong Aik Lee, Bin Ma, and Haizhou Li, “Text-dependent speaker verification: Classifiers, databases and RSR2015”, Speech Communication, 60, May 2014, pp. 56-77.
  • S. J. Wright, D. Kanevsky, Li Deng, Xiaodong He, G. Heigold, and Haizhou Li, “Optimization Algorithm and Applications for Speech and Language  Processing”, IEEE Transactions on Audio, Speech and Language Processing, 21(11), November 2013, pp. 2231-2243.
  • Raymond W. M. Ng, Tan Lee, Cheung-Chi Leung, Bin Ma, and Haizhou Li, “Spoken Language Recognition With Prosodic Features”, IEEE Transactions on Audio, Speech and Language Processing, 21(9), September 2013, pp. 1841-1853.
  • Ville Hautamäki, Tomi Kinnunen, Filip Sedlak, Kong Aik Lee, Bin Ma, and Haizhou Li, “Sparse Classifier Fusion for Speaker Verification”, IEEE Transactions on Audio, Speech and Language Processing, 21(8), August 2013, pp. 1622-1631.
  • Qiang Yu, Huajin Tang, Kay Chen Tan, and Haizhou Li, “Precise-Spike-Driven Synaptic Plasticity: Learning Hetero-Association of Spatiotemporal Spike Patterns”, PLoS ONE, 8(11), November 2013, pp. 1-16.
  • Haizhou Li, Kong Aik Lee, and Bin Ma, “Spoken Language Recognition: From Fundamentals to Practice”, Proceedings of the IEEE, 101(5), May 2013, pp. 1136-1159.
  • Douglas D. O’Shaughnessy, Li Deng, and Haizhou Li, “Speech Information Processing: Theory and Applications”, Proceedings of the IEEE, 101(5), May 2013, pp. 1034-1037.
  • Jiali Yu, Huajin Tang, and Haizhou Li, “Dynamics Analysis of a Population Decoding Model”, IEEE Transactions on Neural Networks and Learning Systems, 24(3), March 2013, pp. 498-503.
  • Qiang Yu, Huajin Tang, Kay Chen Tan, and Haizhou Li, “Rapid Feedforward Computation by Temporal Encoding and Learning With Spiking Neurons”, IEEE Transactions on Neural Networks and Learning Systems, 24(10), October 2013, pp. 1539-1552.
  • Haipeng Wang, Cheung-Chi Leung, Tan Lee, Bin Ma, and Haizhou Li, “Shifted-Delta MLP Features for Spoken Language Recognition”, IEEE Signal  Processing Letters, 20(1), January 2013, pp. 15-18.
  • Andreea Niculescu, Betsy van Dijk, Anton Nijholt, Haizhou Li, and See Swee Lan, “Making Social Robots More Attractive: The Effects of Voice Pitch, Humor and Empathy”, International Journal of Social Robotics, 5(2), April 2013, pp. 171-191.
  • Jiali Yu, Huajin Tang, and Haizhou Li, “Continuous attractors of discrete-time recurrent neural networks”, Neural Computing and Applications, 23(1), July 2013, pp. 89-96.
  • Jiali Yu, Huajin Tang, Haizhou Li, and Luping Shi, “Dynamical properties of continuous attractor neural network with background tuning”, Neurocomputing, 99(1), January 2013, pp. 439-447.
  • Jun Hu, Huajin Tang, Kay Chen Tan, Haizhou Li, and Luping Shi, “A Spike-Timing-Based Integrated Model for Pattern Recognition”, Neural Computation, 25(2), February 2013, pp. 450-472.
  • Sakriani Sakti, Michael Paul, Andrew Finch, Shinsuke Sakai, Thang Tat Vu, Noriyuki Kimura, Chiori Hori, Eiichiro Sumita, Satoshi Nakamura, Jun Park, Chai Wutiwiwatchai, Bo Xu, Hammam Riza, Karunesh Arora, Chi Mai Luong, and Haizhou Li, “A-STAR: Toward Translating Asian Spoken Languages”, Computer  Speech and Language, 27(2), February 2013, pp. 509-527.
  • Zhizheng Wu, Tomi Kinnunen, Eng Siong Chng, and Haizhou Li, “Mixture of factor analyzers using priors from non-parallel speech for voice conversion”, IEEE Signal Processing Letters, 19(12), December 2012, pp. 914-917.
  • Omid Dehzangi, Bin Ma, Eng-Siong Chng, and Haizhou Li, “Discriminative Feature Extraction for Speech Recognition Using Continuous Output Codes”, Pattern Recognition Letters, 33(13), October 2012, pp. 1703-1709.
  • Liyuan Li, Shuicheng Yan, Xinguo Yu, Yeow Kee Tan, and Haizhou Li, “Robust Multiperson Detection and Tracking for Mobile Service and Social Robots”, IEEE Transactions on Systems, Man, and Cybernetics – part B: Cybernetics, 42(5), October 2012, pp. 1398-1412.
  • Tomi Kinnunen, Rahim Saeidi, Filip Sedlak, Kong Aik Lee, Johan Sandberg, Maria Hansson-Sandsten, and Haizhou Li, ”Low-Variance Multitaper MFCC Features: a Case Study in Robust Speaker Verification”, IEEE Transactions on Audio, Speech and Language Processing, 20(7),  September 2012, pp. 1990-2001.
  • Andreea Niculescu, Betsy van Dijk, Anton Nijholt, Haizhou Li, and Swee Lan See, “Making social robots more attractive: the effects of voice pitch, humor and empathy”, International Journal of Social Robotics, 5(2), April 2012, pp. 171-191.
  • Wenliang Chen, Jun’ichi Kazama, Min Zhang, Yoshimasa Tsuruoka, Yujie Zhang, Yiou Wang, Kentaro Torisawa, and Haizhou Li, “Bitext dependency parsing with auto-generated bilingual treebank”, IEEE Transactions on Audio, Speech and Language Processing, 20(5), July 2012, pp. 1461-1472.
  • Xiaoxuan Wang, Lei Xie, Mimi Lu, Bin Ma, Engsiong Chng, and Haizhou Li, “Broadcast news story segmentation using conditional random fields and  multimodal features”, IEICE Transactions on Information and Systems, E95-D(5), May 2012, pp. 1206-1215.
  • Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, and Haizhou Li, “Selective gammatone envelope feature for robust sound event recognition”, IEICE Transactions, 95-D(5), May 2012, pp. 1229-1237.
  • Rui Yan, Keng Peng Tee, Yuanwei Chua, Haizhou Li, and Huajin Tang, “Gesture Recognition Based on Localist Attractor Networks with Application to Robot Control”, IEEE Computational Intelligence Magazine, 7(1), February 2012, pp. 64-74.
  • Keng Peng Tee, Rui Yan, Yuanwei Chua, Zhiyong Huang, and Haizhou Li, “Modular IK: a Robust Inverse Kinematic Algorithm for Gesture Imitation in an  Upper-Body Humanoid Robot”, International Journal of Humanoid Robotics, 9(2), June 2012.
  • Jin-Shea Kuo and Haizhou Li, “Learning regional transliteration variants”, Information Processing and Management, 48(1), January 2012, pp. 154-169.
  • Tin Lay Nwe, Hanwu Sun, Bin Ma, and Haizhou Li, “Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data”, IEEE Transactions on Audio, Speech and Language Processing, 20(2), February 2012, pp. 461-473.
  • Haizhou Li, “FOREWORD – Special Section on Recent Advances in Multimedia Signal Processing Techniques and Applications”, IEICE Transactions on  Information and Systems, 95-D(5), May 2012, pp. 1181-1181.
  • Haizhou Li, John-John Cabibihan, and Yeow Kee Tan, “Towards an Effective Design of Social Robots”, International Journal of Social Robotics, vol. 3, no. 4, November 2011, pp. 333-335.
  • Huajin Tang and Haizhou Li, “Book Review: Information Theoretic Learning: Renyi’s Entropy and Kernel Perspectives”, IEEE Computational Intelligence Magazine, vol. 6, no. 3, August 2011, pp. 60-62.
  • Eliathamby Ambikairajah, Haizhou Li, Liang Wang, Bo Yin, and Vidhyasaharan Sethu, “Language Identification: A Tutorial”, IEEE Circuits and Systems Magazine, vol. 11, no. 2, June 2011, pp. 82-108.
  • Huajin Tang Haizhou Li, and Zhang Yi, “Online learning and stimulus-driven responses of neurons in visual cortex”, Cognitive Neurodynamics, vol. 5, no. 1, March 2011, pp. 77-85.
  • Omid Dehzangi, Bin Ma, Eng-Siong Chng, and Haizhou Li, “Error Corrective Fusion of Classifier Scores for Spoken Language”, IEICE Transactions on Information and Systems, vol. E94-D, no.12, December 2011, pp. 1994-1997.
  • Deyi Xiong, Min Zhang, and Haizhou Li, “A Maximum Entropy Segmentation Model for Statistical Machine Translation”, IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 8, November 2011, pp. 2494-2505.
  • Huy Dat Tran and Haizhou Li, “Sound Event Recognition with Probabilistic Distance SVMs”, IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 6, August 2011, pp. 1556-1568.
  • Jonathan Dennis, Huy Dat Tran, and Haizhou Li, “Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions”, IEEE Signal Processing Letters, vol. 18, no. 2, February 2011, pp. 130-133.
  • Kong Aik Lee, Chang Huai You, Haizhou Li, Tomi Kinnunen, and Khe Chai Sim, “Using Discrete Probabilities with Bhattacharyya Measure for SVM-based Speaker Verification”, IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 4, May 2011, pp. 861-870.
  • Donglai Zhu, Bin Ma, and Haizhou Li, “Speaker Verification with Feature-Space MAPLR Parameters”, IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 3, March 2011, pp. 505-515.
  • Namunu C. Maddage and Haizhou Li, “Beat Space Segmentation and Octave Scale Cepstral Feature for Sung Language Recognition in Pop Music”, ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP), vol. 7, no. 4, Article 37, November 2011, pp. 1-20.
  • Haizhou Li and Ma Bin, “TechWare: Speaker and Spoken Language Recognition Resources”, IEEE Signal Processing Magazine, vol. 27, no. 6, November 2010, pp. 139-142.
  • Deyi Xiong, Min Zhang, Aiti Aw, and Haizhou Li, “Linguistically Annotated Reordering Evaluation and Analysis”, Computational Linguistics, vol. 36, no. 3, September 2010, pp. 535-568.
  • Huajin Tang, Haizhou Li, and Zhang Yi, “A Discrete-Time Neural Network for Optimization Problems with Hybrid Constraints”, IEEE Transactions on Neural Networks, vol. 21, no. 7, July 2010, pp. 1184-1189.
  • Lei Wang, Eng Siong Chng, and Haizhou Li, “A Tree-Construction Search Approach for Multivariate Time Series Motifs Discovery”, Pattern Recognition Letters, vol. 31, no. 9, July 2010, pp. 869-875.
  • Huajin Tang, Haizhou Li, and Rui Yan, “Memory Dynamics in Attractor Networks with Saliency Weights”, Neural Computation, vol. 22, no. 7, July 2010, pp. 1899-1926.
  • Chang Huai You, Kong Aik Lee, and Haizhou Li, “GMM-SVM Kernel with a Bhattacharyya-Based Distance for Speaker Recognition”, IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 6, August 2010, pp. 1300-1312.
  • Tomi Kinnunen and Haizhou Li, “An Overview of Text-Independent Speaker Recognition: from Features to Supervectors”, Speech Communication, vol. 52,  no. 1, January 2010, pp. 12-40. (Speech Communication Most Cited Article since 2007)
  • Xiong Xiao, Jinyu Li, Eng Siong Chng, Haizhou Li, and Chin-Hui Lee, “A Study on the Generalization Capability of Acoustic Models for Robust Speech Recognition”, IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 6, August 2010, pp. 1158-1169.
  • Namunu C. Maddage, Khe Chai Sim, and Haizhou Li, “Word Level Automatic Alignment of Music and Lyrics using Vocal Synthesis”, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), vol. 6, no. 3, Article 19, August 2010. pp. 1-16.
  • Tee Kiah Chia, Khe Chai Sim, Haizhou Li, and Hwee Tou Ng, “Statistical Lattice-Based Spoken Document Retrieval”, ACM Transactions on Information  Systems, vol. 28, no. 1, Article 2, January 2010, pp. 1-30.
  • Huy Dat Tran and Haizhou Li, “Jump Function Kolmogorov for Audio Classification in Noise-mismatch Conditions”, IEEE Transactions on Signal Processing, vol. 57, no. 8, August 2009, pp. 2908-2918.
  • Rong Tong, Bin Ma, Haizhou Li, and Eng Siong Chng, “A Target-Oriented Phonotactic Front-end for Spoken Language Recognition”, IEEE Transactions on  Audio, Speech and Language Processing, vol. 17, no. 7, September 2009, pp. 1335-1347.
  • Chang Hui You, Kong-Aik Lee, and Haizhou Li, “An SVM Kernel with GMM-Supervector Based on the Bhattacharyya Distance for Speaker  Recognition”, IEEE Signal Processing Letters, vol. 16, no. 1, January 2009, pp. 49-52.
  • Donglai Zhu, Haizhou Li, Bin Ma, and Chin-Hui Lee, “Optimizing the Performance of Spoken Language Recognition with Discriminative Training”, IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 8, November 2008, pp. 1642-165.
  • Xiong Xiao, Eng Siong Chng, and Haizhou Li, “Normalization of the Speech Modulation Spectra for Robust Speech Recognition”, IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 8, November 2008, pp. 1662-1674.
  • Haizhou Li, Jin-Shea Kuo, Jian Su, and Chih-Lung Lin, “Mining Live Transliterations using Incremental Learning Algorithms”, International Journal of Computer Processing of Languages, vol. 21, no. 2, June 2008, pp. 183-203.
  • Khe Chia Sim and Haizhou Li, “On Acoustic Diversification Front-end for Spoken Language Identification”, IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 5, July 2008, pp. 1029-1037.
  • Jin-shea Kuo, Haizhou Li, and Ying-Kuei Yang, “Active Learning for Constructing Transliteration Lexicons from the Web”, Journal of the American Society for Information Science and Technology, vol. 59, no. 1, January 2008, pp. 126-135.
  • Bin Ma, Haizhou Li, and Rong Tong, “Spoken Language Recognition with Ensemble Classifiers”, IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 7, September 2007, pp. 2053-2062.
  • Xiong Xiao, Eng Siong Chng, and Haizhou Li, “Temporal Structure Normalization of Speech Feature for Robust Speech Recognition”, IEEE Signal Processing Letters, vol. 14, no. 7, July 2007, pp. 500-503.
  • Jin-Shea Kuo, Haizhou Li, and Ying-Kuei Yang, “A Phonetic Similarity Model for Automatic Extraction of Transliteration Pairs”, ACM Transactions on Asian  Language Information Processing, vol. 6, no. 2, Article 6, September 2007, pp. 1-24.
  • Tin Lay and Haizhou Li, “Exploring Vibrato-Motivated Acoustic Features for Singer Identification”, IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 2, February 2007, pp. 519-530.
  • Haizhou Li, Bin Ma, and Chin-Hui Lee, “A Vector Space Modeling Approach to Spoken Language Identification”, IEEE Transactions on Audio, Speech and  Language Processing, vol. 15, no. 1, January 2007, pp. 271-284.
  • Minghui Dong, Kim-Teng Lua, and Haizhou Li, “A Unit Selection-based Speech Synthesis Approach for Mandarin Chinese”, Journal of Chinese Language and Computing, vol. 16, no. 1, March 2006, pp. 1-10.
  • Bin Ma and Haizhou Li, “A Comparative Study of Four Language Identification Systems”, Computational Linguistics and Chinese Language Processing, vol. 11, no. 2, June 2006, pp. 159-182.
  • Jian Su, K. T. Ng, Haizhou Li, and Jean-Paul Haton, “Nonparametric Distance Measures of Speaker Verification”, IET Electronics Letters, vol. 31, no. 9, April 1995, pp. 700-701.
  • Haizhou Li, Jian Su, Jean-Paul Haton, “Short-Timed Speech Dynamics for Speaker Recognition”, IET Electronics Letters, vol. 31, no. 17, August 1995, pp. 1416-1418.