Conference Papers – HLT

 

NeurIPS

  • Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li,  “Disentangling Voice and Content with Self-Supervision for Speaker Recognition”, Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023), December 10, 2023 – December 16, 2023, New Orleans, Louisiana, U.S.A

 

Engineering in Medicine and Biology Society (EMBC)

  • Siqi Cai, Jia Li, Hongmeng Yang, and Haizhou Li, " RGCnet: An Efficient Recursive Gated Convolutional Network for EEG-based Auditory Attention Detection", in 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Sydney, Australia, July 24 to 27, 2023.

INTERSPEECH

  • Yidi Jiang, Ruijie Tao, Zexu Pan, Haizhou Li, "Target Active Speaker Detection with Audio-visual Cues", in Proc. Interspeech 2023, Convention Centre Dublin, Ireland, August 20 to 24, 2023.
  • Jingru Lin, Xianghu Yue, Junyi Ao, Haizhou Li, "Self-Supervised Acoustic Word Embedding Learning via Correspondence Transformer Encoder", in Proc. Interspeech 2023, Convention Centre Dublin, Ireland, August 20 to 24, 2023.
  • Ke Zhang, Marvin Borsdorf, Zexu Pan, Haizhou Li, Yangjie Wei, Yi Wang, "Speaker Extraction with Detection of Presence and Absence of Target Speakers", in Proc. Interspeech 2023, Convention Centre Dublin, Ireland, August 20 to 24, 2023.
  • Ruicong Wang, Siqi Cai and Haizhou Li, "EEG-based Auditory Attention Detection with Spatiotemporal Graph and Graph Convolutional Network", in Proc. Interspeech 2023, Convention Centre Dublin, Ireland, August 20 to 24, 2023.
  • Rui Liu, Haolin Zuo, De Hu, Guanglai Gao, Haizhou Li, "Explicit Intensity Control for Accented Text-to-speech", in Proc. Interspeech 2023, Convention Centre Dublin, Ireland, August 20 to 24, 2023.
  • Rui Liu, Jinhua Zhang, Guanglai Gao, Haizhou Li, "Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion", in Proc. Interspeech 2023, Convention Centre Dublin, Ireland, August 20 to 24, 2023.
  • Lu Junchen, Berrak Sisman, Mingyang Zhang, Haizhou Li, "High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units", in Proc. Interspeech 2023, Convention Centre Dublin, Ireland, August 20 to 24, 2023.

IJCAI

  • Shuang Lian, Jiangrong Shen, Qianhui Liu, Ziming Wang, Rui Yan, Huajin Tang, "Learnable Surrogate Gradient for Direct Training Spiking Neural Networks", International Joint Conference on Artificial Intelligence (IJCAI) in Macau, August 19 - 25, 2023.

EMBC

  • Siqi Cai, Jia Li, Hongmeng Yang, and Haizhou Li, "RGCnet: An Efficient Recursive Gated Convolutional Network for EEG-based Auditory Attention Detection", Annual International Conference of the IEEE Engineering in Medicine and Biology Society in Sydney, Australia, July 24 - 27, 2023.

ACL 

  • Yiming Chen, Simin Chen, Zexin Li, Wei Yang, Cong Liu, Robby T. Tan, Haizhou Li, "Dynamic Transformers Provide a False Sense of Efficiency", Annual Meeting of the Association for Computational Linguistics (ACL’23) in Toronto, Canada, July 9 to 14, 2023.

CVPR

  • Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li, "Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert", Computer Vision and Pattern Recognition Conference (CVPR) in Vancouver, Canada. June 18 to 22, 2023.
  • Jiawei Du*, Yidi Jiang*, Vincent TF Tan, Joey Tianyi Zhou, Haizhou Li (*equal contribution), "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation", Computer Vision and Pattern Recognition Conference (CVPR) in Vancouver, Canada. June 18 to 22, 2023.

ICASSP

  • Marvin Borsdorf, Saurav Pahuja, Gabriel Ivucic, Siqi Cai, Haizhou Li, and Tanja Schultz, "Multi-Head Attention and GRU for Improved Match-Mismatch Classification of Speech Stimulus and EEG Response", IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 (International Conference on Acoustics, Speech, & Signal Processing (ICASSP), in Rhodes Island, Greece, June 4 - 10, 2023
  • Ruijie Tao, Kong Aik Lee, Zhan Shi, Haizhou Li, "Speaker recognition with two-step multi-modal deep cleansing", IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 (International Conference on Acoustics, Speech, & Signal Processing (ICASSP), in Rhodes Island, Greece, June 4 - 10, 2023
  • Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li, "Token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text", IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 (International Conference on Acoustics, Speech, & Signal Processing (ICASSP), in Rhodes Island, Greece, June 4 - 10, 2023
  • Qiquan Zhang, Hongxu Zhu, Qi Song, Xinyuan Qian, Zhaoheng Ni, Haizhou Li, "RIPPLE SPARSE SELF-ATTENTION FOR MONAURAL SPEECH ENHANCEMENT", IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 (International Conference on Acoustics, Speech, & Signal Processing (ICASSP), in Rhodes Island, Greece, June 4 - 10, 2023
  • Xiaoxue Gao, Xianghu Yue and Haizhou Li, "Self-Transriber: Few-shot Lyrics Transcription with Self-training", IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 (International Conference on Acoustics, Speech, & Signal Processing (ICASSP), in Rhodes Island, Greece, June 4 - 10, 2023
  • Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li, "ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting", IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 (International Conference on Acoustics, Speech, & Signal Processing (ICASSP), in Rhodes Island, Greece, June 4 - 10, 2023
  • Haolin Zuo, Rui Liu, Jinming Zhao, Guanglai Gao and Haizhou Li, "Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities", IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 (International Conference on Acoustics, Speech, & Signal Processing (ICASSP), in Rhodes Island, Greece, June 4 - 10, 2023

NER

  • Saurav Pahuja, Siqi Cai, Tanja Schultz, and Haizhou Li, "XAnet: Cross-Attention Between EEG of Left and Right Brain for Auditory Attention Decoding", International IEEE EMBS Conference on Neural Engineering, Baltimore, MD, USA, April 25 - 27, 2023
  • Peiwen Li, Enze Su, Jia Li, Siqi Cai, Longhan Xie, and Haizhou Li, “ESAA: An Eeg-Speech Auditory Attention Detection Database,” 2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), Hanoi, Vietnam, November 24-26, 2022, pp. 1-6, DOI: 10.1109/O-COCOSDA202257103.2022.9997944
  • Xiaoxue Gao, Chitralekha Gupta and Haizhou Li, “Music-robust Automatic Lyrics Transcription of Polyphonic Music”, Music Technology and Design, June 5-12, 2022, Saint-Etienne (France)

EMNLP

  • Bin Wang, Chen Zhang, Yan Zhang, Yiming Chen, Haizhou Li, "Analyzing and Evaluating Faithfulness in Dialogue Summarization", In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), December 7–11, 2022, pages 4897–4908, Abu Dhabi, United Arab Emirates
  • Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li, “FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation", In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), December 7–11, 2022, pages 3336–3355, Abu Dhabi, United Arab Emirates
  • Yiming Chen, Yan Zhang, Bin Wang, Zuozhu Liu, Haizhou Li, "Generate, Discriminate and Contrast: A Semi-Supervised Sentence Representation Learning Framework", In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), December 7–11, 2022, pages 8150–8161, Abu Dhabi, United Arab Emirates

 

NeurIPS

  • Qu Yang, Jibin Wu, Malu Zhang, Yansong Chua, Xinchao Wang, Haizhou Li, “Training Spiking Neural Networks with Local Tandem Learning”, Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS 2022), November 27, 2022 – December 3, 2022, New Orleans, Louisiana, (U.S.A)

 

INTERSPEECH

  • Zexu Pan, Meng Ge, Haizhou Li, “A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction”, in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.
  • Zeyang Song, Qi Liu, Qu Yang and Haizhou Li, “Knowledge distillation for In-memory keyword spotting model”, in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.
  • Qu Yang, Qi Liu, Haizhou Li, "Deep Residual Spiking Neural Network for Keyword Spotting in Low-Resource Settings", in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.
  • Zongyang Du, Berrak Sisman, Kun Zhou and Haizhou Li, “Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion”, in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.
  • Rui Liu, Berrak Sisman, Bj ̈orn W. Schuller, Guanglai Gao, Haizhou Li, “Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning”, in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.
  • Marvin Borsdorf, Kevin Scheck, Haizhou Li and Tanja Schultz, “Blind Language Separation: Disentangling Multilingual Cocktail Party Voices by Language”, in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.

 

ACL

  • Bin Wang, C.-C. Jay Kuo, and Haizhou Li, "Rethinking Evaluation with Word and Sentence Similarities", In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 22nd - 27th May 2022 (Volume 1: Long Papers), pages 6060–6077, Dublin, Ireland, DOI: 10.18653/v1/2022.acl-long.419

 

ICASSP

  • Marvin Borsdorf, Kevin Scheck, Haizhou Li, Tanja Schultz, “"Experts Versus All-Rounders: Target Language Extraction for Multiple Target Languages," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 846-850, DOI: 10.1109/ICASSP43922.2022.9746130  
  • Jiadong Wang, Jibin Wu, Malu Zhang, Qi Liu, Haizhou Li, "A Hybrid Learning Framework for Deep Spiking Neural Networks with One-Spike Temporal Coding," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 8942-8946, DOI: 10.1109/ICASSP43922.2022.9746792 
  • Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li, "L-SpEx: Localized Target Speaker Extraction," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 7287-7291, DOI: 10.1109/ICASSP43922.2022.9746221 
  • Xiaoxue Gao, Chitralekha Gupta and Haizhou Li, “"Genre-Conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 791-795, DOI: 10.1109/ICASSP43922.2022.9747684 
  • Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li, "Memobert: Pre-Training Model with Prompt-Based Learning for Multimodal Emotion Recognition," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 4703-4707, DOI: 10.1109/ICASSP43922.2022.9746910 
  • Qiquan Zhang, Qi Song, Zhaoheng Ni, Aaron Nicolson, Haizhou Li, "Time-Frequency Attention for Monaural Speech Enhancement," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 7852-7856, DOI: 10.1109/ICASSP43922.2022.9746454.  
  • Junchen Lu, Berrak Sisman, Rui Liu, Mingyang Zhang, Haizhou Li, “"Visualtts: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 8032-8036, DOI: 10.1109/ICASSP43922.2022.9746421 
  • Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, Haizhou Li, “MFA: TDNN with Multi-Scale Frequency-Channel Attention for Text-Independent Speaker Verification with Short Utterances," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 7517-7521, DOI: 10.1109/ICASSP43922.2022.9747021
  • Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li, “"Self-Supervised Speaker Recognition with Loss-Gated Learning," ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 22 May – 27 May 2022, pp. 6142-6146, DOI: 10.1109/ICASSP43922.2022.9747162 
  • Chen Zhang, Luis Fernando D’Haro, Thomas Friedrichs and Haizhou Li, “MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation”, In. Proc. Thirty-Six AAAI Conference on Artificial Intelligence (AAAI-22), Virtual Event, 2022.
  • Xinyuan Qian, Bidisha Sharma, Amine El Abridi and Haizhou Li, “SLoClas: A DATABASE FOR JOINT SOUND LOCALIZATION AND CLASSIFICATION”, in Proc. O-COCOSDA 2021, 18-20 November 2021, Singapore.
  • Yan Zhang, Ruidan He, Zuozhu Liu, Lidong Bing, and Haizhou Li, “Bootstrapped Unsupervised Sentence Representation Learning”, ACL, August 2021, pp. 5168–5180.
  • Qu Yang, Jibin Wu, and Haizhou Li, “Rethinking Benchmarks for Neuromorphic Learning Algorithms”, The International Joint Conference on Neural Networks (IJCNN), Virtual Event, July 2021.
  • Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li, “Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection”, ACM Multimedia (MM), Chengdu, China, Oct 2021.
  • Jiadong Wang, Xinyuan Qian, Zihan Pan, Malu Zhang, and Haizhou Li, “GCC-PHAT with Speech-oriented Attention for Robotic Sound Source Localization”, in Proc. IEEE International Conference on Robotics and Automation (ICRA), Xian, China, 2021.
  • Chen Zhang, Yiming Chen, Luis Fernando D’Haro, Yan Zhang, Thomas Friedrichs, Grandee Lee and Haizhou Li, “DynaEval: Unifying Turn and Dialogue Level Evaluation”, in Proc. Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP), August 2021.
  • Huiping Zhuang, Zhenyu Weng, Fulin Luo, Kar-Ann Toh, Haizhou Li, and Zhiping Lin, “Accumulated Decoupled Learning with Gradient Staleness Mitigation for Convolutional Neural Networks”, Thirty-eighth International Conference on Machine Learning (ICML), July 2021.
  • Chitralekha Gupta, Purnima Kamath, and Lonce Wyse, “Signal Representations for Synthesizing Audio Textures with Generative Adversarial Networks”, Sound and Music Computing Conference, May 2021.
  • Meidan Ouyang, Rohan Kumar Das, Jichen Yang, and Haizhou Li, “Capsule Network-based End-to-end System for Detection of Replay Attacks”, in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, January 2021, pp. 1-5.
  • Kun Zhou, Berrak Sisman, and Haizhou Li, “VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech,” in Proc. IEEE Spoken Language Technology (SLT), Shenzhen, China, January 2021.
  • Hongqiang Du, Xiaohai Tian, Lei Xie, and Haizhou Li, “Optimizing Voice Conversion Network with Cycle Consistency loss of Speaker Identity” in Proc. IEEE Spoken Language Technology (SLT), Shenzhen, China, January 2021.
  • Protima Nomo Sudro, Rohan Kumar Das, Rohit Sinha, and S. R. M. Prasanna, “Enhancing the Intelligibility of Cleft Lip and Palate Speech using Cycle-consistent Adversarial Networks” in Proc. IEEE Spoken Language Technology (SLT), Shenzhen, China, January 2021, pp. 720-727.

ASRU

  • Marvin Borsdorf, Haizhou Li, and Tanja Schultz, “Target Language Extraction at Multilingual Cocktail Parties”, in Proc. IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Cartagena, Colombia, September 2021.
  • Zongyang Du, Berrak Sisman, Kun Zhou, and Haizhou Li, “Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer”, in Proc. IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Cartagena, Colombia, September 2021.
  • Yi Ma, Kong Aik Lee, Ville Hautamaki, and Haizhou Li, “PL-EESR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction”, in Proc. IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Cartagena, Colombia, September 2021.
  • Bidisha Sharma, Maulik Madhavi, Xuehao Zhou, and Haizhou Li, “Exploring Teacher-Student Learning Approach for Multi-lingual Speech-to-Intent Classification”, in Proc. IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Cartagena, Colombia, September 2021.
  • Sergey Nikonorov, Berrak Sisman, Mingyang Zhang, Haizhou Li, “Deepa: A Deep Neural Analyzer for Speech and Singing Vocoding”, in Proc. IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Cartagena, Colombia, September 2021.

INTERSPEECH

  • Qiquan Zhang, Qi Song, Aaron Nicolson, Tian Lan, and Haizhou Li, “Temporal Convolutional Network with Frequency Dimension Adaptive Attention for Speech Enhancement” in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Yidi Jiang, Bidisha Sharma, Maulik Madhavi, and Haizhou Li, “Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification” in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Rohan Kumar Das, Maulik Madhavi and Haizhou Li, “Diagnosis of COVID-19 using Auditory Acoustic Cues”, in Proc. Interspeech, Brno, Czech Republic, August 2021.
  • Yi Zhou, Xiaohai Tian, Zhizheng Wu, and Haizhou Li, “Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation” in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Kun Zhou, Berrak Sisman, and Haizhou Li, “Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training” in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Xianghu Yue and Haizhou Li, “Phonetically Motivated Self-Supervised Speech Representation Learning”, in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Marvin Borsdorf, Chenglin Xu, Haizhou Li, and Tanja Schultz, “GlobalPhone Mix-to-Separate out of 2: A Multilingual 2000 Speakers Mixtures Databasae for Speech Separation” in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Marvin Borsdorf, Chenglin Xu, Haizhou Li, and Tanja Schultz, “Universal Speaker Extraction in the Presence and Absence of Target Speakers for Speech of One and Two Talkers” in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Rui Liu, Berrak Sisman, and Haizhou Li, “Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability”, in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Hongning Zhu, Kong Aik Lee and Haizhou Li, “Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding”, in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.
  • Wang Wupeng,Xu Chenglin,Ge Meng and Haizhou Li, “Neural Speaker Extraction with Speaker-Speech Cross-Attention Network”, in Proc. Interspeech 2021, Brno, Czech Republic, August 2021.

ICASSP

  • Rohan Kumar Das, Jichen Yang, and Haizhou Li, “Data Augmentation with Signal Companding for Detection of Logical Access Attacks” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Xinyuan Qian, Maulik Madhavi, Zexu Pan, Jiadong Wang, and Haizhou Li, “Multi-target DoA estimation with an audio-visual fusion mechanism”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Zexu Pan, Ruijie Tao, Chenglin Xu, and Haizhou Li, “Multi-modal target speaker extraction with visual cues”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Nana Hou, Chenglin Xu, Eng Siong Chng, and Haizhou Li, “Learning disentangled feature representations for speech enhancement via adversarial training”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Lili Guo, Longbiao Wang, Chenglin Xu, Jianwu Dang, Eng Siong Chng, and Haizhou Li, “Representation learning with spectro-temporal-channel attention for speech emotion recognition”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, and Haizhou Li, “Multi-stage Speaker Extraction with Utterance and Frame-Level Reference Signals”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Bidisha Sharma, Maulik Madhavi, and Haizhou Li, “Leveraging acoustic and linguistic embeddings from pre-trained speech and language models for intent classification”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Kun Zhou, Berrak Sisman, Rui Liu, and Haizhou Li, “Seen and unseen emotional style transfer for voice conversion with a new emotional speech dataset”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Rui Liu, Berrak Sisman, and Haizhou Li, “Graphspeech: Syntax-aware graph attention network for neural speech synthesis”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021.
  • Yi Zhou, Xiaohai Tian, Xuehao Zhou, Mingyang Zhang, Grandee Lee, Rui Liu, Berrak Sisman, and Haizhou Li, “NUS-HLT System for Blizzard Challenge 2020”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, Shanghai, China, October 2020, pp. 44-48.
  • Xiaohai Tian, Zhichao Wang, Shan Yang, Xinyong Zhou, Hongqiang Du, Yi Zhou, Mingyang Zhang, Kun Zhou, Berrak Sisman, Lei Xie, and Haizhou Li, “The NUS & NWPU system for Voice Conversion Challenge 2020”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, Shanghai, China, October 2020, pp. 170-174.
  • Zhao Yi, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhenhua Ling, and Tomoki Toda, “Voice Conversion Challenge 2020 – Intra-lingual semi-parallel and cross-lingual voice conversion –”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, Shanghai, China, October 2020, pp. 80-98.
  • Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhenhua Ling, Junichi Yamagishi, Yi Zhao, Xiaohai Tian, and Tomoki Toda, “Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, Shanghai, China, October 2020, pp. 99-120.
  • Wanqiu Lin, Maulik Madhavi, Rohan Kumar Das and Haizhou Li, “Transformer-based Arabic Dialect Identification,” in Proc. International Conference on Asian Language Processing (IALP), Kuala Lumpur, Malaysia, December 2020, pp. 192-196.
  • Chitralekha Gupta, Lin Huang, and Haizhou Li, “Automatic Rank-Ordering of Singing Vocals with Twin-Neural Network”, in Proc. International Society for Music Information Retrieval Conference (ISMIR), Montreal, Canada, October 2020, pp. 416-423.
  • Grandee Lee and Haizhou Li, “Modeling Code-Switch Languages Using Bilingual Parallel Corpus”, in Proc. Association for Computational Linguistics, July 2020, pp. 860-870.
  • Astik Biswas, Emre Yilmaz, Febe De Wet, Ewald Van der Westhuizen and Thomas Niesler, “Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages,” in Proc. 12th Conference on Language Resources and Evaluation (LREC), Marseille, France, May 2020, pp. 3468-3474.
  • Chen Zhang, Luis Fernando D’Haro, Rafael E. Banchs, Thomas Friedrichs and Haizhou Li, “Deep AM-FM: Toolkit for Automatic Dialogue Evaluation,” in Proc. 11th International Workshop on Spoken Dialog System (IWSDS) Technology, Barcelona, Spain, September 2020, pp. 53-69.

APSIPA-ASC

  • Rohan Kumar Das and Haizhou Li, “Classification of Speech with and without Face Mask using Acoustic Features” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 747-752.
  • Rohan Kumar Das, Ruijie Tao, Jichen Yang, Wei Rao, Cheng Yu, and Haizhou Li, “HLT-NUS Submission for NIST 2019 Multimedia Speaker Recognition Evaluation”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 605-609.
  • Biswajit Dev Sarma and Rohan Kumar Das, “Emotion Invariant Speaker Embeddings for Speaker Identification with Emotional Speech” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 610-615.
  • Neil Shah, Sreeraj R, Maulik Madhavi, Nirmesh Shah, and Hemant Patil, “Query-by-Example Spoken Term Detection using Generative Adversarial Network”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 644-648.
  • Zongyang Du, Kun Zhou, Berrak Sisman, and Haizhou Li, “Spectrum And Prosody Conversion for Cross-Lingual Voice Conversion with Cyclegan”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 507-513.
  • Yi Fan Ong, Maulik Madhavi, and Ken Chan, “OPENNLU: Open-Source Web-Interface NLU Toolkit for Development of Conversational Agent”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 381-385.
  • Junchen Lu, Kun Zhou, Berrak Sisman, and Haizhou Li, “VAW-GAN for Singing Voice Conversion with Non-parallel Training Data”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 514-519.
  • Lin Huang, Chitralekha Gupta, and Haizhou Li, “Spectral Features and Pitch Histogram for Automatic Singing Quality Evaluation with CRNN”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 492-499.

SPEAKER ODYSSEY

  • Xiaohai Tian, Rohan Kumar Das and Haizhou Li, “Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 159-164.
  • Xiaoxue Gao, Xiaohai Tian, Yi Zhou, Rohan Kumar Das and Haizhou Li, “Personalized Singing Voice Generation Using WaveRNN” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 252-258.
  • Kun Zhou, Berrak Sisman and Haizhou Li, “Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 230-237.
  • Berrak Sisman and Haizhou Li, “Generative Adversarial Networks for Singing Voice Conversion with and without Parallel Data” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 238-244.
  • Rui Liu, Sisman Berrak, Feilong Bao, Guanglai Gao and Haizhou Li, “WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 245-251.

INTERSPEECH

  • Rohan Kumar Das, Xiaohai Tian, Tomi Kinnunen and Haizhou Li, “The Attacker’s Perspective on Automatic Speaker Verification: An Overview” in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 4213-4217.
  • Emre Yılmaz, Ozgur Bora Gevrek, Jibin Wu, Yuxiang Chen, Xuanbo Meng and Haizhou Li, “Deep Convolutional Spiking Neural Networks for Keyword Spotting”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 2557-2561.
  • Xiaoyi Qin, Ming Li, Hui Bu, Wei Rao, Rohan Kumar Das, Shrikanth Narayanan and Haizhou Li, “The INTERSPEECH 2020 Far-Field Speaker Verification Challenge”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 3456-3460.
  • Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang and Haizhou Li, “SpEx+: A Complete Time Domain Speaker Extraction Network”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 1406-1410.
  • Zexu Pan, Zhaojie Luo, Jichen Yang and Haizhou Li, “Multi-modal Attention for Speech Emotion Recognition”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 364-368.
  • Zhenzong Wu, Rohan Kumar Das, Jichen Yang and Haizhou Li, “Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 1101-1105.
  • Ruijie Tao, Rohan Kumar Das and Haizhou Li, “Audio-visual Speaker Recognition with a Cross-modal Discriminative Network”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 2242-2246.
  • Tianchi Liu, Rohan Kumar Das, Maulik Madhavi, Shengmei Shen and Haizhou Li, “Speaker-Utterance Dual Attention for Speaker and Utterance Verification”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 4293-4297.
  • Shoufeng Lin and Xinyuan Qian, “Audio-Visual Multi-Speaker Tracking Based On the GLMB Framework”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 3082-3086.
  • Nana Hou, Chenglin Xu, Van Tung Pham, Joey Tianyi Zhou, Eng Siong Chng and Haizhou Li, “Speaker and Phoneme-Aware Speech Bandwidth Extension with Residual Dual-Path Network”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 4064-4068.
  • Kun Zhou, Berrak Sisman, Mingyang Zhang and Haizhou Li, “Converting Anyone’s Emotion: Towards Speaker-Independent Emotional Voice Conversion”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 3416-3420.
  • Nana Hou, Chenglin Xu, Joey Tianyi Zhou, Eng Siong Chng and Haizhou Li, “Multi-task Learning for End-to-end Noise-robust Bandwidth Extension”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 4069-4073.
  • Xinyuan Zhou, Emre Yılmaz, Yanhua Long, Yijie Li and Haizhou Li, “Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 1042-1046.
  • Siqi Cai, Enze Su, Yonghao Song, Longhan Xie and Haizhou Li, “Low Latency Auditory Attention Detection with Common Spatial Pattern Analysis of EEG Signals”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 2772-2776.
  • Xinyuan Zhou, Grandee Lee, Emre Yılmaz, Yanhua Long, Jiaen Liang and Haizhou Li, “Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-based LVCSR,” in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 5016-5020.

ICASSP

  • Chitralekha Gupta, Emre Yilmaz and Haizhou Li, “Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background music help?”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 496-500.
  • Rohan Kumar Das and Haizhou Li “On The Importance of Vocal Tract Constriction for Speaker Characterization: The Whispered Speech Study” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 7119-7123.
  • Rohan Kumar Das, Jichen Yang and Haizhou Li “Assessing the Scope of Generalized Countermeasures for Anti-spoofing” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020,  pp. 6589-6593.
  • Rui Liu, Berrak Sisman, Jingdong Li, Feilong Bao, Guanglai Gao and Haizhou Li, “Teacher-Student Training for Robust Tacotron-based TTS”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 6274-6278.
  • Xuehao Zhou, Xiaohai Tian, Grandee Lee, Rohan Kumar Das and Haizhou Li “End-to-end Code-switching TTS with Cross-lingual Language Model” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 7614-7618.
  • Hongqiang Du, Xiaohai Tian, Lei Xie and Haizhou Li, “Effective WaveNet Adaptation for Voice Conversion with Limited Data”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 7779-7783.
  • Van Tung Pham, Haihua Xu, Yerbolat Khassanov, Zhiping Zeng, Eng Siong Chng, Chongjia Ni, Bin Ma and Haizhou Li, “Independent Language Modeling Architecture for End-To-End ASR”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 7059-7063.
  • Xiang Hao, Chenglin Xu, Nana Hou, Lei Xie, Eng Siong Chng and Haizhou Li, “Time-Domain Neural Network Approach for Speech Bandwidth Extension”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 866-870.
  • Rohan Sheelvant, Bidisha Sharma, Maulik Madhavi, Rohan Kumar Das, S.R.M. Prasanna and Haizhou Li “RSL2019: A Realistic Speech Localization Corpus” in Proc. International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (COCOSDA), Cebu City, Philippines, October 2019, pp. 1-6.
  • Jibin Wu, Yansong Chua, Malu Zhang, Qu Yang, Guoqi Li and Haizhou Li, “Deep Spiking Neural Network with Novel Spike Count based Learning Rule”, In. Proc. International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, July 2019,  pp. 1-6.
  • Jibin Wu, Yansong Chua, Malu Zhang and Haizhou Li, “Competitive STDP-based Feature Representation Learning for Sound Event Classification”, In. Proc. International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, July 2019, pp. pp. 1-8.
  • Zihan Pan, Jibin Wu, Yansong Chua, Malu Zhang and Haizhou Li, “Neural Population Coding for Effective Temporal Classification”, In. Proc.International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, July 2019,  pp. 1-8.
  • Maulik Madhavi, Tong Zhan, Haizhou Li and Min Yuan, “First Leap Towards Development of Dialogue System for Autonomous Bus”, In. Proc. International Workshop on Spoken Dialogue Systems Technology (IWSDS),  Sicily, Italy, April 2019, pp. 1-6.
  • Malu Zhang, Jibin Wu, Yansong Chua, Xiaolin Luo, Zihan Pan, Dan Liu, and Haizhou Li, “MPD-AL: An Efficient Membrane Potential Driven Aggregate-Label Learning Algorithm for Spiking Neurons”, In. Proc. Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), Hawaii, USA, 2019, pp. 1327-1334.

ASRU

  • Berrak Sisman, Mingyang Zhang, Minghui Dong, and Haizhou Li, “On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion”, in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop, Sentosa Island, Singapore, December 2019, pp. 144-151.
  • Hongqiang Du, Xiaohai Tian, Lei Xie and Haizhou Li, “Wavenet Factorization with Singular Value Decomposition for Voice Conversion”, in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop, Sentosa Island, Singapore, December 2019, pp. 152-159.
  • Yi Zhou, Xiaohai Tian, Emre Yılmaz, Rohan Kumar Das and Haizhou Li, “A Modularized Neural Network with Language-Specific Output Layers for Cross-Lingual Voice Conversion”, in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop, Sentosa Island, Singapore, December 2019, pp. 160-167.
  • Chenglin Xu, Wei Rao, Eng Siong Chng and Haizhou Li, “Time-Domain Speaker Extraction Network”, in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop, Sentosa Island, Singapore, December 2019, pp. 327-334.
  • Rohan Kumar Das, Jichen Yang and Haizhou Li, “Long Range Acoustic and Deep Features Perspective on ASVspoof 2019”, in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop, Sentosa Island, Singapore, December 2019, pp. 1018-1025.
  • Xianghu Yue, Grandee Lee, Emre Yılmaz, Fang Deng and Haizhou Li, “End-to-End Code-Switching ASR for Low-Resourced Language Pairs”, in Proc. IEEE Automatic Speech Recognition Understanding (ASRU) Workshop, Sentosa Island, Singapore, December 2019, pp. 972-979.

APSIPA-ASC

  • Yitong Liu, Rohan Kumar Das and Haizhou Li, “Multi-band Spectral Entropy Information for Detection of Replay Attacks”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Lanzhou, China, November 2019, pp. 838-843.
  • Rohan Kumar Das, Jichen Yang and Hazhou Li “Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Lanzhou, China, November 2019, pp. 1630-1635.
  • Xiaoxue Gao, Xiaohai Tian, Rohan Kumar Das, Yi Zhou and Haizhou Li, “Speaker-Independent Spectral Mapping for Speech-to-Singing Conversion”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Lanzhou, China, November 2019, pp. 159-164.
  • Karthika Vijayan, Kodukula Sri Rama Murty and Haizhou Li, “Allpass Modeling of Phase Spectrum of Speech Signals for Formant Tracking”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Lanzhou, China, November 2019, pp. 1190-1196.
  • Yi Zhou, Xiaohai Tian, Rohan Kumar Das and Haizhou Li, “Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Lanzhou, China, November 2019, pp. 1282-1287.

INTERSPEECH

  • Emre Yılmaz, Samuel Cohen, Xianghu Yue, David van Leeuwen and Haizhou Li, “Multi-Graph Decoding for Code-Switching ASR”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 3750-3754.
  • Qinyi Wang, Emre Yılmaz, Adem Derinel and Haizhou Li, “Code-Switching Detection Using ASR-Generated Language Posteriors”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 3740-3744.
  • Astik Biswas, Emre Yılmaz, Febe De Wet, Ewald Van der Westhuizen and Thomas Niesler, “Semi-supervised Acoustic Model Training for Five-Lingual Code-Switched ASR”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 3745-3749.
  • Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li and Junichi Yamagishi, “Joint Training Framework for Text-to-Speech and Voice Conversion Using Multi-Source Tacotron and Wavenet”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp.1298-1302.
  • Grandee Lee, Xianghu Yue, Haizhou Li, “Linguistically Motivated Parallel Data Augmentation for Code-switch Language Modeling”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 3730-3734.
  • Emre Yılmaz, Adem Derinel, Zhou Kun, Henk van den Heuvel, Niko Brummer, Haizhou Li and David van Leeuwen, “Large-Scale Speaker Diarization of Radio Broadcast Archives”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 411-415.
  • Wei Rao, Chenglin Xu, Eng Siong Chng and Haizhou Li, “Target Speaker Extraction for Multi-Talker Speaker Verification”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 1273-1277.
  • Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham, Haihua Xu, Eng Siong Chng and Haizhou Li, “On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2165-2169.
  • Xiaohai Tian, Eng Siong Chng and Haizhou Li, “A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 201-205.
  • Chitralekha Gupta, Emre Yılmaz and Haizhou Li, “Acoustic Modeling for Automatic Lyrics-to-Audio Alignment”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2040-2044.
  • Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Héctor Delgado and Massimiliano Todisco, “I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 1497-1501.
  • Rohan Kumar Das, Jichen Yang and Haizhou Li, “Long Range Acoustic Features for Spoofed Speech Detection”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 1058-1062.
  • Rohan Kumar Das and Haizhou Li, “Instantaneous Phase and Long-term Acoustic Cues for Orca Activity Detection”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2418-2422.
  • Bidisha Sharma, Rohan Kumar Das and Haizhou Li, “On the Importance of Audio-source Separation for Singer Identification in Polyphonic Music”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2020-2024.
  • Bidisha Sharma, Rohan Kumar Das and Haizhou Li, “Multi-level Adaptive Speech Activity Detector for Speech in Naturalistic Environments”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2015-2019.
  • Bidisha Sharma and Haizhou Li, “A Combination of Model-based and Feature-based Strategy for Speech-to-Singing Alignment”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 624-628.
  • Tianchi Liu, Maulik Madhavi, Rohan Kumar Das and Haizhou Li, “A Unified Framework for Speaker and Utterance Verification”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 4320-4324.
  • Changhuai You, Jichen Yang and Tran Huy Dat, “Device Feature Extractor for Replay Spoofing Detection”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2933-2937.
  • Tharshini Gunendradasan, Eliathamby Ambikairajah, Julien Epps and Haizhou Li, “An Adaptive-Q Cochlear Model for Replay Spoofing Detection”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2918-2922.
  • Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li and Satoshi Nakamura, “VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 1118-1122.
  • Chitralekha Gupta, Karthika Vijayan, Bidisha Sharma, Xiaoxue Gao and Haizhou Li, “NUS Speak-to-Sing: A Web Platform for Personalized Speech-to-Singing Conversion”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 2376-2377.
  • Jibin Wu, Zihan Pan, Malu Zhang, Rohan Kumar Das, Yansong Chua and Haizhou Li, “Robust Sound Recognition: A Neuromorphic Approach”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 3667-3668.
  • Sarfaraz Jelil, Abhishek Shrivastava, Rohan Kumar Das, S. R. M. Prasanna and Rohit Sinha, “SpeechMarker: A Voice-based Multi-Level Attendance Application”, in Proc. INTERSPEECH, Graz, Austria, September 2019, pp. 3665-3666.

ICASSP

  • Bidisha SharmaChitralekha Gupta, Haizhou Li, and Ye Wang, “Automatic Lyrics-to-Audio Alignment on Polyphonic Music using Singing-Adapted Acoustic Models”, in Proc.  IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, United Kingdom, May 2019, pp. 396-400.
  • Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das and Haizhou Li “Cross-Lingual Voice Conversion with Bilingual Phonetic Posteriorgram and Average Modeling” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, United Kingdom, May 2019, pp. 6790-6794.
  • Grandee Lee and Haizhou Li “Word and Class Common Space Embedding for Code-switch Language Modeling”, In. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, United Kingdom, May 2019, pp. 6086-6090.
  • Chenglin Xu, Wei Rao, Eng Siong Chng and Haizhou Li, “Optimization of Speaker Extraction Neural network with Magnitude and Temporal Spectrum Approximation Loss”, In. Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, United Kingdom, May 2019, pp. 6990-6994.
  • Longting Xu, Rohan Kumar Das, Emre Yılmaz, Jichen Yang and Haizhou Li “Generative x-vectors for text-independent speaker verification,” in Proc. IEEE Spoken Language Technology (SLT), Athens, Greece, December 2018, pp. 1014-1020.
  • Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li and Satoshi Nakamura, “Adaptive Wavenet Vocoder for Residual Compensation in GAN-Based Voice Conversion,” in Proc. IEEE Spoken Language Technology (SLT), Athens, Greece, December 2018, pp. 282-289.
  • Kantheti Srinivas, Rohan Kumar Das and Hemant A. Patil, “Combining Phase-based Features for Replay Spoof Detection” in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP), Taipei, Taiwan, November 2018, pp. 151-155.
  • Chitralekha Gupta, Rong Tong, Haizhou Li, and Ye Wang, “Semi-supervised lyrics and solo-singing alignment”, In Proc. International Society for Music Information Retrieval Conference (ISMIR), Paris, France, September 2018, pp. 1-8.
  • Xiaoxue Gao, Berrak Sisman, Rohan Kumar Das and Karthika Vijayan “NUS-HLT Spoken Lyrics and Singing (SLS) Corpus” in Proc. International on Orange Technologies (ICOT), Bali, Indonesia, October 2018, pp. 1-6.
  • Emre Yılmaz, Henk van den Heuvel and David A. van Leeuwen, “Code-Switching Detection with Data-Augmented Acoustic and Language Models,” in Proc. 6th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), Gurugram, India, September 2018, pp. 127-131.
  • Raghav Menon, Herman Kamper, Emre Yılmaz and John Quinn, Thomas Niesler, “ASR-free CNN-DTW Keyword Spotting Using Multilingual Bottleneck Features for Almost Zero-Resource Languages,” in Proc. 6th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), Gurugram, India, September 2018, pp. 20-24.
  • Berrak Sisman, Grandee Lee and Haizhou Li, “Phonetically Aware Exemplar-Based Prosody Transformation,” in Proc. The Speaker and Language Recognition Workshop-Odyssey, Les Sables D’olonne, France, June 2018, pp. 267-274.
  • Zihan Pan, Haizhou Li, Jibin Wu and Yansong Chua, “An Event-Based Cochlear Filter Temporal Encoding Scheme for Speech Signals,” in Proc. International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, July 2018, pp. 1-8.
  • Jibin Wu, Yansong Chua and Haizhou Li, “A Biologically Plausible Speech Recognition Framework Based on Spiking Neural Networks,” in Proc. International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, July 2018, pp. 1-8.
  • Xiaohai Tian, Juchao Wang, Haihua Xu, Eng Siong Chng and Haizhou Li, “Average Modeling Approach to Voice Conversion with Non-Parallel Data,” in Proc. Odyssey 2018: The Speaker and Language Recognition Workshop, Les Sables d’Olonne, France, June 2018, pp. 227-232.

ICASSP

  • Chenglin Xu, Wei Rao, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Single Channel Speech Separation with Constrained Utterance Level Permutation Invariant Training Using Grid LSTM,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, Alberta, Canada, April 2018, pp. 6-10.
  • Qing Wang, Wei Rao, Sining Sun, Lei Xie, Eng Siong Chng and Haizhou Li, “Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, Alberta, Canada, April 2018, pp. 4889-4893.
  • Karthika Vijayan, Haizhou Li, Hanwu Sun and Kong-Aik Lee, “On the Importance of Analytic Phase of Speech Signals in Spoken Language Recognition,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, Alberta, Canada, April 2018, pp. 5194-5198.
  • Saad Irtza, Vidhyasaharan Sethu, Eliathamby Ambikairajah and Haizhou Li, “End-to-End Hierarchical Language Identification System,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Calgary, Alberta, Canada, April 2018, pp. 5199-5203.

APSIPA-ASC

  • Mingyang Zhang, Berrak Sisman, Sai Sirisha Rallabandi, Haizhou Li and Li Zhao, “Error Reduction Network for DBLSTM-based Voice Conversion,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 823-828.
  • Yanping Li, Kong-Aik Lee, Yougen Yuan, Haizhou Li and Zhen Yang, “Many-to-Many Voice Conversion based on Bottleneck Features with Variational Autoencoder for Non-parallel Training Data,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 829-833.
  • Chitralekha Gupta, Haizhou Li and Ye Wang, “Automatic Evaluation of Singing Quality without a Reference,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 990-997.
  • Jichen Yang, Rohan Kumar Das and Haizhou Li, “Extended Constant-Q Cepstral Coefficients for Detection of Spoofing Attacks,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 1024-1029.
  • Rohan Kumar Das and Haizhou Li, “Instantaneous Phase and Excitation Source Features for Detection of Replay Attacks,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 1030-1037.
  • Gajan Suthokumar, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Chamith Wijenayake, Eliathamby Ambikairajah and Haizhou Li, “Use of Claimed Speaker Models for Replay Detection,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 1038-1046.
  • Sarith Fernando, Vidhyasaharan Sethu, Eliathamby Ambikairajah and Haizhou Li, “Second Order Factorized Model Adaptation for Short Duration Language Identification,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 1440-1447.
  • Rohan Kumar Das, Maulik C. Madhavi and Haizhou Li, “Compensating Utterance Information in Fixed Phrase Speaker Verification,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 1708-1712.
  • Karthika Vijayan, Xiaoxue Gao and Haizhou Li, “Analysis of Speech and Singing Signals for Temporal Alignment,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Honolulu, Hawaii, USA, November 2018, pp. 1893-1898.
  • Rohan Kumar Das and S. R. M. Prasanna “Investigating Text-independent Speaker Verification from Practically Realizable System Perspective” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC) 2018, Honolulu, Hawaii, USA, November 2018, pp. 12-15.

INTERSPEECH

  • Berrak Sisman and Haizhou Li, “Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 52-56.
  • Haihua Xu, Van Tung Pham, Zin Tun Kyaw, Zhi Hao Lim, Eng Siong Chng and Haizhou Li, “Mandarin-English Code-switching Speech Recognition,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 554-555.
  • Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, Bin Ma and Haizhou Li, “Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 97-101.
  • Longting Xu, Kong-Aik Lee, Haizhou Li and Zhen Yang, “Co-whitening of I-vectors for Short and Long Duration Speaker Verification,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 1066-1070.
  • Chitralekha Gupta, Haizhou Li, and Ye Wang, “Automatic Pronunciation Evaluation of Singing,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 1507-1511.
  • Emre Yılmaz, Astik Biswas, Ewald Van der Westhuizen, Febe De Wet and Thomas Niesler“Building a Unified Code-Switching ASR System for South African Languages,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 1923-1927.
  • Emre Yılmaz, Henk van den Heuvel and David van Leeuwen, “Acoustic and Textual Data Augmentation for Improved ASR of Code-Switching Speech,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 1933-1937.
  • Berrak Sisman, Mingyang Zhang and Haizhou Li, “A Voice Conversion Framework with Tandem Feature Sparse Representation and Speaker-Adapted WaveNet Vocoder,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 1978-1982.
  • Emre Yılmaz, Vikramjit Mitra, Chris Bartels and Horacio Franco“Articulatory Features for ASR of Pathological Speech,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 2958-2962.
  • Astik Biswas, Febe De Wet, Ewald Van der Westhuizen, Emre Yılmaz and Thomas Niesler, “Multilingual Neural Network Acoustic Modelling for ASR of Under-Resourced English-isiZulu Code-Switched Speech,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 2603-2607.
  • Chenglin Xu, Wei Rao, Eng Siong Chng and Haizhou Li, “A Shifted Delta Coefficient Objective for Monaural Speech Separation Using Multi-Task Learning,” in Proc. INTERSPEECH, Hyderabad, India, September 2018, pp. 3479-3483.
  • Jinba Xiao, Shan Yang, Mingyang Zhang, Berrak Sisman, Dongyan Huang, Lei Xie, Minghui Dong and Haizhou Li, “The I2R-NWPU-NUS Text-to-Speech System for Blizzard Challenge 2018,” in Proc. INTERSPEECH Blizzard Challenge Workshop, Microsoft India, Hydrabad, India, September 2018.
  • Biswajit Dev Sarma, Rohan Kumar Das, Abhishek Dey and Risto Haukioja “Analysis of Speech Emotions in Realistic Environments” in Proc. Speech, Music Mind (SMM) 2018, a satellite event of INTERSPEECH 2018, Hyderabad, India, September 2018, pp. 11-15.
  • Berrak Sisman, Grandee Lee, Haizhou Li and  Kay Chen Tan, “On the Analysis and Evaluation of Prosody Conversion Techniques,” International Conference on Asian Language Processing (IALP), Singapore, December 2017, pp. 44-47.
  • Grandee Lee, Thi-Nga Ho, Eng-Siong Chng and Haizhou Li, “A Review of the Mandarin-English Code-Switching Corpus: SEAME,” International Conference on Asian Language Processing (IALP), Singapore, December 2017, pp. 210-213.

APSIPA-ASC

  • Luis Fernando D’Haro, Andreea I. Niculescu, Caixia Cai, Suraj Nair, Rafael E. Banchs, Alois Knoll and Haizhou Li, “An Integrated Framework for Multimodal Human-Robot Interaction,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 76-82.
  • Nancy F. Chen, Boon Pang Lim, Van Hai Do, Van Tung Pham, Chongjia Ni, Haihua Xu, Mark Hasegawa-Johnson, Wenda Chen, Xiong Xiao, Sunil Sivadas, Eng Siong Chng, Bin Ma and Haizhou Li, “Low-Resource Spoken Keyword Search Strategies in Georgian Inspired by Distinctive Feature Theory,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 1322-1327.
  • Karthika Vijayan, Minghui Dong and Haizhou Li, “A Dual Alignment Scheme for Improved Speech-to-Singing Voice Conversion,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 1547-1555.
  • Hanwu Sun, Kong-Aik Lee, Trung Hieu Nguyen, Bin Ma and Haizhou Li, “I2R-NUS Submission to Oriental Language Recognition AP16-OL7 Challenge,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 1574-1578.
  • Zhiping Zeng, Haihua Xu, Tze Yuang Chong, Eng Siong Chng and Haizhou Li, “Improving N-gram Language Modeling for Code-Switching Speech Recognition,” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 1596-1601.
  • Xiong Xiao, Shengkui Zhao, Douglas L. Jones, Eng Siong Chng and Haizhou Li, “Time-Frequency Mask Estimation for MVDR Beamforming with Application in Robust Speech Recognition”, Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 3246-3250.
  • Liping Chen, Kong-Aik Lee, Bin Ma, Long Ma and Haizhou Li, “Li-Rong Dai of PLDA for Multi-Source Text-Independent Speaker Verification”, Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 5380-5384.
  • Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, Bin Ma and Haizhou Li, “Pairwise Learning Using Multi-Lingual Bottleneck Low-Resource Query-by-Example Spoken Term Detection”, Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 5645-5649.
  • Berrak Sisman, Haizhou Li and Kay Chen Tan, “Transformation of Prosody in Voice Conversion”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 1537-1546.
  • Chitralekha Gupta, Haizhou Li and Ye Wang, “Perceptual Evaluation of Singing Quality”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kuala Lumpur, Malaysia, December 2017, pp. 577-586.

INTERSPEECH

  • D.Y. Huang, Wan Ding, Mingyu Xu, Huaiping Ming, Minghui Dong, Xinguo Yu and Haizhou Li, “Multimodal Prediction of Affective Dimensions via Fusing Multiple Regression Techniques”, in Proc. INTERSPEECH, Stockholm, Sweden, August 2017, pp. 162-165.
  • Kong Aik Lee and Haizhou Li, “Gain Compensation for Fast i-Vector Extraction Over Short Duration”, in Proc. INTERSPEECH, Stockholm, Sweden, August 2017, pp. 1527-1531.
  • Chenglin Xu, Xiong Xiao, Sining Sun, Wei Rao, Eng Siong Chng and Haizhou Li, “Weighted Spatial Covariance Matrix Estimation for MUSIC Based TDOA Estimation of Speech Source”, in Proc. INTERSPEECH, Stockholm, Sweden, August 2017, pp. 1894-1898.
  • Saad Irtza, Vidhyasaharan Sethu, Eliathamby Ambikairajah and Haizhou Li, “Investigating Scalability in Hierarchical Language Identification System”, in Proc. INTERSPEECH, Stockholm, Sweden, August 2017, pp. 2581-2585.
  • Jie Wu, D.-Y. Huang, Lei Xie and Haizhou Li, “Denoising Recurrent Neural Network for Deep Bidirectional LSTM Based Voice Conversion, in Proc. INTERSPEECH, Stockholm, Sweden, August 2017, pp. 3379-3383.

ASRU

  • Berrak Sisman, Haizhou Li and Kay Chen Tan, “Sparse Representation of Phonetic Features for Voice Conversion with and Without Parallel Data”, IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan, December 2017, pp. 677-684.
  • Shan Yang, Lei Xie, Xiao Chen, Xiaoyan Lou, Xuan Zhu, Dongyan Huang and Haizhou Li, “Statistical Parametric Speech Synthesis using Generative Adversarial Networks Under a Multi-Task Learning Framework”, IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan, December 2017, pp. 685-691.
  • Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Multilingual bottle-neck feature learning from Untranscribed Speech”, IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan, December 2017, pp. 727-733.
  • Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, Bin Ma and Haizhou Li, “Extracting Bottleneck Features and Word-like Pairs from Untranscribed Speech from Feature Representation”, IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan, December 2017, pp. 734-739.
  • Shan Yang, Lei Xie, Xiao Chen, Xiaoyan Lou, Xuan Zhu, Dongyan Huang and Haizhou Li, “Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under a Multi-Task Learning Framework”, IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan, December 2017, pp. 685-691.
  • Seokhwan Kim, Rafael E. Banchs and Haizhou Li, “Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling for Dialogue Topic Tracking”, in Proc. 54th Annual Meeting of the Association for Computational Linguistics (ACL), Berlin, Germany, August 2016, pp. 963-973.
  • Wan Ding, Mingyu Xu, Dong-Yan Huang, Weisi Lin, Minghui Dong, Xinguo Yu and Haizhou Li, “Audio and Face Video Emotion Recognition in the Wild Using Deep Neural Networks and Small Datasets”, in Proc. 18th International Conference on Multimodal Interaction (ICMI), Tokyo, Japan, November 2016, pp. 506-513.

APSIPA-ASC

  • Nancy F. Chen and Haizhou Li, “Computer-Assisted Pronunciation Training: From Pronunciation Scoring Towards Spoken Language Learning”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Jeju, Korea, December 2016, pp. 1-7.
  • Xiaohai Tian, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Spoofing Speech Detection using Temporal Convolutional Neural Network”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Jeju, Korea, December 2016, pp. 1-6.
  • Xiong Xiao, Shinji Watanabe, Eng Siong Chng and Haizhou Li, “Beamforming Networks using Spatial Covariance Features for Far-Field Speech Recognition”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Jeju, Korea, December 2016, pp. 1-6.
  • Haihua Xu, Wei Rao, Xiong Xiao, Hao Huang, Eng Siong Chng and Haizhou Li, “I-Vector Based Deep Neural Network Acoustic Model Adaptation using Multilingual Language Resource”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Jeju, Korea, December 2016, pp. 1-5.

ICASSP

  • Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Spoofing Detection from a Feature Representation Perspective”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 2119-2123.
  • Huaiping Ming, Dong-Yan Huang, Lei Xie, Shaofei Zhang, Minghui Dong and Haizhou Li, “Exemplar-Based Sparse Representation of Timbre and Prosody for Voice Conversion”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 5175-5179.
  • Liping Chen, Kong-Aik Lee, Eng Siong Chng, Bin Ma, Haizhou Li and Li-Rong Dai, “Content-Aware Local Variability Vector for Speaker Verification with Short Utterance”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 5485-5489.
  • Saad Irtza, Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah and Haizhou Li, “A Hierarchical Framework for Language Identification”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 5820-5824.
  • Chongjia Ni, Cheung-Chi Leung, Lei Wang, Haibo Liu, Feng Rao, Li Lu, Nancy F. Chen, Bin Ma and Haizhou Li, “Cross-Lingual Deep Neural Network-Based Submodular Unbiased Data Selection for Low-Resource Keyword Search”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 6015-6019.
  • Haihua Xu, Jingyong Hou, Xiong Xiao, Van Tung Pham, Cheung-Chi Leung, Lei Wang, Van Hai Do, Hang Lv, Lei Xie, Bin Ma, Eng Siong Chng and Haizhou Li, “Approximate Search of Audio Queries by Using DTW with Phone Time Boundary and Data Augmentation”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 6030-6034.
  • Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng and Haizhou Li, “Keyword Search Using Query Expansion for Graph-Based Rescoring of Hypothesized Detections”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 6035-6039.
  • Nancy F. Chen, Van Tung Pharri, Haihua Xu, Xiong Xiao, Van Hai Do, Chongjia Ni, I-Fan Chen, Sunil Sivadas, Chin-Hui Lee, Eng Siong Chng, Bin Ma and Haizhou Li, “Exemplar-Inspired Strategies for Low-Resource Spoken Keyword Search in Swahili”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 6040-6044.
  • Xiong Xiao, Shengkui Zhao, Thi Ngoc Tho Nguyen, Douglas L. Jones, Eng Siong Chng and Haizhou Li, “An Expectation-Maximization Eigenvector Clustering Approach to Direction of Arrival Estimation of Multiple Speech Sources”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 6330-6334.
  • Dong-Yan Huang, Minghui Dong and Haizhou Li, “Combining Multiple Kernel Models for Automatic Intelligibility Detection of Pathological Speech”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, March 2016, pp. 6485-6489.

INTERSPEECH

  • Yougen Yuan, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 788-792.
  • Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 923-927.
  • Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng and Haizhou Li, “Rescoring Hypothesized Detections of Out-of-Vocabulary Keywords Using Subword Samples”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 933-937.
  • Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho and Haizhou Li, “SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 1225-1229.
  • Haihua Xu, Hang Su, Chongjia Ni, Xiong Xiao, Hao Huang, Eng Siong Chng and Haizhou Li, “Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 1315-1319.
  • Jia Yu, Xiong Xiao, Lei Xie, Eng Siong Chng and Haizhou Li, “A DNN-HMM Approach to Story Segmentation”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 1527-1531.
  • Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma and Haizhou Li, “SingaKids-Mandarin: Speech Corpus of Singaporean Children Speaking Mandarin Chinese”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 1545-1549.
  • Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng and Haizhou Li, “An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 1715-1719.
  • Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho and Haizhou Li, “SERAPHIM Live! – Singing Synthesis for the Performer, the Composer, and the 3D Game Developer”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 1966-1967.
  • Huaiping Ming, Dong-Yan Huang, Lei Xie, Jie Wu, Minghui Dong and Haizhou Li, “Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 2453-2457.
  • Rong Tong, Nancy F. Chen, Bin Ma and Haizhou Li, “Context Aware Mispronunciation Detection for Mandarin Pronunciation Training”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 3112-3116.
  • Kong-Aik Lee, Haizhou Li, Li Deng, Ville Hautamäki, Wei Rao, Xiong Xiao, Anthony Larcher, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Aleksandr Sizov, Jianshu Chen, Ivan Kukanov, Amir Hossein Poorjam, Trung Ngo Trong, Chenglin Xu, Haihua Xu, Bin Ma, Eng Siong Chng and Sylvain Meignier, “The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 3211-3215.
  • Saad Irtza, Vidhyasaharan Sethu, Sarith Fernando, Eliathamby Ambikairajah and Haizhou Li, “Out of Set Language Modelling in Hierarchical Language Identification”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 3270-3274.
  • Chongjia Ni, Lei Wang, Cheung-Chi Leung, Feng Rao, Li Lu, Bin Ma and Haizhou Li, “Rapid Update of Multilingual Deep Neural Network for Low-Resource Keyword Search”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 3698-3702.
  • Cheung-Chi Leung, Lei Wang, Haihua Xu, Jingyong Hou, Van Tung Pham, Hang Lv, Lei Xie, Xiong Xiao, Chongjia Ni, Bin Ma, Eng Siong Chng and Haizhou Li, “Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis”, in Proc. INTERSPEECH, San Francisco, California, USA, September 2016, pp. 3703-3707.
  • Huaiping Ming, Dong-Yan Huang, Minghui Dong, Haizhou Li, Lei Xie and Shaofei Zhang “Fundamental Frequency Modeling Using Wavelets for Emotional Voice Conversion”, in Proc. International Conference on Affective Computing and Intelligent Interaction (ACII), Xian, China, September 2015, pp. 804-809.
  • Haihua Xu, Xiong Xiao, Engsiong Chng and Haizhou Li “On Statistical Machine Translation Method for Lexicon Refinement in Speech Recognition”, in Proc. Third IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), Chengdu, China, July 2015, pp. 25-29.
  • Xiaohai Tian, Steven Du, Xiong Xiao, Haihua Xu, Eng Siong Chng and Haizhou Li, “Detecting Synthetic Speech Using Long Term Magnitude and Phase Information”, in Proc. Third IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), Chengdu, China, July 2015, pp. 611-615.
  • Seokhwan Kim, Rafael E. Banchs and Haizhou Li, “Wikification of Concept Mentions within Spoken Dialogues Using Domain Constraints from Wikipedia”, in Proc. Empirical Methods in Natural Language Processing (EMNLP), Lisbon, Portugal, September 2015, pp. 2225-2229.
  • Kui Wu, Xuancong Wang, Nina Zhou, AiTi Aw and Haizhou Li, “Joint Chinese word segmentation and punctuation prediction Using Deep Recurrent Neural Network for Social Media Data”, in Proc. International Conference Asian Language Processing (IALP), Suzhou, China, October 2015, pp. 41-44.
  • Gillian Chua, Qian Ci Chang, Ye Won Park, Paul Yaozhu Chan, Minghui Dong and Haizhou Li, “The Expression of Singing Emotion – Contradicting the Constraints of Song”, in Proc.  International Conference Asian Language Processing (IALP), Suzhou, China, October 2015, pp. 98-102.
  • Yang Yu, Weisi Lin, Dong-Yan Huang, Minghui Dong and Haizhou Li, “Performance Scoring of Singing Voice”, in Proc. International Conference Asian Language Processing (IALP), Suzhou, China, October 2015, pp. 119-122.
  • Ridong Jiang, Seokhwan Kim, Rafael E. Banchs and Haizhou Li, “Towards improving the performance of Vector Space Model for Chinese Frequently Asked Question Answering”, in Proc. International Conference Asian Language Processing (IALP), Suzhou, China, October 2015, pp. 136-139.
  • Miaolong Yuan, Bo Tian, Vui Ann Shim, Huajin Tang and Haizhou Li, “An Entorhinal-Hippocampal Model for Simultaneous Cognitive Map Building”, in Proc. Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI15), Austin Texas, USA, January 2015, pp. 586-592.
  • Sheng Gao and Haizhou Li “Popular Song Summarization Using Chorus Section Detection from Audio Signal”, in Proc. IEEE International Workshop on Multimedia Signal Processing (MMSP), Xiamen, China, October 2015, pp.  1-6.
  • Seokhwan Kim, Rafael E. Banchs and Haizhou Li, “Towards Improving Dialogue Topic Tracking Performances with Wikification of Concept Mentions”, in Proc. 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL), Prague, Czech Republic, September 2015, pp. 124-128.

APSIPA-ASC

  • Van Hai Do, Xiong Xiao, Eng Siong Chng and Haizhou Li “Distance Metric Learning for Kernel Density-Based Acoustic Model Under Limited Training Data Conditions”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Hong Kong, December 2015, pp. 54-58.
  • Jia Yu, Lei Xie, Xiong Xiao, Eng Siong Chng and Haizhou Li, “A Density Peak Clustering Approach to Unsupervised Acoustic Subword Units Discovery”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Hong Kong, December 2015, pp. 178-183.
  • Shaofei Zhang, Dong-Yan Huang, Lei Xie, Eng Siong Chng, Haizhou Li and Minghui Dong, “Non-Negative Matrix Factorization Using Stable Alternating Direction Method of Multipliers for Source Separation”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Hong Kong, December 2015, pp. 222-228.
  • Van Tung Pham, Haihua Xu, Van Hai Do, Tze Yuang Chong, Xiong Xiao, Eng Siong Chng and Haizhou Li, “On the Study of Very Low-Resource Language Keyword Search”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Hong Kong, December 2015, pp. 358-364.
  • Minghui Dong, Chenyu Yang, Yanfeng Lu, Jochen Walter Ehnes, Dong-Yan Huang, Huaiping Ming, Rong Tong, Siu Wa Lee and Haizhou Li, “Mapping Frames with DNN-HMM Recognizer for Non-Parallel Voice Conversion” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Hong Kong, December 2015, pp. 488-494.
  • Van Hai Do, Xiong Xiao, Haihua Xu, Eng Siong Chng and Haizhou Li, “Multilingual Exemplar-Based Acoustic Model for the NIST Open KWS 2015 Evaluation”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Hong Kong, December 2015, pp. 594-98.

ASRU

  • Shengkui Zhao, Xiong Xiao, Zhaofeng Zhang, Thi Ngoc Tho Nguyen, Xionghu Zhong, Bo Ren, Longbiao Wang, Douglas L. Jones, Eng Siong Chng and Haizhou Li, “Robust Speech Recognition Using Beamforming with Adaptive Microphone Gains and Multichannel Noise Reduction”, in Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA, December 2015, pp. 460-467.

ICASSP

  • Jonathan Dennis, Tran Huy Dat, and Haizhou Li, “Combining Robust Spike Coding with Spiking Neural Networks for Sound Event Classification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia, April 2015, pp. 176-180.
  • Xiong Xiao, Shengkui Zhao, Xionghu Zhong, Douglas L. Jones, Eng Siong Chng, and Haizhou Li, “A Learning-based Approach to Direction of Arrival Estimation in Noisy and Reverberant Environments”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia, April 2015, pp. 2814-2818.
  • Sven Ewan Shepstone, Kong Aik Lee, Haizhou Li, Zheng-Hua Tan, and Søren Holdt Jensen, “Source-Specific Informative Prior for i-Vector Extraction”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia, April 2015, pp. 4185-4189.
  • Haihua Xu, Peng Yang, Xiong Xiao, Lei Xie, Cheung-Chi Leung, Hongjie Chen, Jia Yu, Hang Lv, Lei Wang, Su Jun Leow, Bin Ma, Eng Siong Chng, and Haizhou Li, “Language Independent Query-by-Example Spoken Term Detection using N-Best Phone Sequences and Partial Matching”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia, April 2015, pp. 5191-5195.
  • Liping Chen, Kong Aik Lee, Bin Ma, Wu Guo, Haizhou Li, and Li Rong Dai, “Channel Adaptation of PLDA for Text-Independent Speaker Verification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia, April 2015, pp. 5251-5255.
  • Rong Tong, Nancy F. Chen, Boon Pang Lim, Bin Ma, and Haizhou Li, “Tokenizing Fundamental Frequency Variation for Mandarin Tone Error Detection”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia, April 2015, pp. 5361-5365.
  • Nancy F. Chen, Chongjia Ni, I-Fan Chen, Sunil Sivadas, Van Tung Pham, Haihua Xu, Xiong Xiao, Tze Siong Lau, Su Jun Leow, Boon Pang Lim, Cheung-Chi Leung, Lei Wang, Chin-Hui Lee, Alvina Goh, Eng Siong Chng, Bin Ma, and Haizhou Li, “Low-Resource Keyword Search Strategies for Tamil”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Queensland, Australia, April 2015, pp. 5366-5370.

INTERSPEECH

  • Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li and Li-Rong Dai, “Phone-Centric Local Variability Vector for Text-Constrained Speaker Verification”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 229-233.
  • Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma and Haizhou Li, “iCALL corpus: Mandarin Chinese Spoken by Non-Native Speakers of European Descent” in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 324-328.
  • Rong Tong, Nancy F. Chen, Bin Ma and Haizhou Li, “Goodness of Tone (GOT) for Non-Native Mandarin Tone Recognition”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 801-805.
  • Saad Irtza, Vidhyasaharan Sethu, Phu Ngoc Le, Eliathamby Ambikairajah and Haizhou Li “Phonemes Frequency-Based PLLR Dimensionality Reduction for Language Recognition”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 997-1001.
  • Longting Xu, Kong-Aik Lee, Haizhou Li and Zhen Yang, “Sparse Coding of Total Variability Matrix” in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 1022-1026.
  • Tze Yuang Chong, Rafael E. Banchs, Eng Siong Chng and Haizhou Li, “TDTO Language Modeling with Feedforward Neural Networks” in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 1458-1462.
  • Shaofei Zhang, Dong-Yan Huang, Lei Xie, Eng Siong Chng, Haizhou Li and Minghui Dong, “Regularized Non-Negative Matrix Factorization Using Alternating Direction Method of Multipliers and its Application to Source Separation.”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 1498-1502.
  • Jonathan William Dennis, Tran Huy Dat and Haizhou Li, “Spiking neural networks and the Generalised Hough Transform for Speech Pattern Detection”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 1997-2001.
  • Xiong Xiao, Xiaohai Tian, Steven Du, Haihua Xu, Eng Siong Chng and Haizhou Li, “Spoofing Speech Detection Using High Dimensional Magnitude and Phase Features: The NTU Approach for ASVspoof 2015 Challenge”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 2052-2056.
  • Kong-Aik Lee, Guangsen Wang, Kam Pheng Ng, Hanwu Sun, Trung Hieu Nguyen, Ngoc Thuy Huong Thai, Bin Ma and Haizhou Li, “The Reddots Platform for Mobile Crowd-Sourcing of Speech Data”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 2603-2604.
  • Dong-Yan Huang, Minghui Dong and Haizhou Li, “A Real-Time Variable-Q Non-Stationary Gabor Transform for Pitch Shifting”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 2744-2748.
  • Kong-Aik Lee, Anthony Larcher, Guangsen Wang, Patrick Kenny, Niko Brümmer, David A. van Leeuwen, Hagai Aronowitz, Marcel Kockmann, Carlos Vaquero, Bin Ma, Haizhou Li, Themos Stafylakis, Md. Jahangir Alam, Albert Swart and Javier Perez, “The Reddots Data Collection for Speaker Recognition”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 2996-3000.
  • Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Parallel Inference of Dirichlet Process Gaussian Mixture Models for Unsupervised Acoustic Modeling: a Feasibility Study” in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 3189-3193.
  • Huaiping Ming, Dong-Yan Huang, Lei Xie, Haizhou Li and Minghui Dong, “An Alternating Optimization Approach for Phase Retrieval” in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 3426-3430.
  • Xiong Xiao, Shengkui Zhao, Xionghu Zhong, Douglas L. Jones, Eng Siong Chng and Haizhou Li, “Learning to Estimate Reverberation Time in Noisy and Reverberant Rooms”, in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 3431-3435.
  • VCSR”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 2014, pp. 4883-4887.
  • Rong Tong, Boon Pang Lim, Nancy F. Chen, Bin Ma and Haizhou Li, “Subspace Gaussian Mixture Model for Computer-Assisted Language Learning”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 2014, pp.5347-5351.
  • Van Tung Pham, Haihua Xu, Nancy F. Chen, Sunil Sivadas, Boon Pang Lim, Eng Siong Chng and Haizhou Li, “Discriminative Score Normalization for Keyword Search Decision”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, May 2014, pp.7078-7082.

INTERSPEECH

  • Van Hai Do, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Kernel Density-Based Acoustic Model with Cross-Lingual Bottleneck Features for Resource-Limited LVCSR”, in Proc. INTERSPEECH, Singapore, September 2014, pp.6-10.
  • Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma and Haizhou Li, “A Graph-Based Gaussian Component Clustering Approach to Unsupervised Acoustic Modeling”, in Proc. INTERSPEECH, Singapore, September 2014, pp.875-879.
  • Anthony Larcher, Kong-Aik Lee, Pablo Luis Sordo Martinez, Trung Hieu Nguyen, Bin Ma and Haizhou Li, “Extended RSR 2015 for Text-Dependent Speaker Verification over VHF Channel”, in Proc. INTERSPEECH, Singapore, September 2014, pp.1322-1326.
  • Hoang Gia Ngo, Nancy F. Chen, Sunil Sivadas, Bin Ma and Haizhou Li, “A Minimal-Resource Transliteration Framework for Vietnamese”, in Proc. INTERSPEECH, Singapore, September 2014, pp.1410-1414.
  • Peng Yang, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Intrinsic Spectral Analysis Based on Temporal Context Features for Query-by-example Spoken Term Detection”, in Proc. INTERSPEECH, Singapore, September 2014, pp.1722-1726.
  • Haihua Xu, Hang Su, Eng Siong Chng and Haizhou Li, “Semi-Supervised Training for Bottle-Neck Feature Based DNN-HMM Hybrid Systems”, in Proc. INTERSPEECH, Singapore, September 2014, pp.2078-2082.
  • Minghui Dong, Siu Wa Lee, Haizhou Li, Paul Y. Chan, Xuejian Peng, Jochen Walter Ehnes, and Dong-Yan Huang, “I2R Speech2singing Perfects Everyone’s Singing”, in Proc. INTERSPEECH, Singapore, September 2014, pp.2148-2149.
  • Siu Wa Lee, Zhizheng Wu, Minghui Dong, Xiaohai Tian and Haizhou Li, “A Comparative Study of Spectral Transformation Techniques for Singing Voice Synthesis”, in Proc. INTERSPEECH, Singapore, September 2014, pp.2499-2503.
  • Zhizheng Wu, Eng Siong Chng and Haizhou Li, “Joint Nonnegative Matrix Factorization for Exemplar-based Voice Conversion”, in Proc. INTERSPEECH, Singapore, September 2014, pp.2509-2513.
  • Chenglin Xu, Lei Xie, Guangpu Huang, Xiong Xiao, Eng Siong Chng and Haizhou Li, “A Deep Neural Network Approach for Sentence Boundary Detection in Broadcast News”, in Proc. INTERSPEECH, Singapore, September 2014, pp.2887-2891.
  • Rong Tong, Bin Ma and Haizhou Li, “Virtual Example for Phonotactic Language Recognition”, in Proc. INTERSPEECH, Singapore, September 2014, pp.3017-3021.
  • Tze Yuang Chong, Rafael E. Banchs, Eng Siong Chng and Haizhou Li, “Modeling of Term-Distance and Term-Occurrence Information for Improving n-Gram Language model performance”, in Proc. Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria, August 2013, pp.233-237.
  • Xiaoming Lu, Lei Xie, Cheung-Chi Leung, Bin Ma and Haizhou Li, “Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions”, in Proc. Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria, August 2013, pp. 190-195.
  • Zhizheng Wu, Eng Siong Chng and Haizhou Li, “Restricted Machine for Voice Conversion”, in Proc. IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), Beijing, China, July 2013, pp. 104-108.
  • Yanan Li, Keng Peng Tee, Shuzhi Sam Ge and Haizhou Li, “Building Companionship through Human-Robot Collaboration”, in Proc. International Conference of Social Robotics (ICSR), Bristol, UK, October 2013.

APSIPA-ASC

  • Zhizheng Wu and Haizhou Li, “Voice conversion and spoofing attack on speaker verification systems”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kaohsiung, Taiwan, November 2013. pp. 1-9 (Invited paper)
  • Duc Hoang Ha Nguyen, Aleem Mushtaq, Xiong Xiao, Eng Siong Chng, Haizhou Li and Chin-Hui Lee, “A Particle Filter Compensation Approach to Robust LVCSR”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Kaohsiung, Taiwan, November 2013, pp. 1-7.

INTERSPEECH

  • Vidhyasaharan Sethu, Julien Epps, Eliathamby Ambikairajah and Haizhou Li, “GMM Based Speaker Variability Compensated System for Interspeech 2013 ComParE Emotion Challenge”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 205-209.
  • Van Hai Do, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Context-Dependent Phone Mapping for LVCSR of Under-Resourced Languages”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 500-504.
  • Xiong Xiao, Eng Siong Chng and Haizhou Li, “Attribute-Based Histogram Equalization (HEQ) and its Adaptation for Robust Speech Recognition”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 876-880.
  • Zhizheng Wu, Anthony Larcher, Kong Aik Lee, Eng Siong Chng, Tomi Kinnunen and Haizhou Li, “Vulnerability Evaluation of Speaker Verification Under Voice Conversion Spoofing: The Effect of Text Constraints”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 950-954.
  • R. Saeidi, Kong Aik Lee, Tomi Kinnunen, Taufiq Hasan, Benoit Fauve, P.-M. Bousquet, Elie Khoury, P.L. Sordo Martinez, J. M. K. Kua, Chang Huai You, Hanwu Sun, Anthony Larcher, Padmanabhan Rajan, Ville Hautamäki, Cemal Hanilçi, B. Braithwaite, Rosa González Hautamäki, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, N. Shokouhi, D. Matrouf, L. El Shafey, Pejman Mowlaee, Julien Epps, T. Thiruvaran, David A. van Leeuwen, Bin Ma, Haizhou Li, John H.L. Hansen, and Jean-Francois Bonastre, “I4U Submission to NIST SRE 2012: A Large-Scale Collaborative Effort for Noise-Robust Speaker Verification”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 1986-1990.
  • Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma and Haizhou Li, “Unsupervised Mining of Acoustic Subword Units with Segment-Level Gaussian Posteriorgrams”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 2297-2301.
  • Nancy F. Chen, Shivakumar, Mahesh Harikumar, Bin Ma and Haizhou Li, “Large-Scale Characterization of Mandarin Pronunciation Errors Made by Native Speakers of European Languages”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 2370-2374.
  • Anthony Larcher, Jean-Francois Bonastre, Benoit Fauve, Kong Aik Lee, Christophe Lévy, Haizhou Li, John S. D. Mason and Jean-Yves Parfait, “ALIZE 3.0 — Open Source Toolkit for State-of-the-Art Speaker Recognition”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 2768-2772.
  • Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Eng Siong Chng and Haizhou Li, “Exemplar-Based Unit Selection for Voice Conversion Utilizing Temporal Information”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 3057-3061.
  • Kong Aik Lee, Anthony Larcher, Chang Huai You, Bin Ma and Haizhou Li, “Multi-Session PLDA Scoring of i-Vector for Partially Open-Set Speaker Detection”, in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 3651-3655.

ICASSP

  • Zhizheng Wu, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Synthetic Speech Detection using Temporal Modulation Feature”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 7234-7238.
  • Dau-Cheng Lyu, Eng-Siong Chng and Haizhou Li, “Language Diarization for Code-Switch Conversational Speech”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 7314-7318.
  • Nancy F. Chen, Bin Ma and Haizhou Li, “Minimal-Resource Phonetic Language Models to Summarize Untranscribed Speech”, in Proc IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 8357-8361.
  • Anthony Larcher, Kong Aik Lee, Bin Ma and Haizhou Li, “Phonetically-Constrained PLDA Modeling for Text-Dependent Speaker Verification with Multiple Short Utterances”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 7673-7677.
  • Chang Huai You, Haizhou Li, Bin Ma and Kong Aik Lee, “A Study on GMM-SVM with Adaptive Relevance Factor and Its Comparison with i-Vector and JFA for Speaker Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 7683-7687.
  • Heike Adel, Ngoc Thang Vu, Franziska Kraus, Tim Schlippe, Haizhou Li and Tanja Schultz, “Recurrent neural network language modeling for code switching conversational speech”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 8411-8415.
  • Xiaoming Lu, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Broadcast News Story Segmentation using Latent Topics on Data Manifold”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 8465-8469.
  • Jonathan Dennis, Yu Qiang, Tang Huajin, Tran Huy Dat and Li Haizhou, “Temporal Coding of Local Spectrogram Features for Robust Sound Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 803-807.
  • Xiong Xiao, Eng Siong Chng and Haizhou Li, “Temporal Filter Design by Minimum KL Divergence Criterion for Robust Speech Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 7908-7912.
  • Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma and Haizhou Li, “Using Parallel Tokenizers with DTW Matrix Combination for Low-Resource Spoken Term Detection”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, May 2013, pp. 8545-8549.
  • Tze Yuang Chong, Xiong Xiao, Tien-Ping Tan, Eng Siong Chng, and Haizhou Li, “Collection and annotation of Malay Conversational Speech Corpus”, in Proc. The International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), Macau, China, December 2012, pp. 30-35.
  • Deyi Xiong, Min Zhang, and Haizhou Li, “Modeling the Translation of Predicate-Argument Structure for SMT”, in Proc. Annual Meeting of the Association for Computational Linguistics (ACL), Jeju, Korea, July 2012, pp. 902-911.
  • Wenliang Chen, Min Zhang, and Haizhou Li, “Utilizing Dependency Language Models for Graph-based Dependency Parsing Models”, in Proc. Annual Meeting of the Association for Computational Linguistics (ACL), Jeju, Korea, July 2012, pp. 213-222.
  • Rafael E. Banchs and Haizhou Li, “IRIS: a Chat-oriented Dialogue System based on the Vector Space Model”, in Proc. Annual Meeting of the Association for Computational Linguistics (ACL), (System Demonstrations), Jeju, Korea, July 2012, pp. 37-42.
  • Van Hai Do, Xiong Xiao, Eng Siong Chng and Haizhou Li, “A Phone Mapping Technique for Acoustic Modeling of Under-Resourced Languages”, in Proc. International Conference on Asian Language Processing (IALP), Hanoi, Vietnam, November 2012, pp. 233-236.
  • Liyuan Li, Xinguo Yu, Jun Li, Gang Wang, Ji Yu Shi, Yeow Kee Tan and Haizhou Li, “Vision-based attention estimation and selection for social robot to perform natural interaction in the open world”, in Proc. Seventh Annual Conference on Human-Robot Interaction (HRI), Boston, Massachusetts, USA, March 2012, pp. 183-184.
  • Keng Peng Tee, Shuzhi Sam Ge, Rui Yan and Haizhou Li, “Adaptive control for robot manipulators under ellipsoidal task space constraints”, in Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Algarve, Portugal, October 2012, pp. 1167-1172.

APSIPA-ASC

  • Zhizheng Wu, Tomi Kinnunen, Eng Siong Chng, Haizhou Li and Eliathamby Ambikairajah, “A Study on Spoofing Attack in State-of-the-Art Speaker Verification: the Telephone Speech Case”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), California, USA, December 2012. (Best Paper Award)

ICASSP

  • Xiong Xiao, Jinyu Li, Eng Siong Chng and Haizhou Li, “Lasso Environment Model Combination for Robust Speech Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 2012, pp. 4305-4308.
  • Xiong Xiao, Eng Siong Chng and Haizhou Li, “Joint Spectral and Temporal Normalization of Features for Robust Recognition of Noisy and Reverberated Speech”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 2012, pp. 4325-4328.
  • Siu Wa Lee, Shen Ting Ang, Minghui Dong, and Haizhou Li, “Generalized F0 Modelling with Absolute and Relative Pitch Features for Singing Voice Synthesis”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 2012, pp. 429-432.
  • Lilei Zheng, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Acoustic TextTiling for Story Segmentation of Spoken Documents”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 2012, pp. 5121-5124.
  • Haipeng Wang, Cheung-Chi Leung, Tan Lee, Bin Ma and Haizhou Li, “An Acoustic Segment Modeling Approach to Query-by-Example Spoken Term Detection”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 2012, pp. 5121-5124.
  • Anthony Larcher, Pierre-Michel Bousquet, Kong Aik Lee, Driss Matrouf, Haizhou Li and Jean-Francois Bonastre, “I-Vectors in the Context of Phonetically-Constrained Short Utterances for Speaker Verification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 2012, pp. 4773-4776.
  • Tomi Kinnunen, Zhi-Zheng Wu, Kong Aik Lee, Filip Sedlak, Eng Siong Chng and Haizhou Li, “Vulnerability of Speaker Verification Systems Against Voice Conversion Spoofing Attacks: the case of Telephone Speech”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 2012, pp. 4401-4404.

INTERSPEECH

  • Ye Jiang, Kong Aik Lee, Zhenmin Tang, Bin Ma, Anthony Larcher and Haizhou Li, “PLDA Modeling in I-Vector and Supervector Space for Speaker Verification”, in Proc. INTERSPEECH, Portland, Oregon, September 2012, pp. 1680-1683.
  • Anthony Larcher, Kong Aik Lee, Bin Ma and Haizhou Li, “RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases”, in Proc. INTERSPEECH, Portland, Oregon, September 2012, pp. 1580-1583.
  • You Changhuai, Li Haizhou, Ma Bin and Lee Kong Aik, “Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition”, in Proc. INTERSPEECH, Portland, Oregon, September 2012, pp. 2065-2068.

ISCSLP

  • Van Hai Do, Xiong Xiao, Eng Siong Chng and Haizhou Li, “Context dependent phone mapping for cross-lingual acoustic modelling”, in Proc. 8th International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, December 2012, pp. 16-20.
  • Cheung-Chi Leung, Bin Ma, and Haizhou Li, “Phonotactic spoken language recognition: Using diversely adapted acoustic models in parallel phone recognizers”, in Proc. 8th International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, December 2012, pp. 108-111.
  • Duc Hoang Ha Nguyen, Xiong Xiao, Chng Eng Siong and Haizhou Li, “An analysis of vector Taylor series model compensation for non-stationary noise in speech recognition”, in Proc. 8th International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, December 2012, pp. 131-135.
  • Siu Wa Lee, Minghui Dong and Haizhou Li, “A study of F0 modelling and generation with lyrics and shape characterization for singing voice synthesis”, in Proc. 8th International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, December 2012, pp. 150-154.
  • Deyi Xiong, Min Zhang and Haizhou Li, “Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers”, in Proc. Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), Portland, Oregon, June 2011, pp. 1288-1297.
  • Rafael E. Banchs and Haizhou Li, “AM-FM: A Semantic Framework for Translation Quality Assessment”, in Proc. Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), Portland, Oregon, June 2011, pp. 153-158.
  • Wenliang Chen, Junichi Kazama, Min Zhang, Yoshimasa Tsuruoka, Yujie Zhang, Yiou Wang, Kentaro Torisaws, and Haizhou Li, “SMT Helps Bitext Dependency Parsing”, in Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, UK, July 2011, pp. 73–83.
  • Zhenghua Li, Min Zhang, Wanxiang Che, Ting Liu, Wenliang Chen and Haizhou Li, “Joint Models for Chinese POS Tagging and Dependency Parsing”, in Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, UK, July 2011, pp. 1180-1191.
  • Min Zhang, Xiangyu Duan, Ming Liu, Yunqing Xia and Haizhou Li, “Joint Alignment and Artificial Data Generation: An Empirical Study of Pivot-based Machine Transliteration”, in Proc. Fifth International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand, November 2011, pp. 1207-1215.
  • Guoyu Tang, Yunqing Xia, Min Zhang, Haizhou Li and Fang Zhang, “CLGVSM: Adapting Generalized Vector Space Model to Cross-lingual Document Clustering”, in Proc. Fifth International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand, November 2011, pp. 580–588.

ICASSP

  • Huy Dat Tran and Haizhou Li, “Probabilistic Distance SVM with Hellinger-Exponential Kernel for Sound Event Classification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech, May 2011, pp. 2272-2275.
  • Huy Dat Tran and Haizhou Li, “Jump Function Kolmogorov for Overlapping Audio Event Classification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech, May 2011, pp. 3696-3699.
  • Raymond W. M. Ng, Cheung-Chi Leung, Tan Lee, Bin Ma and Haizhou Li, “Score Fusion and Calibration in Multiple Language Detectors with Large Performance Variation”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech, May 2011, pp. 4404-4407.
  • Filip Sedlak, Tomi Kinnunen, Ville Hautamäki, Kong Aik Lee and Haizhou Li, “Classifier Subset Selection and Fusion for Speaker Verification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech, May 2011, pp. 4544-4547.
  • Eryu Wang, Kong Aik Lee, Bin Ma, Haizhou Li, Wu Guo and Li-Rong Dai, “Factored Covariance Modeling for Text-Independent Speaker Verification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech, May 2011, pp. 4856-4859.
  • Xiong Xiao, Jinyu Li, Eng Siong Chng and Haizhou Li, “Maximum Likelihood Adaptation of Histogram Equalization with Constraint for Robust Speech Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech, May 2011, pp. 5480-5483.

INTERSPEECH

  • Kong Aik Lee, Chang Huai You, Ville Hautamäki, Anthony Larcher and Haizhou Li, “Spoken Language Recognition in the Latent Topic Simplex”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 2933-2936.
  • Chang Huai You, Haizhou Li and Kong Aik Lee, “Study on the Relevance Factor of Maximum a Posteriori with GMM for Language Recognition”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 2893-2896.
  • Rong Tong, Bin Ma, Haizhou Li and Eng Siong Chng, “Target-aware Lattice Rescoring for Dialect Recognition”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 733-736.
  • Yiren Leng, Huy Dat Tran, Norihide Kitaoka and Haizhou Li, “Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event Recognition”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 297-300.
  • Kong Aik Lee, Anthony Larcher, Helen Thai, Bin Ma and Haizhou Li, “Joint Application of Speech and Speaker Recognition for Automation and Security in Smart Home”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 3317-3318.
  • Chien-Lin Huang, Bin Ma, Haizhou Li and Chung-Hsien Wu, “Speech Indexing Using Semantic Context Inference”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 717-720.
  • Xiong Xiao, Jinyu Li, Eng Siong Chng and Haizhou Li, “Feature Normalization Using Structured Full Transforms for Robust Speech Recognition”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 693-696.
  • Sethserey Sam, Xiong Xiao, Laurent Besacier, Eric Castelli and Haizhou Li, and Eng Siong Chng, “Speech Modulation Features for Robust Nonnative Speech Accent Detection”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 2417-2420.
  • Jonathan William Dennis, Huy Dat Tran and Haizhou Li, “Image Representation of the Subband Power Distribution for Robust Sound Classification”, in Proc. INTERSPEECH, Florence, Italy, August 2011, pp. 2437-2440.
  • Mimi Lu, Cheung-Chi Leung, Lei Xie, Bin Ma and Haizhou Li, “Probabilistic Latent Semantic Analysis for Broadcast News Story Segmentation”, in Proc. INTERSPEECH,
  • Min Zhang, Hui Zhang, and Haizhou Li, “Convolution Kernel over Packed Parse Forest”, in Proc. Association for Computational Linguistics (ACL), Uppsala, Sweden, July 2010, pp. 875-885.
  • Deyi Xiong, Min Zhang and Haizhou Li, “Error Detection for Statistical Machine Translation Using Linguistic Features”, in Proc. Association for Computational Linguistics (ACL), Uppsala, Sweden, July 2010, Pp. 604-611.
  • Xiangyu Duan, Min Zhang and Haizhou Li. “Pseudo-word for Phrase-based Machine Translation”, in Proc. Association for Computational Linguistics (ACL), Uppsala, Sweden, July 2010, pp 148-156.
  • Deyi Xiong, Min Zhang and Haizhou Li, “Learning Translation Boundaries for Phrase-Based Decoding”, in Proc. North American Chapter of the Association for Computational Linguistics – Human Language Technologies: (NAACL-HLT), Los Angeles, CA, June 2010, pp 136-144.
  • Lianhau Lee, Aiti Aw, Min Zhang and Haizhou Li, “EM-based Hybrid Model for Bilingual Terminology Extraction from Comparable Corpora”, in Proc. International Conference on Computational Linguistics (COLING), Beijing, China, August 2010, pp. 639–646.
  • Vladimir Pervouchine, Min Zhang, Ming Liu and Haizhou Li, “Improving Name Origin Recognition with Context Features and Unlabelled Data”, in Proc. International Conference on Computational Linguistics (COLING), Beijing, China, August 2010, pp. 972–978.
  • Min Zhang, Xiangyu Duan, Vladimir Pervouchine and Haizhou Li, “Machine Transliteration: Leveraging on Third Languages”, in Proc. International Conference on Computational Linguistics (COLING), Beijing, China, August 2010, pp. 1444–1452.

INTERSPEECH

  • Raymond W. M. Ng, Cheung-Chi Leung, Ville Hautamaki, Tan Lee, Bin Ma and Haizhou Li, “Towards Long-Range Prosodic Attribute Modeling for Language Recognition”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1792-1795.
  • Tin Lay Nwe, Hanwu Sun, Bin Ma and Haizhou Li, “Speaker Diarization in Meeting Audio for Single Distant Microphone”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1505-1508.
  • Rong Tong, Bin Ma, Haizhou Li and Eng Siong Chng, “Selecting Phonotactic Features for Language Recognition”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 737-740.
  • Omid Dehzangi, Bin Ma, Eng Siong Chng and Haizhou Li, “A Discriminative Performance Metric for GMM-UBM Speaker Identification”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 2114-2117.
  • Cheung-Chi Leung, Donglai Zhu, Kong-Aik Lee, Bin Ma and Haizhou Li, “Incorporating MAP Estimation and Covariance Transform for SVM based Speaker Recognition”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 2318-2321.
  • Chien-Lin Huang, Hanwu Sun, Bin Ma and Haizhou Li, “Speaker Characterization Using Long-Term and Temporal Information”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 370-373.
  • Xiaoxuan Wang, Lei Xie, Bin Ma, Eng Siong Chng and Haizhou Li, “Phoneme Lattice based TextTiling towards Multilingual Story Segmentation”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1305-1308.
  • Eryu Wang, Kong-Aik Lee, Bin Ma, Haizhou Li, Wu Guo and Lirong Dai, “The Estimation and Kernel Metric of Spectral Correlation for Text-Independent Speaker Verification”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1065-1068.
  • Donglai Zhu, Bin Ma, Kong-Aik Lee, Cheung-Chi Leung and Haizhou Li, “MAP Estimation of Subspace Transform for Speaker Recognition”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1465-1468.
  • Hanwu Sun, Bin Ma, Chien-Lin Huang, Trung Hieu Nguyen and Haizhou Li, “The IIR NIST SRE 2008 and 2010 Summed Channel Speaker Recognition Systems”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 366-369.
  • Ville Hautamaki, Tomi Kinnunen, Mohaddeseh Nosratighods, Kong-Aik Lee, Bin Ma and Haizhou Li, “Approaching Human Listener Accuracy with Modern Speaker Verification”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1473-1476.
  • Minghui Dong, Paul Chan, Ling Cen, Haizhou Li, Jason Teo and Ping Jen Kua, “Phonetic Segmentation of Singing Voice using MIDI and Parallel Speech”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 2890-2893.
  • You Changhuai, Li Haizhou and Kong-Aik Lee, “A Hybrid Modeling Strategy for GMM-SVM Speaker Recognition System with Adaptive Relevance factor”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 2746-2749
  • Leng Yi Ren, Tran Huy Dat, Norihide Kitaoka and Li Haizhou, “Selective Gammatone Filterbank Feature for Robust Sound Event Recognition”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 2246-2249.
  • Zhi-Zheng Wu, Tomi Kinnunen, Eng Siong Chng and Haizhou Li, “Text-Independent F0 Transformation with Non-Parallel Data for Voice Conversion”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1732-1735.
  • Dau-Cheng Lyu, Tien-Ping Tan, Eng-Siong Chng and Haizhou Li, “SEAME: a Mandarin-English Code-switching Speech Corpus in South-East Asia”, in Proc. INTERSPEECH, Makuhari, Japan, September 2010, pp. 1986-1989.

ICASSP

  • Dat Tran Huy, Yi Ren Leng, and Haizhou Li, “Feature Integration for Heart Sound Biometrics”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 1714-1717.
  • Omid Dehzangi, Bin Ma, Eng Siong Chng, and Haizhou Li, “Error Corrective Classifier Fusion for Spoken Language Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 1994-1997.
  • C. P. Santhosh Kumar, Haizhou Li, Rong Tong, Pavel Matejka, Lukas Burget, and Jan Cernocky, “Tuning Phone Decoders for Language Identification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 5010-5013.
  • Hanwu Sun, Bin Ma, Swe Zin Kalayar Khine, and Haizhou Li, “Speaker Diarization System for RT07 and RT09 Meeting Room Audio”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 4982-4985.
  • Yu Tsao, Hanwu Sun, Haizhou Li, and Chin-Hui Lee, “An Acoustic Segment Model Approach to Incorporating Temporal Information into Speaker Modeling for Text-Independent Speaker Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 4422-4425.
  • Donglai Zhu, Bin Ma, and Haizhou Li, “Soft Margin Estimation of Gaussian Mixture Model Parameters for Spoken Language Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 4990-4993.
  • Shuanhu Bai, Chien-Lin Huang, Bin Ma, and Haizhou Li, “Semi-Supervised Learning of Language Model using Unsupervised Topic Model”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 5386-5389.
  • Raymond W. M. Ng, Cheung-Chi Leung, Tan Lee, Bin Ma, and Haizhou Li, “Prosodic Attribute Model for Spoken Language Identification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Dallas, USA, March 2010, pp. 5022-5025.
  • Vladimir Pervouchine, Haizhou Li, and Bo Lin, “Transliteration Alignment”, in Proc. 47th Annual Meeting of Association for Computational Linguistics and the 4th International Joint Conference of Natural Language Processing (ACL-IJCNLP), Singapore, August 2009, pp. 136–144.
  • Deyi Xiong, Min Zhang, Aiti Aw and Haizhou Li, “A Syntax-Driven Bracketing Model for Phrase-Based Translation”, in Proc. 47th Annual Meeting of Association for Computational Linguistics and the 4th International Joint Conference of Natural Language Processing (ACL-IJCNLP), Singapore, August 2009, pp. 315–323.
  • Hendra Setiawan, Min Yen Kan, Haizhou Li, and Philip Resnik, “Topological Ordering of Function Words in Hierarchical Phrase-based Translation”, in Proc. 47th Annual Meeting of Association for Computational Linguistics and the 4th International Joint Conference of Natural Language Processing (ACL-IJCNLP), Singapore, August 2009, pp. 324–332.
  • Hui Zhang, Min Zhang, Haizhou Li, Aiti Aw, and Chew Lim Tan, “Forest-based Tree Sequence to String Translation Model”, in Proc. 47th Annual Meeting of Association for Computational Linguistics and the 4th International Joint Conference of Natural Language Processing (ACL-IJCNLP), Singapore, August 2009, pp. 172–180.
  • Boxing Chen, Min Zhang, Haizhou Li, and Aiti Aw, “A Comparative Study of Hypothesis Alignment and its Improvement for Machine Translation System Combination”, in Proc. 47th Annual Meeting of Association for Computational Linguistics and the 4th International Joint Conference of Natural Language Processing (ACL-IJCNLP), Singapore, August 2009, pp. 941–948.
  • Min Zhang and Haizhou Li, “Tree Kernel-based SVM with Structured Syntactic Knowledge for BTG-based Phrase Reordering”, in Proc. Empirical Methods in Natural Language Processing (EMNLP), Singapore, August 2009, pp. 698–707.
  • Hui Zhang, Min Zhang, Haizhou Li, and Chew Lim Tan, “Fast Translation Rule Matching for Syntax-based Statistical Machine Translation”, in Proc. Empirical Methods in Natural Language Processing (EMNLP), Singapore, August 2009, pp. 1037–1045.
  • Hui Zhang, Min Zhang, Chew Lim Tan, and Haizhou Li, “K-Best Combination of Syntactic Parsers”, in Proc. Empirical Methods in Natural Language Processing (EMNLP), Singapore, August 2009, pp. 1552–1560.

INTERSPEECH

  • Rong Tong, Bin Ma, Haizhou Li, Eng Siong Chng, and Kong-Aik Lee, “Target-Aware Language Models for Spoken Language Recognition”, in Proc. INTERSPEECH, Brighton, UK, September 2009, pp. 200-203.
  • Hanwu Sun, Tin Lay Nwe, Bin Ma, and Haizhou Li, “Speaker Diarization for Meeting Room Audio”, in Proc. INTERSPEECH, Brighton, UK, September 2009, pp. 900-903.
  • Ling Cen, Minghui Dong, Paul Chan, and Haizhou Li, “Unit Selection Based Speech Synthesis for Poor Channel Condition”, in Proc. INTERSPEECH, Brighton, UK, September 2009, pp. 2075-2078.
  • Donglai Zhu, Bin Ma, and Haizhou Li, “Large Margin Estimation of Gaussian Mixture Model Parameters with Extended Baum-Welch for Spoken Language Recognition”, in Proc. INTERSPEECH, Brighton, UK, September 2009, pp. 2179-2182.
  • Omid Dehzangi, Bin Ma, Eng Siong Chng, and Haizhou Li, “Discriminative Feature Transformation Using Output Coding for Speech Recognition”, in Proc. INTERSPEECH, Brighton, UK, September 2009, pp. 2979-2982.
  • Khe Chai Sim and Haizhou Li, “Stream-Based Context-Sensitive Phone Mapping for Cross-Lingual Speech Recognition”, in Proc. INTERSPEECH, Brighton, UK, September 2009, pp. 3019-3022.

ICASSP

  • Yanhua Long, Bin Ma, Haizhou Li, Wu Guo, Eng Siong Chng, and Lirong Dai, “Exploiting Prosodic Information for Speaker Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. 4225-4228.
  • Chang Huai You, Kong Aik Lee, and Haizhou Li, “A GMM Supervector Kernel with the Bhattacharyya Distance for SVM based Speaker Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. pp. 4221-4224.
  • Mohaddeseh Nosratighods, Tharmarajah Thiruvaran, Julien Epps, Eliathamby Ambikairajah, Bin Ma, and Haizhou Li, “Evaluation of a Fused FM and Cepstral-Based Speaker Recognition System on the NIST 2008 SRE”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. pp. 4233-4236.
  • Hanwu Sun, Bin Ma, and Haizhou Li, “Cross-Validation of Multiple Language Recognition Systems using Pseudo Keys”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. 4353-4356.
  • Haizhou Li, Bin Ma, Kong-Aik Lee, Hanwu Sun, Donglai Zhu, Khe Chai Sim, Changhuai You, Rong Tong, Ismo Karkkainen, Chien-Lin Huang, Vladimir Pervouchine, Wu Guo, Yijie Li, Lirong Dai, Mohaddeseh Nosratighods, Thiruvaran Tharmarajah, Julien Epps, Eliathamby Ambikairajah, Eng-Siong Chng, Tanja Schultz, and Qin Jin, “The I4U System in NIST 2008 Speaker Recognition Evaluation”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. 4201-4204.
  • Donglai Zhu, Bin Ma, and Haizhou Li, “Joint MAP Adaptation of Feature Transformation and Gaussian Mixture Model for Speaker Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. 4045-4048.
  • Tran Huy Dat and Haizhou Li, “Sound Event Classification based on Feature Integration, Recursive Feature elimination and Structured Classification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. 177-180.
  • Trung Hieu Nguyen, Eng Siong Chng, and Haizhou Li, “Clustering Criterion Functions in Spectral Subspace and Their Application in Speaker Clustering”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. 4085-4088.
  • Tin Lay Nwe, Hanwu Sun, Haizhou Li, and Susanto Rahardja, “Speaker Diarization in Meeting Audio”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, April 2009, pp. 4073-4076.
  • Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li, Chew Lim Tan, and Sheng Li, “A Tree Sequence Alignment-based Tree-to-Tree Translation Model”, in Proc. Annual Meeting of the Association for Computational Linguistics with the Human Language Technology Conference (ACL-HLT), Columbus, Ohio, June 2008, pp. 559–567.
  • Deyi Xiong, Min Zhang Aiti Aw, and Haizhou Li, “A Linguistically Annotated Reordering Model for BTG-based Statistical Machine Translation”, in Proc. Annual Meeting of the Association for Computational Linguistics with the Human Language Technology Conference (ACL-HLT), Columbus, Ohio, June 2008, pp. 149–152.
  • Boxing Chen, Min Zhang Aiti Aw, and Haizhou Li, “Exploiting N-best Hypotheses for SMT Self-Enhancement”, in Proc. Annual Meeting of the Association for Computational Linguistics with the Human Language Technology Conference (ACL-HLT), Columbus, Ohio, June 2008, pp. 157–160.
  • Jin-Shea Kuo and Haizhou Li, “Multi-View Co-Training of Transliteration Model”, in Proc. International Joint Conference on Natural Language Processing (IJCNLP), Hyderabad, India, January 2008, pp. 373-380.
  • Min Zhang, Chengjie Sun, Haizhou Li, Aiti Aw, and Chew Lim Tan, “Name Origin Recognition Using Maximum Entropy Model and Diverse Features”, in Proc International Joint Conference on Natural Language Processing (IJCNLP), Hyderabad, India, January 2008, pp. 56-63.
  • Jin-Shea Kuo, Haizhou Li, and Chih-Lung Lin, “Mining Transliterations from Web Query Results: An Incremental Approach,” in Proc. Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing (SIGHAN), Hyderabad, India, January 2008, pp. 16-23.
  • Min Zhang, Hongfei Jiang, Haizhou Li, Aiti Aw, and Sheng Li, “Grammar Comparison Study for Translational Equivalence Modeling and Statistical Machine Translation”, in Proc. COLING 2008, Manchester, UK, August 2008.
  • Boxing Chen, Min Zhang, Aiti Aw, and Haizhou Li, “Regenerating Hypotheses for Statistical Machine Translation”, in Proc. COLING 2008, Manchester, UK, August 2008.
  • Deyi Xiong, Min Zhang, Aiti Aw and Haizhou Li, “Linguistically Annotated BTG for Statistical Machine Translation”, in Proc. International Conference on Computational Linguistics (COLING), Manchester, UK, August 2008, pp. 1009–1016.
  • Tee Kiah Chia, Khe Chai Sim, Haizhou Li and Hwee Tou Ng, “A Lattice-Based Approach to Query-by-Example Spoken Document Retrieval”, in Proc. 31st Annual International ACM SIGIR Conference on Research & Development on Information Retrieval, Singapore, July 2008, pp. 363-370.
  • Chien-Lin Huang, Chung-Hsien Wu, Chia-Hsin Hsieh, Haizhou Li, and Bin Ma, “Unsupervised Pronunciation Grammar Growing using Knowledge-based and Data-Driven Approaches”, in Proc. IEEE International Conference on Multimedia & Expo (ICME), Hannover, Germany, June 2008, pp. 1097-1100.
  • Chang Huai You, Susanto Rahardja, and Haizhou Li, “Speech Enhancement for Telephony Name Speech Recognition”, in Proc. IEEE International Conference on Multimedia & Expo (ICME), Hannover, Germany, June 2008, pp. 973-976.
  • Boxing Chen, Deyi Xiong, Min Zhang, Aiti Aw, and Haizhou Li, “I2R Multi-Pass Machine Translation System for IWSLT 2008”, in Proc. International Workshop on Spoken Language Translation (IWSLT), Hawaii, USA, 2008, pp.46-51.
  • Omid Dehzangi, Bin Ma, Eng Siong Chng, and Haizhou Li, “Fuzzy Rule Selection using Iterative Rule Learning for Speech Data Classification”, in Proc. International Conference on Pattern Recognition (ICPR), Tampa, Florida, December 2008, pp. 1-4.
  • Eugene Chin Wei Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Eng-Siong Chng, Haizhou Li, and Susanto Rahardja, “Speaker Diarization Using Direction of Arrival Estimate and Acoustic Feature Information: The I2R-NTU Submission for the NIST RT 2007 Evaluation”, in Lecture Notes of Computer Science, Vol. 4625, Multimodal Technologies for Perception of Humans, Springer 2008, pp.484-496.

INTERSPEECH

  • Rong Tong, Bin Ma, Haizhou Li, and Eng-Siong Chng, “Target-Oriented Phone Selection from Universal Phone Set for Spoken Language Recognition”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 715-718.
  • Donglai Zhu, Bin Ma, and Haizhou Li, “Using MAP Estimation of Feature Transformation For Speaker Recognition”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 849-852.
  • Chien-Lin Huang, Bin Ma, Chung-Hsien Wu, Brian Mak, and Haizhou Li, “Robust Speaker Verification Using Short-Time Frequency with Long-Time Window and Fusion of Multi-Resolutions”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 1897-1900.
  • Tin Lay Nwe, Minghui Dong, Swe Zin Kalayar Khine, and Haizhou Li, “Multi-Speaker Meeting Audio Segmentation”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 2522-2525.
  • Swe Zin Kalayar Khine, Tin Lay Nwe, and Haizhou Li, “Speech/Laughter Classification in Meeting Audio”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 793-796.
  • Tran Huy Dat and Haizhou Li, “Speaker Identification in Noise Mismatch Conditions based on Jump Function Kolmogorov Analysis in Wavelet Domain”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 1469-1472.
  • Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen, and Donglai Zhu, “Characterizing Speech Utterances for Speaker Verification with Sequence Kernel SVM”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 1397-1400.
  • Namunu Maddage and Haizhou Li, “Rhythm Based Music Segmentation and Octave Scale Cepstral Features for Sung Language Recognition”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 2526-2529.
  • Tran Hieu Nguyen , Eng Siong Chng, and Haizhou Li, “T-Test Distance and Clustering Criterion for Speaker Diarization”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 36-39.
  • Khe Chai Sim and Haizhou Li, “Context-sensitive Probabilistic Phone Mapping Model for Cross-lingual Speech Recognition”, in Proc. INTERSPEECH, Brisbane, Australia, September 2008, pp. 2715-2718.

ICASSP

  • Rong Tong, Bin Ma, Haizhou Li, and Eng Siong Chng, “Target-Oriented Phone Tokenizers for Spoken Language Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, March- April 2008, pp. 4221-4224.
  • Donglai Zhu, Haizhou Li, Bin Ma, and Chin-Hui Lee, “Discriminative Learning for Optimizing Detection Performance in Spoken Language Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, March- April 2008, pp. 4161-4164.
  • Tin Lay Nwe and Haizhou Li, “On Fusion of Timbre-Motivated Features for Singing Voice Detection And Singer Identification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, March- April 2008, pp. 2225-2228.
  • Swe Zin Kalayar Khine, Tin Lay Nwe, and Haizhou Li, “Singing Voice Detection In Pop Songs Using Co-Training Algorithm”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, March- April 2008, pp. 1629-1632.
  • Khe Chai Sim and Haizhou Li, “Robust Phone Set Mapping Using Decision Tree Clustering for Cross-Lingual Phone Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, March- April 2008, pp. 4309-4312.
  • Kong-Aik Lee, Changhuai You, and Haizhou Li, “Spoken Language Recognition Using Support Vector Machines with Generative Front-End”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, March- April 2008, pp. 4153-4156.
  • Tran Huy Dat and Haizhou Li, “Jump Function Komogorov and Its Application for Audio Stream”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, March- April 2008, pp. 3353-3356.
  • Haizhou Li, Khe Chai Sim, Jin-Shea Kuo, and Minghui Dong, “Semantic Transliteration of Personal Names”, in Proc. Association for Computational Linguistics (ACL), Prague, Czech Republic, June 2007, pp. 120-127.
  • Hendra Setiawan, Min-Yen Kan, and Haizhou Li, “Ordering Phrases with Function Words”, The in Proc. Association for Computational Linguistics (ACL), Prague, Czech Republic, June 2007, pp. 712-719.
  • Tee Kiah Chia, Haizhou Li, and Hwee Tou Ng, “A Statistical Language Modeling Approach to Lattice-based Spoken Document Retrieval”, in Proc. Joint Meeting Conference on Empirical Methods in Natural Language Processing, and Conference on Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, June 2007, pp. 810–818.
  • Tin Lay Nwe and Haizhou Li, “Singing Voice Detection using Perceptually-Motivated Features”, in Proc. ACM Annual Conference on Multimedia (ACM), Augsburg, Germany, September 2007, pp. 309-312.
  • Lei Wang, Eng Siong Chng, and Haizhou Li, “A vector-based approach to broadcast audio database indexing and retrieval”, in Proc. IEEE International Conference on Multimedia and Expo (ICME), Beijing, China, July 2007. pp. 512-515.

ICASSP

  • Bin Ma, Rong Tong, and Haizhou Li, “Discriminative Vector for Spoken Language Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hawaii, USA, April 2007, pp. pp. 1001-1004.
  • Rong Tong, Haizhou Li, Bin Ma, and Eng Siong Chng, “Spoken Language Recognition with Relevance Feedback”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hawaii, USA, April 2007, pp. 861-864.
  • Donglai Zhu, Bin Ma, Haizhou Li, and Qiang Huo, “A Generalized Feature Transformation Approach for Channel Robust Speaker Verification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hawaii, USA, April 2007, pp. 61-64.
  • Xiong Xiao, Eng Siong Chng, and Haizhou Li, “Normalizing the Speech Modulation Spectrum for Robust Speech Recognition”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hawaii, USA, April 2007, pp. 1021-1024.

INTERSPEECH

  • Kong Aik Kee, Changhuai You, Haizhou Li, and Tomi Kinnunen, “A GMM-based Probabilistic Sequence Kernel for Speaker Verification”, in Proc. INTERSPEECH, Antwerp, Belgium, August 2007, pp. 294-297.
  • Eugene Chin Wei Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Eng-Siong Chng, Haizhou Li, and Susanto Rahardja, “Using Direction of Arrival Estimate and Acoustic Feature Information in Speaker Diarization”, in Proc. INTERSPEECH, Antwerp, Belgium, August 2007, pp. 2149-2152.
  • Khe Chai Sim and Haizhou Li, “Fusion of Contrastive Acoustic Models for Parallel Phonotactic Spoken Language Identification”, in Proc. INTERSPEECH, Antwerp, Belgium, August 2007, pp. 170-173.
  • Xiong Xiao, Eng Siong Chng, and Haizhou Li, “Evaluating the Temporal Structure Normalisation Technique on the Aurora-4 Task”, in Proc. INTERSPEECH, Antwerp, Belgium, August 2007, pp. 1070-1073.
  • Jin-Shea Kuo, Haizhou Li, and Ying-Kuei Yang, “Learning Transliteration Lexicons from the Web”, in Proc. Association for Computational Linguistics (COLING-ACL), Sydney, Australia, July 2006, pp. 1129 – 1136.
  • Namunu Maddage, Haizhou Li, and Mohan Kankanhalli, “Music Structure-based Vector Space Retrieval”, in Proc. Annual International ACM SIGIR Conference on Research & Development on Information Retrieval (SIGIR), Seattle, Washington, August 2006, pp. 67-74.
  • Denny Iskandar, Ye Wang, Min -Yen Kan, and Haizhou Li, “Syllabic Level Automatic Synchronization of Music Signals and Text Lyrics”, in Proc. ACM Multimedia Conference, Santa Barbara, USA, October 2006, pp. 659-662.
  • Namunu C Maddage, Mohan S. Kankanhalli, and Haizhou Li, “A Hirarchical Approach for Music Chord Modeling based on the Analysis of Tonal Characteristics”, in Proc. IEEE International Conference on Multimedia and Expo (ICME), Toronto, Canada, July 2006.
  • Jinyu Li, Sibel Yaman, Chin-Hui Lee, Bin Ma, Rong Tong, Donglai Zhu, and Haizhou Li, “Language Recognition Based on Score Distribution Feature Vectors and Discriminative Classifier Fusion”, in Proc. IEEE Odyssey 2006 – The Speaker and Language Recognition Workshop, San Juan, Puerto Rico, June 2006, pp. 1-5.

ICASSP

  • Shuanhu Bai and Haizhou Li, “Bayesian Learning of N-gram Statistical Language Modeling”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, May 2006, pp. I-I.
  • Haizhou Li and Tin Lay Nwe, “Vibrato-Motivated Acoustic Features for Singer Identification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, May 2006, pp. V-V.
  • Rong Tong, Bin Ma, Donglai Zhu, Haizhou Li, and Eng Siong Chng, “Integrating Acoustic, Prosodic and Phonotactic features for Spoken language identification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, May 2006, pp. I-I.

INTERSPEECH

  • Tin Lay Nwe, Haizhou Li, and Minghui Dong, “Analysis and Detection of Speech under Sleep Deprivation”, in Proc. INTERSPEECH, Pittsburgh, USA, September 2006, pp. 1846-1849.
  • Haizhou Li, Bin Ma, and Rong Tong, “Vector-Based Spoken Language Recognition using Output Coding”, in Proc. of INTERSPEECH, Pittsburgh, USA, September 2006.
  • Minghui Dong, Haizhou Li, and Tin Lay Nwe, “Evaluating Prosody of Mandarin Speech for Language Learning”, in Proc. of INTERSPEECH, Pittsburgh, USA, September 2006, pp. 429-432.
  • Ma Bin, Donglai Zhu, Rong Tong, and Haizhou Li, “Speaker Cluster-based GMM Tokenization for Speaker Recognition”, in Proc. INTERSPEECH, Pittsburgh, USA, September 2006, pp. 505-508.
  • Min Zhang, Haizhou Li, Jian Su, and Hendra Setiawan, “A Phrase-based Context-dependent Joint Probability”, in Proc. International Joint Conference on Natural Language Processing (IJCNLP), Jeju, South Korea, October 2005, pp. 600-611.
  • Hendra Setiawan, Haizhou Li, Min Zhang, and Beng Chin Ooi, “Phrase-based Statistical Machine Translation: A Level of Detail Approach”, in Proc. International Joint Conference on Natural Language Processing (IJCNLP), Jeju, South Korea, October 2005, pp. 576-587.
  • Haizhou Li and Bin Ma, “A Phonotactic Language Model for Spoken Language Identification”, in Proc. Association for Computational Linguistics (ACL), Ann Arbor, USA, June 2005, pp. 515-522.
  • Bin Ma and Haizhou Li, “A Phonotactic-Semantic Paradigm for Automatic Spoken Document Classification”, in Proc. International ACM SIGIR Conference (SIGIR), Salvador, Brazil, August 2005, pp. 369-376.
  • Minghui Dong, Kim Teng Lua, and Haizhou Li, “A Unit Selection based Speech Synthesis Approach for Chinese Mandarin Text-to-Speech”, in Proc. of the International Conference on Chinese Computing (ICCC), Singapore, March 2005, pp. 135-144.
  • Bin Ma and Haizhou Li, “Spoken Language Identification Using Bag-of-Sounds”, in Proc. International Conference on Chinese Computing 2005 (ICCC 2005), Singapore, March 2005.
  • Manickam K and Haizhou Li, “Complexity Analysis of Normal and Deaf Infant Cry Acoustic Waves”, in Proc. International Workshop on Model and Analysis of Vocal Emission for Biomedical Applications (MAVEBA 2005), Florence, Italy, 2005.
  • Boon Pang Lim, Bin Ma, and Haizhou Li, “Using Semantic Context to Improve Voice Keyword Mining”, in Proc. International Conference on Chinese Computing 2005 (ICCC 2005), Singapore, March 2005.

ICASSP

  • Tin Lay Nwe and Haizhou Li, “Broadcast News Segmentation by Audio Type Analysis”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, PA, March 2005, pp. 1065-1068.
  • Boon Pang Lim, Haizhou Li, and Bin Ma, “Using Local and Global Phonotactical Features in Chinese Dialect Identification”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Philadelphia, PA, March 2005, pp. 577-580.

INTERSPEECH

  • Santhosh C. Kumar, V.P. Mohandas, and Haizhou Li, “Multilingual Speech Recognition: A Unified Approach”, in Proc. INTERSPEECH, Lisboa, Portugal, September 2005, pp. 3357-3360.
  • Tin Lay Nwe and Haizhou Li, “Identifying Singers of Popular Songs”, in Proc. INTERSPEECH, Lisboa, Portugal, September 2005, pp. 129-132.
  • Minghui Dong, Kim-Teng Lua, and Haizhou Li, “A Probabilistic Approach to Prosodic Word Prediction for Mandarin Chinese TTS”, in Proc. INTERSPEECH, Lisboa, Portugal, September 2005, pp. 3245-3248.
  • Sheng Gao, Bin Ma, Haizhou Li, and Chin-Hui Lee, “A Text Categorization Approach to Automatic Language Identification”, in Proc. INTERSPEECH, Lisboa, Portugal, September 2005, pp. 2837-2840.
  • Bin Ma, Haizhou Li, and Chin-Hui Lee, “An Acoustic Segment Modeling Approach to Automatic Language Identification”, in Proc. INTERSPEECH, Lisboa, Portugal, September 2005, pp. 2829-2832.
  • Haizhou Li, Min Zhang, and Jian Su, “A Joint Source-Channel Model for Machine Transliteration”, in Proc. Association for Computational Linguistics (ACL), Barcelona, Spain, July 2004, pp. 160-167.
  • Min Zhang, Haizhou Li, and Jian Su, “Direct Orthographical Mapping for Machine Transliteration”, in Proc. International Conference on Computational Linguistics (COLING), Geneva, Switzerland, August 2004.
  • Boon Pang Lim, Haizhou Li, and Yu Chen, “Language Identification through Large Vocabulary Continuous Speech Recognition”, in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, December 2004.
  • Yeow Kee Tan, Boon Seong Teoh, and Haizhou Li, “A Grapheme to Phoneme Conversion for Standard Malay”, in Proc. International Conference on Speech and Language System for Human Communication and Workshop on Oriental COCOSDA (ICSLT-OCOCOSDA), New Delhi, India, November 2004.
  • C. S. Kumar and Haizhou Li, “Language identification System for Multilingual Speech Recognition Systems”, in Proc. International Conference Speech and Computer (SPECOM), St. Petersburg, Russia, September 2004.

INTERSPEECH

  • Jun Xu, Guohong Fu, and Haizhou Li, “Grapheme-to-Phoneme Conversion for Chinese Text-to-Speech Session Code”, in Proc. INTERSPEECH, Jeju Island, Korea, October 2004.