Completed-Project-10 – Electrical and Computer Engineering

Neuromorphic Computing

For many real-world pattern learning and classification tasks, today’s digital computers cannot compete with human brains in terms of energy and computational efficiency. One of the reasons is that the computer architecture is very different from that of our neural systems, in which a huge number of nerve cells communicate with action potentials, which we call ‘spikes’, in parallel. Human brain works with spikes and biological spiking neurons. Spiking neural network is biologically inspired and grounded under a solid scientific framework to achieve some of the advantages of biological systems. It has been a general belief that spiking neural network is the computational way to achieve brain like performance. The research on the theory and implementation of Spiking Neural Network forms the foundation of spiking neural network computing.

Leveraging on the research outcomes in the past decades, this project is focused on developing realistic spiking neural network framework that is hardware implementable by modeling the following brain parts: synapses (plasticity, learning), visual cortex (encoding and recognition), hippocampus (memory and synthesis). The framework will include neural encoding, precise-spike-driven learning for synaptic plasticity learning, neuron modeling for analog memory element characteristics, and classification and synthesis of spatiotemporal patterns.

Project Duration: 31 March 2017 – 29 September 2022

Funding Source: RIE2020 Advanced Manufacturing and Engineering Programmatic Grant A1687b0033

PUBLICATIONS

Journal Articles

Xinyi Chen*, Qu Yang*, Jibin Wu, Haizhou Li, , Kay Chen Tan, "A Hybrid Neural Coding Approach for Pattern Recognition With Spiking Neural Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 5, pp. 3064-3078, May 2024, doi: 10.1109/TPAMI.2023.3339211
Qu Yang*, Malu Zhang*, Jibin Wu, Kay Chen Tan, Haizhou Li, "LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding", IEEE Transactions on Cognitive and Developmental Systems 2023, DOI: 10.1109/TCDS.2023.3334010.
Siqi Cai, Hongxu Zhu, Tanja Schultz and Haizhou Li, "EEG-based Auditory Attention Detection in Cocktail Party Environment", in APSIPA Transactions on Signal and Information Processing 2023, Vol. 12: No. 3, e22. http://dx.doi.org/10.1561/116.00000128, October 2023
Xiaoxue Gao, Chitralekha Gupta, Haizhou Li, "Automatic Lyrics Transcription of Polyphonic Music with Lyrics-Chords Multi-Task Learning", IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 30, pp. 2280-2294, June 2022.
Kun Zhou, Berrak Sisman, Rajib Rana, B.W. Schuller, Haizhou Li, “Emotion Intensity and its Control for Emotional Voice Conversion”, IEEE Transactions on Affective Computing, 2022, DOI 10.1109/TAFFC.2022.3175578 [link]
Jibin Wu, Chenglin Xu, Xiao Han, Daquan Zhou, Malu Zhang, Haizhou Li and Kay Chen Tan, “Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks”, TPAMI.2021.3114196, IEEE Transactions on Pattern Analysis and Machine Intelligence [link]
Chenglin Xu, Wei Rao, Jibin Wu, and Haizhou Li, “Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech”, IEEE / ACM Transactions on Audio, Speech, and Language Processing, July 2021. [link]
Qu Yang, Jibin Wu, and Haizhou Li, “Rethinking Benchmarks for Neuromorphic Learning Algorithms”, The International Joint Conference on Neural Networks (IJCNN), Virtual Event, July 2021. [link]
Xinyuan Qian, Qi Liu, Jiadong Wang, and Haizhou Li, “Three-dimensional Speaker Localization: Audio-refined Visual Scaling Factor Estimation”, IEEE Signal Processing Letters, July 2021. [link]
Chen Zhang, Grandee Lee, Luis Fernando D’Haro, and Haizhou Li, “D-score: Holistic Dialogue Evaluation without Reference”, in IEEE/ACM Transactions on Audio, Speech and Language Processing, April 2021. [link]
Rui Liu, Berrak Sisman, Guanglai Gao and Haizhou Li, “Expressive TTS Training with Frame and Style Reconstruction Loss”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, April 2021, pp. 1-13. [link]
Rui Liu, Berrak Sisman, Yixing Lin and Haizhou Li, “FastTalker: A Neural Text-to-Speech Architecture with Shallow and Group Autoregression”, Neural Networks, April 2021. [link]
Mingyang Zhang, Yi Zhou, Li Zhao, and Haizhou Li, “Transfer learning from speech synthesis to voice conversion with non-parallel training data,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, March 2021, pp. 1290-1302. [link]
Jichen Yang, Hongji Wang, Rohan Kumar Das, and Yanmin Qian, “Modified Magnitude-phase Spectrum Information for Spoofing Detection”, in IEEE/ACM Transactions on Audio, Speech and Language Processing, 29, February 2021, pp. 1065-1078. [link]
Zhixuan Zhang and Qi Liu, “Spike-event-driven deep spiking neural network with temporal encoding”, IEEE Signal Processing Letters, 28, 2021, pp. 484-488. [link]
Berrak Sisman, Junichi Yamagishi, Simon King, and Haizhou Li, “An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 2021, pp. 132-157. [link]
Rui Liu, Berrak Sisman, Feilong Bao, Jichen Yang, Guanglai Gao and Haizhou Li, “Exploiting morphological and phonological features to improve prosodic phrasing for Mongolian speech synthesis” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 2021, pp. 274-285. [link]
Qi Liu and Jibin Wu, “Parameter tuning-free missing-feature reconstruction for robust sound recognition”, IEEE Journal of Selected Topics in Signal Processing, 15(1), January 2021, pp. 78-89. [link]
Yi Zhou, Xiaohai Tian and Haizhou Li, “Multi-Task WaveRNN with an Integrated Architecture for Cross-lingual Voice Conversion”, IEEE Signal Processing Letters, 27, 2020, pp 1310-1314. [link]
Mingyang Zhang, Berrak Sisman, Li Zhao and Haizhou Li, “DeepConversion: Voice conversion with limited parallel training data”, Speech Communication, 122, 2020, pp. 31-43. [link]

Conference Articles

Shimin Zhang*, Qu Yang*, Chenxiang Ma, Jibin Wu, Haizhou Li, Kay Chen Tan, "TC-LIF: A Two-Compartment Spiking Neuron Model for Long-term Sequential Modelling" in the 38th Annual AAAI Conference on Artificial Intelligence (AAAI-24), Vancouver, Canada. (Accepted) (* Equal Contribution)
Siqi Cai, Jia Li, Hongmeng Yang, and Haizhou Li, "RGCnet: An Efficient Recursive Gated Convolutional Network for EEG-based Auditory Attention Detection", Annual International Conference of the IEEE Engineering in Medicine and Biology Society in Sydney, Australia, July 24 - 27, 2023.
Qu Yang, Qi Liu, Haizhou Li, "DEEP RESIDUAL SPIKING NEURAL NETWORK FOR KEYWORD SPOTTING IN LOW-RESOURCE SETTINGS", in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.
Rui Liu, Berrak Sisman, Bj ̈orn W. Schuller, Guanglai Gao, Haizhou Li, "Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning", in Proc. Interspeech 2022, Songdo ConvensiA, in Incheon, Korea, September 18 to 22, 2022.
Jiadong Wang, Xinyuan Qian, Zihan Pan, Malu Zhang, and Haizhou Li, “GCC-PHAT with Speech-oriented Attention for Robotic Sound Source Localization”, in Proc. IEEE International Conference on Robotics and Automation (ICRA), Xian, China, 2021.
Chen Zhang, Yiming Chen, Luis Fernando D’Haro, Yan Zhang, Thomas Friedrichs, Grandee Lee and Haizhou Li, “DynaEval: Unifying Turn and Dialogue Level Evaluation”, in Proc. Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP), August 2021. [link]
Rohan Kumar Das, Jichen Yang, and Haizhou Li, “Data Augmentation with Signal Companding for Detection of Logical Access Attacks” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toronto, Ontario, Canada, June 2021. [link]
Kun Zhou, Berrak Sisman, and Haizhou Li, “VAW-GAN for disentanglement and recomposition of emotional elements in speech,” in Proc. IEEE Spoken Language Technology (SLT), Shenzhen, China, January 2021. [link]
Hongqiang Du, Xiaohai Tian, Lei Xie, and Haizhou Li, “Optimizing voice conversion network with cycle consistency loss of speaker identity” in Proc. IEEE Spoken Language Technology (SLT), Shenzhen, China, January 2021. [link]
Meidan Ouyang, Rohan Kumar Das, Jichen Yang and Haizhou Li, “Capsule Network based End-to-end System for Detection of Replay Attacks”, in Proc. International Symposium on Chinese Spoken Language Processing (ISCSLP) 2021, Hong Kong, January 2021, pp. 1-5. [link]
Rohan Kumar Das and Haizhou Li, “Classification of Speech with and without Face Mask using Acoustic Features” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 747-752. [link]
Rohan Kumar Das, Ruijie Tao, Jichen Yang, Wei Rao, Cheng Yu, and Haizhou Li, “HLT-NUS Submission for NIST 2019 Multimedia Speaker Recognition Evaluation”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 605-609. [link]
Junchen Lu, Kun Zhou, Berrak Sisman, and Haizhou Li, ” VAW-GAN for Singing Voice Conversion with Non-parallel Training Data”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 514-519. [link]
Zongyang Du, Kun Zhou, Berrak Sisman, and Haizhou Li, “Spectrum And Prosody Conversion for Cross-Lingual Voice Conversion with Cyclegan”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 507-513. [link]
Biswajit Dev Sarma and Rohan Kumar Das, “Emotion Invariant Speaker Embeddings for Speaker Identification with Emotional Speech” in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 610-615. [link]
Yi Zhou, Xiaohai Tian, Xuehao Zhou, Mingyang Zhang, Grandee Lee, Rui Liu, Berrak Sisman, and Haizhou Li, “NUS-HLT System for Blizzard Challenge 2020”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, Shanghai, China, October 2020, pp. 44-48. [link]
Xiaohai Tian, Zhichao Wang, Shan Yang, Xinyong Zhou, Hongqiang Du, Yi Zhou, Mingyang Zhang, Kun Zhou, Berrak Sisman, Lei Xie, and Haizhou Li, “The NUS & NWPU system for Voice Conversion Challenge 2020”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, Shanghai, China, October 2020, pp. 170-174. [link]
Zhao Yi, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhenhua Ling, and Tomoki Toda, “Voice Conversion Challenge 2020 – Intra-lingual semi-parallel and cross-lingual voice conversion –”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, Shanghai, China, October 2020, pp. 80-98. [link]
Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhenhua Ling, Junichi Yamagishi, Yi Zhao, Xiaohai Tian, and Tomoki Toda, “Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions”, in Proc. Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge, Shanghai, China, October 2020, pp. 99-120. [link]
Xinyuan Zhou, Emre Yılmaz, Yanhua Long, Yijie Li and Haizhou Li, “Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition,” in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 1042-1046. [link]
Xinyuan Zhou, Grandee Lee, Emre Yılmaz, Yanhua Long, Jiaen Liang and Haizhou Li, “Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-based LVCSR,” in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 5016-5020. [link]
Kun Zhou, Berrak Sisman, Mingyang Zhang and Haizhou Li, “Converting Anyone’s Emotion: Towards Speaker-Independent Emotional Voice Conversion”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 3416-3420. [link]
Shoufeng Lin and Xinyuan Qian, “Audio-Visual Multi-Speaker Tracking Based On the GLMB Framework”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 3082-3086. [link]
Xiaoyi Qin, Ming Li, Hui Bu, Wei Rao, Rohan Kumar Das, Shrikanth Narayanan and Haizhou Li, “The INTERSPEECH 2020 Far-Field Speaker Verification Challenge”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 3456-3460. [link]
Zhenzong Wu, Rohan Kumar Das, Jichen Yang and Haizhou Li, “Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 1101-1105. [link]
Ruijie Tao, Rohan Kumar Das and Haizhou Li, “Audio-visual Speaker Recognition with a Cross-modal Discriminative Network”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 2242-2246. [link]
Tianchi Liu, Rohan Kumar Das, Maulik Madhavi, Shengmei Shen and Haizhou Li, “Speaker-Utterance Dual Attention for Speaker and Utterance Verification”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 4293-4297. [link]
Nana Hou, Chenglin Xu, Joey Tianyi Zhou, Eng Siong Chng and Haizhou Li, “Multi-task Learning for End-to-end Noise-robust Bandwidth Extension”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 4069-4073. [link]
Nana Hou, Chenglin Xu, Van Tung Pham, Joey Tianyi Zhou, Eng Siong Chng and Haizhou Li, “Speaker and Phoneme-Aware Speech Bandwidth Extension with Residual Dual-Path Network”, in Proc. INTERSPEECH, Shanghai, China, October 2020, pp. 4064-4068. [link]
Grandee Lee and Haizhou Li, “Modeling Code-Switch Languages Using Bilingual Parallel Corpus”, in Association for Computational Linguistics, July 2020, pp. 860-870. [link]
Berrak Sisman and Haizhou Li, “Generative Adversarial Networks for Singing Voice Conversion with and without Parallel Data” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 238-244. [link]
Kun Zhou, Berrak Sisman and Haizhou Li, “Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 230-237. [link]
Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao and Haizhou Li, “WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 245-251. [link]
Xiaohai Tian, Rohan Kumar Das and Haizhou Li, “Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 159-164. [link]
Xiaoxue Gao, Xiaohai Tian, Yi Zhou, Rohan Kumar Das and Haizhou Li, “Personalized Singing Voice Generation Using WaveRNN” in Proc. Speaker Odyssey, Tokyo, Japan, November 2020, pp. 252-258. [link]
Rui Liu, Berrak Sisman, Jingdong Li, Feilong Bao, Guanglai Gao and Haizhou Li, “Teacher-Student Training for Robust Tacotron-based TTS”, in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, May 2020, pp. 6274-6278. [link]

Return to Project Lists