Completed Project 3 – Electrical and Computer Engineering

Explainable AI as a Service for Community Healthcare (1 May 2019 – 30 April 2021)

We advance AI with prototype devices built for deployment and testing in a community setting. We apply rigour in data science and through AI as a service, allow AI results to be used in precision medicine, preventive advices and automatic lifestyle coaching such as food logging. We help in designing a community deployable device for patients with chronic diseases.

Project Duration: 1 May 2019 – 30 April 2021

Funding Source: AI Singapore: AI in Health Grand Challenge
Acknowledgment: This research/project is supported by the National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISG-GC-2019-002)

PUBLICATIONS

Journal Articles

Rui Liu, Berrak Sisman, Guanglai Gao and Haizhou Li, “Expressive TTS Training with Frame and Style Reconstruction Loss”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, April 2021, pp. 1-13. [link] [Article In-process]
Rui Liu, Berrak Sisman, Yixing Lin and Haizhou Li, “FastTalker: A Neural Text-to-Speech Architecture with Shallow and Group Autoregression”, Neural Networks, April 2021. [link] [Article In-process]
Mingyang Zhang, Yi Zhou, Li Zhao, and Haizhou Li, “Transfer learning from speech synthesis to voice conversion with non-parallel training data,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, March 2021, pp. 1290-1302. [link] [Article In-process]
Berrak Sisman, Junichi Yamagishi, Simon King, and Haizhou Li, “An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 2021, pp. 132-157. [link] [Article In-process]
Rui Liu, Berrak Sisman, Feilong Bao, Jichen Yang, Guanglai Gao and Haizhou Li, “Exploiting morphological and phonological features to improve prosodic phrasing for Mongolian speech synthesis” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 2021, pp. 274-285. [link] [Article In-process]
Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao and Haizhou Li, “Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS”, IEEE Signal Processing Letters, 27, 2020, pp. 1470-1474. [link] [Article In-process]
Mingyang Zhang, Berrak Sisman, Li Zhao and Haizhou Li, “DeepConversion: Voice conversion with limited parallel training data”, Speech Communication, 122, 2020, pp. 31-43. [link] [Article In-process]

Conference Article

Kun Zhou, Berrak Sisman, and Haizhou Li, “VAW-GAN for disentanglement and recomposition of emotional elements in speech,” in Proc. IEEE Spoken Language Technology (SLT), Shenzhen, China, January 2021. [link]
Hongqiang Du, Xiaohai Tian, Lei Xie, and Haizhou Li, “Optimizing voice conversion network with cycle consistency loss of speaker identity” in Proc. IEEE Spoken Language Technology (SLT), Shenzhen, China, January 2021. [link]
Junchen Lu, Kun Zhou, Berrak Sisman, and Haizhou Li, ” VAW-GAN for Singing Voice Conversion with Non-parallel Training Data”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 514-519. [link]
Zongyang Du, Kun Zhou, Berrak Sisman, and Haizhou Li, “Spectrum And Prosody Conversion for Cross-Lingual Voice Conversion with Cyclegan”, in Proc. Asia-Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (ASC), Auckland, New Zealand, December 2020, pp. 507-513. [link]
Grandee Lee and Haizhou Li, “Modeling Code-Switch Languages Using Bilingual Parallel Corpus”, in Association for Computational Linguistics, July 2020, pp. 860-870. [link]
Chen Zhang, Luis Fernando D’Haro, Rafael E. Banchs, Thomas Friedrichs and Haizhou Li, “Deep AM-FM: Toolkit for Automatic Dialogue Evaluation,” in Proc. 11th International Workshop on Spoken Dialog System (IWSDS) Technology, Barcelona, Spain, September 2020, pp. 53-69. [link]