Resources – HLT – Electrical and Computer Engineering

Tutorial

Berrak Sisman (SUTD & NUS, Singapore), Yu Tsao (Academia SINICA, Taiwan) and Haizhou Li (NUS, Singapore) gave a tutorial on voice conversion at Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2020, which was held in New Zealand. The recording of their tutorial can be found here.

Workshop

Voice Conversion Challenge 2020 workshop: Organizers: Tomoki Toda, Wen-Chin Huang, Junichi Yamagishi, Yi Zhao, Tomi Kinnunen, Zhenhua Ling, Rohan Kumar Das and Xiaohai Tian. The recording of the workshop video can be found here.

Code

D-score: Holistic Dialogue Evaluation without Reference: [https://github.com/e0397123/D-score]
DynaEval: Unifying Turn and Dialogue Level Evaluation: [https://github.com/e0397123/DynaEval]
Unified framework for speaker and utterance verification: [https://github.com/sn1ff1918/SUV]
Multi-level adaptive speech activity detector: [https://github.com/bidishasharma/MultiSAD/]
PESnQ: Perceptual evaluation of singing quality: [https://github.com/chitralekha18/PESnQ_APSIPA2017] [Paper]
Automatic sung-lyrics data annotation: [https://github.com/chitralekha18/AutomaticSungLyricsAnnotation_ISMIR2018.git] [Paper]
NUS AutoLyrixAlign: [https://github.com/chitralekha18/AutoLyrixAlign.git]
Emotional voice conversion and/or speaker identity conversion with non-parallel training data: [https://github.com/KunZhou9646/emotional-voice-conversion-with-CycleGAN-and-CWT-for-Spectrum-and-F0]
Speaker-independent emotional voice conversion based on conditional VAW-GAN and CWT: [https://github.com/KunZhou9646/Speaker-independent-emotional-voice-conversion-based-on-conditional-VAW-GAN-and-CWT]
Transformer-based dialect identification: [https://github.com/LIN-WANQIU/ADI17]
Multi-modal target speaker extraction with visual cues: [https://github.com/zexupan/MuSE]

Data Set

NHSS: A speech and singing parallel database: [https://hltnus.github.io/NHSSDatabase/index.html]
Solo singing damp dataset with aligned lyrics: [https://github.com/chitralekha18/lyrics-aligned-solo-singing-dataset]
Pronunciation evaluation in singing: [https://github.com/chitralekha18/Dataset-for-pronunciation-evaluation-in-singing]
RSL2019: A realistic speech localization corpus: [https://bidishasharma.github.io/RSL2019/]
Voice conversion challenge (VCC) 2020 database: [https://github.com/nii-yamagishilab/VCC2020-database]
Emotional Speech Dataset (ESD) for speech synthesis and voice conversion: [https://github.com/HLTSingapore/Emotional-Speech-Data]

Demo

Robust sound recognition: A neuromorphic approach: [https://youtu.be/MIVvNb0sWOM]
Speak-to-Sing: [Poster]
MuSigPro: Automatic leaderboard generation of singers using reference-Independent singing quality evaluation Methods: [https://youtu.be/IAlsECqd9IE]
AutoLyrixAlign: Automatic lyrics-to-audio alignment system for polyphonic music audio
Demo video: [https://drive.google.com/file/d/1oGdXQ9d3SfecPu8R3TBhY8kufFfXsd8_/view]
Webpage link: [https://autolyrixalign.hltnus.org]
MuSigPro demo video [https://www.youtube.com/watch?v=E0wwwpxaUOM] Webpage link: [https://musigpro.com/] Google play store link: [https://play.google.com/store/apps/details?id=com.musigpro.app]
Multi-modal target speaker extraction with visual cues: [https://github.com/zexupan/MuSE]

Poster

HLT Lab research areas [Pdf – download] [Image/png – download]
Automatic speech recognition for code-mixed singaporean languages [Pdf – download] [Image/png – download]
Neuromorphic computing [Pdf – download] [Image/png – download]
Let’s perfect everyone’s singing [Pdf – download] [Image/png – download]
Recognize speakers from their voice [Pdf – download] [Image/png – download]
Voice conversion [Pdf – download] [Image/png – download]

HLT Logo

HLT Logo [Pdf – download]
HLT Logo [Image/png – download]