Completed Project 6 – Electrical and Computer Engineering

Perfect Singing Vocals (13 September 2018 – 12 September 2020)

Every one likes to sing. However, not all of us can sing like a trained singer. This project aims to convert improper amateur singing, or even read lyrics of a song, to good quality singing.

Speech-to-singing (STS) conversion is the task of converting the lyrics of a song, read in natural speaking manner, to proper singing. It finds applications in many innovative services, such as, beautifying the singing renditions by amateur singers, automatically generating reference singing for vocal learners, personalizing singing synthesis systems, etc. The most important aspect of STS conversion is to change the prosody of the natural speech to match with that of proper singing, while retaining the linguistic content and the speaker’s identity. This is a challenging task because speaking and singing are different in many ways.

The STS conversion is currently being implemented in two ways using the model-based and template-based techniques. These two techniques are very similar to each other, except in the way in which reference prosody characteristics are set up for STS conversion. The quality of output of current STS systems are limited by the accuracy of temporal alignment, spectral conversion and analysis-synthesis by a vocoder.

The research on STS conversion in the HLT lab primarily focus on template-based conversion technique. We have developed efficient frameworks for temporal alignment between speech and singing signals and spectral conversion for STS. We have collected a parallel speak-sing database to facilitate research. Further investigation is ongoing, aiming at improvisation of vocoding and other frameworks that can influence the quality of output signals from an STS conversion system.

We have developed a line interactive web platform, where users can try out our speech-to-singing conversion system. A user can create an account, choose his/her favorite songs from our list, read out the lyrics of the song and enjoy the synthesized singing in their own voice. This provides an easy-to-use personalized singing synthesizer for users. Through this system, we advocate that ‘everyone can sing their favorite songs as they desire’. Please find the web platform for “Speak-to-Sing” here.

Project Duration: 13 September 2018 – 12 September 2020