ML Scientist

Connecting Scholars with the Latest Academic News and Career Paths

FeaturedNews

New Opportunity in Multilingual Speech Generation with Neural Codecs

A new announcement is out regarding the development of multilingual streaming Text-to-Speech (TTS) systems using neural codecs for Indian languages. The challenge includes TTS data for Indian English, Kannada, Gujarati, and Bhojpuri, with each language having a male and female speaker. This initiative is a part of the SYSPIN project at SPIRE lab, Indian Institute of Science (IISc) Bangalore, India.

This is an excellent opportunity for researchers to contribute to the advancement of real-time, adaptable, and high-quality speech generation systems. Recent developments in Conversational AI models have increased the demand for low-latency, multilingual, and controllable TTS systems. For applications such as Large Language Models (LLMs), streaming TTS systems are essential.

Neural codec-based TTS systems have achieved state-of-the-art performance and offer compact representations of speech that enable efficient transmission and storage. Additionally, various speech attributes can be encoded in neural codecs, allowing for high-quality and controllable speech synthesis.

More details on the challenge are available on the website.

Leave a Reply

Your email address will not be published. Required fields are marked *