Published on: May 4, 2023

Text -to-speech (TTS) synthesis system

Text -to-speech (TTS) synthesis system

Why in news? Researchers at the Indian Institute of Science (IISc) are planning to release a text-to-speech (TTS) synthesis system aimed at making India’s digital ambitions more inclusive by the end of 2023.

Highlights:

  • SYSPIN (Synthesising Speech in Indian Languages) converts text in nine Indian languages to corresponding voices and publishes these datasets. The datasets double as an open resource for innovators, encouraging them to develop Artificial Intelligence driven voice-based services in finance and agriculture.
  • Implemented in partnership with the IISc-promoted ARTPARK, SYSPIN  could be a potential game changer as it could bring technology-enabled services closer to around 602 million people who speak these languages.
  • SYSPIN could enable TTS in multiple Indian languages, covering more sectors including education, e-mobility, and healthcare.
  • The datasets and models in Bengali, Hindi, Kannada, Marathi and Telugu have been compiled and the project is set for completion by the end of the year
  • About 80 hours of TTS data per language, in a male and a female voice, and the relevant computer programmes are being developed.
  • The team designs text relevant to the two domains, identifies the subjects and speakers, collects and validates the data, and then, open-sources it.
  • Text normalisation which involves the conversion of symbols, numbers, and abbreviations to context-specific speech is critical to TTS systems.

The need for such technology

  • The unavailability of technological expertise and open voice data had hindered possibilities of AI-powered speech technologies in low-resourced Indian languages. Bhojpuri, Chhattisgarhi, Magadhi, and Maithili are the other languages in the programme.
  • Socio-economic backwardness and low literacy have left large sections of India’s population with limited access to digital innovations.

The goal

  • To enable voice-based interfaces that also help people who cannot read. The size of the TTS corpus is expected to be “several times larger” than any existing corpus in Indian languages.
  • The open-sourcing of the TTS data makes it accessible to researchers, technology innovators, social entrepreneurs and startups to develop application-specific models.