Meta’s speech model can recognise over 4000 spoken languages: What this means

Language has been a sensitive issue around the world. Many countries including Sri Lanka and India have seen various attempts from different linguistic groups to protect and preserve their language. However, since most of the tech development is centred around the English language, it has become important for people to learn and speak English and in this attempt, many of the world languages are facing a threat of extinction. Limited language capabilities of current speech recognition and generation technology are further accelerating this trend.

Meta claims to have made an attempt to counter this trend. The company has announced a series of artificial intelligence models that can help people access information and use their devices in their native languages. Meta’s Massively Multilingual Speech (MMS) models have expanded text-to-speech and speech-to-text technology to more than 1,100 languages, as per the company’s statement.

Text-to-speech and speech-to-text were earlier limited to around 100 languages. In addition to this, these technologies can now identify more than 4,000 spoken languages, which is 40 times more than the existing models.

What this means for users

Speech technology has many use cases, and they range from virtual and augmented reality to messaging services. Language translators such as Google Translate and Microsoft Translator are also based on this technology.

With the expanded capabilities of speech technology, many more users for different linguist groups will be able to use various services in their languages and their devices will also be able to understand those languages.

In addition to this, Meta has announced to keep its models and code open-source to help the research community to build solutions on their work.

“We’re open-sourcing our models and code so that others in the research community can build on our work and help preserve the world’s languages and bring the world closer together,” Meta said in a blog post.

How Meta developed Massively Multilingual Speech

Meta used religious texts such as Bible to collect audio data for thousands of languages. The company preferred religious texts because they have been translated into many different languages and these translations have publicly available audio recordings, as per Meta’s statement.

Meta has claimed that though the data is from a specific domain and these recordings often had male voices, its model does not show gender bias and does not produce more religious text. In addition to this, Meta said that moving forward MMS coverage will increase to more languages and it will work on handling dialects.

The post Meta’s speech model can recognise over 4000 spoken languages: What this means appeared first on Techlusive.