Speechmatics doubles down on aim to understand every voice with the addition of 14 new languages to its offering

In its largest single language increase to date, Speechmatics empowers 340 million extra voices to use speech-to-text technology in their native language

Speechmatics, the leading speech-to-text API scaleup, has added 14 new languages to its accurate speech-to-text engine*. This increase from to 50 languages means that Speechmatics’ technology now supports over half of the world’s population, enabling more people to access speech-to-text technology that outperforms Microsoft and Google.

The 14 new languages available are: Bashkir, Basque, Belarusian, Esperanto, Estonian, Galician, Interlingua, Marathi, Mongolian, Tamil, Thai, Uyghur, Vietnamese, and Welsh. These languages account for over a quarter of a billion people’s native languages. While Thai and Marathi were highly requested by Speechmatics customers, other languages, like Vietnamese, were chosen due to being widely spoken across the world.

However, Speechmatics also included a selection of lesser-spoken languages such as Welsh and Basque (with 883,300 and 900,000 native speakers respectively) and constructed languages like Interlingua, which has an estimated 1,500 native speakers. By including these languages, Speechmatics is playing a key role in preserving those communities and cultures for future generations. Speechmatics plans to continue adding to the number of languages its engine is able to understand, and is working towards usability by 70% of the world’s population in the next three years.

Using the freely available data on Common Voice (audio and visuals) and Oscar (web scraped text), Speechmatics’ Enhanced model has been trained to a higher standard than Google across the shared languages**. For example, Speechmatics’ accuracy for Estonian is 85.95%, compared to Google’s 67.21%***. Speechmatics’ speech-to-text engine uses self-supervised learning, which requires little to no human input, and uses the extensive amount of unlabelled data that is already available online to everyone. These enhanced capabilities have resulted in an overall higher success rate in speech-to-text for all voices.

John Hughes, Accuracy Team Lead at Speechmatics, said, “This expansion to our language offering is the biggest in Speechmatics’ history. While we’re responding to customer demand by adding highly requested languages to our speech-to-text API, we’re also identifying other popular languages with fewer speakers that have yet to be included. Our aim has always been to understand every voice and so it’s vital that we also capture languages that may not be as well recognised. This has allowed us to provide the most comprehensive offering on the market compared to others in our industry. In addition, we have a new pipeline for releasing new languages rapidly so will continue to increase our language coverage going forward.”

*Launched in October 2021, Speechmatics’ speech-to-text engine is the most accurate and inclusive engine of its kind available. It is trained on representative, unlabelled voice data.

**Speechmatics and Google have eight of the new languages in common: Basque, Estonian, Galician, Marathi, Mongolian, Tamil, Vietnamese, and Thai.

*** Accuracy is measured as the percentage value of 1−WER (Word Error Rate, the rate at which words are incorrectly inserted, deleted or substituted).

Sep 14, 2022 | Read time 3 min

Speechmatics doubles down on aim to understand every voice with the addition of 14 new languages to its offering