Google’s newest machine translation system converts speech directly into the text of another language, greatly speeding the process by removing the intermediate transcription step.
Google’s offering isn’t the first real time speech-to-text options. Skype, for example, rolled out a live translation feature in 2014. The difference, though, is that Skype’s and others’ translate from a transcribed version of the audio. Errors in speech recognition could result in incorrect transcription and, therefore, translation.
Google’s deep-learning research team, Google Brain, is essentially cutting out the middle step which has the potential to lead to quicker, more accurate translations. The system was developed by analyzing hundreds of hours of Spanish audio along with the corresponding English text. By using several layers of neural networks, or computer algorithms that mirror the human brain, wavelengths of the spoken Spanish were linked to the corresponding chunks of written English. It’s the computer equivalent of your ears hearing Spanish while your brain understands the words as English.
Here’s Matty Reynolds, reporting for New Scientist:
After a learning period, Google’s system produced a better-quality English translation of Spanish speech than one that transcribed the speech into written Spanish first. It was evaluated using the BLEU score, which is designed to judge machine translations based on how close they are to that by a professional human.
The system could be particularly useful for translating speech in languages that are spoken by very few people, says Sharon Goldwater at the University of Edinburgh in the UK.
International disaster relief teams, for instance, could use it to quickly put together a translation system to communicate with people they are trying to assist. When an earthquake hit Haiti in 2010, says Goldwater, there was no translation software available for Haitian Creole.
Not only could this system come in handy for international disaster relief teams, but it could also be used to translate rare languages that are seldom written down. For example, Goldwater is currently using similar methods to translate Arapaho, a Native American language spoken only by about 1,000 people of the Arapaho tribe. She is also working on translating Ainu, which is spoken by a small portion of the Japanese population.
The new approach isn’t ready for prime time, but with additional training on bigger data sets, it could set a new standard for machine translation.