At the end of its I/O presentation last week, Google pulled another ace out of its sleeve by showcasing a pair of augmented reality (AR) glasses that have a single purpose – to display audible translations right in front of your eyes. In the video, Google product manager Max Spear called the capability of this prototype “subtitles for the world”, and we can see family members communicating for the first time.
Most of us have used Google Translate before and generally think of it as a powerful tool that occasionally makes embarrassing errors. While we might trust it to get us directions to the bus stop, that’s nowhere near the same as trusting it to accurately understand and communicate our parents’ childhood stories. And hasn’t Google previously stated that it was finally breaking down the language barrier?
Real-time translation was promoted as a feature of Google’s original Pixel Buds in 2017. Sean O’Kane described the experience as a “laudable idea with a lamentable execution” and reported that some of the people he tried it with said he sounded like a five-year-old. That’s not quite in line with what Google showed off in its video.
We also shouldn’t overlook Google’s promise that this translation will take place “inside a pair of AR glasses”, while the reality of augmented reality hasn’t even come close to its concept video from a decade ago.
Google’s AR translation glasses appear to be much more focused than what Glass was trying to accomplish. From what Google showed, they’re only supposed to accomplish one thing – display translated text – rather than act as an ambient computing experience that would replace a smartphone. Even still, creating AR glasses isn’t easy. Even a moderate amount of ambient light can make it difficult to read text on see-through screens. It’s challenging enough to read subtitles on a TV with some light streaming through a window; now imagine that experience strapped to your face (and with the extra burden of engaging in a conversation with someone you can’t comprehend on your own).
But technology advances at a breakneck pace, and Google may be able to overcome a roadblock that has hampered its competitors – which wouldn’t change the fact that Google Translate is hardly a silver bullet for cross-language communication. If you’ve ever tried having an actual conversation using a translation app, you know how important it is to speak slowly. And with precision. And clearly. Unless you want to risk a muddled translation. You could be finished with one slip of the tongue.
People don’t converse in a vacuum or like machines. We know we have to utilize much simpler language when dealing with machine translation, just like we code-switch when speaking to voice assistants like Alexa, Siri, or the Google Assistant. And even when we do speak correctly, the translation can still come out awkward and misinterpreted. The Verge authors mention the example of their Korean-speaking who pointed out that Google translates an honorific version of “Welcome” in Korean that no one actually uses.
That mildly embarrassing gaffe, however, pales in comparison to the fact that, according to tweets from Rami Ismail and Sam Ettinger, Google displayed a number of backwards or broken Arabic scripts on a slide during its Translate presentation. To be fair, nobody expects perfection, but Google is trying to convince us that it is on the verge of nailing real-time translation, which such mistakes make appear seriously unlikely.
Google is attempting to solve an exceptionally intricate problem. It’s easy to translate words; figuring out the grammar is more challenging but not impossible. However, language and communication are far more complicated than just that. Antonio’s mother, for instance, speaks three languages: Italian, Spanish, and English. Every so often, she’ll switch between these languages mid-sentence, including her regional Italian dialect (which is like a fourth language). A human can digest that type of information reasonably easily, but could Google’s prototype glasses handle it? Not to mention the messier parts of a conversation such as unclear references, incomplete thoughts, or innuendo in dialogue. Google’s goal is praiseworthy, to be sure. We all want to live in a world where everyone can have the same experience as the research participants in the video do, watching in awe as their loved ones’ words appear in front of them. Breaking down language barriers and understanding each other in ways we previously couldn’t is something the world needs more of; it’s just that there’s a long way to go before we reach that future. Machine translation has existed for quite some time, but – despite the plenitude of languages it can handle, it does not speak human yet.