Over the past few months, a waterfall of details about how companies handle voice recordings has poured into the news. First, there was news that Amazon hired thousands of contractors to review Alexa commands. Apple and Google were doing the same thing with their voice assistants. Then, more recently, we learned that Facebook and Microsoft had humans transcribing their users’ messages and phone calls. All of this has been a real nightmare for people who care about their privacy.
None of these companies were forthcoming about the fact that they had hired humans might listen to and transcribe what users said to their computers. Unless you’re versed in the basics of machine learning and natural language processing, you probably didn’t realize that human review is still an essential part of developing voice software. It is. Amazon, Apple, Facebook, Google, and Microsoft have all justified this eavesdropping on their users’ voice recording by explaining that it’s essential to improve the artificially intelligent software that makes their products work. It’s very frustrating that these companies have obscured the fact that humans are listening. We can only wonder what else is happening behind the scenes with these data-hoarding gadgets and services.
While the whole situation feels like a big bummer, consider the bright side: If humans are necessary to keep these invasive machines working, at least we know the robot overlords have not yet taken over. We even have a better understanding of what we can do to keep ourselves safe, thanks to the human contractors who blew the whistle and revealed their existence to the world. And there’s still time to rein in the technology with both policy and pressure before things really get out of hand.
This summer’s torrent of revelations about voice assistants and voice user interfaces work is shining some much-needed sunlight onto how this technology works. When Amazon released the Echo in 2014, it was initially unclear what the device’s array of always-on microphones were actually doing. People feared that Echos were recording everything, and privacy advocates have long argued that these so-called smart speakers are simply futuristic forms of wiretapping devices. So far, they haven’t exactly been proven wrong, which is why learning more about the inner workings of the technology is so important.
We are getting a clearer picture of how all these voice-powered products work, typically because some scandal forces companies to reveal details. Like there was that time an Echo misinterpreted some speech, recorded a couple’s entire conversation, and sent the recording to one of their friends. At the very least, Amazon had to explain how wake words worked and alert the public to the fact that its technology doesn’t always work as intended. Google had a similar scuffle with the reality of faulty software, when a Google Home Mini accidentally recorded the life of a tech blogger. Those examples aren’t even including the crazy hacks and attacks that security researchers have come up with.
Now, thanks to this new series of scandals, we know that humans are involved behind the scenes of voice assistants and voice-powered software. We also know a bit more about how imperfect it still is. In some cases, it appears that humans are doing a lot more work in this arena than tech companies and their marketing teams would have you believe. When Google showed off Duplex, a new service that used bots mimicking human voices to make restaurant reservations, it didn’t take long for users to realize that actual humans were sometimes making these calls, as the software slowly improved. Google wanted the world to believe that it invented a robot that could talk on the phone and seem human, but that dream hadn’t yet come true—at least, not entirely.
Knowing that humans are at work transcribing and reviewing people’s speech reveals a similarly failed aspiration. It would have been incredible, five years ago, if Amazon had invented software for a speaker that could listen to you only when you wanted it to listen and could interpret your commands into actions. That’s what the Echo and other smart speakers have promised to do all along. Anyone who’s owned one of these things knows that they’re imperfect and often struggle to understand human speech. If you review your voice commands—something that’s easier to do thanks to some scandals—you’ll see that voice assistants accidentally record stuff all the time. Humans have been behind the scenes trying to make things work better, and companies like Amazon probably hid this fact from the public, because it would have revealed how imperfect the machines still are.
It still sucks for privacy. It’s creepy that a human might have heard anything you’ve said to a voice assistant. For now, some of the companies in question have paused the human review process, presumably until the bad press from all these recent reports blows over. The human process will most likely start back up again, since they’re needed to improve the voice software. At least more people will realize that they’re not necessarily just talking to a computer when they talk to a voice assistant. If the privacy invasion seems too scary, don’t use a voice assistant.
What will be scarier, though, is when the machines can train themselves. The thought of it must have Isaac Asimov rolling over in his grave. For now, if you think about it from the right point of view, it’s encouraging that humans are still part of the equation. It’s enlightening that some of those humans are bold enough to alert the press of their existence and let the world know how the tech companies that employ them are hiding the truth from the users. The computers would never do a thing like that.