The Year Alexa Grew Up

Amazon's voice assistant made considerable gains in 2018 through the continued refinement of machine learning techniques.
Image may contain Clothing Apparel Human Person Suit Coat Overcoat and Sleeve
Amazon senior vice president of devices and services Dave Limp at an Alexa-focused event in September.Grant Hindsley/AFP/Getty Images

It’s fair to say that when Amazon introduced the first Echo speaker in the fall of 2014, most people weren’t quite sure what to make of it. In the intervening years, Echo and the broader universe of Alexa-powered devices have transitioned from curiosity to ubiquity. But while you can find Alexa in just about everything—including, yes, a microwave—the real progress Amazon’s voice assistant made in 2018 came less from breadth than from depth.

That’s not to say it hasn’t made gains of scale. Amazon’s voice assistant has doubled the number of countries where it’s available, for starters, learning how to speak French and Spanish along the way. More than 28,000 smart home devices work with Alexa now, six times as many as at the beginning of the year. And more than 100 distinct products have Alexa built in. If you’re looking for some sort of tipping point, consider that, as of last month, you can buy an Alexa-compatible Big Mouth Billy Bass.

It’s how Alexa evolves under the hood, though, that has defined this year—and how it will continue to inch toward its full potential in those to come. Alexa has gotten smarter, in ways so subtle you might not yet have even noticed.

Machine Head

Because many voice assistant improvements aim to reduce friction, they’re almost invisible by design. Over the past year, Alexa has learned how to carry over context from one query to the next, and to register follow-up questions without having to repeat the wake word. You can ask Alexa to do more than one thing in the same request, and summon a skill—Alexa’s version of apps—without having to know its exact name.

Those may sound like small tweaks, but cumulatively they represent major progress toward a more conversational voice assistant, one that solves problems rather than introducing new frustrations. You can talk to Alexa in a far more natural way than you could a year ago, with a reasonable expectation that it will understand what you’re saying.

Those gains have come, unsurprisingly, through the continued introduction and refinement of machine learning techniques. So-called active learning, in which the system identifies areas in which it needs help from a human expert, has helped substantially cut down on Alexa’s error rates. “That’s fed into every part of our pipeline, including speech recognition and natural language understanding,” says Rohit Prasad, vice president and chief scientist of Amazon Alexa. “That makes all of our machine learning models look better.”

More recently, Amazon introduced what’s known as transfer learning to Alexa. Prasad gives the example of trying to build a recipe skill from scratch—which anyone can do, thanks to Amazon’s recently introduced skills “blueprints”. Developers could potentially harness everything Alexa knows about restaurants, say, or grocery items to help cut down on the grunt work they’d otherwise face. "Essentially, with deep learning we’re able to model a large number of domains and transfer that learning to a new domain or skill,” Prasad says.

The benefits of the machine learning improvements manifest themselves across all aspects of Alexa, but the simplest argument for its impact is that the system has seen a 25 percent reduction in its error rate over the last year. That’s a significant number of headaches Echo owners no longer have to deal with.

And more advances are incoming. Just this month, Alexa launched self-learning, which lets the system automatically make corrections based on context clues. Prasad again provides an example: Say you ask your Echo to “play XM Chill,” and the request fails because Alexa doesn’t catalogue the station that way. If you follow up by saying “play Sirius channel 53,” and continuing listening, Alexa will learn that XM Chill and Sirius channel 53 are the same, all on its own. “That’s a big deal for AI systems,” says Prasad. “This is where it’s learning from implicit feedback.”

The next frontier, though, gets a little trickier. Amazon wants Alexa to get smarter, obviously, and better at anticipating your needs at any given time. It also, though, wants Alexa to better understand not just what you’re saying but how you say it.

“When two humans are talking, they’re actually pretty good at understanding sentiment. But these systems are essentially clueless about it,” says Alex Rudnicky, a speech recognition expert at Carnegie Mellon University. “People are trying to develop capabilities that make them a little more sophisticated, more humanlike in their ability to understand how a conversation is going.”

Amazon already made headlines this fall over a patent that described technology allowing Alexa to recognize the emotions of users and respond accordingly. Those headlines were not glowing. A device that always listens to you is already a step too far for many; one that infers how you’re feeling escalates that discomfort dramatically.

Prasad says the ultimate goal for Alexa is long-range conversation capabilities. As part of that, it might respond differently to a given question based on how you asked it. And while it’s important to have these conversations now, it’s worth noting that a voice assistant that truly understands the subtleties of your intonations remains, for the most part, a ways off.

“If you look at the big five emotions,” Rudnicky says, “the one thing people have been successful in detecting is anger.”

Skill Set

As the number of Alexa devices has exploded, so too have the skills. Amazon now counts 70,000 of them in its stable, from quizzes to games to meditation and more. That’s seven times the number it had just under two years ago.

It’s here, though, that Alexa’s room for improvement begins to show. The assistant has gotten better at anticipating what skills people might want to use, but discovery remains a real problem. Not only do Alexa owners miss out on potential uses for their devices beyond a fancy kitchen timer, developers have less incentive to invest time in a platform where they may well remain invisible.

The answer can’t come entirely from deep learning, either. That can surface the most relevant skill at any given moment, but voice assistants have so much potential beyond immediate, functional needs. Think of skills like The Magic Door, an interactive fantasy game on Alexa that launched in 2016. If all you’ve used Alexa for is listening to NPR and checking the weather, it’s hard to see how the algorithm would alert you to its existence. And even more straightforward suggestions aren’t necessarily always welcome.

“It can be an engaging experience if we introduce customers to new skills and new capabilities, if it’s highly relevant to what they’re doing,” says Toni Reid, vice president of Alexa experience and Echo devices. “But you have to be really careful in those use cases, because it may be overload. It’s sort of the right time at the right moment, the right amount of content.”

Amazon will also need to figure out how to fend off Google, whose Google Assistant has closed the voice-control gap considerably despite a late start. Canalys Research estimates that 6.3 million Echo smart speakers shipped in the third quarter of this year, just ahead of Google’s 5.9 million smart speakers.

The race isn’t quite as close as those numbers make it seem; it doesn’t include third-party devices, an arena where Alexa dominates, and a three-month snapshot belies the huge install base Amazon has built up over the past four years. Still, Google has advantages that Amazon can’t ignore.

“They had years of experience with AI, whereas Alexa was built from the ground up,” says Canalys analyst Vincent Thielke. “Because Google’s AI was so advanced, it was very easy to catch up.” Similarly, by virtue of Android, Android Auto, and WearOS, Google has more places it can seed Google Assistant. With the spectacular failure of the Fire Phone—also launched in 2014—Amazon’s mobile options are limited. The company is faring better in cars, but still lags behind Google and Apple in native integrations, which has led the introduction to hardware add-ons like Echo Auto.

Still, Alexa has shown no signs of slowing down. There’s now Alexa Guard to watch after your home when you’re gone. There’s Alexa Answers, a sort of voice assistant hybrid Quora and Wikipedia. There’s Alexa Donations and Alexa Captions and Alexa Hunches and Alexa Routines.

It’s a lot. But if you want to know where Alexa is headed next, well, you know who to ask.


More Great WIRED Stories