The Real Story of How Amazon Built the Echo

The talking speaker started as part of a secret augmented-reality project and ended up as a surprise hit.

Telling Jeff Bezos he’s wrong is always a frightening proposition. In the fall of 2014, though, a small group of the men and women building Amazon’s new voice-controlled speaker felt they needed to confront the CEO. The release of the speaker was looming, and for the most part, things were falling into place. The device looked good, its voice recognition software was improving quickly, and even the boxes it would ship in had been designed and assembled. But there was a lingering issue with the name printed on those boxes: the Amazon Flash.

Many people who worked at Lab126, Amazon’s hardware division, hated the name, according to two former employees. Bezos, on the other hand, was strongly in favor. And there was another worry. A core feature of the device is a “wake word” that cues it to begin taking voice commands when spoken. One of the two words being considered was “Alexa.” Bezos thought the best word would be “Amazon.” This presented a challenge, because people say that word a lot. A common opinion within Lab126 was that the project was hurtling toward a potential disaster: The speakers would wake upon hearing Amazon ads on television and commence buying random stuff from the Internet.

Generally, the engineers and product managers at Lab126 quelled their own dissent before it reached Bezos, instead concentrating on giving the boss what they thought he wanted. “We spent so much time trying to anticipate what Jeff would do or say, and read into little words he would say in meetings,” said one former employee. “It would lead to so much additional work.”

Making matters worse was the overall atmosphere at Lab126 that summer. Amazon had launched the Fire Phone, its competitor to the iPhone, in July. In the home stretch of the speaker’s development, the Fire Phone was bombing, and the lab was in a period of reckoning. People were moving on to new projects or quitting altogether. It felt like Lab126 was hitting bottom.

“We spent so much time trying to anticipate what Jeff would do or say, and read into little words he would say in meetings. ... It would lead to so much additional work.”

Weeks before the speaker was set to ship, the dissidents confronted Bezos. He was amenable to changes: The device would be called the Echo, and its wake word would be “Alexa.” Users can now choose to change it to “Amazon” or “Echo” if they want. The Amazon Flash boxes were destroyed, and the first round of speakers was shipped off in November.

In a gadget landscape dominated by rectangular touchscreens, the Echo is something different. The speaker is a screenless cylinder, just over 9 inches tall and 3.25 inches in diameter. It can play music, and also answer basic household questions like how many teaspoons there are in a cup. The only way to interact with the Echo is to talk to it. It’s always listening for its wake word.

When it launched, Amazon’s critics jumped to mock the company. Some called it a useless gimmick; others pointed to it as evidence of Amazon’s Orwellian tendencies. Then something weird happened: People decided they loved it. Amazon never releases data about how its products are selling, but Consumer Intelligence Research Partners issued a report this month saying that Amazon had sold more than 3 million devices, with 1 million of those sales happening during the 2015 holiday season. About 35,000 people have reviewed the speaker on Amazon.com, with an average rating of 4.5 stars out of 5.

Perhaps even more important to Amazon is how dozens of independent developers are writing apps that work with the speaker’s voice controls. You can use Alexa to turn off the lights, ask it how much gas is left in your car, or order a pizza. This is doubly surprising given how far behind Apple and Google the company was in the area of voice control when it started. The Echo may have seemed like a superfluous toy at first, but it now looks like a way for Amazon to become the default choice in a whole new era in the way people interact with computers and the Internet.

“We want to be a large company that’s also an invention machine,” Bezos wrote in a letter to investors this month. The Echo shows what happens when Amazon achieves that goal. Bezos declined an interview request to discuss the speaker’s development, but 10 current and former Amazon employees agreed to talk, mostly on the condition of anonymity because they hadn’t been authorized to do so by the company. This is the story of what they built.

Amazon created Lab126 in 2004 to build the Kindle e-reader. The lab’s name is a reference to the alphabet, with 1 representing the letter A and 26 representing Z. People in the lab sometimes refer to the Kindle as Project A. Project B was the Fire Phone. Work on the Echo—Project D—began in 2011. At the project’s peak, there were several hundred employees in Seattle, the San Francisco Bay Area, and Cambridge, Mass., who worked on D.

The idea for the Echo was an offshoot of Project C, and many of the early employees on the Echo moved over from C. Amazon remains particularly eager to keep this project a secret, even though work on it has stopped. But a sense of the focus and scope of the idea can be gleaned from patent applications filed by engineers at Lab126.

The first activity showed up on Dec. 21 and Dec. 23, 2010, when Lab126 employees applied for five patents whose titles all included the phrase “augmented reality.” Augmented reality—hologram-like displays projected into the physical world—was already a buzzword at the time. An e-commerce company wouldn’t seem like an obvious leader in the field. But Amazon’s patent applications show it was pursuing a vision that goes far beyond anything that exists as a commercial product even today, almost six years after the first patent applications were filed.

One of the initial patent applications described a device that would display augmented-reality images that people could interact with; another proposed tracking people’s movements and responding when they clapped, whistled, sang, or spoke. Taken together, Amazon’s patents during this period point toward a vision of a home where virtual displays follow people around as they wander from room to room, offering a range of services in response to voice commands and physical gestures. Bezos himself is listed as an inventor on two patents related to voice control or augmented reality from this period.

Amazon kept its fingerprints off the original patent applications. Instead, Rawles LLC was named the assignee—the organization that would own the patent. Rawles had been incorporated in Delaware just two weeks before Amazon started filing patent applications related to augmented reality. In the years since, Lab126 employees have applied for dozens of patents listing Rawles as the assignee, all relevant to augmented reality or voice control. No one has ever claimed Rawles as an employer on LinkedIn, and its correspondence with the U.S. Patent and Trademark Office has been handled by lawyers based in Washington state, where Amazon is headquartered.

“We certainly tried to keep things quiet,” said Dave Limp, Amazon’s senior vice president for devices. “Until the product is ready for customers, the only people that are going to be advantaged by it are competitors and possibly the press.”

Stowing the patents with Rawles didn’t make them completely secret, but it made them harder to find. The ploy seemed to have worked. While speculation around Amazon’s progress on a smartphone and a set-top box built in the years before those projects launched, its ambitions on a living room powered by augmented reality remained a secret. On a single day last November, Rawles transferred 106 patents to Amazon. A month later, the U.S. Patent and Trademark Office approved one of the patents, inspiring a small round of coverage in the press. By that time, the augmented-reality project had been put on ice and the Echo was out in the world.

Some employees who worked on Project C lament its demise as a sign of Amazon’s downgraded ambitions; others say the company just realized it was time to abandon an idea that was too wacky for prime time. By one person’s account, the project wasn’t cut loose for good until the failure of the Fire Phone led to Lab126’s leadership to question its ability to pull off the biggest projects. But the Echo was broken off as a standalone project well before then, out of a desire to build a commercial product that was a little less far-reaching.

As originally conceived, the Echo was simpler and cheaper than the speaker in its current form. One person who worked on the project remembers that the company expected to be able to manufacture the devices for about $17 and sell them for $50. It now costs $180, and Amazon is believed to take a loss on each sale, once packaging, shipping, and marketing are factored in. The company declined to comment.

At the time, it was still unclear what the speaker’s main purpose would be. Of course it’d play music, but why else would someone want a speaker you could talk to? Bezos had lots of ideas. “There was almost an irrational expectation around the functionality of the device,” said one person who was at Lab126 at the time. “Jeff had a vision of full integration into every part of the shopping experience.”

“We certainly tried to keep things quiet”

Amazon hired a handful of people who had worked at the speech recognition company Nuance, and bought two startups, Yap and Evi, that were also in the voice response business. Engineers in Cambridge dove into building a speech recognition system that could match Google’s or Apple’s, a daunting task considering the head start those companies had from building services for their smartphone software.

Once Amazon’s engineers started building the speaker, they realized it would need more processing power than they’d anticipated. They swapped out the microcontroller, the kind of simple computer used to control devices like remote controls, with a microprocessor, which could handle more complex tasks. Even as these fundamental changes went on, the lab’s leadership was convinced the speaker was almost ready. For three consecutive years, the product was expected to ship within six months. The $50 target price seemed more and more far-fetched.

People working on different projects at Lab126 aren’t usually clued in to what other projects are under way, so for several years the Echo team had no idea that other people at the lab were building a phone, and vice versa. When Bezos unveiled the Fire Phone in June 2014, the speaker project was progressing nicely. Then the phone’s flop threw everything at Lab126 off center.

Amazon’s official line on the Fire Phone is that the occasional face-plant is part of the job. In his recent letter to shareholders, Bezos referred to failure as the “inseparable twin” of invention. Limp said the team took solace in the popularity of the Kindle and Fire TV. “Being able to see products that are resonating unbelievably well with customers is always a good contrast when you have one that doesn’t work as well,” he said.

People who worked at Lab126 at the time described the period as acutely painful and damaging to the division’s collective self-confidence. Amazon didn’t immediately lay off employees who worked on Fire Phone; instead, a handful of new managers joined the Echo team, with different ideas and varying levels of enthusiasm about the speaker. This grated on some people who had been on the project since the beginning. And it didn’t help that the stakes were now raised, since the speaker had to redeem Amazon’s reputation. To top it off, all of this took place amid creeping doubts within the lab: Maybe Amazon couldn’t make desirable high-end consumer gadgets after all.

The Echo went through several key changes in the eleventh hour. The speaker had to emit sound and listen for it simultaneously, a challenge that had preoccupied engineers throughout the development process. What if the music was so loud it drowned out people’s voices? Early in the process, engineers created prototypes for smaller devices that looked like hockey pucks that could be placed around the house to listen for commands when people strayed too far from the main speaker. The lab’s leadership pushed that idea aside to focus on the development of the main device, but it recently reemerged as the Echo Dot, which Amazon introduced last month and is currently selling only on a limited basis.

In the fall of 2014, there was still disagreement over whether the Echo’s hearing was good enough on its own. Bezos and his top deputies were adamantly opposed to relying on any form of input other than voice control within the speaker itself. They saw it as cheating. Some engineers disagreed, and pushed for a remote that people could speak into from anywhere in the house. Luckily, the company had just made such a remote for the Fire TV. The two sides reached a compromise, agreeing to send the first batch of speakers with a remote included. They’d gather information about how often people used it and tweak the product accordingly. Apparently the fears were overblown. The people using the Echo in the wild almost never used the remote, and it was quietly removed from the box in later shipments.

Connecting the Echo to Internet-enabled lightbulbs and thermostats made by other companies hadn’t been a focus within Lab126 until late 2014. On a lark, an engineer had rigged the speaker to work as a voice controller for a streaming TV device. It was a forehead-slapping moment for Bezos, according to one employee who worked with him directly. “It was something he grew to embrace, aggressively,” the person said. Amazon’s vision for the Echo now relies heavily on the speaker serving as a hub for the so-called smart home. Limp jokes that it’s only a matter of time before some enterprising developer writes a program to use the Echo’s voice controls to flush the toilet.

Many of the people who helped create the Echo no longer work for Amazon. They gave various reasons for leaving: a sense of closure after finishing a big project; a lucrative job offer from a competitor or the temptation to start something on their own; burnout after the long workdays; bitterness after years of the blood sport of internal politics. None of the former employees interviewed for this story quibbled with Amazon’s reputation as a brutal workplace. When asked whether it was inherently “fun” to work on a product like the Echo, one former employee scoffed that, to describe Amazon, no one had ever used that word with a straight face.

The success of the Echo is luring in their replacements. In February, Amazon held an open recruitment event at one of its buildings in downtown Seattle. Hundreds of programmers and engineers showed up—many from Microsoft. They listened to Amazon executives give speeches about the company’s ambitious plans to use its voice-controlled speaker as a link for all the Internet-connected appliances coming to market. “Now is the time for the smart home to be real,” Charlie Kindel, Alexa’s smart home director, told the crowd.

With the Echo, Amazon has figured out a way to insert itself into customer interactions with other devices and services. Part of this is just good timing. The tech industry has been searching for the next big computing platform after mobile. Investing in some combination of voice control and artificial intelligence was prudent, especially given that no one else has quite figured it out yet. Apple, Google, and Microsoft all have their own virtual assistants, but they’ve designed them as a way to make their smartphones work better. The Echo is a bigger departure from the past.

In a way, its success is a result of the Fire Phone failure. Since Amazon already killed off its smartphone, its voice-control efforts were bound to be focused elsewhere. And while smartphones are touted as the pinnacle of convenience, taking one out and clicking an app to find out the weather while buttoning your shirt actually seems rather work-intensive compared with just yelling the question across the room.

Alexa has more than 500 skills—Amazon’s word for software programs that create voice controls that have the speaker check your bank balance, start playing a Pandora station, or make your child’s favorite animal noises. The company keeps an internal list of customer suggestions for new controls, ranking each one according to popularity to decide the order in which it will pursue them.

The next big task for Amazon is to begin tying services together in new ways, says Julie Ask, an analyst at Forrester Research. She says being able to tell the Echo to call an Uber is fun, but incremental. “In five years, my Echo will say, hey, it’s about time to go to the airport. Should I get you a car? And I’ll just say yes,” she said. “That’s the difference between where we are today and where we want to be.”

As a company, Amazon would rather look forward to those challenges than back at the ones that plagued the creation of the Echo. Limp seemed most comfortable describing the Echo’s development in general terms, downplaying any epiphanies along the way. For him, the most salient part of the process was reducing latency—the time between when you ask the Echo a question and when it responds—from about 9 seconds to 1.5 seconds. He professes ignorance about the details of any last-minute angst over what to call the thing, remembering only that in the end everyone ended up in agreement.

“I can assure you,” he said, “Jeff loves the Echo name.”

—With Spencer Soper