Roku Is Doubling Down on Voice, May Be Building Smart Speaker (EXCLUSIVE)

speaker grille
Shutterstock / KAE CH

Streaming video device maker Roku has been working on advancing voice control while also ramping up its own audio efforts, according to a number of new job offers posted by the company as well as other public documents reviewed by Variety. Together, these documents suggest that the company may be building an Amazon Echo-like smart speaker or similar voice-centric products.

A Roku spokesperson declined to comment when contacted for this story.

Roku recently posted multiple job offers audio-centric roles including a “Sr. Software Engineer, Audio” as well as a “Sr. Software Engineer, New Products, Audio (Expert).”  The latter is being tasked with “building a center of audio excellence” at the company. “We are looking for a senior software engineer with extensive experience in audio, and audio software development,” the job posting reads, continuing: “You must have a proven track record of developing and porting software for new hardware platforms from prototype to mass production.”

At the same time, the company has been looking to hire multiple team members to build out its voice control capabilities, including a job offer for a “Sr. Interaction Designer, Voice,” as well as a “Voice User Interface Designer” who is supposed to become Roku’s “expert on all things voice related.”

Popular on Variety

On its surface, any of these job offers alone may not be damning proof for Roku branching out into the smart speaker space. Some of Roku’s current video streaming products do have some rudimentary voice integration. Users of higher-end Roku streaming devices can query the company’s universal search with a microphone-equipped remote control. Likewise, some streaming boxes have dedicated optical audio outputs, and video streaming devices in general are often connected to AV equipment, which could explain attempts to improve audio quality.

However, the job offers do include some peculiar clues. “Voice user interface design” is an industry term that is commonly used for screen-less interaction design that relies entirely on a back-and forth of voice commands and spoken feedback. And the audio expert is supposed to know about audio concepts including total dynamic distortion, which is commonly used to describe the distortion of a loudspeaker, as well as de-reverberation, which is an important factor in making voice control work with far-field microphones, meaning microphones designed to pick up voice commands from across the room.

But it’s not just new job offers that strongly suggest a new focus on voice and audio products. Roku has also already been hiring a number of experts in these fields over the past couple of months, including Tyler Bell, whose Linkedin profile states that he is leading product management for voice, natural language understanding, automated speech recognition and artificial intelligence at the company — all key to making a voice-controlled device.

Then there is Roku director of engineering Hari Ramakrishnan, who is working on “far field voice and audio engineering,” according to his Linkedin profile. And finally, there is Jim Cortez, who is “working on voice interfaces” at Roku, according to his Linkedin profile.

Interesting about Cortez is not just his field of work, but also his background: In 2015, he co-founded and became the VP of engineering of Ivee, a startup that made a device called the Ivee Voice, which he describes on his Linkedin page as “a hardware device home voice assistant with a voice-centric interface. The device could set alarms, answer general knowledge questions, play music, and much more.” In other words: It worked just like an Amazon Echo.

The Ivee smart speaker was one of the first Amazon Echo competitors. Roku has since hired its co-founder and VP of engineering to work on voice interfaces. Courtesy of Ivee

There are a few reasons why it would make sense for Roku to branch out into the territory of voice-controlled devices. Roku’s current voice efforts are hampered by the company’s business model, which increasingly relies on partnerships with TV manufacturers to build TV sets powered by Roku’s smart TV operating system. These TV sets are a great way to grow Roku’s audience, but they’re generally a bad vehicle for product innovation.

Roku has teamed up with a number of companies aiming for the lower end of the market. Companies like TCL produce cheap TVs to compete with big players like Samsung and LG on price. With razor-thin margins, these companies don’t have any incentive to include more expensive remote controls with integrated microphones with their TV sets, so their TVs can’t be easily controlled with voice commands even if Roku was to add this functionality to its software in the future. Adding a cheap voice-equipped speaker to the mix that could also be used to control a Roku TV could solve this issue.

At the same time, there is some urgency for Roku to show a commitment to voice. The growth of Amazon’s Echo, and to an extend Google’s competing Home speaker, has turned voice into a key component of the smart home. Roku is set to go public later this year, and investors may want some assurance that the company isn’t oblivious to this trend.

This also wouldn’t be the first time for the company to dabble in audio products. About a decade ago, Roku sold a dedicated audio streaming device dubbed the “Roku SoundBridge” that was primarily designed to listen to online radio stations. But without popular online music services, the device was simply too early. Roku discontinued its SoundBridge product line in 2008, focusing on video streaming devices instead.

Roku already made an audio-only product, the Roku SoundBridge player, about a decade ago.
Courtesy of Roku

So what will Roku’s second generation of audio products look like? It’s quite possible that the company is building its own version of the Amazon Echo — a standalone speaker that’s entirely controlled via voice.

It’s also conceivable that the company may be building something more akin to Amazon’s latest Fire TV iteration, which is reportedly a streaming video box that doubles as an audio streaming device when the TV is turned off, and packs microphones for voice control. Roku may even decide to put its technology straight into a soundbar, and do away with the need to add another box or device to your living room altogether.

Whatever the product may be, it’s likely that we are going to see first signs of it sooner rather than later to get the word out before Roku goes public in the coming months.