“Voice-First” Intelligent Assistance: A Closer Look

By Dan Miller on February 3, 2017 • ( 0 )

John Wooden, the UCLA basketball coach called “The Wizard of Westwood”, is credited with saying, “Never mistake activity for achievement.” It is an aphorism that should speak for itself but, in the world of speech-enabled or voice-first Intelligent Assistance, it has taken on new meaning.

Whether you are a digital commerce specialist, customer care professional, marketing maven, contact center manager or analyst like myself, we are awash in evidence of activity surrounding chatbots. Facebook Messenger, with a base of “active monthly users” has transformed itself into a “Bot Magnet” and correspondingly the flame that attracts tens of thousands of ambitious developers to add their interactive offerings to a roster of 40,000 automated assistants.

Messenger is only one of a dozen or so platforms for conversational commerce. It started as Facebook’s answer to the highly popular WeChat, owned by Chinese e-commerce specialist TenCent and claiming over 850 million users of its own. They are joined by plethora of other popular messaging platforms, including Snapchat, Kik, Viber, Telegram and others. well-established business-oriented collaboration platforms, like Skype (Microsoft), Spark (Cisco) and Slack (independent).

On the “voice-first” side of the ledger, Amazon’s Alexa is the Messenger equivalent. It has fomented a formidable set of statistics surrounding its expanding installed base and “skills” set (the spoken equivalent of bots or apps). In its “2017 Voice Report”, VoiceLabs, which specializes “Voice Experience Analytics” estimates that number of homes with Amazon Echo, Dot or Tap combined with owners of Google Home has reached roughly 8.5 million. They then estimate that an additional 24.5 million devices will be sold in 2017, bringing the total number in-service to 33 million.

That population may pale when compared with the billion or so people that regularly use FB Messenger, but it is proving sufficient to attract the attention of thousands of developers as well as purveyors of branded, conversational service offerings. Amazon has proudly reported that the ranks of approved skills for Amazon Alexa swelled from 136 to 7,000 in 2016.

VoiceLabs deflated (but did not burst) voice application bubble by providing data indicating that both awareness and repeated use of the applications invoked by talking with Amazon’s Alexa or Google’s Assistant are, in a word, dismal. Nearly 70% of Alexa’s skills had generated one or fewer reviews, indicating general lack of awareness. Then, as Jason Del Rey notes in Recode, a major drop-off in active users occurs by the end of week two. Skills have only a 3% chance of retaining interest. According to Del Rey, mobile apps on Android and iOS based smartphones retain active users at a rate in excess of 11%.

Meanwhile Echo owners follow usage patterns and preferences that are reminiscent of the early days of Audiotex, phone-based audio information services. Leading the way in usage are News (primarily through a skill called “Flash Briefing”), Games & Trivia, Education, Lifestyle and Weather (which actually looks to be in a tie with Novelty & Humor. If you threw in Sports Scores, Lottery Numbers” and Daily Horoscope, you’d swear you were dealing with the early voice portals like Tellme, BeVocal, HeyAnita or Quack.

But What do People Really Like?
When VoiceLabs asked people what they “really like about their Echo or Google Home,” a totally different picture appears. Nearly half said they like streaming music and audiobooks. Controlling smart home devices tied with playing games and entertainment with 29.1% saying thats what they really liked. Listening to news and podcasts, which is a frequently invoked function resonated with only 26.1% of the respondents.

Responses to these questions deliver a harsh truth to enterprises and brands as they look to incorporate speech-enabled devices into their collaboration or marketing plans. Only 2.7% said they really like “brand content” through their assistant, while 1.1% liked “business services” or what they think of as business services.

As I wrote back in March, Amazon deserves a lot of credit for growing the population of end-users who have grown comfortable talking to a device and using their own words to best describe their intent. Its array of microphones do a tremendous job of picking up voices from across a room and its speech recognition has proven to be tremendously accurate. It has also provided a sufficient base of users to enable developers and product planners to refine their offerings based on what we learn from today’s early adopters.

Alexa and Google Assistant share some of the challenges first confronted by Apple when it introduced Siri. The “strangeness” is gone. Speech recognition is more accurate. Yet it remains hard to make users aware of all the things they can do with their assistants and then popularize them through promotion, advertising and constant refinement.

Encouraging repeated use has always been a challenge because the apps or skills don’t work consistently and, as many people have pointed out, they don’t seem tailored, personal or particularly trustworthy.

Enter Google Assistant in the Home and on the Phone
Just as Amazon’s Alexa could learn much from Siri, the product planners for Google Assistant could go to school on Alexa, as well as the myriad of data and metadata that running the world’s starting point for search, digital and real-world activities. Sunil Vemuri, Product Manager for Google Assistant, drove this point home during his keynote at the Conversational Interaction Conference (January 30-31) in San Jose.

In a shout-out to technological advancements, Vemuri made it clear that vast improvements in speech recognition and the ability to understand or derive meaning from what people say gave Google the confidence to introduce new capabilities. Ubiquity and personalization are also a very big deal. As Vemuri put it “It will be everywhere you are,” and ”We made it just for you.”

Characterizing the Assistant as “playful and a little geeky” he outlined four pillars to product and service design. It is Conversational, Contextual, Personal and Fun. “Actions” are the Google Assistant equivalent of Skills. Where Amazon and Alexa appear to have an endgame built around support of commerce and transactions, Google with Assistant hark back to Siri’s original intent, to be an “action engine” rather than just a search engine, with the initial spoken queries culminating in a purchase or other transaction.

Vemuri’s live demos were impressive. Google Assistant used personal information like location and the content of an individual’s calendar “with permission” to offer relevant suggestions, news stories search results and appointment making is ways that felt like an omniscient easy-listening radio station.

As we enter the second month of 2017, the demos were great and the technology, in my opinion, no longer gives rise to first order problems. Now you’ll hear talk that focuses on “the ecosystem” and “monetization” as if creating a conversational medium to support commercial activity and communications is just a business. Rest assured, it’s going to create a number of businesses.

Harking back to Wooden’s quote, we’ve fomented activity on many levels of a layered ecosystem. We’ll know when we’ve achieved something when – as happened with the Internet and World Wide Web – software developers, equipment makers and service providers introduce features and functions that naturally and seamlessly improve everyday activities…at scale.

‹ H&R Block and Raymond James Offer Automated Personal Advisors for Financial Management

Opus Research Report: Decision Makers’ Guide to Enterprise Intelligent Assistants ›

Categories: Conversational Intelligence, Intelligent Assistants