This year’s Voice Global Conference went virtual like many of the conferences and events this year . . . and as we all know, it’s due to the Global COVID-19 pandemic. The beauty of an event of this magnitude going virtual — it’s free to all! The proliferation of free information allows practitioners, users, and brands to come together and have a dialogue about a technology that’s taken off: Voice.

I heard many talks across several verticals; one, in particular, stood out: the state of voice presentation delivered by Bret Kinsella, founder of His data-rich discussion covered a lot of ground in 25 minutes. Here are some of the key points that jumped out:

1 Adoption of voice is staggering

Adoption of any new technology is key. A key inhibitor of technology is often distribution, but this has not been the case with voice. Apple, Google, Baidu have reported hundreds of millions of devices using voice, and Amazon has 200 million users. Amazon has a slightly more difficult job since they’re not in the smartphone market, which allows for greater voice assistant distribution for Apple and Google.

But are people using devices? Yes! Google said recently there are 500 million monthly active users of Google Assistant. And not far behind are active Apple users with 375 million. Large numbers of people are using assistants, not just owning them. That’s a sign of technology gaining momentum — the technology is at a price point and within digital and personal ecosystems that make it right for user adoption.

2 Adoption Happens Unevenly

When we look at the adoption cycle, voice is evolving in different stages. Measured by monthly active users, we are still in the early stages of voice with devices such as smartwatches. But use of smartphones has penetrated half the U.S. population. Voice search is mature, with two-thirds of the U.S. population using it because they’re comfortable with it. As with most technologies, change happens unevenly. “Voice first” doesn’t mean everyone is using Voice the same way, rather in a breadth of ways, which speaks to its applicability across contexts.

3 Voice Use Differs by Surface

Voice also differs by how people access it. Smart speakers get most of the attention in the news media, but not most of the use by consumers. About 90 million US adults have access to smart speakers, but more than twice that amount have tried a voice assistant on smartphones.

And people use surfaces differently. The most common uses for smart speakers are streaming music followed by asking questions and looking up the weather. For smartphones, the most popular use is asking a question followed by answering a phone call and finding directions. In cars, it’s making a phone call, followed by finding directions. But asking questions drops to sixth. Not to mention, users may be using in-car voice assistants like “Hey Mercedes” and not just smartphone voice assistants, Alexa Auto, or Android Auto. For more insight: Juergen Schmerder discusses some of the advancements in the Mercedes Car Voice Assistant.

Voice is not amorphous. Context and use case matter!

As we think about the different surfaces and devices voice is being adopted on, we must also now think about contact-less interaction amidst Covid-19’s global pandemic; and how the “new normal” behavior will evolve as health and safety have influenced how people pay for goods and services. Mark Jamison, Global Head of Innovation at Visa, talked about voice and its impact on the new normal as it relates to payments as part of his fireside chat. The movement of contact-less interaction will become more pervasive, and this continues to push businesses to expedite their digital transformation efforts as the former ways of doing business (read: operating models and business models) need to evolve with societal and consumer behavior.

4 Voice is Global

It’s all-too-easy easy to think of voice in context of the U.S. market, but in fact, voice is a global phenomenon. China accounts for 30–40% of smart speaker sales and the rate of total installed base is catching up. And the rules for using voice are different in China, usually tied to a super app’s ecosystem.

Meanwhile, in the United States, the growth rate of smart speaker purchases is slowing. One possible reason consists of privacy concerns, sparked by a spate of negative articles in 2019 about big tech companies using third-party contractors to listen in on consumers’ conversations in order to test and improve voice device performance. Media coverage likely made people wary of using them as frequently as they used to — although sales remain robust.

Regional differences become even more striking when you examine the different assistants catching on globally. The big voice assistants such as Alexa, Cortana, Google Assistant, and Siri, do not speak for the world. For example, different general-purpose assistants are proliferating in Asia and Europe:

When you think of the players outside the United States, you get a more complete view of how different voice assistants are taking hold:

This is a global technology adoption and consumer behavior movement, which makes it exceedingly exciting to be involved with and continue to explore for businesses around the world.

5 Voice Assistants Are Exploding

One of the biggest stories emerging in Voice is the proliferation of different types of voice assistants such as:

· Niche assistants such as Aider that provide back-office support.

· Branded in-house assistants such as those offered by BBC and Snapchat

· White-label solutions such as Houndify that provide lots of capabilities and configurable toolsets.

With all these (perhaps commoditized) voice experiences, remember that value gets created from the experience and relationship with the user. Voice design and voice user interface (VUI) creation still greatly matter, and should. It’s far too easy to create poor voice experiences. A poor voice user experience is frustrating for users and more harmful to a brand than a bad text-based website. That’s because a voice-based experience is less forgiving. With a poorly designed VUI, the user lacks a way to decipher the content or information further — user comments like: where do I go from here, that’s not what I asked, I’m not sure what to do with that information — are statements that VUI designers do not want to hear. This is, of course, providing the user was understood by the automated speech recognition (ASR) and natural language understanding (NLU), and received a response from voice application. All of this decreases the user’s trust in the medium and pushes them back to, say, websites or phone calls. As a result, the bad brand experience might result in the user not wanting to interact with the business via the voice interface again. It’s quick for users to try voice and say, I like the old way better because: it’s more reliable or they know how to navigate it. Adobe is empowering creatives to get past these experience hurdles faster and smarter as we heard from Mark Webster, Director of Product at Adobe.

With so many assistants proliferating globally, Voice will become a commodity like a website or an app. And that’s good. It’s soon (read: throughout the next couple of years) becoming just table stakes for a business to adopt voice as a metaphor for offering a lovable experience that users expect. Consider that feeling you get when you realize a business doesn’t have a website: it makes you question its validity and reputation for quality. Voice isn’t quite there yet, but it’s moving in that direction.

Bret Kinsella’s talk resonated throughout the conference, especially as I considered the many ways voice is being adopted in healthcare now. I heard many presentations on voice use cases for wellness care, which is especially important in 2020 as we become more health and safety conscious, but also remote. In fact, it’s fascinating to consider how voice might evolve to help people manage life during COVID-19. Consider, for example, a voice app (with IoT sensors) that warns you when someone is getting too close to your six-foot zone of personal safety, or an app that reminds you to wear your mask as you go out in public.

The possibilities for voice are ever-expanding — getting smarter, more personalized, in more contexts — especially in how it fits into a brand’s digital or experience ecosystem. Start investigating your voice ideas by running a Voice Design Sprint. It’s a new world, and Voice is shaping it.

Mark Persaud

Mark Persaud

Practice Lead, Immersive Reality