AI Revolutionizes Voice Interaction: The Dawn Of A New Era In ...

2 Feb 2024
AI

AI Revolutionizes Voice Interaction: The Dawn Of A New Era In Technology

Adobe Stock

It’s becoming increasingly common to find ourselves controlling and communicating with machines using our voices. This behavioral shift in how we interact with our most valuable and powerful tools has profound implications for our relationship with technology and many aspects of broader society.

This recent move towards a voice-controlled world has accelerated sharply with the arrival of generative AI and large language models. Rather than the stilted, often frustrating conversations we’ve become accustomed to having with machines like Alexa or Siri, generative AI offers naturally flowing, context-sensitive, two-way communications.

One person who’s carefully considered these implications is author and WillowTree president Tobias Dengel. In his recent book The Sound Of The Future – The Coming Age Of Voice Technology, Dengel explores the ways in which the world is likely to change as the final technical barriers to programming and controlling machines come crashing down.

Why Is Voice So Transformational?

Dengel argues that the shift in the way we communicate with machines has far wider implications than simply letting us have conversations with machines.

The more contextual and flowing nature of natural language means we will use technology far more efficiently and that it will become far more accessible to many people.

In our conversation, he tells me, “Human beings can speak three times as fast as they can type on a keyboard – probably five times as fast as the average person can type on a mobile device … that’s the core insight.

“I guarantee you that when you take something that takes three minutes and you can now do it in 15 seconds, the world is going to change overnight.”

His position is that the world is about to shift very rapidly towards adopting a model where voice is our primary interface with machines. In other words, for complex machine operations – like programming computers – we no longer need to learn the language of machines because they’ll just speak ours.

Accessibility is a huge issue here. The move to voice will democratize technology, meaning that a larger and more diverse range of individuals can put complex systems to work to solve problems. Dengel’s position is that this isn’t just about convenience; it’s a fundamental change in our relationship with technology.

As he says, “You’re going to see every interface between humans and machines transition to voice-first.”

Voice In Action

Throughout his book – and our conversation - Tobias provides examples of how this change is already taking place.

Of course, these include the voice assistants that we all have in our homes and on our phones, but he makes it clear this trend will go far beyond Alexa and Siri.

One of his favorite examples is Cathay Pacific, which he says has implemented natural language technology in an assistant tool designed to help with routine maintenance and cleaning of aircraft.

“They now have a voice that says, ‘Hey, seat 13C has a broken armrest’, when they’re doing their thing.”

He also highlights the voice-control technology developed for military aircraft which is now being deployed in civilian aviation.

“Accidents all happened because the pilots didn’t know what the plane was doing and couldn’t interface with it – if they had a voice that said, you know, turn off the autopilot, do XYZ, go there, whatever it was, it would have avoided those accidents.”

As an example of how it could revolutionize day-to-day technologies, he suggests that banking apps will vastly improve when users simply ask for what they want and get results rather than navigating hundreds of possible functions on a small screen.

He also mentions one WillowTree customer – a large soft drinks manufacturer – who has created voice systems enabling them to order replacement parts for any of their machines or dispensers in vending machines or restaurants simply with voice. This saves hours of time that were previously used while searching catalogs for location and item codes.

Ethics And Challenges

The impact that this change is likely to have on society is hard to overstate. One of the biggest questions is about its implications for human jobs and employment.

“Everything is showing us that there will be more jobs,” says Dengel, “but there will be disruptions.

“And I think this is where policy decisions, government, has to come in and support.”

Most obviously at risk, he believes, are roles such as call center operator that are already being made redundant by conversational AI tools.

But this will be offset, he argues, not just by the supposed new jobs such as “prompt engineer” that will be created but by the multitude of ways in which we’ll be able to create value using AI.

Just as serious are the issues raised around security. We’ve already seen AI voice spoofing being used by fraudsters and blackmailers. There’s a real risk that these attacks will scale as AI becomes cheaper and more accessible, leading to more victims.

However, Dengel isn’t so worried about the more far-fetched concerns that are sometimes raised.

He says, “People talk about AI running amok and battling humans … I’m not super-worried about that, at least in our lifetimes.

“ChatGPT is awesome, but it can’t even change your mailing address for your American Express card right now because it’s not wired into the system. But it can be used for evil pretty effectively.”

Preparing For The Voice-Powered Future

So what can we do to make sure we’re ready for this universal shift to voice-controlled tech and having natural language conversations with machines?

Dengel suggests the answer lies in meeting the challenge head-on. This means drawing together teams made of technologists, engineers, designers, communications experts and business leaders. Their core focus is to identify opportunities and potential risks to the business, allowing them to be managed proactively rather than reactively.

“That’s always the first step,” he says, “because you start defining what’s possible, but you’re doing it in the context of what’s realistic as well because you’ve got your tech folks involved as well … and then making a roadmap.”

It’s a “workshop” approach pioneered by Apple and adopted by various tech giants that have found themselves at the forefront of an emerging wave of transformation. But it’s equally applicable to just about any forward-looking business or organization that doesn’t want to be caught off-guard.

Dengel says that addressing a group of interns recently, he told them, “I wish I were in your shoes – the next five years is gonna be more innovation than there’s been in the last five or maybe the last 20 years, as conversational AI and generative AI come together. It’s just an amazing experience and an awesome time.”

Read more
Similar news