Blog

Share
Developers

Google Duplex: Technical limitations, ethical implications, and market ramifications

by Lewis Leong on Aug 28, 2018

There is certainly a lot of buzz around Google Duplex, the most human-like artificial intelligence (AI) we’ve seen yet. But there are many unanswered questions—Is it a breakthrough technology with the power to change the way users and businesses interact? Or is it a pending annoyance on the level of robocall hang-ups while you’re trying to enjoy dinner? Should developers and marketers be thrilled with the idea of an AI-powered “voicebot” that can deal with the mundane conversations of everyday life? Beyond that, is Google Duplex even viable on an enterprise or global scale?

To find out, let’s take an in-depth look at Duplex and what it could mean for app developers and marketers alike.

A peek behind the curtain

Duplex’s smooth-sounding outward persona is the product of an RNN, or recurrent neural network. RNNs are not new tech, as they’ve been around since the early 1980s. One of the earliest versions of an RNN, the Hopfield Network, provided a model for decoding human behavior—a good starting point to understand and react to human communication.

According to Google, it has initially trained Duplex “on a corpus of anonymized phone conversation data. The network uses the output of Google’s automatic speech recognition (ASR) technology, as well as features from the audio, the history of a conversation, and the parameters of a conversation.” This allows Duplex to make realistic-sounding phone calls on your behalf for things like making dinner reservations.

Google Duplex ASR diagram

Source: Google – “Incoming sound is processed through an ASR system. This produces text that is analyzed with context data and other inputs to produce a response text that is read aloud through the TTS system.”

Key to iterating on the earliest version of the technology, Duplex self-monitors to identify tasks it is incapable of completing, with a hand-off to a human being as the fail-safe. With time, the AI-driven, human-guided experience becomes more “conversational, intelligent, and universal by leveraging the same natural language, business rules and decision making, application flow, and back-end integration that feed into all self-service apps,” says Scott Horn, CMO of [24]7.ai, an AI platform for speech and digital.

Duplex’s engineers believe it could also help address accessibility and language barriers. That’s a great vision of the future for a voice-enabled chatbot, a product that is still in its infancy and has seen low adoption by enterprises. That said, if anyone has the engineering talent and market reach to make AI-powered voicebots ubiquitous, it’s Google. But the path ahead is technically complex and fraught with ethical questions.

Technical limitations and ethical implications

According to Horn, writing for VentureBeat, “Bots today do well so long as the customer stays on the ‘happy path,’ the ideal scenario where the customer says all the right things. That is, when question A is well understood and leads naturally and predictably to question B and so forth. Challenges begin when the questions deviate from this path.”

Beyond this very practical issue all developers must consider, there are other, more profound technical and ethical implications of AI-powered voicebots. First and foremost, how will Duplex integrate with backend legacy systems? As Horn wrote, “This may not be as much of an issue when interacting with a small business to make an appointment, but it can be a challenge for larger enterprises. Calling a bank for a financial transaction—say, transferring funds between accounts—and then asking whether you qualify for a car loan involves many bank systems. It is quite likely they are not integrated among themselves, let alone with the bot.” Realizing these limitations means developers and companies can begin deeply integrating their systems to make sure humans and AI alike can work cross-functionally to better serve customers.

AppLovin-chatbots-ai-machine-learning-trendschatbots-ai-machine-learning-trends

Second, and highly pertinent to modern digital marketers, is the question of omnichannel integration. Duplex, by definition, is purpose-built for use via telephone conversations, but telephone is a shrinking channel in comparison to email, SMS, social, and in-app channels.

What gives Duplex cachet as a new technology is its promised ability to understand what we’re saying and interact with users in a natural way. But our “natural way” of communicating hasn’t been limited to speech in a long time, as we’ve been using email, messaging apps, and other forms of communication for years. Will Duplex evolve to play nice with each of these other channels that users and brands rely upon? Or are we looking at a one-trick pony?

Further, we can’t say whether Duplex’s AI brain will ever grow up enough to void its dependence on humans to correct the technology through the obvious roadblock that it is not human, and it will never fully be able to communicate like it is. The optimist may think it can, or at least it can well enough. The realist thinks the machine will always need a flesh and blood custodian.

Finally, consumers must consider the role they are willing to allow eerily human-sounding bots to play. Even in Google CEO Sundar Pichai’s demo of Duplex, it was hard to miss the fact that that the technology did not identify itself as a bot. Additionally, it never told the person that picked up the phone it was recording, or that the data it was gathering would be used to evolve the technology.

“Does Google have an obligation to tell people they’re talking to a machine?” asks Ken Hanly of Digital Journal. “Does technology that mimics humans erode our trust in what we see and hear? And is this another example of tech privilege, where those in the know can offload boring conversations they don’t want to have to a machine?”

These questions remain unanswered, even as the market rushes towards the future Duplex promises.

Duplex is undeniably cool technology, as it has all the characteristics of a smart, modern, ubiquitous solution. It’s raised the bar on natural language processing, and many markets will be altered as a result. However, it’s still unproven and has a long way to go. Eventually, technical and ethical issues will be resolved and Duplex will find its market purpose. In the meantime, developers and marketers should continue to anticipate customer needs and find opportunities for Duplex and other AI-powered bots to serve them.

Lewis Leong is AppLovin's Content Marketing Manager.

We’re hiring! Apply here.