This week, everybody is talking about Google Duplex, announced earlier this week at Google I/O. Based on previous interactions with IVRs from calling vendors for customer support, Duplex is an impressive leap forward in natural language AI, and offers future hope at making some clerical tasks easier to complete. Duplex will be tested further by a limited number of users in Google Assistant this summer, refining its ability to complete specific tasks: getting holiday hours for a business, making restaurant reservations, and scheduling appointments specifically at a hair salon.
So what does this mean for most businesses?
In the near term, your front-line retail employees may be called by Duplex more frequently in the near future. The tasks being tested are still quite narrow in scope. This is by necessity: “closed domains” allow the deep training necessary for specific tasks. More broadly, anyone considering similar customer-facing AI initiatives will need to start small, with clearly defined task boundaries. The obvious improvements would be to your company’s phone tree – connecting your customers with appropriate customer service representatives based on natural-language AI-based conversations.
Another caveat is that all of this training being done is language-specific – in this case, everything is in English. If you have a global customer base and you’re looking to improve your customers’ experience with you on phone calls, your AI will need to be trained for each task separately in each language you wish to support.
In practice, Google Duplex requires a significant training effort to teach a specific task for a specific business environment in a specific language. This granularity makes Duplex a fit for individual tasks that are repeated often and have a relatively predictable set of conversational paths and outcomes. However, at this point, Duplex is not a good fit for general conversations. As of now, Duplex is not SkyNet or, if you prefer a more benign example, Janet from “The Good Place.”
What about the ethics?
But what’s generating the most conversation is the ethical implications of an AI that sounds more and more like a real person. In the last two years, we’ve gone from “Photoshop for audio” to using AI to put words in Barack Obama’s mouth . Google has already promised Duplex will identify itself when talking to people. Will this be enough?
This warning may be enough for the conversational aspect of the use case. But is this conversation being recorded by Google? Will Google be replacing people to do these tasks? Is there any way to ensure that other companies developing similar solutions will use Google’s approach as a standard for identifying AI-based conversations? Is an AI-based conversation similar to providing your personal information to a computer interface? And even outside of these direct ethical concerns, is the AI conversing in a manner that is acceptable from an HR or cultural perspective? Without sufficient programming and testing, it is easy to see a situation where the AI insistently moves down an incorrect conversation path while maintaining a personable tone.
Much of the criticism sounds familiar: how did Google not think of these ethical issues at some point during the creation process before announcing its existence to the public? Getting a wider audience perspective is providing valuable feedback to Google – at the cost of its reputation. Improving workforce diversity could help catch some of these issues before they go public. It’s a warning note to other companies considering AI initiatives: what are the possible ramifications of what we’re trying to do? Did we catch all or even most of the possible problems before making its existence more broadly known?
Recommendations for Enterprises Considering Google Duplex and Similar Technologies
Google Duplex will start being implemented in Google Assistant this summer, meaning that it is only a matter of time before your company starts looking at some form of AI-based outreach. As your organization looks at AI-assisted speech, Amalgam provides the following recommendations.
1) Make sure you understand the basis of your AI’s conversational assumptions. For instance, Duplex uses a recurrent neural net to learn how to understand, interact, react, and speak. It combines a concatenative text to speech (TTS) engine with a synthesis TTS engine for inflection. And, as of now, it is tuned to work in English. Dig into the details of how your AI works so that you have strong certainty of how the AI will react as it speaks with your clients, customers, and colleagues.
2) Identify the most frequent and repetitive tasks for potential automation. Despite Amalgam’s warnings, the potential for this technology in augmenting or replacing time-consuming human interactions with predictable outcomes is immense. What in your organization requires these types of conversations and are painfully cumbersome to conduct in a purely digital or computer-aided model? To the extent that these conversational activities exist in your organization, there are opportunities to use Duplex-like technology.
3) Get ready to train your AI solution. Although Google’s Duplex looks like an out-of-the-box solution that makes restaurant reservations, this task took significant real-time supervised training across subject matter, human-grade response, and human inflection. Any corporate use of this type of technology will require similar training, not entirely dissimilar to the challenges of training IBM Watson to handle complex tasks. Although Google Duplex’ demonstrated capabilities are impressive, results like this require training to go through every potential issue or challenge prior to going live. Don’t underestimate the time and effort needed to train AI to act human even for one or two basic tasks, since Duplex uses a brute force method of imitating human behavior rather than recreating the human brain’s methods of learning.
4) Compare your AI considerations to both the ethical and compliance aspects of your company. You have HR and corporate policies that define corporate behavior. If your AI is making calls on behalf of employees, you need to be able to ensure that the AI is compliant with corporate rules. If the AI is not compliant or becomes non-compliant over time, how will your organization shut down the AI and/or get reimbursement from the AI provider that has breached your corporate policies? AI used for outward-facing interactions will need to be treated as a consultant or contingent labor provider over time.
One final thought: the natural language capabilities of Google Duplex are an improvement over existing technology – but some of the tonality still sounds off, even smarmy. When you eventually deploy AI technology to speak on behalf of your company, tone will be key. If your AI is performing work on behalf of your company, and it sounds insincere, or its tone is otherwise out of line with your corporate policies, your company’s reputation will take a hit.