Conversational AI has existed now for several years, but the technology hasn’t been ready for anything but a lab. Now, however, thanks to advancements in AI technology, accelerated computing, and machine-learning models, conversational AI is ready to move into the business world as a mainstream technology, particularly in the area of customer experience.  Earlier this month, a panel of experts from T-Mobile, RingCentral, and Hugging Face gathered at the NVIDIA 2021 GTC conference to discuss how conversational AI has enhanced their businesses and to share trends shaping the future of this emerging technology. At the conference, NVIDIA also unveiled Riva Custom Voice, a new toolkit that can be used to create custom voices with only 30 minutes of speech recording data.  The innovation happening around voice synthesis and speech data will “transform the way virtual assistants and chatbots are connecting and speaking back,” said Kari Briski, NVIDIA’s vice president of software product management for AI/HPC (high-performance computing). There’s a huge opportunity to use data to build new conversational AI models that take into consideration people’s accents and distinct audio environments, such as noisy coffee shops and outdoor sporting events.

AI is in a position to teach itself 

“In the future, we’ll see AI define its own data,” said Prashant Kukde, Assistant Vice-President of Conversational AI at RingCentral. Much like a background filter blurring out a messy room, Kukde said, AI could act as a real-time background filter to eliminate an accent when a non-native speaker is talking. At the same time, a person on the other end would hear the accent they’re familiar with. This concept of bidirectional conversational AI is just one example of innovation in this space, Kukde said. Taking a step back to the present day, RingCentral’s current focus is applying AI to spontaneous conversations typically found in virtual meetings. The unified communications as a service (UCaaS) provider is bundling conversational AI into its existing product portfolio. It recently launched a new automated summarization feature that generates speech-to-text meeting summaries to provide attendees with a better experience and improve productivity.

Why it should be in every contact center 

Whereas T-Mobile’s conversational AI deployments range from supporting T-Mobile employees to external-facing customers. T-Mobile is using AI in its contact centers to document conversations between customers and customer service agents, both through chatbots and self-service. The wireless carrier also uses AI to transcribe conversations from speech to text to help agents working in call centers (agent-assist). When COVID-19 hit, T-Mobile call centers became inundated with customer requests for payment plans due to the financial difficulties caused by the pandemic. T-Mobile was able to automate this simple task by rolling out a chatbot that helped customers make payment arrangements. What T-Mobile didn’t expect is to get such a high return on investment (ROI) launching a small side project, which turned into a widely used tool.  “We thought the chatbot would only live for the coronavirus season. But in the first 18 months of its life, we had a 750%  ROI from this chatbot,” said Heather Nolis, T-Mobile’s principal machine-learning engineer. “There are many routine tasks that happen in our call centers where humans aren’t necessary. In fact, we found that about 30% of our customers don’t want to talk to a person and would prefer a conversational assistant.”

Conversational AI continues its rapid evolution 

Over the last three years, conversational AI has evolved to include new types of models that provide better predictions to summarize and classify text, understand the sentiment, and do new things both in speech and vision. Moving forward, conversational AI will be driven by open collaboration through models, checkpoints, and implementations. This means organizations will have to embrace more pragmatic AI, which tackles the business problem rather than the solution, said Jeff Boudier, product director at Hugging Face, creator of the Transformers open source natural language processing (NLP) library. “One of the main challenges of the last three years has been to take science into the hands of the practitioners,” said Boudier. “Pragmatic AI requires using open source technologies as much as possible. That’s one of the defining characteristics of machine learning; it’s science-driven. It’s a living system that people are building.” Hugging Face’s Transformers library encourages contributions from many people across different industries. There are more than 1,600 public data sets available in approximately 200 languages. Anybody can access 70,000 free transformer models provided by a community of 1,000 contributors (and growing). The data sets include everything from classifying text to transcribing audio to recognizing objects in photos and videos.  An open, collaborative future is the ultimate goal for conversational AI. But before organizations can get there, they need to understand why they’re building chatbots and other AI-based services in the first place. RingCentral’s Kukde thinks organizations should gradually introduce conversational AI and position it in a way that doesn’t make people feel like it’s taking over their jobs. When AI is progressively introduced, organizations have time to collect feedback with more data, better training, and keep building for the future, he said. Nolis believes a good strategy for organizations is to create chatbots that provide users with a good experience rather than making suggestions they already know. It’s important to understand where chatbots should and shouldn’t be used. Hence, people actually love talking to chatbots instead of tolerating them and hoping to get to a real person eventually. “Anyone building a chatbot should listen to their users by looking at the data they already have from social media interactions, complaints, and conversations with customer service agents,” Nolis concluded. “If the AI agent we are building cannot do what a human can do, then we’ll let a human handle it because we really care about the quality of our customer service.”