Justine Cassell, a professor at Carnegie Mellon University who studies the way humans interact with AI agents, says it will be fascinating to see how people respond to a voice-enabled chatbot capable of richer responses. “The goals are great, and I’m excited to see what they do,” she says.
However, Cassell says some of the things Amazon is promising, like responding to body language, remain extremely challenging. “There is no grammar of body language, the way there is for spoken and written language,” she says. If Alexa misreads someone’s posture or movements and responds incorrectly, things could get awkward.
Cassell says that even if Alexa gains more ChatGPT-like fluency, its efforts to mimic human personality and feeling through characteristics like intonation are unlikely to match human capabilities for some while yet. Expect the new Alexa to sometimes feel stilted in its responses.
Amazon says users will be able to apply to gain access to an additional test of its new technology, where Alexa’s new capabilities can be used to control other devices, including some not made by Amazon. Over time, the company plans to add new features to Alexa, potentially including the ability to discuss and recommend products from the company’s vast inventory of products.
If Alexa can respond to more complex queries while avoiding embarrassing errors, it could herald a wider—and much needed—upgrade in the capabilities of voice assistants.
When Amazon launched Alexa in 2014, it helped create a new category in personal computing built around voice interaction, spurring predictions that voice interfaces would soon dominate. Alexa and Apple’s Siri benefited from advances in machine learning that finally made it feasible for devices to reliably recognize and respond to a user’s voice. But the complexity of language has limited these devices to only simple commands and left them unable to engage in anything resembling a real conversation. Even so, Amazon says that over half a billion devices featuring Alexa have been sold worldwide.
The advent of large language models trained on vast amounts of text has at last created algorithms that can handle more complex dialog. ChatGPT and other chatbots have startled both experts and the public with their flexibility and garrulousness, even though they are prone to spitting out statements that may be false, biased, or even offensive.
Prasad says Amazon developed a new cutting-edge large language model to invigorate Alexa. He says that the company fine-tuned this model toward phrasings appropriate for vocal conversation, and it uses additional algorithms to help with recognition of body language and intonation.
One of the big challenges for Amazon may prove to be handling the surprising errors that come with using large language models. When Microsoft added an advanced AI chatbot to its search engine Bing, users quickly discovered some odd behavior. “Is it 100 percent perfect? No,” Prasad says. “This is why it’s an early preview, because there will be occasional errors.”
Prasad says Amazon has already developed guardrails to prevent Alexa from straying off course. He adds that some will remind people they are talking to a machine, and try to avoid the assistant presenting too much like a person. Some chatbot users form strong emotional and even romantic bonds with the simulated personalities they interact with. Prasad adds that Amazon is doing research on the long-term risks that may come from further advances in AI.