Google Unveils Gemini Live Voice Assistant to Rival ChatGPT Voice Mode
Google's new AI-powered voice assistant offers advanced conversational capabilities and app integrations
Google has unveiled Gemini Live, a conversational voice assistant that’s set to rival OpenAI’s Voice Mode.
Available through the Gemini app on Android and iOS, the new Live feature allows users to interact with the AI using their voice.
Powered by Google’s Gemini 1.5 Flash model, the Live feature can answer questions across a variety of generated voices, 10 in total. Users can ask the chatbot to manage their shopping lists or summarize incoming emails.
“With Gemini, we’re reimagining what it means for a personal assistant to be truly helpful,” said Sissie Hsiao, Google’s general manager for Gemini experiences and Google Assistant. “Gemini is evolving to provide AI-powered mobile assistance that will offer a new level of help — all while being more natural, conversational and intuitive.”
Google’s answer to ChatGPT Voice Mode lets users talk to the chatbot when moving to a different app and even when their phone is locked, allowing for interactions to occur as if they were taking a regular phone call.
Gemini Live is currently available in English to Gemini Advanced subscribers on Android phones, before coming to iOS and more languages in the coming weeks.
Gemini Advanced offers a free trial for the first month, with a subscription cost of $20 per month thereafter.
In addition to the new voice functionality, subscribers have access to the Gemini 1.5 Pro model, and its mammoth input length, as well as more storage, access to Gemini in Workspace applications and the ability to upload files for the chatbot to interact with.
Live is getting further extensions — including interoperability with other Google apps such as YouTube Music, where the chatbot can create playlists from voice prompts.
Also in the works is calendar support, allowing the chatbot to interact with a user’s calendar app to set reminders about upcoming events.
New features are expected in the coming weeks.
“Because Gemini has built deep integrations for Android, it can do more than just read the screen,” Hsiao wrote in a blog post. “ It can interact with many of the apps you already use. For example, you can drag and drop images that Gemini generates directly into apps like Gmail and Google Messages.”
In addition to new functionality, Google plans to improve the speed and quality of Live responses. The underlying 1.5 Flash model was unveiled at this year’s Google I/O event and despite being smaller than the flagship 1.5 Pro model, it still boasts the same hefty context window, meaning it can handle huge data inputs.
Gemini Live comes as OpenAI steps up its improvements to ChatGPT’s audio feature, with the new GPT-4o greatly improving the chatbot’s voice functionality.
OpenAI recently began rolling out the newly revamped ChatGPT Voice Mode, though it’s currently locked to a small group of ChatGPT Plus subscribers.
Some might say Google is simply copying ChatGPTs Voice Mode, but the search company has been working on something similar for some time.
Gemini Live is a glimpse of what its researchers have been working on, with a conversational agent being teased at I/O back in May under the tagline Project Astra.
Read more about:
ChatGPT / Generative AIAbout the Author
You May Also Like