Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
The lab that built Stable Diffusion’s dataset said BUD-E is an open source AI voice assistant that understands context
AI voice assistants have come a long way since Siri was introduced in February 2010. Now, the team that helped design Stable Diffusion wants to build a next-gen voice assistant that responds to requests in real-time with a natural voice.
German nonprofit research lab LAION unveiled BUD-E, which stands for Buddy for Understanding and Digital Empathy. It is designed to provide more immersive conversational experiences than current AI voice assistants.
LAION claims that current voice assistants respond in what it describes as a “stilted, mechanical nature." Also, "unlike human conversation partners, they often struggle with fully understanding and adapting to the nuanced, emotional, and contextually rich nature of human dialogue, leading to noticeable latencies and a disjointed conversational flow. Consequently, users often experience unsatisfactory exchanges."
BUD-E sounds more natural than current systems and it also runs on consumer devices, the research lab said. Moreover, the system achieved latencies of 300 to 500 milliseconds, a fast response to user requests.
LAION, which built the underlying dataset for the text-to-image AI model Stable Diffusion, created BUD-E with the ELLIS Institute Tübingen, Collabora and the Tübingen AI Center.
It is still early days for BUD-E, with LAION dreaming of a voice assistant that can manage multi-speaker conversations with interruptions, affirmations and thinking pauses.
The current version of BUD-E uses Nvidia’s speech-to-text model FastConformer Streaming STT, Microsoft’s Phi-2 language model and the StyleTTS2 text-to-speech model.
However, LAION wants to scale the underlying models powering BUD-E, expressing confidence it could, in the future, produce responses with low latency using a larger model like the 30 billion parameter version of Meta’s Llama 2.
You can try BUD-E for yourself as all of its code is open source and available on GitHub.
But you can also go one step further and contribute to the development of BUD-E. LAION has invited open source developers and researchers to help refine the voice assistant. For those interested, join the LAION Discord community or reach out at [email protected].
Read more about:
ChatGPT / Generative AIYou May Also Like