Skip to content

Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. This repository contains a simple and efficient implementation using NextJS. Dive into the world of AI voice generation for free with our comprehensive demo. Contributions welcome.

License

Notifications You must be signed in to change notification settings

Ejb503/ai-voice-generation

Repository files navigation

AI Powered Voice Chat Demo

logo By tyingshoelaces.com License Contributors

Overview

This is a simple version of OpenAI's voice functionality using free APIs. This demo lets you talk, listen, and converse with LLMs.

Original blog post is here: - Blog: Blog Post Youtube video explainer is here: YouTube Video

Feel free to play around!

Tech Stack

  • LLM Host: Groq
  • LLM: LLAMA 3
  • TTS: DeepGram
  • STT: SpeechRecognition API
  • Web Framework: NextJS (React front-end, Express API)

How to use

  1. download the repo
  2. npm i
  3. setup .env.local with DEEPGRAM_API_KEY and GROQ_API_KEY
  4. npm run dev

You might want to edit all the prompts to change the tone of the response.

The architecture is simple, Voice -> Text -> LLM -> Text -> Voice. Rag and all sorts of fun creative things can be used to spice up the LLM.

Hints and tricks

You'll probably want to switch out SpeechRecognition for Whisper AI if you want non-chrome APIs or something more stable.

There is a lot of investment needed in handling state in the AudioPlayer, not necessary for this demo.

Playing with the prompts and context going to Groq is the key for personalisation.

Contact me for feedback!

What I Did

I built a demo where you can:

  1. Talk into the browser using the WebSpeechRecognitionAPI.
  2. Stream the transcribed text to Groq for processing.
  3. Stream the response from Groq to DeepGram for text-to-speech conversion.
  4. Play the generated audio response in the browser.
  • NextJS: ★★★★★ - Wonderful technology, simplifies client and server-side development.
  • Groq: ★★★★★ - New benchmarks in speed and cost.
  • Llama3: ★★★★☆ - Noticeable difference from GPT-io, great for cheap requests and demos.
  • DeepGram: ★★★☆☆ - Generous starting credits, good latency. Still green as a tech.

Links


Edward Ejb503, Tying Shoelaces Blog

About

Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. This repository contains a simple and efficient implementation using NextJS. Dive into the world of AI voice generation for free with our comprehensive demo. Contributions welcome.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published