Gemini 2.0 Flash Multimodal Live API Client

A lightweight vanilla JavaScript implementation of the Gemini 2.0 Flash Multimodal Live API client. This project provides real-time interaction with Gemini's API through text, audio, video, and screen sharing capabilities.

This is a simplified version of Google's original React implementation, created in response to this issue.

Live Demo on GitHub Pages

Live Demo

Key Features

Real-time chat with Gemini 2.0 Flash Multimodal Live API
Real-time audio responses from the model
Real-time audio input from the user, allowing interruptions
Real-time video streaming from the user's webcam
Real-time screen sharing from the user's screen
Function calling
Transcription of the model's audio (if Deepgram API key provided)
Built with vanilla JavaScript (no dependencies)
Mobile-friendly

Prerequisites

Modern web browser with WebRTC, WebSocket, and Web Audio API support
Google AI Studio API key
python -m http.server or npx http-server or Live Server extension for VS Code (to host a server for index.html)

Quick Start

Get your API key from Google AI Studio

Clone the repository

git clone https://github.com/ViaAnthroposBenevolentia/gemini-2-live-api-demo.git

Start the development server (adjust port if needed):

cd gemini-2-live-api-demo
python -m http.server 8000 # or npx http-server 8000 or Open with Live Server extension for VS Code

Access the application at http://localhost:8000
Open the settings at the top right, paste your API key, and click "Save"
Get free API key from Deepgram and paste in the settings to get real-time transcript (Optional).

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
css		css
js		js
LICENSE		LICENSE
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemini 2.0 Flash Multimodal Live API Client

Live Demo on GitHub Pages

Key Features

Prerequisites

Quick Start

Contributing

License

About

Languages

License

ViaAnthroposBenevolentia/gemini-2-live-api-demo

Folders and files

Latest commit

History

Repository files navigation

Gemini 2.0 Flash Multimodal Live API Client

Live Demo on GitHub Pages

Key Features

Prerequisites

Quick Start

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages