Skip to content

reachsak/LLM-XR-smart-home

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Multimodal AI Agents and Extended Reality (XR) Applications

Video Demo

AI Voice Chat and Image Understanding

Watch the demo video 1
Click on the image to view the video.

Watch the demo video 2
Click on the image to view the video.

Project Overview

This project focuses on developing LLM-based AI agents and extended reality (XR) applications to enhance smart building control. Leveraging open-source models and Unity 3D, the project integrates the LLaVA vision language model, as well as open-source Text-to-Speech (TTS) and Speech-to-Text (STT) models. The application is designed for the Microsoft HoloLens 2, featuring AI-powered voice chat and image understanding capabilities.

Manuscript

Under preparation...

Key Features

  • LLM-Based AI Agents: Utilizes advanced language models for intelligent interaction and control.
  • Extended Reality (XR) Integration: Implements XR technologies with Unity 3D to create immersive smart building control applications.
  • AI Voice Chat: Enables natural language communication with the smart building system.
  • Image Understanding: Incorporates vision language models for understanding and interpreting visual data.

Getting Started

Requirements

  • Microsoft HoloLens 2
  • Unity 3D
  • LLaVA vision language model
  • Open-source Text-to-Speech (TTS) and Speech-to-Text (STT) models

Contributing

Contributions are welcome! Please read the CONTRIBUTING.md file for details on how to contribute to this project.

License

This project is licensed under the MIT License – see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published