Click on the image to view the video.
Click on the image to view the video.
This project focuses on developing LLM-based AI agents and extended reality (XR) applications to enhance smart building control. Leveraging open-source models and Unity 3D, the project integrates the LLaVA vision language model, as well as open-source Text-to-Speech (TTS) and Speech-to-Text (STT) models. The application is designed for the Microsoft HoloLens 2, featuring AI-powered voice chat and image understanding capabilities.
Under preparation...
- LLM-Based AI Agents: Utilizes advanced language models for intelligent interaction and control.
- Extended Reality (XR) Integration: Implements XR technologies with Unity 3D to create immersive smart building control applications.
- AI Voice Chat: Enables natural language communication with the smart building system.
- Image Understanding: Incorporates vision language models for understanding and interpreting visual data.
- Microsoft HoloLens 2
- Unity 3D
- LLaVA vision language model
- Open-source Text-to-Speech (TTS) and Speech-to-Text (STT) models
Contributions are welcome! Please read the CONTRIBUTING.md
file for details on how to contribute to this project.
This project is licensed under the MIT License – see the LICENSE file for details.