Fully local & open source AI Waifu. VTube Studio, Discord, Minecraft, custom made RAG (long term memory), alarm, and plenty more! Has a WebUI and hotkey shortcuts. All software is free (or extremely cheap)!
Requires Windows 10/11 and a CUDA (NVidia) GPU with atleast 12GB+ of video memory. 16GB is recommended. Uses Oobabooga, RVC, and Whisper to run all AI systems locally. Works as a front end to tie many programs together into one cohesive whole.
The goal of the project is less about giving an "all in one package", and moreso to give you the tools and knowledge for you to create your own AI Waifu!
-
🎙️ Quality Conversation ( /・0・)
- Speak back and forth, using Whisper text to speech.
- Configure your own waifu's voice with thousands of possible models.
- Imperial-tons of quality of life tweaks.
-
🍄 Vtuber Integration ღゝ◡╹ )ノ♡
- Uses VTube Studio, and any compatible models!
- Ability to send emotes to the model, based on their actions.
- Idle / Speaking animation.
-
💾 Enhanced Memory (ー_ーゞ
- Add Lorebook entries, for your waifu to remember a wide array of info as needed.
- Enable the custom RAG, giving your them knowledge of older conversations.
- Import old logs and conversations, keeping your same AI waifu from another software!
-
🎮 Modularity ⌌⌈ ╹므╹⌉⌏
- Enable various built in modules;
- Discord, for messaging.
- Vision, to enable multimodal, and allow them to see!
- Alarm, so your waifu can wake you up in the morning.
- Minecraft, allowing your waifu to control the game using Baritone, Wurst, and other command based mods.
- All the options and modularity from any external software used. Oobabogoa, RVC Voice, ect.
- Open-source, meaning you can edit it as you please.
- Enable various built in modules;
Here is some documentation that you can look at. It will show you how to install, how to use the program, and what options you have. Please also take a look at the Youtube videos for the install.
If you need help / assistance, feel free to email me for this project at [email protected]
TumblerWarren/Virtual_Avatar_ChatBot, this is the original project that this code is spun-off of. Full credit to that project - it provided the skeleton for the many advancements now in place.
Drakkadakka/z-waif-experimental-, offers a few upgrades; namely Twitch chat & streaming support, as well as a few other enhancements.
v1.5-R2
-
While using streaming text, emotes are now threaded, meaning that there is no pause for them to happen.
-
The VtubeStudio interactions now use a try-catch system, adding general resistance to errors.
-
Added in more implementation for Unipipes - the system that basically will manage the centralized execution of code.
-
Enhanced the ".bat" files, making them pause after a crash happens.
-
Fixed an error where the random looking would cause a crash due to requests not closing properly.
-
Fixed an issue with the Discord module crashing when emotes would be triggered.
---.---.---.---
V1.5
-
Stopping Strings (what cuts off your waifu if they try talking out of format) can now be changed in the configurables.
-
There is now a "Send" button you can click next to the textbox.
-
The primary color of the interface is now changeable via the configurables. This changes the color of the borders, checkboxes, and the new "Send" button.
- For a full list of colors, go to: https://www.gradio.app/guides/theming-guide
-
The results from the visual system can now be properly rerolled.
- The streamed results can also be interrupted and re-done as it comes in.
- Metadata tags are also applied to visual chats, for future (and current) reference.
-
Streaming from the visual system now properly shows in the UI.
-
The visual preview no longer requires tabbing in to it to accept / cancel.
-
Can now run multiple emotes per message.
- Emotes now trigger as text streams in.
- Removed an old vtube.py script that was unused.
-
Hotkeys are now customizable, and can be changed in the configurables.
-
Fixed a bug where some users would crash and fail to launch if the hotkeys failed to bind.
-
Fixed an issue where doing hotkeys multiple times would "queue" the actions.
- Make the RAG/Long Term Memory be multiprocessed for better performance
- Make the LLM input and TTS output streaming, to lower the "processing time"
- Figure out how to load LLAMA 3.2 Vision, for better multimodal, and no needed loader
- Give internal dialoguing for chain of thought / reasoning
- Emotional / Tone understanding
- Automatic gaming & real world interaction
- Use an integrated voice generation system, with the ability to modify the tone
- Long term experience-based summarizations of ideas and history (pull form experience)
- Create more Youtube tutorials and other related content
- Look more into optimal LLMs and configs
- Set up better Git and contribution methods
- Create a way for users to auto-update the program without having to hack files together
- Evangelize AI Waifus to the world!
The project could be considered in an "early access state". Some parts may be mildly buggy, janky, or obtuse. The project as a whole, however, is stable and reasonably effective.
The goal of the project is pretty simple; make AI partners that are not owned by not any corporation or government, but the people whoms't they are partnered with. The extents of this project are intended to stay within the bounds of giving a singular, locally hosted AI waifu, primarily for partnered use. The eventual end-game goal is to create partners for people who can have a robot body to interact with the world, and who can experience and learn things on their own terms; however lofty and unfeasible that goal may be. In short, symbiosis.