Skip to content

Latest commit

 

History

History
128 lines (93 loc) · 8.5 KB

README.md

File metadata and controls

128 lines (93 loc) · 8.5 KB

Jakey Bot

Jakey Bot is a Gemini-based chatbot with personality, powered by Gemini 1.5 Pro and Flash

Jakey Bot Banner

This chatbot is designed to utilize the Gemini API and combine with best Python and Discord APIs to create a helpful chatbots

UI/UX availability

Jakey AI is available as Discord Bot. Standalone UI is coming soon

Features

  • It uses the latest and greatest Gemini 1.5 models with extensive multimodal capabilties, this chatbot can accept text, images, video, and text files to input. With models to choose from
  • Enables and exposes AI tools and features such as JSON mode, code execution, function calling, and system instructions for personality
  • It can summarize messages and integrate to Discord
  • Chat history per guild or user session (chat history is stored under pickle that snapshots the Gemini API chat history objects)
  • Gemini API requests are asynchronous

Installation

Core dependencies is Python with PIP, depending on your distribution, pip must be installed separately along with venv. If you want to enable music chatbot mode, you'll also need to install ffmpeg/openjdk

Required permissions for Discord bot

  • Read message history (see #faq for privacy implications)
  • Embed messages (required for rendering text more than 4096 and for most commands)
  • Send messages (obviously)
  • Attach files
  • Create webhooks
  • Create slash commands
  • Voice related features such as connect, disconnect

Required dependencies

  • Python 3.10+ with pip
    If you use Linux distros, I strongly require you to install Python with venv support due to PEP 0668 and PEP 453 for rationale.

Optional dependencies

  • OpenJDK 17 with ffmpeg
    Needed for voice commands (wavelink/lavalink)

Installation

Once you activated your enviornment and has pip ready, you can run

pip3 install -r requirements.txt

After you installed the dependencies, don't run main.py just yet. You must run these commands before installing, since Wavelink installs discord.py as dependency and we use py-cord due to ease of use

pip3 uninstall py-cord discord.py
pip3 install py-cord

Configuring

After you install the required dependencies, head over to dev.env.template and save it as dev.env in the gitroot directory

Required fields to configure:

  • TOKEN - Your Discord Bot Token
  • GOOGLE_AI_TOKEN - Gemini API token, please see this link to obtain API keys (Its free)
  • SYSTEM_USER_ID - Its strongly advisable you to use your Discord user ID for administrative commands like eval. You probably don't want me to control your infrastructure 😉

Please see CONFIG.md for more information about configuration.

Voice commands configuration:

You can enable VC-related commands such as /voice play (which plays videos from YouTube and other supported sources) by downloading Lavalink jar file and placing it as wavelink/Lavalink.jar in project's root directory.

Activate voice by placing Lavalink.jar from lavalink releases and rename application.yml.template to application.yml and run java -jar Lavalink.jar in separate session before starting the bot.

Running the server

After everything is configured, you can run main.py

Get started by asking Jakey /ask prompt:Who are you and how can I get started

By default, it uses Gemini 1.5 Flash because it's cheap, widely used, and has the same multimodal and contextual capabilities as Pro but it is statistically nerfed in terms of performance and diverse domain understanding, but it is much better than 1.0 Pro and GPT-3.5 and on-parity (in some cases outclasses) with the first GPT-4 model snapshot from March 2023. Please see the LLM arena for comparison

Get started

Jakey provides commands such as:

  • /ask - Ask Jakey anything!
    • Get started by asking /ask prompt: Hey Jakey, I'm new, tell me your commands, features, and capabilities
    • Accepts file attachments in image, video, audio, text files, and PDFs (with images) by passing attachment: parameter
    • JSON mode with json_mode:True
    • Ephemeral conversation with append_hist:True
    • You can choose between Gemini 1.5 Flash or Gemini 1.5 Pro using model: parameter
  • /sweep - Clear the conversation
  • /feature - Extend Jakey skills by activating chat tools! (Clears conversation when feature are set)
  • /imagine - Create images using Stable Diffusion 3
  • /summarize - Summarize the current text channel or thread and gather insights into a single summary thanks to Gemini 1.5 Flash's long context it can understand conversations even from the past decade!
  • /mimic - Mimics other users using webhook
  • /voice - Basic streaming audio functionality from YouTube, soundcloud and more!

Jakey also has apps which is used to take action on a selected message. Such as explain, rephrase, or suggest messages.

apps

FAQ

This is FAQ for people using this bot, please see FAQ for technical users to understand how data is stored or how the code works under the hood.

Why Jakey instead of standard Gemini personality?

Personality is implemented in the chatbot so to make it more human-like. However, it is based on a guy and Jakey's name is based on Jake which is mostly a masculine name (and no, don't expect Jakey to be your AI girlfriend). Prefer to keep it neutral however.

Can it search the internet?

Web Search (beta) can be used by enabling it under /feature command capability named "Web Search with DuckDuckGo" and ask queries with keywords like "Search the web"

Web search performs in two steps

  1. It searches the query through DuckDuckGo API and collects the links needed for page summarization
  2. The list of URLs is then being scrapped and agregates them so the model can understand them

The maximum number of queries can be used is 6 to prevent tokens from depleting so quickly due to large articles and causing slower responses as context builds up. It does not use embeddings at the moment.

Its recommended to use Gemini 1.5 Pro to better utilize Tool use but Flash also works. Keep in mind that the model sometimes cannot pick up the tool schema needed to perform web search action, if it fabricates its responses, explicitly tell the model to search the web.

Using web search can affect the response overall performance, due to number of pages are being passed depending through the query through the model which is quite similar to attaching a single 20 page PDF being processed. Its recommended to use web search sparingly if you want the model to be aware with certain information. You can also tell the model how many searches it can perform (but queries are maximum to 6) optimally 2-3 searches.

Depending on a website, some pages may not be used for responses that does not have extractable textual data.

You can also attach HTML files manually as part of attachment if you want a single page summarization img

Are models free to use?

Yes, both 1.5 Pro and Flash are free to use, and the latter is used by default (overriden by model: parameter)
The only limit is rate limit. 1.5 pro rate limits are usually lowest than flash.

If you have an account with higher rate limits, we suggest to self-host this bot and use your own API keys from AI studio with billing enabled to serve your users. Vertex AI and other non-Google AI models are not supported at this time.

Can this bot be user-installable?

You can use /ask, /imagine and /sweep commands in the bot's DM once you install this app by tapping "Add app" in its profile card and clicking "Try it yourself" otherwise you will get "Integration error" when directly using these commands in DMs.

https://support.discord.com/hc/en-us/articles/23957313048343-Moderating-Apps-on-Discord#h_01HZQQQEADYVN2CM4AX4EZGKHM

Keep in mind that after installing the app to yourself, mentioned commands are exposed anywhere even if the bot is not authorized in guilds you've joined. Using /ask and /sweep commands are not supported outside DMs or guilds where the bot is authorized despite it can be visible from anywhere if its installed by user scope. This is due to because some actions like ctx.send will prematurely end the command with Missing Access error.