Local Whisper Cat

A plugin to transcript locally on your gpu/cpu, audio files to text.

How it works

This plugin communicates with a local container running an api service to transcript audio files to text.

In the settings panel you can set the location of the container and the audio_key .

Should be agnostic but in practice I am referring to Whisper ASR Webservice

How to setup the plugin settings

Choose the url model

"http://openai-whisper-asr-webservice:9000" by default

Choose an Audio key field for your Websocket message

"audio_key" by default

Choose a language

"en" by default

How to send audio files to the Cat

Your client should send a message with the following fields: text, user_id, audio_key, audio_type, audio_name, encodedBase64.

The audio_key field should contain the base64 encoded audio file. like the next example:

your_json_fields = {
    text='',
    user_id='user69',
    audio_key: "",
    audio_type: "
    "audio/ogg"
    ",audio_name: 'msg45430839-160807.ogg',
    encodedBase64: True,
    }

For convenience you can use a compatible Python client Chatty! in order to send 10 second audio in the right format.

Obviously you must have set the nvidia-container-toolkit and have an adequate video card

Obviously you need a running container with whisper-asr-webservice

The accepted audio formats are: mp3, wav, ogg,mpeg, mp4(depending on the container settings).

Example of a full local istance with ollama and nvidia container with docker-compose

networks:
    fullcat-network:
services:
    cheshire-cat-core:
        build:
            context: ./core
        container_name: cheshire_cat_core
        depends_on:
            - cheshire-cat-vector-memory
            - ollama
            - openai-whisper-asr-webservice
        environment:
            - PYTHONUNBUFFERED=1
            - WATCHFILES_FORCE_POLLING=true
            - CORE_HOST=${CORE_HOST:-localhost}
            - CORE_PORT=${CORE_PORT:-1865}
            - QDRANT_HOST=${QDRANT_HOST:-cheshire_cat_vector_memory}
            - QDRANT_PORT=${QDRANT_PORT:-6333}
            - CORE_USE_SECURE_PROTOCOLS=${CORE_USE_SECURE_PROTOCOLS:-}
            - API_KEY=${API_KEY:-}
            - LOG_LEVEL=${LOG_LEVEL:-DEBUG}
            - DEBUG=${DEBUG:-true}
            - SAVE_MEMORY_SNAPSHOTS=${SAVE_MEMORY_SNAPSHOTS:-false}
        ports:
            - ${CORE_PORT:-1865}:80
        volumes:
            - ./cat/static:/app/cat/static
            - ./cat/public:/app/cat/public
            - ./cat/plugins:/app/cat/plugins
            - ./cat/metadata.json:/app/metadata.json
        restart: unless-stopped
        networks:
            - fullcat-network
            
    cheshire-cat-vector-memory:
        image: qdrant/qdrant:latest
        container_name: cheshire_cat_vector_memory
        expose:
            - 6333
        volumes:
            - ./cat/long_term_memory/vector:/qdrant/storage
        restart: unless-stopped
        networks:
            - fullcat-network
            
    ollama:
        container_name: ollama_cat
        image: ollama/ollama:latest
        volumes:
            - ./ollama:/root/.ollama
        expose:
            - 11434
        environment:
            - gpus=all
        deploy:
            resources:
                reservations:
                    devices:
                        - driver: nvidia
                          count: 1
                          capabilities:
                              - gpu
        networks:
            - fullcat-network
            
    openai-whisper-asr-webservice:
        deploy:
            resources:
                reservations:
                    devices:
                        - driver: nvidia
                          count: all
                          capabilities:
                              - gpu
        ports:
            - 9000:9000
        expose:
            - 9000
        environment:
            - ASR_MODEL=base
            - ASR_ENGINE=openai_whisper
        image: onerahmet/openai-whisper-asr-webservice:latest-gpu
        networks:
            - fullcat-network

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
.gitignore		.gitignore
README.md		README.md
local_whisper_cat.py		local_whisper_cat.py
local_whisper_cat_logo.png		local_whisper_cat_logo.png
plugin.json		plugin.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Whisper Cat

How it works

How to setup the plugin settings

How to send audio files to the Cat

Example of a full local istance with ollama and nvidia container with docker-compose

About

Releases

Packages

Languages

LorenzoSiena/local_whisper_cat

Folders and files

Latest commit

History

Repository files navigation

Local Whisper Cat

How it works

How to setup the plugin settings

How to send audio files to the Cat

Example of a full local istance with ollama and nvidia container with docker-compose

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages