VLM UI

VLM UI is a web-based user interface for interacting with various Vision Language Models (VLMs).

It provides a convenient way to upload images, ask questions, and receive responses from the model.

Features

Web-based interface using Gradio
Support for multiple VLM models
Image upload and processing
Real-time streaming responses
Dockerised deployment

Prerequisites

Docker
NVIDIA GPU with CUDA support (for running models)

Quick Start

Clone the repository:

git clone --depth=1 https://github.com/sammcj/vlm-ui.git
cd vlm-ui

Build and run the Docker container:

docker build -t vlm-ui .
docker run -d --gpus all -p 7860:7860 -e MODEL_NAME=OpenGVLab/InternVL2-8B vlm-ui

Open your browser and navigate to http://localhost:7860 to access the VLM UI.

Configuration

You can customize the behaviour of VLM UI by setting the following environment variables:

SYSTEM_MESSAGE: The system message to use for the conversation (default: "Carefully follow the users request.")
TEMPERATURE: Controls randomness in the model's output (default: 0.3)
TOP_P: Controls diversity of the model's output (default: 0.7)
MAX_NEW_TOKENS: Maximum number of tokens to generate (default: 2048)
MAX_INPUT_TILES: Maximum number of image tiles to process (default: 12)
REPETITION_PENALTY: Penalizes repetition in the model's output (default: 1.0)
MODEL_NAME: The name of the model to use (default: OpenGVLab/InternVL2-8B)
LOAD_IN_8BIT: Whether to load the model in 8-bit precision (default: 1)

Example:

docker run -d --gpus all -p 7860:7860 \
  -e MODEL_NAME=OpenGVLab/InternVL2-8B \
  -e TEMPERATURE=0.3 \
  -e MAX_NEW_TOKENS=2048 \
  vlm-ui

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Copyright Sam McLeod
This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

This app builds on the work of the following projects:

Name	Name	Last commit message	Last commit date
Latest commit sammcj A new start Jul 29, 2024 fbd5bd0 · Jul 29, 2024 History 1 Commit
.github/workflows	.github/workflows	A new start	Jul 29, 2024
assets	assets	A new start	Jul 29, 2024
.env.example	.env.example	A new start	Jul 29, 2024
.gitignore	.gitignore	A new start	Jul 29, 2024
Dockerfile	Dockerfile	A new start	Jul 29, 2024
README.md	README.md	A new start	Jul 29, 2024
api.py	api.py	A new start	Jul 29, 2024
constants.py	constants.py	A new start	Jul 29, 2024
controller.py	controller.py	A new start	Jul 29, 2024
conversation.py	conversation.py	A new start	Jul 29, 2024
entrypoint.sh	entrypoint.sh	A new start	Jul 29, 2024
gradio_web_server.py	gradio_web_server.py	A new start	Jul 29, 2024
model_worker.py	model_worker.py	A new start	Jul 29, 2024
requirements.txt	requirements.txt	A new start	Jul 29, 2024
screenshot.png	screenshot.png	A new start	Jul 29, 2024
supervisord.conf	supervisord.conf	A new start	Jul 29, 2024
utils.py	utils.py	A new start	Jul 29, 2024
wait_for_model_worker.sh	wait_for_model_worker.sh	A new start	Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLM UI

Features

Prerequisites

Quick Start

Configuration

Contributing

License

Acknowledgements

About

Releases

Packages

Languages

sammcj/vlm-ui

Folders and files

Latest commit

History

Repository files navigation

VLM UI

Features

Prerequisites

Quick Start

Configuration

Contributing

License

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages