Skip to content

C0nsumption/Consume-Blip3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python src/analyze.py path/to/directory "Describe the image"

Consume-Blip3 Logo

BLIP3 Autocaptioning Tools

Welcome to this XGEN-MM(BLIP3) Autocaptioning Tools repository! This project sets up tools for autocaptioning using state-of-the-art models.

✅ Chat Mode     ✅ Caption Mode     ✅ FastAPI Application

Table of Contents

Introduction

XGEN (BLIP3) is designed to provide efficient and accurate (lol to a degree) autocaptioning capabilities. This repository aims to set up the necessary environment and some tools to leverage the power of the XGen-MM-PHI3 model from Salesforce.

Setup

(TESTED ON UBUNTU 22.04 | CUDA 12.1 | Torch 2.3.1+cu121)
For windows, lmk. I'll make a pull request to actually test. but should work fine.

Follow the steps below to set up the project:

Option 1: Using Setup Scripts

Linux/Mac:

  1. Download and Run the Shell Script:
    wget https://raw.githubusercontent.com/C0nsumption/Consume-Blip3/main/setup/setup.sh
    chmod +x setup.sh
    ./setup.sh

Windows:

  1. Download and Run the Batch Script:
     curl -o setup.bat https://raw.githubusercontent.com/C0nsumption/Consume-Blip3/main/setup/setup.bat
     setup.bat

Option 2: Manual Installation

Manual Installation
  1. Clone this Repo and Navigate to the Project Directory:

    git clone https://github.com/C0nsumption/Consume-Blip3.git
    cd Consume-Blip3
  2. Set Up a Virtual Environment:

    python -m venv venv
    source venv/bin/activate  # For Linux/Mac
    venv\Scripts\activate  # For Windows
  3. Initialize with Git LFS (make sure to have installed. Ask ChatGPT.):

    git lfs install
  4. Clone the Model Repository:

    git clone https://huggingface.co/Salesforce/xgen-mm-phi3-mini-instruct-r-v1
  5. Install Dependencies:

    pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121
    pip install -r requirements.txt
  6. Run Tests:

    python test/test.py

Usage

After setting up the environment, you can start using the BLIP3 autocaptioning tools. Detailed usage instructions and examples can be found in the Usage Guide.

Contributing

I welcome contributions from the community! If you'd like to contribute, please fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.


Feel free to reach out if you have any questions or need further assistance! But give me time, very busy:
accelerating 🫡


About

XGEN-MM(BLIP3) Autocaptioning Tools

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published