GGUF Model Quantization 🚀

A lightweight tool for quantizing large language models to GGUF format with configurable bit precision.

Features ✨

Support for 4-bit and 8-bit quantization
Compatible with Hugging Face models
Memory efficient processing
Simple API for custom implementations
Built-in scaling factor calculation
Automatic tensor type handling

Installation 🛠️

git clone https://github.com/KevinDKao/gguf-quantization
cd gguf-quantization
pip install -r requirements.txt

Dependencies 📦

torch
transformers
numpy
gguf

Quick Start 🏃‍♂️

from quantize import quantize_model

model_path = "path/to/model"
output_path = "quantized_model.gguf"

# Quantize to 4-bit
quantize_model(model_path, output_path, bits=4)

Advanced Usage 🔧

# 8-bit quantization
quantize_model("gpt2", "gpt2_quantized.gguf", bits=8)

# Custom model quantization
model = AutoModelForCausalLM.from_pretrained("custom_model")
tokenizer = AutoTokenizer.from_pretrained("custom_model")
quantize_model("custom_model", "custom_quantized.gguf", bits=4)

How It Works 🤔

Loads model and tokenizer from specified path
Calculates optimal scaling factors for quantization
Converts float32 tensors to int4/int8 with scaling
Preserves non-float tensors in original format
Writes quantized model to GGUF format
Automatically handles tokenizer configuration

Contributing 🤝

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GGUF Model Quantization 🚀

Features ✨

Installation 🛠️

Dependencies 📦

Quick Start 🏃‍♂️

Advanced Usage 🔧

How It Works 🤔

Contributing 🤝

License 📄

Contact 📬

Star History 🌟

About

Releases

Packages

Languages

KevinDKao/gguf-quantization

Folders and files

Latest commit

History

Repository files navigation

GGUF Model Quantization 🚀

Features ✨

Installation 🛠️

Dependencies 📦

Quick Start 🏃‍♂️

Advanced Usage 🔧

How It Works 🤔

Contributing 🤝

License 📄

Contact 📬

Star History 🌟

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages