Compressa.ai

llm-awq Public

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 1

compressa-perf Public

Python 1 1

Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, spar…

Python

qlora Public

Forked from artidoro/qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook

peft Public

Forked from huggingface/peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python

OmniQuant Public

Forked from OpenGVLab/OmniQuant

OmniQuant is a simple and powerful quantization technique for LLMs.

Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compressa.ai

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!