Quantized-LLMs

Visit the docs link for Literature Review.

Check all the colab notebooks here.

Check the document on LLM Quantization and Benchmarking here. It includes all my implementation details, challenges and adaptations.

Check the document on Deploying LLM to Mobile here. It includes all my implementation details, challenges and adaptations.

Published studio at Lightning AI

Benchmark GGUF format quantized model using lm-evaluation-harness and llama-cpp-python
HumanEval benchmark (non-quantized, quantized (GPTQ, GGUF))
Research SmoothQuant
Pruning
Distillation

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
notebooks		notebooks
DEPLOY_MOBILE_GUIDE.md		DEPLOY_MOBILE_GUIDE.md
LICENSE		LICENSE
QUANTIZE_and_BENCHMARK.md		QUANTIZE_and_BENCHMARK.md
README.md		README.md