Skip to content

Easy to use, High Performant Knowledge Distillation for LLMs

License

Notifications You must be signed in to change notification settings

agokrani/distillKitPlus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DistillKitPlus

DistillKit is an open-source toolkit for doing knowledge distillation (KLD). The repo was inspired by acree-ai/DistillKit. The main motivation behind the toolkit was to support offline distillation and PEFT for low computation resource settings.

https://arxiv.org/abs/2006.05525

Features

  • Logit Distillation: Supports same-architecture teacher and student models.
  • Pre-Computed Logits: Enables memory-efficient training by generating logits in advance.
  • LoRA Fine-Tuning Integration: Efficient low-rank adaptation fine-tuning support.
  • Quantization Support: 4-bit model quantization for faster inference and reduced memory usage.

Installation

git clone https://github.com/agokrani/distillkitplus.git
cd distillkitplus
pip install -r requirements.txt
pip install .

Quick Start

  • Configure your distillation settings in config/default_config.json
  • Generate teacher logits:
    python scripts/local/generate_logits.py --config config/default_config.json
  • Run distillation:
    python scripts/local/distill_logits.py --config config/default_config.json

Optional: Modal Integration

DistillKitPlus also supports running scripts using Modal. Follow the steps below to perform knowledge distillation with Modal.

Use the following command to generate pre-computed logits with Modal:

  • Generate teacher logits:
    python scripts/modal/generate_logits.py --config config/default_config.json
  • Run distillation:
    python scripts/modal/distill_logits.py --config config/default_config.json

Configuration

The toolkit uses a JSON configuration file with the following main sections:

  • project_name: Name of your distillation project
  • dataset: Dataset configuration including source and processing settings
  • models: Teacher and student model specifications
  • tokenizer: Tokenizer settings including max length and padding
  • training: Training hyperparameters
  • distillation: Distillation-specific parameters (temperature, alpha)
  • lora: LoRA configuration for efficient fine-tuning
  • quantization: Model quantization settings

See config/default_config.json for a complete example.

Contributing

We welcome contributions from the community! If you have ideas for improvements, new features, or bug fixes, please feel free to open an issue or submit a pull request.

Contact

For any technical questions or issues, please open an issue in this repository. We appreciate your feedback and support!

About

Easy to use, High Performant Knowledge Distillation for LLMs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages