This repository implements an RL-based active learning framework that enables a base language model (Llama-3.2-3B) to autonomously guide its training by sequencing non-overlapping key–value pairs from Wikipedia articles.
pip install -r requirements.txt
python src/main.py
- Self-directed curriculum learning via RL
- Attention-guided key-value pair selection
- Efficient context window utilization with non-overlapping pairs
- Trajectory filtering for high-quality updates
- LoRA for efficient fine-tuning
pytest tests/