In this project I have implemented every aspect from scratch, as well as the training pipeline. The end goal is to have a fully functional GPT2 model, that can replicate the results of NanoGPT, and that has a clean integration with Hugging Face and Lightning for future flexibility.
This repository is a work in progress, and I will be updating it as I go along.
- Progress list:
[x] Replicate NanoGPT on tiny_shakespeare
[ ] Introduce MoE code
[ ] Implement Llama 2
[ ] Implement Llama 3
- Resources and references: //TODO//