- gpt2 https://huggingface.co/openai-community/gpt2/resolve/main/model.safetensors
- DeepSeek-R1-Distill-Qwen-1.5B https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
- DeepSeek-R1-Distill-Qwen-7B https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
- Parse all the layers
- Tokenizer
- matmul in C and CUDA
- MLP in C and CUDA
- attention layer in C and CUDA