I'm Peter, a software engineer embarking on a fresh journey into the world of transformers, and I invite you to join me! The course is a work in progressβitβs free, open-source, and weβll be building it together, step by step. We'll explore key concepts, tackle practical exercises, and dissect seminal papers, all while learning and growing together. Using YouTube videos for clarity and Jupyter notebooks for hands-on practice, we're set for our collaborative journey into the world of transformers. Let's dive in together! π
- Encoder-decoder architecture
- Self-attention
- Multi-head attention
- Positional encoding
- Keys, queries, and values
- Word embeddings
- Dynamic padding
- Tokenization
- Implement self-attention from scratch
- Implement multi-head attention from scratch
- Build a simple transformer model for a sequence-to-sequence task
- Fine-tune a pre-trained model like BERT or GPT-2 on a specific task
- Use a pre-trained transformer like GPT-2 for text generation
- Train ViT on custom dataset for image classification
- "Attention Is All You Need" (2017) [link]
- "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" (2018) [link]
- "ViT: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" (2020) [link]
- "DETR: End-to-End Object Detection with Transformers" (2020) [link]
- "CLIP: Learning Transferable Visual Models From Natural Language Supervision" (2021) [link]
- "GPT-3: Language Models are Few-Shot Learners" (2020) [link]
- Introduction to the course (coming soon)
- Self-attention (coming soon)
- Multi-head attention (coming soon)
- Paper review: "Attention Is All You Need" (coming soon)
I would love your help in making this repository even better! Whether you want to correct a typo, add some new content, or if you have any suggestions for improvement, feel free to open an issue.