This repo is a collection of neural network implementations from scratch in Python. The purpose of this repo is to understand the inner workings of neural networks and to implement them from scratch.
The idea of building neural nets from scratch mainly came from George Hotz's tinygrad and Andrej Karpathy's micrograd (I've created nanograd while studying it and the associated lecture series.).
Unlike nanograd I won't copy any tutorial directly, but I will use them as a reference to build my own neural network implementations.
- Book: Dive into Deep Learning
- https://d2l.ai
- My notes and exercise solutions:
- Andrej Karpathy
- LeCun et al., 1989 Reproducing
- micrograd
- Neural Networks: Zero to Hero
- tinygrad
- 3Blue1Brown:
- Book: Introduction to Statistical Learning
- https://www.statlearning.com
- My notes and exercise solutions:
- Book: All of Statistics, Wassermann 2004
- https://link.springer.com/book/10.1007/978-0-387-21736-9
- My notes and exercises solutions:
- https://github.com/Daniel-Sinkin/Research-and-Development/tree/main/Books-Courses-Exercises/Wasserman-Statistics
- Currently part of my general RnD Repo, should move this to a seperate Repo soon.
- Coursera: Deep Learning Specialization
- Bengio et al., 2003
- A Neural Probabilistic Language Model
- https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf
- Vaswani et al., 2017
- Attention is all you need
- The original transformer architecture paper
- https://arxiv.org/abs/1706.03762
- Kingma & Ba, 2014
- The paper that introduced the Adam optimizer
- Adam: A Method for Stochastic Optimization
- https://arxiv.org/abs/1412.6980
- Redford et al., 2019
- The original GPT2 paper
- Language Models are Unsupervised Multitask Learners
- Code and models:
- LeCun et al., 1989
- Backpropagation Applied to Handwritten Zip Code Recognition
- http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf
- Sennrich et al., 2015
- Neural Machine Translation of Rare Words with Subword Units
- https://arxiv.org/abs/1508.07909