En:

Deep Learning book - classics. Delivers comprehensive overview of almost all vital themes in ML and DL. Available online at https://www.deeplearningbook.org
The Hundred-page Machine Learning book: link (available online, e.g. on the github)
Stanford lectures on Probability Theory: link
Matrix calculus notes from Stanford: link
Derivatives notes from Stanford: link
Reinforcement Learning: An introduction by Richard S. Sutton and Andrew G. Barto: link

Ru:

Отличные лекции Жени Соколова. Читать pdf, лучше всего наиболее актуальный год: link
“Рукописный учебник” от студентов нашего курса на ФИВТе: link
Методичка Воронцова, link
Замечательная книжка В.Г. Спокойного про линейные оценки: link

Basics:

Bootstrap and bias-variance decomposition:

[en] Detailed description of bootstrap procedure: link
[en] Bias-variance tradeoff in more general case: A Unified Bias-Variance Decomposition and its Applications link

Gradient Boosting and Feature importances:

[en] Great interactive blogpost by Alex Rogozhnikov on Gradient Boosting: http://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained.html
[en] And great gradient boosted trees playground by Alex Rogozhnikov: http://arogozhnikov.github.io/2016/07/05/gradient_boosting_playground.html
[en] Shap values repo and explanation: https://github.com/slundberg/shap
[en] Kaggle tutorial on feature importances: https://www.kaggle.com/learn/machine-learning-explainability

Deep Learning:

[en] Notes on vector and matrix derivatives: http://cs231n.stanford.edu/vecDerivs.pdf
[en] More notes on matrix derivatives from Stanford: link
[en] Stanford notes on backpropagation: http://cs231n.github.io/optimization-2/
[en] Stanford notes on different activation functions (and just intuition): http://cs231n.github.io/neural-networks-1/
[en] Great post on Medium by Andrej Karpathy: https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b
[en] CS231n notes on data preparation (batch normalization over there): http://cs231n.github.io/neural-networks-2/
[en] CS231n notes on gradient methods: http://cs231n.github.io/neural-networks-3/
[en] Original paper introducing Batch Normalization: https://arxiv.org/pdf/1502.03167.pdf
[en] What Every Computer Scientist Should Know About Floating-Point Arithmetic: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
[en] The Unreasonable Effectiveness of Recurrent Neural Networks blog post by Andrej Karpathy: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
[en] Understanding LSTM Networks: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
[en] CS231n notes on data preparation: http://cs231n.github.io/neural-networks-2/
[en] Convolutional Neural Networks: Architectures, Convolution / Pooling Layers: http://cs231n.github.io/convolutional-networks/
[en] Understanding and Visualizing Convolutional Neural Networks: http://cs231n.github.io/understanding-cnn/
[en] LR warm-up and useful tricks - article

Natural Language Processing:

[en] Great resource by Lena Voita (direct link to Word Embeddings explanation): https://lena-voita.github.io/nlp_course/word_embeddings.html
[en] Word2vec tutorial: http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/
[en] Beautiful post by Jay Alammar on word2vec: http://jalammar.github.io/illustrated-word2vec/
[en] Blog post about text classification with RNNs and CNNs blogpost: https://medium.com/jatana/report-on-text-classification-using-cnn-rnn-han-f0e887214d5f
[en] Convolutional Neural Networks for Sentence Classification: https://arxiv.org/abs/1408.5882
[en] Great blog post by Jay Alammar on Transformer: https://jalammar.github.io/illustrated-transformer/
Notebook on positional encoding: link
[en] Great Annotated Transformer article with code and comments by Harvard NLP group: https://nlp.seas.harvard.edu/2018/04/03/attention.html
[en] Harvard NLP full Transformer implementation in PyTorch
[en] OpenAI blog post Better Language Models and Their Implications (GPT-2)
[en] Paper describing positional encoding "Convolutional Sequence to Sequence Learning"
[en] Paper presenting Layer Normalization
[en] The Illustrated BERT blog post
[en] DistillBERT overview (distillation will be covered later in our course) blog post
[en] Google AI Blog post about open sourcing BERT
[en] OpenAI blog post Better Language Models and Their Implications (GPT-2)
[en] One more blog post explaining BERT
[en] Post about GPT-2 in OpenAI blog (by 04.10.2019)

Graph Neural Networks:

Provide feedback

Saved searches