I compiled this list when transitioning out of academia into industry. My background is in (quantitative) economics and this list assumes that you know college-level math (calculus, linear algebra, probability/statistics) and some programming (e.g. Matlab).
- Read ISL (free) for a gentle introduction to the theory behind important machine learning algorithms
- Read Hands on ML (Part I) to learn how to code up the most important algorithms in Python
- Take part in a Kaggle Challenge
- Python
- Data manipulation: Python for Data Analysis
- Fundamentals: Learn Python the hard way (free)
- Important libraries:
numpy
,pandas
,seaborn
,scikit-learn
,keras
,tensorflow
- R
- Data manipulation and visualization: R for Data Science (free)
- Fundamentals: Advanced R, Writing R Packages (free)
- Important libraries:
dplyr
,ggplot2
,tidyr
,gbm
,xgboost
- Version control
- Happy git with R (free)
- GitHub is your friend!
- Goodfellow/Bengio/Courville (free) for theory
- Google’s (short) Tensorflow course on Udacity (free)
- Hands on ML (Part II) for implementation in using TensorFlow in Python
- The classic for theory: The Visual Display of Quantitative Information
- Workhorse libraries are
ggplot2
(R) andseaborn
(Python) - You can build quick, interactive dashboards in R using
shiny
andplotly
. If you are serious about interactive visualization, then you may want to invest in D3
- For a statistical perspective on many ML techniques
- For some of the more theoretical concepts in ML (e.g. VC-bound):
- Classics:
- Online courses
- Andrew Ng's Machine Learning course on Coursera
- MIT 6.00.1 (EdX) for basic CS concepts with exercises in Python