ML course materials for bioinformatics students following the course Basic Machine Learning for Bioinformatics at UU.
- Day 1: Linear regression, gradient descent, introduction to linear algebra
- Day 2: Logistic regression, regularisation, ROC curve, introduction to neural networks (NNs)
- Day 3: NN Backpropagation algorithm, convolutional neural networks explained, guest speaker on deep learning in Oxford Nanopore sequencing (13:15-14:05)
- Day 4: K-means clustering, hierarchical clustering, deep dive into phylogenetics
- Day 5: Problems with high-dimensional data, Principal Component Analysis (PCA)
- Day 6: Working with scikit-learn, introduction to Keras and TensorFlow, project introduction and start
The material assumes a local installation of Anaconda, including the packages numpy
, scipy
, pandas
, sklearn
, biopython
, pandas-plink
, tensorflow
, notebook
, matplotlib
, and seaborn
.
For more information and resources, read the course reader.
Greatly inspired by/based on Andrew Ng's course on Coursera. The PCA part is based on Prof. Victor Lavrenko's excellent lecture series. Many thanks are owed to Dr. Jeroen de Ridder for expert assistance. I thank Dr. ir. Bas van Breukelen for long-term assistance and Prof. Dr. Berend Snel for comments on the phylogenetics part. Any errors remain my own (and, with your help, will hopefully be noticed and rectified soon).