This repository has course material for Bayesian Data Analysis course at Aalto (CS-E5710). Aalto students should check also MyCourses announcements.
The material will be updated during the course. Exercise instructions and slides will be updated at latest on Monday of the corresponding week.
- Basic terms of probability theory
- probability, probability density, distribution
- sum, product rule, and Bayes' rule
- expectation, mean, variance, median
- in Finnish, see e.g. Stokastiikka ja tilastollinen ajattelu
- in English, see e.g. Wikipedia and Introduction to probability and statistics
- Some algebra and calculus
- Basic visualisation techniques (R or Python)
- histogram, density plot, scatter plot
- see e.g. BDA_R_demos
- see e.g. BDA_py_demos
If you find BDA3 too difficult to start with, I recommend
- For background prerequisites, see, e.g., chapters 2, 4 and 5 in Kruschke, "Doing Bayesian Data Analysis". Some of my students have found this useful.
- Richard McElreath's Statistical Rethinking book is easier and the latest videos of Statistical Rethinking: A Bayesian Course Using R and Stan are highly recommended even if you are following BDA3.
- Michael Betancourt has a different point of view in his introduction material, and many have found these also enlightening. Furthermore, his Hamiltonian Monte Carlo videos are highly recommended if you are taking this course.
Exercises (67%) and a project work (33%). Minimum of 50% of points must be obtained from both the exercises and project work.
Bayesian Data Analysis, 3rd ed, by by Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin. Home page for the book. Errata for the book.
- Background (Ch 1)
- Single-parameter models (Ch 2)
- Multiparameter models (Ch 3)
- Computational methods (Ch 10)
- Markov chain Monte Carlo (Ch 11--12)
- Extra material for Stan and probabilistic programming (see below)
- Hierarchical models (Ch 5)
- Model checking (Ch 6)
- Evaluating and comparing models (Ch 7)
- Decision analysis (Ch 9)
- Large sample properties and Laplace approximation (Ch 4)
- In addition you learn workflow for Bayesian data analysis
Recommended way to go through the material is
- Read the reading instructions for a chapter in chapter_notes.
- Read the chapter in BDA3 and check that you find the terms listed in the reading instructions.
- Watch the corresponding lecture video to get explanations for most important parts.
- Read corresponding additional information in the chapter notes.
- Run the corresponding demos in R demos or Python demos.
- Read the exercise instructions and make the corresponding exercises. Demo codes in R demos and Python demos have a lot of useful examples for handling data and plotting figures. If you have problems, visit TA sessions or ask in course slack channel.
- If you want to learn more, make also self study exercises listed below
- Slides
- including code for reproducing some of the figures
- Chapter notes
- including reading instructions highlighting most important parts and terms
Text licensed under CC-BY-NC 4.0. Code licensed under BSD-3.
Shorter video clips on selected topics are available in a Panopto folder.
- 1.1 Introduction to uncertainty and modelling
- 1.2 Introduction to the course contents
- 2.1 Observation model, likelihood, posterior and binomial model
- 2.2 Predictive distribution and benefit of integration
- 2.3 Priors and prior information
2019 fall lecture videos will appear weekly to a Panopto folder.
- Lecture 2.1 and Lecture 2.2 on basics of Bayesian inference, observation model, likelihood, posterior and binomial model, predictive distribution and benefit of integration, priors and prior information, and one parameter normal model.
- Lecture 3 on multiparameter models, joint, marginal and conditional distribution, normal model, bioassay example, grid sampling and grid evaluation.
- Lecture 4.1 on numerical issues, Monte Carlo, how many simulation draws are needed, how many digits to report, and Lecture 4.2 on direct simulation, curse of dimensionality, rejection sampling, importance sampling.
- Lecture 5.1 on Markov chain Monte Carlo, Gibbs sampling Metropolis algorithm, and Lecture 5.2 on warm-up, convergence diagnostics, R-hat, and effective sample size.
We strongly recommend using R in the course as there are more packages for Stan and statistical analysis in R. If you are already fluent in Python, but not in R, then using Python may be easier, but it can still be more useful to learn also R. Unless you are already experienced and have figured out your preferred way to work with R, we recommend installing RStudio Desktop. TAs will provide brief introduction to use of RStudio during the first week TA sessions.
Good self study exercises for this course are listed below. Most of these have also model solutions vailable.
- 1.1-1.4, 1.6-1.8 (model solutions for 1.1-1.6)
- 2.1-2.5, 2.8, 2.9, 2.14, 2.17, 2.22 (model solutions for 2.1-2.5, 2.7-2.13, 2.16, 2.17, 2.20, and 2.14 is in slides)
- 3.2, 3.3, 3.9 (model solutions for 3.1-3.3, 3.5, 3.9, 3.10)
- 4.2, 4.4, 4.6 (model solutions for 3.2-3.4, 3.6, 3.7, 3.9, 3.10)
- 5.1, 5.2 (model solutions for 5.3-5.5, 5.7-5.12)
- 6.1 (model solutions for 6.1, 6.5-6.7)
- 9.1
- 10.1, 10.2 (model solution for 10.4)
- 11.1 (model solution for 11.1)
- Stan home page
- Introductory article in Journal of Statistical Software
- Documentation
- RStan installation
- PyStan installation
- Basics of Bayesian inference and Stan, Jonah Gabry & Lauren Kennedy Part 1 and Part 2
- Dicing with the unknown
- Logic, Probability, and Bayesian Inference by Michael Betancourt
- Origin of word Bayesian
Sanasta "bayesilainen" esiintyy Suomessa muutamaa erilaista kirjoitustapaa. Muoto "bayesilainen" on muodostettu yleisen vieraskielisten nimien taivutussääntöjen mukaan
"Jos nimi on kirjoitettuna takavokaalinen mutta äännettynä etuvokaalinen, kirjoitetaan päätteseen tavallisesti takavokaali etuvokaalin sijasta, esim. Birminghamissa, Thamesilla." Terho Itkonen, Kieliopas, 6. painos, Kirjayhtymä, 1997.
We now have an FAQ for the exercises here. Has solutions to commonly asked questions related RStudio setup, errors during package installations, etc.