Hope this list is helpful. If I forgot any topics please let me know!
- Stan website
- The Stan Forums (get free help from Stan developers and users)
- Stan documentation (links to various kinds of documentation for Stan). Some of the most useful doc pages are:
- Stan developer Ben Goodrich's lecture videos and materials from his masters-level course at Columbia Bayesian Statistics for the Social Sciences (YouTube videos, course materials)
- Contributed talks and materials from past Stan conferences, including videos, slides, and code (stancon_talks repository)
- rstan, the R interface to Stan
- rstanarm provides a traditional R formula interface for fitting common applied regression models with Stan, without having to write the Stan code yourself
- bayesplot provides plotting functions for use after fitting a model
- shinystan provides interactive tables and visualizations in a GUI
- loo provides tools for model comparison and averaging
- brms is similar to rstanarm with several advantages (more models are implemented, Stan code is simpler to read) and several disadvantages (models not pre-compiled, Stan code is less robust to numerical problems)
- rstantools tools for developing R packages interfacing with Stan
- projpred is for projection predictive variable selection, which is described in this paper: http://arxiv.org/abs/1508.02502
You can also find many R packages developed by Stan users that fit Stan models for you. Check out the list of packages that depend on the rstan package at cran.r-project.org/package=rstan (scroll down to the Reverse dependencies section).
- Visualization in Bayesian Workflow (paper, code)
- Jim Savage's A quick-start introduction to Stan for economists is a good guide to Bayesian data analysis workflow regardless of whether or not you care about economics
- Jim Savage's blog post Building useful models for industry—some tips
Chi Feng's interactive MCMC demos that we used in class:
- The Markov-chain Monte Carlo Interactive Gallery (website)
I highly recommend my Stan colleague Michael Betancourt's intro to HMC paper. Michael has a lot of very technical papers about HMC but this one is primarily focused on providing intuition (e.g., he has a whole section on the connection between HMC and the physics of planetary motion that I mentioned briefly in class):
- A Conceptual Introduction to Hamiltonian Monte Carlo (paper)
This next paper is aimed at ecologists, but the HMC explanation is well written and is worth reading regardless of your field of work/study:
- Faster Estimation of Bayesian Models in Ecology using Hamiltonian Monte Carlo (paper)
This case study from Stan developer Bob Carpenter uses simple simulations to demonstrate how things get strange (and challenging) very quickly as the number of dimensions grows due to the tension between probability density at the mode and volume in the tails:
- Typical Sets and the Curse of Dimensionality (case study)
- Visual MCMC diagnostics (tutorial vignette)
- Diagnosing biased inference with divergences (case study)
- A few simple reparameterizations (blog post)
- The impact of reparameterization on point estimates (case study)
- A bag of tips and tricks for dealing with scale issues (blog post/case study)
- The QR decomposition for regression models (case study)
- Prior Choice Recommendations wiki
- How the Shape of a Weakly Informative Prior Affects Inferences (case study)
- The prior can generally only be understood in the context of the likelihood (paper)
This is only a problem if your model lacks important structure. The generative modeling perspective provides a simple solution to this problem: build a model that allows for different amounts of variability in different subpopulations:
- Jeff Arnold's notes on heteroscedasticity, with RStan examples
- Informative priors on the relevant regression coefficients will help a lot
- The QR reparameterization) helps avoid computational issues when you have highly correlated predictors.
This is my paper (with many great coauthors!) that most of the course slides were based off of:
We also have some vignettes for the bayesplot package that demonstrate many of the important graphical model checks:
- bayesplot tutorial vignettes (online vignettes)
- Chapter 10 in the Stan Manual v2.17.0
- Mitzi Morris' case study Spatial Models in Stan: Intrinsic Auto-Regressive (ICAR) Models for Areal Data)
- Stan tutorial: Modern Bayesian Tools for Time Series Analysis contributed by Stan users Thomas P. Harte and R. Michael Weylandt.
- Jim Savage's blog post on Regime-switching models with Stan
- Jim Savage's blog post on Hierarchical vector autoregression (VAR) with Stan
- Lu Zhang's case study Nearest neighbor Gaussian process (NNGP) models in Stan
-
Chapters 11 through 14 in the Stan Manual [v2.17.0]((https://github.com/stan-dev/stan/releases/download/v2.17.0/stan-reference-2.17.0.pdf)
-
Ben Goodrich's course materials for week 14
Some Stan users have written Python and R libraries to help fit certain survival models using Stan:
- Estimating Joint Models for Longitudinal and Time-to-Event Data with rstanarm
- Library of Stan Models for Survival Analysis from Jacki Novik and HammerLab
- survHE R package for fitting survival models via RStan from Gianluca Baio
- Chapters 11 through 15 in the Stan Manual v2.17.0 all have content that relates in some way to survival models even if not explicitly mentioned.
- Paper and Stan code for survival analysis with shrinkage priors from Aki Vehtari (link). (Note: this is a few years old so the Stan code may use some deprecated syntax)
The loo package has several useful vignettes that Aki Vehtari and I recently updated for version 2.0.0:
- Using the loo package (version >= 2.0.0)
- Bayesian Stacking and Pseudo-BMA weights using the loo package
- Writing Stan programs for use with the loo package
- Leave-one-out cross-validation for non-factorizable models
Aki Vehtari also has a bunch of tutorials online as well as some blog posts on the topic:
Papers from various authors (published in journals but I'm including links to the free arXiv preprint versions):
- Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC (arXiv, R package)
- Understanding predictive information criteria for Bayesian models (arXiv)
- Projection predictive variable selection using Stan+R (arXiv, R package)
- Using stacking to average Bayesian predictive distributions (arXiv)
- Comparison of Bayesian predictive methods for model selection (arXiv)
- Michael Betancourt's Identifying Bayesian Mixture Models case study
- Chapter 13 in the Stan Manual v2.17.0
-
Chapter 18 in the Stan Manual v2.17.0
-
Michael Betancourt's case study Robust Gaussian Processes in Stan
-
Lu Zhang's case study Nearest neighbor Gaussian process (NNGP) models in Stan
-
Rob Trangucci's repository of multi-output GP code and slides
- Aki's talk at StanCon 2018 Asilomar about regularized horseshoe priors (video)
- Michael Betancourt's case study Bayes Sparse Regression
- Juho and Aki's paper Sparsity information and regularization in the horseshoe and other shrinkage priors
Condition logit has different meanings in different fields. What we call conditional logit is implemented in the rstanarm package:
rstanarm::stan_clogit()
(function doc, vignette section)
Multinomial logit is a common discrete choice model (which may sometimes also be referred to as conditional logit in a small number of fields):
-
Starting at section 9.3 of the Stan Manual v2.17.0 the next few sections discuss related topics
-
Rob Trangucci's case study Hierarchical multinomial logistic regression models in Stan
Some blog posts on the topic from various authors:
Several Stan developers wrote a paper about the custom implementation of autodiff developed for Stan:
- The Stan Math Library: Reverse-Mode Automatic Differentiation in C++. arXiv 1509.07164
Here's a wiki page where we list a lot of things we want to add to Stan going forward. Many of these things are already in progress, but this should help give a sense of some of the current limitations: