-
-
Notifications
You must be signed in to change notification settings - Fork 265
RStan Getting Started (Português)
(Translation: Luiz Max Carvalho & Marco Inácio)
RStan é a interface para a Stan no R. Para mais informações sobre a Stan e a sua linguagem de modelagem, visite o website em http://mc-stan.org/
Quase todas as instruções de instalação abaixo são para a versão do RStan citada acima, que necessita que você tenha o R na versão 3.4.0 ou mais recente. Se necessário, você pode baixar a versão mais recente do R aqui.
Além disso, recomendamos fortemente que você use o RStudio, versão 1.2.x ou mais recente porque este tem ótimo suporte à Stan.
Por precaução, muitas vezes é necessário remover qualquer Rstan existente no sistema via usando?
remove.packages("rstan")
if (file.exists(".RData")) file.remove(".RData")
Depois disso, reabra o R.
Na maioria dos casos você pode simplesmente digitar (exatamente dessa forma)
install.packages("rstan", repos = "https://cloud.r-project.org/", dependencies = TRUE)
No entanto, se você usa o Linux, se getOption("pkgType")
está configurado para "source"
ou se o R perguntar se você deseja instalar a versão mais recente do Rstan a partir da fonte ("from source"), vá até a página correspondente para cada plataforma: Windows, Mac ou Linux.
Conferindo checando?não gosto, estrangeirismo, but whatever o conjunto de ferramentas ("toolchain") C++.
No RStudio (preferencialmente) ou no R, execute uma vez
pkgbuild::has_build_tools(debug = TRUE)
para se certificar que o seu conjunto de ferramentas C++ (toolchain) está usando o pacote pkgbuild que é instalado junto com o Rstan.
Se essa linha de código retorna TRUE
, então sua toolchain está instalada corretamente e você pode pular para a próxima seção.
Caso contrário,
- Se você utiliza Windows e RStudio (recomendado), uma janela pop-up vai aparecer perguntando se você deseja instalar o Rtools. Clique em Yes/Sim e espere a instalação terminar.
- Se você utiliza Windows mas não o RStudio, uma menssagem aparecerá no console do R lhe dizendo para instalar o Rtools. Mais informações sobre baixar e instalar rtools pode ser úteis, mas você normalmente NÃO precisa ir para a seção intitulada "Instalando o RStan a partir da fonte".
- If you use a Mac, a link will appear but do not click on it. Instead go here
- Se você utiliza Mac, um link irá aparecer, mas não clique nele. Em vez disso, vá aqui
- Se você utiliza Linux (incluindo o subsistema Windows para Linux), então vá aqui.
Se você seguir as instruções acima mas não obtiver sucesso, pode obter ajuda no Forum da Stan no Discourse mas por favor se certifique de nos dizer qual o seu sistema operacional, se você utiliza o RStudio e qual o output quando você tenta as instruções acima.
This step is optional, but it can result in compiled Stan programs that execute much faster than they otherwise would. Simply paste the following into R once
Esse passo é opcional, mas pode resultar em programas Stan compilados que rodam muito mas rápido do que rodariam de outra forma. Simplesmente cole o código abaixo no R
dotR <- file.path(Sys.getenv("HOME"), ".R")
if (!file.exists(dotR)) dir.create(dotR)
M <- file.path(dotR, ifelse(.Platform$OS.type == "windows", "Makevars.win", "Makevars"))
if (!file.exists(M)) file.create(M)
cat("\nCXX14FLAGS=-O3 -march=native -mtune=native",
if( grepl("^darwin", R.version$os)) "CXX14FLAGS += -arch x86_64 -ftemplate-depth-256" else
if (.Platform$OS.type == "windows") "CXX11FLAGS=-O3 -march=native -mtune=native" else
"CXX14FLAGS += -fPIC",
file = M, sep = "\n", append = TRUE)
No entanto, esteja ciente que mudar o nível de otimização para O3
pode causar problemas para outros pacotes além do RStan e que, em casos raros, especificar -march=native -mtune=native
pode fazer com que programas Stan não funcionem. Se vocẽ alguma vez precisar mudar as configurações da sua toolchain C++, pode executar
M <- file.path(Sys.getenv("HOME"), ".R", ifelse(.Platform$OS.type == "windows", "Makevars.win", "Makevars"))
file.edit(M)
The rest of this document assumes that you have already installed RStan by following the instructions above.
The package name is rstan (all lowercase), so we start by executing
library("rstan") # observe startup messages
As the startup message says, if you are using rstan locally on a multicore machine and have plenty of RAM to estimate your model in parallel, at this point execute
options(mc.cores = parallel::detectCores())
In addition, you should follow the second startup message that says to execute
rstan_options(auto_write = TRUE)
which allows you to automatically save a bare version of a compiled Stan program to the hard disk so that it does not need to be recompiled (unless you change it).
Finally, if you use Windows, there will be a third startup message saying to execute
Sys.setenv(LOCAL_CPPFLAGS = '-march=native')
which is not necessary if you followed the C++ toolchain configuration advice in the previous section.
This is an example in Section 5.5 of Gelman et al (2003), which studied coaching effects from eight schools. For simplicity, we call this example "eight schools."
We start by writing a Stan program for the model in a text file. If you are using RStudio version 1.2.x or greater, click on File -> New File -> Stan File . Otherwise, open your favorite text editor. Either way, paste in the following and save your work to a file called 8schools.stan
in R's working directory (which can be seen by executing getwd()
)
// saved as 8schools.stan
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // standard error of effect estimates
}
parameters {
real mu; // population treatment effect
real<lower=0> tau; // standard deviation in treatment effects
vector[J] eta; // unscaled deviation from mu by school
}
transformed parameters {
vector[J] theta = mu + tau * eta; // school treatment effects
}
model {
target += normal_lpdf(eta | 0, 1); // prior log-density
target += normal_lpdf(y | theta, sigma); // log-likelihood
}
Be sure that your Stan programs ends in a blank line without any characters including spaces and comments.
In this Stan program, we let theta
be a transformation of mu
and eta
instead of directly declaring theta
as parameters, which allows the sampler will run more efficiently (see detailed explanation). We can prepare the data (which typically is a named list) in R with:
schools_dat <- list(J = 8,
y = c(28, 8, -3, 7, -1, 1, 18, 12),
sigma = c(15, 10, 16, 11, 9, 11, 10, 18))
And we can get a fit with the following R command. Note that the argument to file =
should point to where the file is on your file system unless you have put it in the working directory of R in which case the below will work.
fit <- stan(file = '8schools.stan', data = schools_dat)
The object fit
, returned from function stan
is an S4 object of class
stanfit
. Methods such as print
, plot
, and pairs
are associated with the
fitted result so we can use the following code to check out the results in fit
.
print
provides a summary for the parameter of the model as well
as the log-posterior with name lp__
(see the following example output).
For more methods and details of class stanfit
, see the help of class stanfit
.
In particular, we can use the extract
function on stanfit
objects to
obtain the samples. extract
extracts samples from the stanfit
object as a list of arrays for parameters of interest, or just an array.
In addition, S3 functions as.array
, as.matrix
, and as.data.frame
are defined
for stanfit
objects (using help("as.array.stanfit")
to check
out the help document in R).
print(fit)
plot(fit)
pairs(fit, pars = c("mu", "tau", "lp__"))
la <- extract(fit, permuted = TRUE) # return a list of arrays
mu <- la$mu
### return an array of three dimensions: iterations, chains, parameters
a <- extract(fit, permuted = FALSE)
### use S3 functions on stanfit objects
a2 <- as.array(fit)
m <- as.matrix(fit)
d <- as.data.frame(fit)
The Rats example is also a popular example. For example, we can find the
OpenBUGS version from
here, which originally is from
Gelfand et al (1990).
The data are about the growth of 30 rats weekly for five weeks.
In the following table, we list the data, in which we use x
to denote the dates
the data were collected. We can try this example using the linked data
rats.txt
and model code rats.stan.
Rat | x=8 | x=15 | x=22 | x=29 | x=36 | Rat | x=8 | x=15 | x=22 | x=29 | x=36 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 151 | 199 | 246 | 283 | 320 | 16 | 160 | 207 | 248 | 288 | 324 | |
2 | 145 | 199 | 249 | 293 | 354 | 17 | 142 | 187 | 234 | 280 | 316 | |
3 | 147 | 214 | 263 | 312 | 328 | 18 | 156 | 203 | 243 | 283 | 317 | |
4 | 155 | 200 | 237 | 272 | 297 | 19 | 157 | 212 | 259 | 307 | 336 | |
5 | 135 | 188 | 230 | 280 | 323 | 20 | 152 | 203 | 246 | 286 | 321 | |
6 | 159 | 210 | 252 | 298 | 331 | 21 | 154 | 205 | 253 | 298 | 334 | |
7 | 141 | 189 | 231 | 275 | 305 | 22 | 139 | 190 | 225 | 267 | 302 | |
8 | 159 | 201 | 248 | 297 | 338 | 23 | 146 | 191 | 229 | 272 | 302 | |
9 | 177 | 236 | 285 | 350 | 376 | 24 | 157 | 211 | 250 | 285 | 323 | |
10 | 134 | 182 | 220 | 260 | 296 | 25 | 132 | 185 | 237 | 286 | 331 | |
11 | 160 | 208 | 261 | 313 | 352 | 26 | 160 | 207 | 257 | 303 | 345 | |
12 | 143 | 188 | 220 | 273 | 314 | 27 | 169 | 216 | 261 | 295 | 333 | |
13 | 154 | 200 | 244 | 289 | 325 | 28 | 157 | 205 | 248 | 289 | 316 | |
14 | 171 | 221 | 270 | 326 | 358 | 29 | 137 | 180 | 219 | 258 | 291 | |
15 | 163 | 216 | 242 | 281 | 312 | 30 | 153 | 200 | 244 | 286 | 324 |
y <- as.matrix(read.table('https://raw.github.com/wiki/stan-dev/rstan/rats.txt', header = TRUE))
x <- c(8, 15, 22, 29, 36)
xbar <- mean(x)
N <- nrow(y)
T <- ncol(y)
rats_fit <- stan('https://raw.githubusercontent.com/stan-dev/example-models/master/bugs_examples/vol1/rats/rats.stan')
You can run many of the BUGS examples and some others that we have created in Stan by executing
model <- stan_demo()
and choosing an example model from the list that pops up. The first time you call stan_demo()
, it will ask you if you want to download these examples. You should choose option 1 to put them in the directory where rstan was installed so that they can be used in the future without redownloading them. The model
object above is an instance of class stanfit
, so you can call print
, plot
, pairs
, extract
, etc. on it afterward.
More details about RStan can be found in the documentation including the vignette of package rstan.
For example, using help(stan)
and help("stanfit-class")
to check out the help for function stan
and S4 class stanfit
.
And see Stan's modeling language manual for details about Stan's samplers, optimizers, and the Stan modeling language.
In addition, the Stan User's Mailing list can be used to discuss the use of Stan, post examples or ask questions about (R)Stan. When help is needed, it is important to provide enough information such as the following:
- properly formatted syntax in the Stan modeling language
- data
- necessary R code
- dump of error message using
verbose=TRUE
andcores=1
when calling thestan
orsampling
functions - information about R by using function
sessionInfo()
in R
- Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2003). Bayesian Data Analysis, CRC Press, London, 2nd Edition.
- Stan Development Team. Stan Modeling Language User's Guide and Reference Manual.
- Gelfand, A. E., Hills S. E., Racine-Poon, A., and Smith A. F. M. (1990). "Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling", Journal of the American Statistical Association, 85, 972-985.
- Stan
- R
- BUGS
- OpenBUGS
- JAGS
- Rcpp