rmarkdown
working in order to ‘knit’ the HTML output.ctrl/cmd
+ shift
+ K
in RStudio) to produce a HTML file.rmarkdown
(comes with RStudio)tidyverse
(or ggplot2
, dplyr
, purrr
, tibble
)TeachingDemos
ggm
maps
rstan
ggmcmc
___As always, load the required packages and change the seed to your last name.
library(TeachingDemos)
library(ggm)
library(maps)
library(tidyverse)
library(rstan)
library(ggmcmc)
# set cores to use to the total number of cores (minimally 4)
options(mc.cores = max(parallel::detectCores(), 4))
# save a compiled version of the Stan model file
rstan_options(auto_write = TRUE)
lastname <- "YOURLASTNAME"
char2seed(lastname)
Previously during this course, we have looked at a particular instantiation of a \(t\)-test as tools to decide whether to retain or abandon a null hypothesis that the means of two sets of samples are equal. Let’s now use Bayesian parameter inference to address the same kind of question.
Here is some data:
set.seed(2018)
N1 <- 20
N2 <- 20
dataList <- list(y1 = rnorm(N1, mean = 100, sd = 20),
y2 = rnorm(N2, mean = 107, sd = 20),
N1 = N1,
N2 = N2)
Here is a (clumsy!) Stan model for this data. Notice that we estimate the means of both vectors/groups independently and use the generated quantities
block to obtain samples to approximate our posterior beliefs about the difference between these means.
data {
int N1;
int N2;
real y1[N1] ;
real y2[N2] ;
}
parameters {
real mu1 ;
real mu2 ;
real sd1 ;
real sd2 ;
}
model {
mu1 ~ normal(100, 100); // diffuse but roughly accurate priors
mu2 ~ normal(100, 100);
sd1 ~ gamma(1, 100); // sds are assumed to be low a priori
sd2 ~ gamma(1, 100);
y1 ~ normal(mu1,sd1);
y2 ~ normal(mu2,sd2);
}
generated quantities {
real delta;
delta = mu1 - mu2;
}
model_string <- "
paste the model code here
"
fit <- stan(model_code = model_string,
iter = 10000,
data = dataList)
ANSWER:
(your answer here)
# your code here
(your answer here)
delta
. (Hint: we first extract the samples from the stan_fit
object into a tibble, and then use a function from the packages HDInterval
, which you may have to install.)samples <- ggmcmc::ggs(stan_fit)
HDInterval::hdi( ... something with samples and delta ... )
(your answer here)
with(dataList, t.test(y1,y2, paired = FALSE, var.equal = FALSE))
##
## Welch Two Sample t-test
##
## data: y1 and y2
## t = -1.5525, df = 37.525, p-value = 0.1289
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -22.492760 2.971995
## sample estimates:
## mean of x mean of y
## 97.17022 106.93060
How would you interpret this result? Would you reject the null hypothesis (that the means are equal)?
(your answer here)
End of assignment