This homework assignment is due July 28th 2017 before noon. Submit your results in a zipped archive with name BDA+CM_HW5_YOURLASTNAME.zip
which includes both Rmarkdown and a compiled version (preferably HTML). Use the same naming scheme for the *.Rmd
and *.html
files. All other files, if needed, should start with BDA+CM_HW5_YOURLASTNAME_
as well. Upload the archive to the Dropbox folder.
Keep your descriptions and answers as short and concise as possible, without reverting to bullet points. All of the exercises below are required and count equally to the final score if you are taking this course for 6 credit points. If you want 3 credit points for this course, you only need to do the first exercise.
We will look once more at the IQ data from class (plucked from Kruschke’s textbook).
iq_data = readr::read_csv('../data/07_Kruschke_TwoGroupIQ.csv')
head(iq_data)
## # A tibble: 6 x 2
## Score Group
## <int> <chr>
## 1 102 Smart Drug
## 2 107 Smart Drug
## 3 92 Smart Drug
## 4 101 Smart Drug
## 5 110 Smart Drug
## 6 68 Smart Drug
Here’s a plot of the data to compare the distribution of measured IQ scores in both groups:
iq_data %>% group_by(Group) %>% mutate(mean_score = mean(Score)) %>%
ggplot(aes(x = Score)) + geom_density() +
geom_point(aes(x = Score,
y = 0 - runif(nrow(iq_data), 0, 0.005)),
color = "skyblue", size = 0.5) +
facet_grid(Group ~ .) +
geom_vline(aes(xintercept = mean_score), color = "firebrick")
The graphs show the estimated densities (black lines), a jittered scatter plot of each data point (blue dots) and the means (red vertical lines) of both control and treatment group. We see that there are quite some outliers, which a Gaussian noise model might not be ideally suited to deal with. For this reason, we will look at two linear regression models, both of which predict metric variable Score
in terms of categorical variable Group
. One model has a Gaussian noise model, the other uses a \(t\)-distribution instead.
The Gaussian noise model is this:
\[ \begin{align*} y_{\text{pred}} & = \beta_0 + \beta_1 x & \ \ \ \ \ \ \ \ \ \ \ \ \ \ y & \sim \mathcal{N}(\mu = y_{\text{pred}}, \sigma_{err}) \\ \beta_0 & \sim \mathcal{N}(100, 15) & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac{1}{\sigma_{err}^2} & \sim \text{Gamma}(0.1,0.1) \\ \beta_1 & \sim \mathcal{N}(0, 30) \end{align*} \]
Here’s the JAGS code for the Gaussian model, which we used in class:
# specify model
modelString = "
model{
sigma_e = 1/sqrt(tau_e)
tau_e ~ dgamma(0.1,0.1)
b0 ~ dnorm(100, 1/15^2)
b1 ~ dnorm(0, 1/30^2)
for (i in 1:k){
yPred[i] = b0 + b1 * x[i]
y[i] ~ dnorm(yPred[i], tau_e)
}
}
"
# prepare data for JAGS
dataList = list(y = iq_data$Score,
x = ifelse(iq_data$Group == "Placebo", 0, 1),
k = nrow(iq_data))
# set up and run model
params <- c('b0', "b1", "sigma_e")
jagsModel = jags.model(file = textConnection(modelString),
data = dataList,
n.chains = 3, quiet = TRUE)
update(jagsModel, 5000) # 5000 burn-in samples
codaSamples = coda.samples(jagsModel,
variable.names = params,
n.iter = 20000)
ggmcmc::ggs(codaSamples) %>%
group_by(Parameter) %>%
summarize(HDI_lo = coda::HPDinterval(as.mcmc(value))[1],
mean = mean(value),
HDI_hi = coda::HPDinterval(as.mcmc(value))[2])
## # A tibble: 3 x 4
## Parameter HDI_lo mean HDI_hi
## <fctr> <dbl> <dbl> <dbl>
## 1 b0 94.4351031 100.113340 105.78655
## 2 b1 -0.1720315 7.672494 15.60908
## 3 sigma_e 19.5371152 22.309443 25.25697
Based on these results from parameter inference, would you conclude that it is credible that there is a difference between control and treatment group? (Hint: is some relevant parameter’s 95% HDI credibly different from some relevant value?)
Implement the following robust regression model (Hints: you only need to change a few lines of code in the script above; the exponential distribution with shape parameter \(\frac{1}{30}\) in JAGS is dexp(1/30)
; it is not entirely improbable that Kruschke’s textbook contains all and exactly the information you need):
\[ \begin{align*} y_{\text{pred}} & = \beta_0 + \beta_1 x & \ \ \ \ \ \ \ \ \ \ \ \ \ \ y & \sim \mathcal{t}(\mu = y_{\text{pred}}, \sigma_{err}, \nu) \\ \beta_0 & \sim \mathcal{N}(100, 15) & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \frac{1}{\sigma_{err}^2} & \sim \text{Gamma}(0.1,0.1) \\ \beta_1 & \sim \mathcal{N}(0, 30) & \nu - 1 & \sim \text{Exponential}(1/30) \end{align*} \]
The goal of this exercise is to demonstrate how the inclusion of mixed effects can lead to more conservative conclusions. We consider the self-paced reading data from the study on Chinese relative clauses again. But we will rig the data slightly:
rt_data = readr::read_delim('../data/08_gibsonwu2012data.txt', delim = " ") %>%
filter(region == "headnoun") %>%
mutate(so = ifelse(type == "subj-ext", "-1", "1")) %>%
select(subj, item, so, rt) %>%
mutate(rt_rigged = ifelse(subj <= 10 & so == 1, rt*0.75, rt * 0.85)) # rigging the data!
Make sure that you understand what the last line did. We made some subjects respond faster tout court, with a stronger increase in RTs for the object relative clause condition. (Just to be absolutely clear: such sata tweaking is a total no-go if serious inference from the data is desired! We do it just for illustration of the effect that the inclusion of mixed effects can have on some (artificial!) data set.)
brms
package. Regress rigged log-reading time against condition factor so
, just as we did in class for the orginal, non-rigged data. As we did in class, use the default (flat/improper) priors of the brms
package. Inspect the 95% HDI for parameter so
and check whether there is a credible effect of relative clause type. (Let us not worry about ROPEs for the sake of clarity; just look at whether the relevant value is in the 95% HDI.)loo::loo(model_1, model_2)
from the loo
package. This function returns the loo-values as discussed in class multiplied by \(-2\) (thus making the loo score play in the same ballpark as standard information criteria). Interpret the outcome of this model comparison. Which model is better? (Be mindful about whether bigger numbers are better or smaller ones are.) Check the estimated standard error of the loo-IC scores, as returned by the loo
function, and validate whether the result of the model comparison is trustworthy given the loo
function’s estimation method. (There may be warning messages about slightly high diagnostic values, which you would normally take very serious (read the paper that goes with the loo
package if you can!), but for this exercise you may pass them over.)