rmarkdown
working in order to ‘knit’ the HTML output.ctrl/cmd
+ shift
+ K
in RStudio) to produce a HTML file.rmarkdown
(comes with RStudio)tidyverse
(or ggplot2
, dplyr
, purrr
, tibble
)TeachingDemos
rstan
ggmcmc
brms
loo
plotrix
As always, load the required packages and change the seed to your last name. Remember to set eval = TRUE
.
library(TeachingDemos)
library(tidyverse)
library(rstan)
library(ggmcmc)
library(brms)
library(loo)
library(plotrix)
# set cores to use to the total number of cores (minimally 4)
options(mc.cores = max(parallel::detectCores(), 4))
# save a compiled version of the Stan model file
rstan_options(auto_write = TRUE)
lastname <- "YOURLASTNAME"
char2seed(lastname)
library(TeachingDemos)
library(tidyverse)
library(rstan)
library(ggmcmc)
library(brms)
library(loo)
library(plotrix)
# set cores to use to the total number of cores (minimally 4)
options(mc.cores = max(parallel::detectCores(), 4))
# save a compiled version of the Stan model file
rstan_options(auto_write = TRUE)
lastname <- "bayes"
char2seed(lastname)
For this homework, you will analyze data from a speech production experiment in which male and female speakers produced utterances in a number of scenarios. Scenarios differed in attitude: either informal or polite. For example, a scenario could be something like a request for salt. An informal request could be “Can I get the salt?” and a formal request would be “Would you mind passing the salt, please?”.
The data gives the mean voice pitch frequency of the utterances. This research is interested in finding whether there are differences in pitch frequency between informal and polite scenarios for either female or male speaker. Concretely, we are interested in three hypotheses:
Remember to set eval = TRUE
.
#read the data file
politeness_data <- ..FILL ME..%>%
# remove an entry with a missing data point for frequency
filter(..FILL ME..) %>%
# specify gender, subject, scenario and attitude as factors
mutate(..FILL ME ..)
#show the tibble
politeness_data
politeness_data <- read_csv('data/politeness_data.csv') %>%
# remove an entry with a missing data point
filter(!is.na(frequency)) %>%
# specify factors
mutate(scenario = as.factor(scenario),
attitude = as.factor(attitude),
subject = as.factor(subject),
gender = as.factor(gender))
## Parsed with column specification:
## cols(
## subject = col_character(),
## gender = col_character(),
## scenario = col_double(),
## attitude = col_character(),
## frequency = col_double()
## )
politeness_data
## # A tibble: 83 x 5
## subject gender scenario attitude frequency
## <fct> <fct> <fct> <fct> <dbl>
## 1 F1 F 1 pol 213.
## 2 F1 F 1 inf 204.
## 3 F1 F 2 pol 285.
## 4 F1 F 2 inf 260.
## 5 F1 F 3 pol 204.
## 6 F1 F 3 inf 287.
## 7 F1 F 4 pol 251.
## 8 F1 F 4 inf 277.
## 9 F1 F 5 pol 232.
## 10 F1 F 5 inf 252.
## # … with 73 more rows
Create a summary of mean frequency with standard error by gender and attitude. Remember to set eval = TRUE
.
politeness_summary <- politeness_data %>%
# group by gender and attitude
group_by(..FILL ME..) %>%
# create a summary of mean freq and standard error
summarize(mean_frequency = ..FILL ME..,
standard_error = plotrix::std.error(frequency))
politeness_summary <- politeness_data %>%
group_by(gender, attitude) %>%
summarize(mean_frequency = mean(frequency),
standard_error = plotrix::std.error(frequency))
Make a bar plot with gender on the x-axis, mean frequency on the y-axis and vary fill colour by attitude. Include error bars for standard error.
Remember to set eval = TRUE
.
politeness_summary %>%
ggplot(aes(x = ..FILL ME..,
y = ..FILL ME..,
fill = ..FILL ME..)) +
# bars for mean frequency
geom_bar(stat = "identity", position = "dodge") +
# error bars for standard error
geom_errorbar(aes(..FILL ME..), position = "dodge" )
politeness_summary %>%
ggplot(aes(x = gender,
y = mean_frequency,
fill = attitude)) +
geom_bar(stat = "identity", position = "dodge") +
geom_errorbar(aes(ymin = mean_frequency - standard_error,
ymax = mean_frequency + standard_error),position = "dodge")
Specify the formula
argument for three regression models.
The first one, model_FE
, only has fixed effects. It tries to explain frequency
in terms of gender
, attitude
and their interaction.
The second one, model_intercept_only
, is like model_FE
but also adds random intercepts for both scenario
and subject
.
Finally, model_max_RE
is like model_FE
but also specifies the following random effects structure: by-scenario random intercepts, and random slopes for gender
, attitude
and their interaction, as well as by-subject random intercepts and random slopes for attitude
.
Remember to set eval = TRUE
.
model_FE <- brm(formula = ..FILL ME..,
data = politeness_data)
model_intercept_only <- brm(formula = ..FILL ME..,
data = politeness_data)
model_max_RE <- brm(formula = ..FILL ME..,
data = politeness_data)
model_FE <- brm(formula = frequency ~ gender * attitude,
data = politeness_data)
## Compiling the C++ model
## Start sampling
model_intercept_only <- brm(formula = frequency ~ gender * attitude +
(1 | scenario + subject),
data = politeness_data)
## Compiling the C++ model
## Start sampling
model_max_RE <- brm(formula = frequency ~ gender * attitude +
(1 + gender * attitude | scenario) +
(1 + attitude | subject),
data = politeness_data)
## Compiling the C++ model
## Start sampling
## Warning: There were 10 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Warning: Examine the pairs() plot to diagnose sampling problems
The following function attempts to calculate the probability of each hypothesis being true, given the model.
Fill in the correct formula for the “male-informal” cell in the design matrix.
Specify the correct names and comparisons for hypothesis 2 and 3
Run the function for each model and interpret the output
Remember to set eval = TRUE
.
extract_comparisons <- function(model) {
# get posterior samples
post_samples <- posterior_samples(model)
# mnemonic names for reconstructed predictor values for all cells in the design matrix
# female-informal
F_inf <- post_samples$b_Intercept
# female-polite
F_pol <- post_samples$b_Intercept +
post_samples$b_attitudepol
# male-informal
M_inf <- ..FILL ME..
# male-polite
M_pol <- post_samples$b_Intercept +
post_samples$b_genderM +
post_samples$b_attitudepol +
post_samples$`b_genderM:attitudepol`
# create a tibble for recording hypotheses probabilities
tibble(
# names of hypotheses
hypothesis = c(
# h1:
"Female-polite < Female-informal",
# h2:
"Male-polite < Male-informal",
# h3:
"..FILL ME.."
),
# probability of hypotheses being true
probability = c(
# h1:
mean(F_pol < F_inf),
# h2:
mean(..FILL ME..),
# h3
mean(..FILL ME..)
)
)
}
# run the function on each model
# YOUR CODE HERE
extract_comparisons <- function(model) {
# get posterior samples
post_samples <- posterior_samples(model)
# mnemonic names for reconstructed predictor values for all cells in the design matrix
# female informal
F_inf <- post_samples$b_Intercept
# female polite
F_pol <- post_samples$b_Intercept +
post_samples$b_attitudepol
# male informal
M_inf <- post_samples$b_Intercept +
post_samples$b_genderM
# male polite
M_pol <- post_samples$b_Intercept +
post_samples$b_genderM +
post_samples$b_attitudepol +
post_samples$`b_genderM:attitudepol`
# create a tibble of for recording hypotheses
tibble(hypothesis = c("Female-polite < Female-informal",
"Male-polite < Male-informal",
"Male-informal < Female-polite"),
probability = c(mean(F_pol < F_inf),
mean(M_pol < M_inf),
mean(M_inf < F_pol)
))
}
# run the function on each model
extract_comparisons(model_FE)
## # A tibble: 3 x 2
## hypothesis probability
## <chr> <dbl>
## 1 Female-polite < Female-informal 0.992
## 2 Male-polite < Male-informal 0.846
## 3 Male-informal < Female-polite 1
extract_comparisons(model_intercept_only)
## # A tibble: 3 x 2
## hypothesis probability
## <chr> <dbl>
## 1 Female-polite < Female-informal 1
## 2 Male-polite < Male-informal 0.931
## 3 Male-informal < Female-polite 0.992
extract_comparisons(model_max_RE)
## # A tibble: 3 x 2
## hypothesis probability
## <chr> <dbl>
## 1 Female-polite < Female-informal 0.977
## 2 Male-polite < Male-informal 0.814
## 3 Male-informal < Female-polite 0.985
Given the output, what would you say about the probability of each hypothesis being true? Which, if any, are likely to be true under all models? Which, if any, depend on the model considered?
ANSWER:
SOLUTION:
Under all models the posterior probability that hypotheses 1 and 3 are true is very high (bigger than .95). There is less posterior credence that hypothesis 2 is true on all accounts.
Look at the estimates for the coefficients in the random effects structure of the model with the maximal RE structure (get them from summary()
).
# YOUR CODE HERE
summary(model_max_RE)
## Warning: There were 10 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
## See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: frequency ~ gender * attitude + (1 + gender * attitude | scenario) + (1 + attitude | subject)
## Data: politeness_data (Number of observations: 83)
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup samples = 4000
##
## Group-Level Effects:
## ~scenario (Number of levels: 7)
## Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept) 21.98 10.67 7.21 48.65
## sd(genderM) 10.55 8.53 0.40 31.50
## sd(attitudepol) 14.99 11.35 0.60 42.62
## sd(genderM:attitudepol) 16.85 13.00 0.78 49.68
## cor(Intercept,genderM) -0.23 0.44 -0.90 0.71
## cor(Intercept,attitudepol) 0.02 0.42 -0.76 0.80
## cor(genderM,attitudepol) -0.06 0.44 -0.83 0.77
## cor(Intercept,genderM:attitudepol) -0.09 0.43 -0.83 0.72
## cor(genderM,genderM:attitudepol) -0.05 0.44 -0.82 0.78
## cor(attitudepol,genderM:attitudepol) -0.14 0.45 -0.88 0.73
## Eff.Sample Rhat
## sd(Intercept) 1633 1.00
## sd(genderM) 2156 1.00
## sd(attitudepol) 1434 1.00
## sd(genderM:attitudepol) 1882 1.00
## cor(Intercept,genderM) 3746 1.00
## cor(Intercept,attitudepol) 3927 1.00
## cor(genderM,attitudepol) 2497 1.00
## cor(Intercept,genderM:attitudepol) 4114 1.00
## cor(genderM,genderM:attitudepol) 3061 1.00
## cor(attitudepol,genderM:attitudepol) 3060 1.00
##
## ~subject (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample
## sd(Intercept) 36.49 17.96 14.99 85.94 1476
## sd(attitudepol) 9.41 9.34 0.32 34.78 1716
## cor(Intercept,attitudepol) 0.03 0.57 -0.94 0.94 4221
## Rhat
## sd(Intercept) 1.00
## sd(attitudepol) 1.00
## cor(Intercept,attitudepol) 1.00
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept 259.64 25.24 207.02 308.67 1168 1.00
## genderM -113.74 33.87 -177.99 -40.64 1115 1.00
## attitudepol -27.30 12.92 -52.64 -1.34 2171 1.00
## genderM:attitudepol 15.84 17.58 -19.26 51.38 2153 1.00
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma 25.01 2.35 20.88 30.08 2713 1.00
##
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample
## is a crude measure of effective sample size, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Which of these random effects coefficients are very, very credibly estimated to be different from zero?
ANSWER:
SOLUTION:
None except the by-subject and by-scenario intercepts. The lower 95% CIs for the standard deviations are very close to zero.
Compare the three models with leave-one-out cross-validation.
Remember to set eval = TRUE
.
loo(..FILL ME..,
reloo = T)
loo(model_FE, model_intercept_only, model_max_RE,
reloo = T)
## No problematic observations found. Returning the original 'loo' object.
## No problematic observations found. Returning the original 'loo' object.
## 1 problematic observation(s) found.
## The model will be refit 1 times.
##
## Fitting model 1 out of 1 (leaving out observation 31)
## Start sampling
## LOOIC SE
## model_FE 834.91 11.75
## model_intercept_only 788.76 17.06
## model_max_RE 796.28 17.90
## model_FE - model_intercept_only 46.15 11.24
## model_FE - model_max_RE 38.63 12.24
## model_intercept_only - model_max_RE -7.52 3.77
Which model is best?
Keep in mind that:
ANSWER:
SOLUTION:
As all the differences are greater than their standard errors, they are all meaningful. The intercept only model is the best as it has the lowest LOO-IC.
End of assignment