Instructions

Grading scheme

Suggested readings

Required R packages


As always, load the required packages and change the seed to your last name. Remember to set eval = TRUE.

library(TeachingDemos)
library(tidyverse)
library(rstan)
library(ggmcmc)
library(brms)
library(loo)
library(plotrix)

# set cores to use to the total number of cores (minimally 4)
options(mc.cores = max(parallel::detectCores(), 4))
# save a compiled version of the Stan model file
rstan_options(auto_write = TRUE)

lastname <- "YOURLASTNAME"

char2seed(lastname)

For this homework, you will analyze data from a speech production experiment in which male and female speakers produced utterances in a number of scenarios. Scenarios differed in attitude: either informal or polite. For example, a scenario could be something like a request for salt. An informal request could be “Can I get the salt?” and a formal request would be “Would you mind passing the salt, please?”.

The data gives the mean voice pitch frequency of the utterances. This research is interested in finding whether there are differences in pitch frequency between informal and polite scenarios for either female or male speaker. Concretely, we are interested in three hypotheses:

  1. Do female speakers have a lower average frequency in polite scenarios than in informal scenarios?
  2. Do male speakers have a lower average frequency in polite scenarios than in informal scenarios?
  3. Do male speakers have a lower average frequency in informal scenarios than female speakers in polite scenarios?

1. Read and plot the data

a. Load and clean the data (1 point)

Remember to set eval = TRUE.

#read the data file 
politeness_data <- ..FILL ME..%>% 
  
  # remove an entry with a missing data point for frequency
  filter(..FILL ME..) %>%  
  
  # specify gender, subject, scenario and attitude as factors
  mutate(..FILL ME ..)

#show the tibble
politeness_data

b. Create a summary of the data (1 point)

Create a summary of mean frequency with standard error by gender and attitude. Remember to set eval = TRUE.

politeness_summary <- politeness_data %>% 
  
  # group by gender and attitude
  group_by(..FILL ME..) %>% 
  
  # create a summary of mean freq and standard error
  summarize(mean_frequency = ..FILL ME..,
            standard_error = plotrix::std.error(frequency))

c. Plot the summary (1 point)

Make a bar plot with gender on the x-axis, mean frequency on the y-axis and vary fill colour by attitude. Include error bars for standard error.

Remember to set eval = TRUE.

politeness_summary %>% 
  ggplot(aes(x = ..FILL ME..,
             y = ..FILL ME..,
             fill = ..FILL ME..)) +
  
  # bars for mean frequency
  geom_bar(stat = "identity", position = "dodge") +
  
  # error bars for standard error
  geom_errorbar(aes(..FILL ME..), position = "dodge" )

2. Fit some regression models

a. Specify regression models (3 points)

Specify the formula argument for three regression models.

The first one, model_FE, only has fixed effects. It tries to explain frequency in terms of gender, attitude and their interaction.

The second one, model_intercept_only, is like model_FE but also adds random intercepts for both scenario and subject.

Finally, model_max_RE is like model_FE but also specifies the following random effects structure: by-scenario random intercepts, and random slopes for gender, attitude and their interaction, as well as by-subject random intercepts and random slopes for attitude.

Remember to set eval = TRUE.

model_FE <- brm(formula = ..FILL ME.., 
               data = politeness_data)

model_intercept_only <- brm(formula = ..FILL ME.., 
                          data = politeness_data)

model_max_RE <- brm(formula = ..FILL ME.., 
                  data = politeness_data)

b. Extracting comparisons (3 points)

The following function attempts to calculate the probability of each hypothesis being true, given the model.

  1. Fill in the correct formula for the “male-informal” cell in the design matrix.

  2. Specify the correct names and comparisons for hypothesis 2 and 3

  3. Run the function for each model and interpret the output

Remember to set eval = TRUE.

extract_comparisons <- function(model) {
  
  # get posterior samples
  post_samples <- posterior_samples(model)
  
  # mnemonic names for reconstructed predictor values for all cells in the design matrix
  
  # female-informal
  F_inf <- post_samples$b_Intercept

  # female-polite
  F_pol <- post_samples$b_Intercept + 
    post_samples$b_attitudepol
  
  # male-informal
  M_inf <- ..FILL ME..
  
  # male-polite
  M_pol <- post_samples$b_Intercept + 
    post_samples$b_genderM + 
    post_samples$b_attitudepol + 
    post_samples$`b_genderM:attitudepol`
  
  # create a tibble for recording hypotheses probabilities
  tibble(
    # names of hypotheses
    hypothesis = c(
      # h1: 
      "Female-polite < Female-informal",
      # h2:
      "Male-polite < Male-informal",
      # h3:
      "..FILL ME.."
      ),
         
    # probability of hypotheses being true
    probability = c(
      # h1:
      mean(F_pol < F_inf),
      # h2:
      mean(..FILL ME..),
      # h3
      mean(..FILL ME..)
      )
    )
}

# run the function on each model
# YOUR CODE HERE

Given the output, what would you say about the probability of each hypothesis being true? Which, if any, are likely to be true under all models? Which, if any, depend on the model considered?

ANSWER:

(your answer here)

c. Inspecting random effects (1 point)

Look at the estimates for the coefficients in the random effects structure of the model with the maximal RE structure (get them from summary()).

# YOUR CODE HERE

Which of these random effects coefficients are very, very credibly estimated to be different from zero?

ANSWER:

(your answer here)

3. Model comparison

a. Leave-one-out cross-validation (1 point)

Compare the three models with leave-one-out cross-validation.

Remember to set eval = TRUE.

loo(..FILL ME..,
    reloo = T)

b. Interpretation (1 point)

Which model is best?

Keep in mind that:

  • lower LOO-IC scores are better
  • the SE is a standard error of the LOO-IC
  • we judge any difference between LOO-IC scores to be meaningful if the SE of the difference does not exceed the absolute value of the LOO-IC score of the difference

ANSWER:

(your answer here)


End of assignment