Instructions

Grading scheme

Suggested readings

Required R packages


As always, load the required packages and change the seed to your last name.

library(TeachingDemos)
library(tidyverse)
## ── Attaching packages ─────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  2.0.1     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.3.1
## ✔ readr   1.3.1     ✔ forcats 0.3.0
## ── Conflicts ────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.18.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## 
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
## 
##     extract
library(ggmcmc)
library(bridgesampling)
library(polspline)

# set cores to use to the total number of cores (minimally 4)
options(mc.cores = max(parallel::detectCores(), 4))
# save a compiled version of the Stan model file
rstan_options(auto_write = TRUE)

lastname <- "YOURLASTNAME"

char2seed(lastname)

Here is some data from a forgetting experiment. t is the time point, and y is the proportion of people who correctly recalled the stimulus.

y <- c(.94, .77, .40, .26, .24, .16)
t <- c(1,   3,   6,   9,  12,  18)
obs <- y * 100

forgetting_data <- tibble(obs = obs,
                   N = 100,
                   t = t)

1. Implement models

Below is the code for the power model of forgetting we discussed in class: \[P(\text{recall} | t; c, d) = ct^{-d}\] where \(0 < c, d < 1.5\)

a. (1 point)

Implement an alternative model based on this exponential model for each participant:

\[P(\text{recall} | t; a, b) = a \exp(-bt)\] where \(0 < a, b < 1.5\)

power_model_str <- "
// power model of forgetting data
data {
  int obs[6] ;
  real t[6] ;
}
parameters {
  real<lower=0,upper=1.5> c ;
  real<lower=0,upper=1.5> d ;
} 
transformed parameters {
  real theta[6];
  for (i in 1:6) {
    theta[i] = c * t[i]^(-d) ;   
  }
}
model {
  // increment target log score with
  // probability of current samples from prior
  target += uniform_lpdf(c | 0, 1.5);
  target += uniform_lpdf(d | 0, 1.5);
  // increment target log score with
  // likelihood of data given current parameter samples
  target += binomial_lpmf(obs | 100, theta);
}

"

exponential_model_str <- "
// exponential model of forgetting data
// STAN CODE HERE

"

2. Parameter inference

a. (2 points)

For both models, complete the code below to draw samples from the posterior distribution. NB: you should use a large amount of samples (as indicated) so that later bridge sampling is reliable. We use the control parameter here to fine-tune the inference for these models. Remember to set eval = TRUE.

# complete the code below
stanfit_exponential <- stan(# specify model
                            iter = 50000,
                            warmup = 10000,
                            control = list(adapt_delta = 0.999),
                            pars = , # replace NA with parameters to inspect
                            data =  ## provide data
                            )

stanfit_power <- stan(# specify model
                      iter = 50000,
                      warmup = 10000,
                      control = list(adapt_delta = 0.999),
                      pars = , # replace NA with parameters to inspect
                      data =  ## provide data
                      )                 

b. (1 point)

Show a summary for each fit

# YOUR CODE HERE

c. (1 point)

Make traceplots for the samples for both models

# YOUR CODE HERE

d. (1 point)

Plot the density for posterior samples for both models

# YOUR CODE HERE

3. Bridge Sampling

a. (1 point)

Approximate the marginal likelihood of each model using the bridgesampling package (Hint: use bridgesampling::bridge_sampler() as we did in the lecture, and use option silent = T to suppress output in the markdown).

# YOUR CODE HERE

b. (1 point)

Determine the error of the estimated marginal likelihoods using bridgesampling::error_measures.

# YOUR CODE HERE

c. (1 point)

The previous results only give you an estimate for the marginal likelihood of the data given each model. But you can use these estimates to compute the Bayes factor in favour of the exponential model.

# YOUR CODE HERE

d. (1 point)

Interpret the Bayes factor. What is the value? What does it mean?

ANSWER:

(your answer here)

4. Savage-Dickey

Use the Savage-Dickey method to compute the Bayes factor in favor of the exponential model, when compared against a special case (properly nested model) of the exponential model where a = 1.

a. (1 point)

Fill in the required code:

# NB: we take only a subsest of the samples because otherwise polspline will not work
post_samples <- filter(ggs(stanfit_exponential), 
                       Parameter == "a" & Iteration <= 1000)$value

fit.posterior <- polspline::oldlogspline(post_samples, 
                                         ubound=1.5, lbound=0)

posterior_a <- polspline::doldlogspline(1, fit.posterior)

# calculate the Bayes factor via the Savage-Dickey method
# YOUR CODE HERE

b. (1 point)

Interpret the result

Is this strong or mild evidence in favor of one of the two models?

ANSWER:

(your answer here)

5. Posterior predictive distribution

We want to approximate the posterior predictive distribution. For this, we draw samples from the posterior, using basically the same scripts as above for posterior estimation. Additionally we generate a “sample imaginary prediction” of data that we’d expect to see in an exact repetition of the experiment, for each sampled vector of parameter values. To do so, we use the generated parameters block at the end of the Stan model code.

a. (2 points)

Complete the pseudo-code below for the exponential and the power model with actual Stan code which does that. (Hint: use command binomial_rng to take samples from a binomial distribution; check the Stan manual for its syntax.)

power_pred_model_str <- "
// Stan code here

// fill in the gaps and replace the pseudo-code
data {
 // as before
}
parameters {
 // as before
} 
transformed parameters {
  // as before
}
model {
  // as before
}
## PSEUDO-CODE!
generated quantities {
 // fill in for yourself
 // DECLARE VARIABLE 'post_predict_sample' WITH SAME SHAPE AS THE DATA
 // post_predict_sample = SAMPLE FROM BINOMIAL DISTRIBUTION, WITH CURRENT PARAMETER ESTIMATES
}

"

exponential_pred_model_str <- "
// Stan code here

// fill in the gaps and replace the pseudo-code
data {
 // as before
}
parameters {
 // as before
} 
transformed parameters {
  // as before
}
model {
  // as before
}
## PSEUDO-CODE!
generated quantities {
 // fill in for yourself
 // DECLARE VARIABLE 'post_predict_sample' WITH SAME SHAPE AS THE DATA
 //  post_predict_sample = SAMPLE FROM BINOMIAL DISTRIBUTION, WITH CURRENT PARAMETER ESTIMATES
}

"

b. (2 points)

Plot the data and a visual posterior predictive check

Use the samples from variable post_predict_sample to generate plots for a visual posterior predictive check, which should similar what we had in the lecture. Use y limits of 0 to 100.

Hint: use geom_point() for the real data and geom_pointrange() for the predicted values and 95% credible intervals.

# YOUR CODE HERE

6. Computing posterior predictive \(p\)-value

To compute the posterior predictive \(p\)-value most efficiently (in the sense of having to write least code; not necessarily in the sense of fastest run time), we will expand the Stan code even more. Based on your previous code which computes samples from the posterior predictive distribution, expand the Stan code even more to also generate two more quantities. The first is the likelihood of the observed data under the currently sampled parameter values. The second is the likelihood of the post_predict_sample under the currently sampled parameter value. The posterior predictive \(p\)-value is approximated by the mean of the variable post_pred_p_value_flag in the pseudo-code below.

a. (2 points)

Complete the code accordingly.

power_ppp_model_str <- "
// fill in the gaps and replace the pseudo-code
data {
 // as before
}
parameters {
 // as before
} 
transformed parameters {
  // as before
}
model {
  // as before
}
## PSEUDO-CODE!
generated quantities {
  // fill in for yourself
  // DECLARE VARIABLE 'post_predict_sample' WITH SAME SHAPE AS THE DATA
  // post_predict_sample = SAMPLE FROM BIOMIAL DISTRIBUTION, WITH CURRENT PARAMETER ESTIMATES
  // likelihood_data = binomial_lpmf( ... FILL ME ... )
  // likelihood_post_predict_sample = binomial_lpmf( ... FILL ME ... )
  // post_pred_p_value_flag = likelihood_data >= likelihood_post_predict_sample;
}

"

exponential_ppp_model_str <- "
// fill in the gaps and replace the pseudo-code
data {
 // as before
}
parameters {
 // as before
} 
transformed parameters {
  // as before
}
model {
  // as before
}
## PSEUDO-CODE!
generated quantities {
  // fill in for yourself
  // DECLARE VARIABLE 'post_predict_sample' WITH SAME SHAPE AS THE DATA
  // post_predict_sample = SAMPLE FROM BIOMIAL DISTRIBUTION, WITH CURRENT PARAMETER ESTIMATES
  // likelihood_data = binomial_lpmf( ... FILL ME ... )
  // likelihood_post_predict_sample = binomial_lpmf( ... FILL ME ... )
  // post_pred_p_value_flag = likelihood_data >= likelihood_post_predict_sample;
}

"

b. (1 point)

Retrieve estimates of the posterior predictive \(p\)-values for both the exponential and power models.

# YOUR CODE HERE
# Run the stan models and retrieve an estimate of the posterior predictive 
# p-value for both exponential and power model.

Can either of the models be criticised based on these values? If so, which and why?

ANSWER:

(your answer here)


End of assignment