Here is code to load (and if necessary, install) required packages, and to set some global options (for plotting and efficient fitting of Bayesian models).
Code
# install packages from CRAN (unless installed)pckgs_needed <-c("tidyverse","brms","remotes","tidybayes","shinystan")pckgs_installed <-installed.packages()[,"Package"]pckgs_2_install <- pckgs_needed[!(pckgs_needed %in% pckgs_installed)]if(length(pckgs_2_install)) {install.packages(pckgs_2_install)} # install additional packages from GitHub (unless installed)if (!"aida"%in% pckgs_installed) { remotes::install_github("michael-franke/aida-package")}if (!"faintr"%in% pckgs_installed) { remotes::install_github("michael-franke/faintr")}if (!"cspplot"%in% pckgs_installed) { remotes::install_github("CogSciPrag/cspplot")}# load the required packagesx <-lapply(pckgs_needed, library, character.only =TRUE)library(aida)library(faintr)library(cspplot)# these options help Stan run fasteroptions(mc.cores = parallel::detectCores())# use the CSP-theme for plottingtheme_set(theme_csp())# global color scheme from CSPproject_colors = cspplot::list_colors()[c(1,3,4,5,2,6:14),"hex", drop =TRUE]# setting theme colors globallyscale_colour_discrete <-function(...) {scale_colour_manual(..., values = project_colors)}scale_fill_discrete <-function(...) {scale_fill_manual(..., values = project_colors)}
This tutorial provides demonstrations of how to check the quality of MCMC samples obtained from brms model fits.
A good model
To have something to go on, here are two model fits, one of this is good, the other is … total crap. The first model fits a smooth line to the average world temperature. (We need to set the seed here to have reproducible results.)
fit_good <-brm(formula = avg_temp ~s(year), data = aida::data_WorldTemp,seed =1969)
The good model is rather well behaved. Here is a generic plot of its posterior fits and traceplots:
plot(fit_good)
Traceplots look like madly-in-love caterpillars doing their thing.
We can check \(\hat{R}\) and effective sample sizes also in the summary of the model:
summary(fit_good)
Family: gaussian
Links: mu = identity; sigma = identity
Formula: avg_temp ~ s(year)
Data: aida::data_WorldTemp (Number of observations: 269)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Smooth Terms:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sds(syear_1) 3.56 1.08 1.96 6.17 1.00 907 1689
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 8.31 0.02 8.27 8.35 1.00 3732 2652
syear_1 14.55 2.25 10.11 19.08 1.00 2182 2371
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.33 0.01 0.30 0.36 1.00 3823 2962
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Interestingly, there is one warning message about one divergent transition. We are recommended to check the pairs() plot, so here goes:
pairs(fit_good)
This is actually not too bad. (Wait until you see a terrible case below!)
We can try to fix this problem with a single divergent transition by doing as recommended by the warning message, namely increasing the adapt_delta parameter in the control structure:
Family: gaussian
Links: mu = identity; sigma = identity
Formula: avg_temp ~ s(year)
Data: aida::data_WorldTemp (Number of observations: 269)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Smooth Terms:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sds(syear_1) 3.59 1.09 1.97 6.30 1.00 1003 1613
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 8.31 0.02 8.27 8.35 1.00 3632 2619
syear_1 14.59 2.40 9.96 19.34 1.00 2425 2510
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.33 0.01 0.30 0.36 1.00 3588 2868
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
That looks better, but what did we just do? — When the sampler “warms up”, it tries to find good parameter values for the case at hand. The adapt_delta parameter is the minimum amount of accepted proposals (where to jump next) before “warm up” counts as done and successfull. So with a small problem like this, just making the adaptation more ambitious may have have solved the problem. It has also, however, made the sampling slower, less efficient.
A powerful interactive tool for exploring a fitted model (diagnostics and more) is shinystan:
shinystan::launch_shinystan(fit_good_adapt)
A terrible model
The main (maybe only) reason for serious problems with the NUTS sampling is this: sampling issues arise for bad models. So, let’s come up with a really stupid model.
Here’s a model that is like the previous but adds a second predictor, which is a normal (non-smoothed) regression coefficient that is almost identical to the original year information. You may already intuit that this cannot possibly be a good idea; the model is notionally deficient. So, we exepct nightmares during sampling:
Family: gaussian
Links: mu = identity; sigma = identity
Formula: avg_temp ~ s(year) + year_perturbed
Data: mutate(aida::data_WorldTemp, year_perturbed = rnor (Number of observations: 269)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Smooth Terms:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sds(syear_1) 3.62 1.09 2.01 6.24 1.00 1239 1594
Population-Level Effects:
Estimate Est.Error l-95% CI
Intercept 2225135220368.58 2790632812490.53 -1139137522767.06
year_perturbed -1271506170.07 1594647735.12 -4693478867.37
syear_1 14.53 2.32 10.16
u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 8213585887067.27 1.76 6 35
year_perturbed 650935896.17 1.76 6 35
syear_1 19.27 1.00 2285 2953
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.33 0.01 0.30 0.36 1.00 4150 2656
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Indeed, that looks pretty bad. We managed to score badly on all major accounts:
large \(\hat{R}\)
extremely poor efficient sample size
ridiculously far ranging posterior estimates for the main model components
tons of divergent transitions
maximum treedepth reached more often than hipster touches their phone in a week
Some of these caterpillars look like they are in a vicious rose war:
plot(fit_bad)
We also see that that the intercept of and the slope for year_perturbed are the main troublemakers (in terms of traceplots).
Interestingly, a simple posterior check doesn’t look half-bad:
pp_check(fit_bad)
But now, have a look at the pairs() plot:
pairs(fit_bad)
Aha, there we see a clear problem! The joint posterior for the intercept and the slope for year_perturbed looks like a line. This means that these parameters could in princple do the same “job”.
This suggests a possible solution stragey. The model is too unconstrained. It can allow these two parameters meander to wherever they want (or so it seems). We could therefore try honing them in by specifying priors, like so:
Family: gaussian
Links: mu = identity; sigma = identity
Formula: avg_temp ~ s(year) + year_perturbed
Data: mutate(aida::data_WorldTemp, year_perturbed = rnor (Number of observations: 269)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Smooth Terms:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sds(syear_1) 3.62 1.08 2.04 6.16 1.00 1074 1808
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -187.55 58370.12 -91895.80 96032.53 1.02 2239 613
year_perturbed 0.11 33.35 -54.87 52.52 1.02 2239 613
syear_1 14.60 2.32 9.93 19.16 1.00 2006 2539
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.33 0.01 0.30 0.36 1.00 3180 2729
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Well, alright! That isn’t too bad anymore. But it is still clear from the posterior pairs plot that this model has two parameters that steal each other’s show. The model remains a bad model … for our data.
pairs(fit_bad)
Here’s what’s wrong: year_perturbed is a constant! The model is a crappy model of the data, because the data is not what we thought it would be. Check it out:
That’s more like what we thought it was: year_perturbed is supposed to be noisy version of the actual year. So, let’s try again, leaving out the smoothing, just for some more chaos-loving fun:
fit_bad_2 <-brm(formula = avg_temp ~ year + year_perturbed, data = data_WorldTemp_perturbed,seed =1969,prior =prior("student_t(1,0,5)", coef ="year_perturbed")) summary(fit_bad_2)
Family: gaussian
Links: mu = identity; sigma = identity
Formula: avg_temp ~ year + year_perturbed
Data: data_WorldTemp_perturbed (Number of observations: 269)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -3.45 0.58 -4.58 -2.32 1.00 4456 3133
year 0.01 0.00 0.01 0.01 1.00 4059 2558
year_perturbed -0.00 0.00 -0.00 0.00 1.00 4094 2708
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.40 0.02 0.37 0.44 1.00 1456 1573
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
There are no warnings, so this model must be good, right? – No!
If we check the pairs plot, we see that we now have introduced a fair correlation between the two predictor variables.
pairs(fit_bad_2)
We should just not have year_perturbed; it’s nonsense, and it shows in the diagnostics.