Predictions from generalized linear models

Bayesian regression: theory & practice

Author

Michael Franke

Preamble

Here is code to load (and if necessary, install) required packages, and to set some global options (for plotting and efficient fitting of Bayesian models).

Toggle code
# install packages from CRAN (unless installed)
pckgs_needed <- c(
  "tidyverse",
  "brms",
  "rstan",
  "rstanarm",
  "remotes",
  "tidybayes",
  "bridgesampling",
  "shinystan",
  "mgcv"
)
pckgs_installed <- installed.packages()[,"Package"]
pckgs_2_install <- pckgs_needed[!(pckgs_needed %in% pckgs_installed)]
if(length(pckgs_2_install)) {
  install.packages(pckgs_2_install)
} 

# install additional packages from GitHub (unless installed)
if (! "aida" %in% pckgs_installed) {
  remotes::install_github("michael-franke/aida-package")
}
if (! "faintr" %in% pckgs_installed) {
  remotes::install_github("michael-franke/faintr")
}
if (! "cspplot" %in% pckgs_installed) {
  remotes::install_github("CogSciPrag/cspplot")
}

# load the required packages
x <- lapply(pckgs_needed, library, character.only = TRUE)
library(aida)
library(faintr)
library(cspplot)

# these options help Stan run faster
options(mc.cores = parallel::detectCores())

# use the CSP-theme for plotting
theme_set(theme_csp())

# global color scheme from CSP
project_colors = cspplot::list_colors() |> pull(hex)
# names(project_colors) <- cspplot::list_colors() |> pull(name)

# setting theme colors globally
scale_colour_discrete <- function(...) {
  scale_colour_manual(..., values = project_colors)
}
scale_fill_discrete <- function(...) {
   scale_fill_manual(..., values = project_colors)
}

Model predictions

Usually, when we think about what predictions a model makes, we think about the predictions it makes about the data, e.g., which data points \(y'\) is deems likely after having been conditioned on some data \(y\). These are the types of predictions that usually matter. They address the question: what will the future be like?

But complex models, with various latent variables, not only make predictions about data \(y\) but also about all computational steps from parameters \(\theta\) to data \(y\), so to speak.

A vanilla linear model (whether before or after being conditioned on some data), makes two kinds of predictions, namely:

  1. the shape of the (hypothetical) data \(y'\) for \(x\), and
  2. the central tendency of data \(y\) for some predictor \(x\).

Generalized linear models often also disassociate a prediction of central tendency (point 2 above), from the linear predictor that is used to compute that prediction of central tendency. So, all linear models also predict:

  1. a linear predictor value given values of \(x\),

but for some linear models (with the identity function as a link function), there is no difference between 1 and 3.

When we speak of the *posterior predictive distribution* we usually mean predictions about data \(y'\), but the term can be (leniently) also applied to the latter two types of predictions.

Predictive samples

Samples for all of the three types of posterior predictive distributions can be obtained from a fitted model, e.g., with different functions from the tidyverse package. Here, it does not matter whether the model was fitted to data or it is a “prior model”, so to speak, fit with the flag sample_prior = "only". We look at the posterior case first, then at the prior predicitives.

Posterior predictives

Here is an example for a logistic regression model (where all the three measures clearly show their conceptual difference). Fit the model to some data first (here: predicting accuracy for two categorical factors with two levels each):

Toggle code
fit_MT_logistic <- 
  brms::brm(
    formula = correct ~ group * condition,
    data    = aida::data_MT,
    family  = brms::bernoulli()
  )

The posterior predictive (in the most general sense) makes predictions about the to-be-expected data, here a Boolean value of whether a response was correct.

Toggle code
# 2 samples from the predictive distribution (data samples)
data_MT |> 
  select(group, condition) |> 
  unique() |> 
  tidybayes::add_predicted_draws(
    fit_MT_logistic,
    ndraws = 2
    )
# A tibble: 8 × 7
# Groups:   group, condition, .row [4]
  group condition  .row .chain .iteration .draw .prediction
  <chr> <chr>     <int>  <int>      <int> <int>       <int>
1 touch Atypical      1     NA         NA     1           1
2 touch Atypical      1     NA         NA     2           1
3 touch Typical       2     NA         NA     1           1
4 touch Typical       2     NA         NA     2           0
5 click Atypical      3     NA         NA     1           1
6 click Atypical      3     NA         NA     2           1
7 click Typical       4     NA         NA     1           1
8 click Typical       4     NA         NA     2           1

A predicted central tendency for this logistic model is a probability of giving a correct answer.

Toggle code
# 2 samples from the predicted central tendency
data_MT |> 
  select(group, condition) |> 
  unique() |> 
  tidybayes::add_epred_draws(
    fit_MT_logistic,
    ndraws = 2
    )
# A tibble: 8 × 7
# Groups:   group, condition, .row [4]
  group condition  .row .chain .iteration .draw .epred
  <chr> <chr>     <int>  <int>      <int> <int>  <dbl>
1 touch Atypical      1     NA         NA     1  0.935
2 touch Atypical      1     NA         NA     2  0.918
3 touch Typical       2     NA         NA     1  0.938
4 touch Typical       2     NA         NA     2  0.936
5 click Atypical      3     NA         NA     1  0.885
6 click Atypical      3     NA         NA     2  0.849
7 click Typical       4     NA         NA     1  0.964
8 click Typical       4     NA         NA     2  0.950

Predictions at the linear predictor level are sometimes not so easy to interpret. The interpretation depends on the kind of link function used (more on this under the topic of “generalized linear models”). For a logistic regression, this number is a log-odds ratio (which determines the predicted correctness-probability).

Toggle code
# 2 samples for the linear predictor
data_MT |> 
  select(group, condition) |> 
  unique() |> 
  tidybayes::add_linpred_draws(
    fit_MT_logistic,
    ndraws = 2
    )
# A tibble: 8 × 7
# Groups:   group, condition, .row [4]
  group condition  .row .chain .iteration .draw .linpred
  <chr> <chr>     <int>  <int>      <int> <int>    <dbl>
1 touch Atypical      1     NA         NA     1     2.10
2 touch Atypical      1     NA         NA     2     2.29
3 touch Typical       2     NA         NA     1     2.74
4 touch Typical       2     NA         NA     2     2.55
5 click Atypical      3     NA         NA     1     1.89
6 click Atypical      3     NA         NA     2     2.33
7 click Typical       4     NA         NA     1     3.46
8 click Typical       4     NA         NA     2     3.52