13.1 Simple linear regression with brms

The main function of the brms package is brm (short for Bayesian Regression Model). It behaves very similarly to the glm function we saw above.59 Here is an example of the current case study based on the world temperature data set:

fit_temperature <- brm(
  # specify what to explain in terms of what
  #  using the formula syntax
  formula = avg_temp ~ year,
  # which data to use
  data = aida::data_WorldTemp
)

The formula syntax y ~ x tells R that we want to explain or predict the dependent variable y in terms of associated measurements of x, as stored in the data set (tibble or data.frame) supplied in the function call as data.

The object returned by this function call is a special-purpose object of the class brmsfit. If we print this object to the screen we get a summary (which we can also produce with the explicit call summary(fit_temperature)).

fit_temperature
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: avg_temp ~ year 
##    Data: aida::data_WorldTemp (Number of observations: 269) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    -3.51      0.61    -4.68    -2.30 1.00     3961     2594
## year          0.01      0.00     0.01     0.01 1.00     3967     2626
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.41      0.02     0.37     0.44 1.00     1394     1605
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

This output tells us which model we fitted and it states some properties of the MCMC sampling routine used to obtain samples from the posterior distribution. The most important pieces of information for drawing conclusions from this analysis are the summaries for the estimated parameters, here called “Intercept” (the \(\beta_0\) of the regression model), “year” (the slope coefficient \(\beta_1\) for the year column in the data) and “sigma” (the standard deviation of the Gaussian error function around the central predictor). The “Estimate” shown here for each parameter is its posterior mean. The columns “l-95% CI” and “u-95% CI” give the 95% inner quantile range of the marginal posterior distribution for each parameter.


  1. Actually, brm is similar to the lmer function from the lme4 package, which is more general than glm. Both lmer and brm also cover so-called hierarchical regression models.↩︎