10 Model Comparison

Parameter estimation (the topic of the last chapter) asks: given a single model and the data, what are good (e.g., credible) values of the model’s parameters? Model comparison (the topic of this chapter) asks: based on the data at hand, which of several models is better? Or even: how much better is this model compared to another, given the data?

The pivotal criterion by which to compare models is how well a model explains the observed data. A good explanation of observed data \(D\) is one that makes \(D\) unsurprising. Intuitively, we long for an explanation for things that puzzle us. A good explanation is a way of looking at the world in which puzzles disappear, in which all observations make sense, in which what we have seen would have been quite expectable after all. Consequently, the pivotal quantity for comparing models is how likely \(D\) is given a model \(M_i\): \(P(D \mid M_i)\).

But there is more to a good explanation, also intuitively. All else equal, a good explanation is simple. If theories \(A\) and \(B\) both explain the facts equally well, but \(A\) does so with less “mental machinery”, most people would choose the more economical explanation \(A\).

In this chapter, we will look at two common methods of comparing models: the Akaike information criterion (AIC) and Bayes factors. AICs are a non-Bayesian method in the sense that it does not require (or ignores) a model’s priors over parameter values. Bayes factors are the flagship Bayesian method for model comparison. There are many other approaches to model comparison (e.g., other kinds of information criteria, or methods based on cross-validation). Our goal is not to be exhaustive, but to introduce the main ideas of model comparison and showcase a reasonable selection of representative approaches.

The learning goals for this chapter are:

understand the differences between estimation and model comparison
understand and apply the two covered methods:
- Akaike information criterion
- Bayes factor
become familiar with the pros and cons of each of these methods
[optional] get acquainted with some methods for computing Bayes factors