11 Bayesian hypothesis testing
This chapter introduces common Bayesian methods of testing what we could call statistical hypotheses. A statistical hypothesis is a hypothesis about a particular model parameter or a set of model parameters. Most often, such a hypothesis concerns one parameter, and the assumption in question is that this parameter takes on a specific value, or some value from a specific interval. Henceforth, we speak just of a “hypothesis” even though we mean a specific hypothesis about particular model parameters. For example, we might be interested in what we will call a pointvalued hypothesis, stating that the value of parameter \(\theta\) is fixed to a specific value \(\theta = \theta^*\). Section 11.1 introduces different kinds of statistical hypotheses in more detail.
Given a statistical hypothesis about parameter values, we are interested in “testing” it. Strictly speaking, the term “testing” should probably be reserved for statistical decision procedures which give clear categorical judgements, such as whether to reject a hypothesis, accept it as true or to withhold judgement because no decision can be made (yet/currently). While we will encounter such categorical decision routines in this chapter, Bayesian approaches to hypotheses “testing” are first and foremost concerned, not with categorical decisions, but with quantifying evidence in favor or against the hypothesis in question. (In a second step, using Bayesian decision theory which also weighs in the utility of different policy choices, we can use Bayesian inference also for informed decision making, of course.) But instead of speaking of “Bayesian inference to weigh evidence for/against a hypothesis” we will just speak of “Bayesian hypothesis testing” for ease of parlor.
We consider two conceptually distinct approaches within Bayesian hypothesis testing.
 Estimationbased testing considers just one model. It uses the observed data \(D_\text{obs}\) to retrieve posterior beliefs \(P(\theta \mid D_{\text{obs}})\) and checks whether, a posteriori, our hypothesis is credible.
 Comparisonbased testing uses Bayesian model comparison, in the form of Bayes factors, to compare two models, namely one model that assumes that the hypothesis in question is true, and one model that assumes that the complement of the hypothesis is true.
The main difference between these two approaches is that estimationbased hypothesis testing is simpler (conceptually and computationally), but less informative than comparisonbased hypothesis testing. In fact, comparisonbased methods give a clearer picture of the quantitative evidence for/against a hypothesis because they explicitly take into account a second alternative to the hypothesis which is to be tested. As we will see in this chapter, the technical obstacles for comparisonbased approaches can be overcome. For special but common use cases, like testing directional hypotheses, there are efficient methods of performing comparisonbased hypothesis testing.
The learning goals for this chapter are:

understand the notion of a statistical hypothesis
 pointvalued, ROPEd and directional hypotheses
 complement / alternative hypothesis
 be able to apply Bayesian hypothesis testing to (simple) case studies
 understand and be able to apply the SavageDickey method (and its extension to intervalbased hypotheses in terms of encompassing models)
 become familiar with a Bayesian \(t\)test model for comparing the means of two groups of metric measurements