part 1
- (discrete) probability distributions
- \(p\)-value
- statistical significance
\[ \definecolor{firebrick}{RGB}{178,34,34} \newcommand{\red}[1]{{\color{firebrick}{#1}}} \] \[ \definecolor{mygray}{RGB}{178,34,34} \newcommand{\mygray}[1]{{\color{mygray}{#1}}} \] \[ \newcommand{\set}[1]{\{#1\}} \] \[ \newcommand{\tuple}[1]{\langle#1\rangle} \] \[\newcommand{\States}{{T}}\] \[\newcommand{\state}{{t}}\] \[\newcommand{\pow}[1]{{\mathcal{P}(#1)}}\]
part 1
a discrete probability distribution over a finite set of mutually exclusive world states \(\States\) is a function \(P \colon \States \rightarrow [0;1]\) such that \(P(\States)=1\).
for finite \(\States\), \(P(\state)\) is \(\state\)'s probability mass
okay, winter is coming; but who will sit the Iron Throne next spring?
## Targaryen Lannister Baratheon Greyjoy Stark ## 0.375 0.188 0.062 0.125 0.250
if \(f \colon \States \rightarrow \mathbb{R}^{\ge0}\), then
\[ P(\state) \propto f(\state) \]
is shorthand notation for
\[ P(\state) = \frac{f(\state)}{ \sum_{\state' \in \States} f(\state')} \]
the binomial distribution gives the probability of observing \(k\) successes in \(n\) coin flips with a bias of \(\theta\):
\[ B(k ; n,\theta) = \binom{n}{k} \theta^{k} \, (1-\theta)^{n-k} \]
probability of observing at least one success in \(n\) coin flips with bias \(\theta\):
\[ \begin{align*} B(k > 0; n,\theta) & = 1 - B(k = 0; n,\theta) \\ & = 1 - \binom{n}{0} \theta^{0} \, (1-\theta)^{n-0} \\ & = 1 - (1-\theta)^{n} \\ \end{align*} \]
this will be useful later
which of these are true?
many statistical concepts are misunderstood by the majority of researchers
[from Haller & Kraus (2002)]
the \(p\)-value is the probability of observing, under infinite hypothetical repetitions of the same experiment, a less extreme value of a test statistic than that of the oberved data, given that the null hypothesis is true
in the general case, the \(p\)-value of observation \(x\) under null hypothesis \(H_0\), with sample space \(X\), sampling distribution \(P(\cdot \mid H_0) \in \Delta(X)\) and test statistic \(t \colon X \rightarrow \mathbb{R}\) is:
\[ p(x ; H_0, X, P(\cdot \mid H_0), t) = \int_{\left\{ \tilde{x} \in X \ \mid \ t(\tilde{x}) \ge t(x) \right\}} P(\tilde{x} \mid H_0) \ \text{d}\tilde{x}\]
intuitive slogan: probability of at least as extreme outcomes
for an exact test we get:
\[ p(x ; H_0, X, P(\cdot \mid H_0)) = \int_{\left\{ \tilde{x} \in X \ \mid \ P(\tilde{x} \mid H_0) \le P(x \mid H_0) \right\}} P(\tilde{x} \mid H_0) \ \text{d}\tilde{x}\]
intuitive slogan: probability of at least as unlikely outcomes
notation: \(\Delta(X)\) – set of all probability measures over \(X\)
fair coin?
\[ B(k ; n = 24, \theta = 0.5) = \binom{n}{k} \theta^{k} \, (1-\theta)^{n-k} \]
binom.test(7,24)
## ## Exact binomial test ## ## data: 7 and 24 ## number of successes = 7, number of trials = 24, p-value = 0.06391 ## alternative hypothesis: true probability of success is not equal to 0.5 ## 95 percent confidence interval: ## 0.1261521 0.5109478 ## sample estimates: ## probability of success ## 0.2916667
binom.test(7,24)$p.value
## [1] 0.06391466
fix a significance level, e.g.: \(0.05\)
we say that a test result is significant iff the \(p\)-value is below the pre-determined significance level
we reject the null hypothesis in case of significant test results
the significance level thereby determines the \(\alpha\)-error of falsely rejecting the null hypothesis