In some sense the p-value offers a first defense line against being fooled by randomness, separating signal from noise.
Yoav Benjamini, 2016
\[ p\text{-value} \stackrel{\text{def}}{=} \text{P}(\text{data} \mid \text{$H_0$ is true}) \]
\[ p\text{-value} \neq \text{P}(\text{$H_0$ is true} \mid \text{data}) \]
We suspect a coin is biased towards heads.
\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased towards heads ($p>0.5$).} \\ \end{aligned} } \]
We tossed the coin 10 times and got 10 heads.
\[ \small{ \begin{aligned} \text{$p$-value}&=\text{P}(\text{get 10 heads $\mid$ $H_0$ is true}) \\ &={10 \choose 10}0.5^{10} \approx 0.001 \\ \end{aligned} } \]
Assuming \(H_0\) is true, it is highly unlikely to get 10 heads.
Since we indeed got 10 heads, \(H_0\) is probably not true.
\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased towards heads ($p>0.5$).} \\ \end{aligned} } \]
We tossed the coin 10 times and got 9 heads.
\[ \small{ \begin{aligned} \text{$p$-value}=\;&\text{P}(\text{get 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{get 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 9}0.5^{10}+{10 \choose 10}0.5^{10} \approx 0.011 \\ \end{aligned} } \]
Assuming \(H_0\) is true, it is unlikely to get 9 or more heads.
Since we indeed got 9 heads, \(H_0\) is probably not true.
\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased towards heads ($p>0.5$).} \\ \end{aligned} } \]
We tossed the coin 10 times and got 8 heads.
\[ \small{ \begin{aligned} p\text{-value}=\;&\text{P}(\text{got 8 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 8}0.5^{10}+{10 \choose 9}0.5^{10}+{10 \choose 10}0.5^{10} \approx 0.055 \\ \end{aligned} } \]
Assuming \(H_0\) is true, it is unlikely (?) to get 8 or more heads.
We tossed the coin 10 times and got 7 heads.
\[ \small{ \begin{aligned} p\text{-value}=\;&\text{P}(\text{got 7 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 8 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 7}0.5^{10}+{10 \choose 8}0.5^{10}+{10 \choose 9}0.5^{10}+{10 \choose 10}0.5^{10} \\ \approx\;&0.172 \\ \end{aligned} } \]
Assuming \(H_0\) is true, it is not unlikely to get 7 or more heads.
Getting 7 heads is not enough evidence to reject \(H_0\).
We tossed the coin 10 times and got 6 heads.
\[ \small{ \begin{aligned} p\text{-value}=\;&\text{P}(\text{got 6 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 7 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 8 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 6}0.5^{10}+{10 \choose 7}0.5^{10}+\cdots+{10 \choose 10}0.5^{10} \\ \approx\;&0.377 \\ \end{aligned} } \]
Assuming \(H_0\) is true, it is not unlikely to get 6 or more heads. Do not reject \(H_0\).
First we specify a significance level \(\alpha\) (the desired tolerance for type I error probability).
Then,
\[ \small{ \begin{aligned} &\text{Reject $H_0$ if $p\text{-value} < \alpha$}, \\ \\ &\text{Fail to reject $H_0$ if $p\text{-value} \geq \alpha$}. \\ \end{aligned} } \]
We suspect a coin is biased (towards either heads or tails).
\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased ($p \neq 0.5$).} \\ \end{aligned} } \]
We tossed the coin 10 times and got 9 heads.
\[ \small{ \begin{aligned} \text{$p$-value} =\;&\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 9}\color{red}{\text{ tails }} \text{$\mid$ $H_0$ is true})+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true})+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10}\color{red}{\text{ tails }} \text{$\mid$ $H_0$ is true}) \;\color{gray}{\leftarrow\text{more extreme}} \\ \approx\;&0.022 \\ \end{aligned} } \]
\[ \text{$p$-value} \approx 0.022 \]
Assuming \(H_0\) is true, it is unlikely to get 9 or more heads or tails.
Since we indeed got 9 heads, \(H_0\) is likely not true.
Absence of evidence is not evidence of absence.
