8.2 p-value

In some sense the p-value offers a first defense line against being fooled by randomness, separating signal from noise.

Yoav Benjamini, 2016

  • Roughly speaking, p-values tell us how surprising the observed data is, assuming there is no effect.
  • p-value is defined as the conditional probability of getting a test statistic as extreme or more extreme than the observed data, given that the null hypothesis is true.

\[ p\text{-value} \stackrel{\text{def}}{=} \text{P}(\text{data} \mid \text{$H_0$ is true}) \]

  • It is not the probability the null hypothesis is true.

\[ p\text{-value} \neq \text{P}(\text{$H_0$ is true} \mid \text{data}) \]

We suspect a coin is biased towards heads.

\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased towards heads ($p>0.5$).} \\ \end{aligned} } \]

We tossed the coin 10 times and got 10 heads.

\[ \small{ \begin{aligned} \text{$p$-value}&=\text{P}(\text{get 10 heads $\mid$ $H_0$ is true}) \\ &={10 \choose 10}0.5^{10} \approx 0.001 \\ \end{aligned} } \]

Assuming \(H_0\) is true, it is highly unlikely to get 10 heads.

Since we indeed got 10 heads, \(H_0\) is probably not true.

\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased towards heads ($p>0.5$).} \\ \end{aligned} } \]

We tossed the coin 10 times and got 9 heads.

\[ \small{ \begin{aligned} \text{$p$-value}=\;&\text{P}(\text{get 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{get 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 9}0.5^{10}+{10 \choose 10}0.5^{10} \approx 0.011 \\ \end{aligned} } \]

Assuming \(H_0\) is true, it is unlikely to get 9 or more heads.

Since we indeed got 9 heads, \(H_0\) is probably not true.

\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased towards heads ($p>0.5$).} \\ \end{aligned} } \]

We tossed the coin 10 times and got 8 heads.

\[ \small{ \begin{aligned} p\text{-value}=\;&\text{P}(\text{got 8 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 8}0.5^{10}+{10 \choose 9}0.5^{10}+{10 \choose 10}0.5^{10} \approx 0.055 \\ \end{aligned} } \]

Assuming \(H_0\) is true, it is unlikely (?) to get 8 or more heads.

We tossed the coin 10 times and got 7 heads.

\[ \small{ \begin{aligned} p\text{-value}=\;&\text{P}(\text{got 7 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 8 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 7}0.5^{10}+{10 \choose 8}0.5^{10}+{10 \choose 9}0.5^{10}+{10 \choose 10}0.5^{10} \\ \approx\;&0.172 \\ \end{aligned} } \]

Assuming \(H_0\) is true, it is not unlikely to get 7 or more heads.

Getting 7 heads is not enough evidence to reject \(H_0\).

We tossed the coin 10 times and got 6 heads.

\[ \small{ \begin{aligned} p\text{-value}=\;&\text{P}(\text{got 6 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 7 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 8 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})\;+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true}) \;\;\;\color{gray}{\leftarrow\text{more extreme}} \\ =\;&{10 \choose 6}0.5^{10}+{10 \choose 7}0.5^{10}+\cdots+{10 \choose 10}0.5^{10} \\ \approx\;&0.377 \\ \end{aligned} } \]

Assuming \(H_0\) is true, it is not unlikely to get 6 or more heads. Do not reject \(H_0\).

Decision rule based on p-value

First we specify a significance level \(\alpha\) (the desired tolerance for type I error probability).

Then,

\[ \small{ \begin{aligned} &\text{Reject $H_0$ if $p\text{-value} < \alpha$}, \\ \\ &\text{Fail to reject $H_0$ if $p\text{-value} \geq \alpha$}. \\ \end{aligned} } \]

We suspect a coin is biased (towards either heads or tails).

\[ \small{ \begin{aligned} &\text{$H_0$: the coin is fair ($p=0.5$).} \\ &\text{$H_a$: the coin is biased ($p \neq 0.5$).} \\ \end{aligned} } \]

We tossed the coin 10 times and got 9 heads.

\[ \small{ \begin{aligned} \text{$p$-value} =\;&\text{P}(\text{got 9 heads $\mid$ $H_0$ is true})+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 9}\color{red}{\text{ tails }} \text{$\mid$ $H_0$ is true})+ \;\color{gray}{\leftarrow\text{as extreme}} \\ &\text{P}(\text{got 10 heads $\mid$ $H_0$ is true})+ \;\color{gray}{\leftarrow\text{more extreme}} \\ &\text{P}(\text{got 10}\color{red}{\text{ tails }} \text{$\mid$ $H_0$ is true}) \;\color{gray}{\leftarrow\text{more extreme}} \\ \approx\;&0.022 \\ \end{aligned} } \]

\[ \text{$p$-value} \approx 0.022 \]

Assuming \(H_0\) is true, it is unlikely to get 9 or more heads or tails.

Since we indeed got 9 heads, \(H_0\) is likely not true.

What does it mean if p-value > 0.05?

  • p-value > 0.05 doesn’t mean the null hypothesis is true.
  • It simply means that we do not have enough information to answer this question.
  • There may be effects that would reveal with more data.

Absence of evidence is not evidence of absence.