Table 1 of Cassidy et al 2019. Categories of fallacious descriptions of 'statistical significance'

The meaning of ‘statistical significance’ and of p-values

A 2019 paper in the Advances in Methods and Practices in Psychological Science found that most psychology textbooks, instructors, and students misinterpret ‘statistical significance’ and p values. Talk about a headline! More important than the headline, however, are the right interpretations and what we can do to correct widespread misinterpretations. In this post, I explain the authors’ findings and the three solutions they propose.

The Bad News

Turns out 89% of introductory psychology textbooks described statistical significance incorrectly (Cassidy et al. 2019). So did 100% of psychology undergraduates, 80% of methodology instructors, and 90% of scientific psychologists (Haller & Kraus 2002). Yikes!

So what’s the right description of a p-value?

The Meaning of ‘Statistical Significance’ & P-value

Can the meaning of ‘statistical significance’ be derived from its parts, ‘statistical’ and ‘significance’? No. It’s more technical than that. So what’s the technical definition?

Here’s a jargony definition of “statistically significant” (e.g., p < 0.05):

Assuming that the null hypothesis is true and the study is repeated an infinite number times by drawing random samples from the same populations(s), less than 5% of these results will be more extreme than the current result

Cassidy and colleagues’ adaption of Klein 2013, p. 75

In other words, a p-value is the probability of observing something at least that extreme in the data if the null hypothesis were true.

This makes sense of why smaller p-values are seen as reasons to reject null hypotheses. Small p-values suggest that the probability of observing the corresponding outcome would be low if the null hypothesis were true.

It may be tempting to infer more than this from a p-value, especially if it’s smaller than conventional thresholds (e.g., 0.05). However, you might want to read to the end to save yourself from drawing an embarrassing conclusion.

P-value & Statistical Significance Fallacies

How do the correct descriptions of ‘statistical significance’ and p-values above differ from what we get from textbooks, methodological instructors, scientists, and their students?

Below are 8 categories of “fallacious” descriptions from Cassidy and colleagues’ Table 1 (2019). The 1st, 5th, and 6th fallacies are the ones I hear most often.

What Can You Do About It?

Now that you see how widespread the misunderstanding is and how simple the correct definition is, what should you do? Cassidy et al. give the following tips:

  • Fix the textbooks.
  • Remove discussions of the term from books.
  • Use their free teaching materials on the Open Science Framework: https://osf.io/qg9t2/

In addition to trying to spread the news in this post, I also teach about p-values in the probability section of my Logic course. If you’re not sure what you can do, you could share this post with someone you think might be interested in it.

Related Posts

Published by

Nick Byrd

Nick is a cognitive scientist at Florida State University studying reasoning, wellbeing, and willpower. Check out his blog at byrdnick.com/blog