Theoretical and empirical distributions of the p value

J. S. Butler, Peter Jones

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


The use of p values in null hypothesis statistical tests (NHST) is controversial in the history of applied statistics, owing to a number of problems. They are: arbitrary levels of Type I error, failure to trade off Type I and Type II error, misunderstanding of p values, failure to report effect sizes, and overlooking better means of reporting estimates of policy impacts, such as effect sizes, interpreted confidence intervals, and conditional frequentist tests. This paper analyzes the theory of p values and summarizes the problems with NHST. Using a large data set of public school districts in the United States, we demonstrate empirically the unreliability of p values and hypothesis tests as predicted by the theory. We offer specific suggestions for reporting policy research.

Original languageEnglish
Pages (from-to)1-30
Number of pages30
Issue number1
StatePublished - Apr 1 2018

Bibliographical note

Publisher Copyright:
© 2017, Sapienza Università di Roma.


  • Education finance
  • Null hypothesis statistical tests (NHST)
  • p values

ASJC Scopus subject areas

  • Statistics and Probability


Dive into the research topics of 'Theoretical and empirical distributions of the p value'. Together they form a unique fingerprint.

Cite this