Code of Statistical Practice

The ESTDatS Lab advocates the interpretation of statistical research findings in the context under which the data and scientific questions both arise. For example, there is almost never any context under which a hypothesis test is justified at a 5% significance level when the study involves, say, n=3 or n=106 independent observations.

While the proper application of statistical principles may require a deep understanding of all types of data structures and the scientific context at hand, the improper usage of p-values and associated hypothesis testing is more straightforward to avoid: 

1. Which is H0, which is Ha

  1. H0 represents the status quo, and should remain so unless your data indicate otherwise. Thus, hypothesis testing is all about Ha — you use data as evidence to add weight to Ha, but this evidence may or may not be enough to topple the status quo.
  2. Disagreement between data and H0 alone may not be enough to reject H0. (a) Disagreement between data and H0 and (b) agreement between data and Ha together form the basis of rejecting H0.
  3. It is statistically illegal to construct H0 and Ha based on the current data which will be used again to conduct the test — double-dipping# may constitute scientific fraud.
  4. When in doubt, ask a professional statistician for advice.

2. At what significance level?

  1. The level α (alpha) must be prespecified prior to inspecting the data,@ and should be context specific.
  2. When in doubt, ask a professional statistician for advice. 

3. But scientifically I know H0 should be rejected even if my data tell me otherwise. Now what?

  1. In light of #1.3, the statistically legal approach is to construct a new set of H0 and Ha (possibly based on the current data), after which you collect brand new data, then test the new H0 and Ha with the brand new data, then finally report the two sets of test results in the same report. Cherry picking& may constitute scientific fraud.
  2. Bayesian statisticians typically don’t employ hypothesis tests, because Bayesian inference can automatically integrate presumed scenarios into the estimator, and uncertainty is communicated as probabilities. Ask a professional Bayesian for advice.

# Delve deeper into the garden of forking paths and dangers of multiple inference.
@ Delve deeper into the dangers of data snooping.
& Delve deeper into the so-called file-drawer problem.

 


 American Statistical Association’s official statement on the use of p-values

   
Link to article titled 'Americian Statistical Association Releases Statement on Statistical Significance and P-values - Provides Principles to Improve the Conduct and Interpretation of Quantitative Science' (2016-03-07)

Click image for full statement

 

The American Statistician special issue on moving beyond p-values

   
Link to The American Statistician (TAS) special issue on moving beyond p-values (2019)
Click image for full TAS issue

 

 

Journals' and authors' disdain for p-values and statistical significance

 
Link to article titled 'Rewriting results sections in the language of evidence' from the journal Trends in Ecology & Evolution (2022)
Click image for full article
  
Link to article titled 'Scientists rise up against statistical significance' from the journal Nature (2019)
Click image for full article
   
Link to article titled 'P value ban: small step for a journal, giant leap for science - Editors reject flawed system of null hypothesis testing' from Science News (2015)
Click image for full article
   
Link to article titled 'Statistics: P values are just the tip of the iceberg' from the journal Nature (2015)
Click image for full article
   
Link to article titled 'Scientific method: Statistical errors' from the journal Nature (2014)
Click image for full article

 


  Examples of statistical misuse or statistical abuse?

 

Examples of scientific fraud or statistical misuse?

  • COVID19 research (1, 2)
  • cancer research (1, 2)