Code of Statistical Practice

The ESTDatS Lab advocates the interpretation of statistical research findings in the context under which the data and scientific questions both arise. For example, there is almost never any context under which a hypothesis test is justified at a 5% significance level when the study involves, say, n=3 or n=10⁶ independent observations.

Fisheries scientists Midway & Daugherty (2025) identify major pitfalls when reporting statistical results in science (not just fisheries), and discuss some preventative measures. For example, "[n]ever use a model you cannot understand."

While the proper application of statistical principles may require a deep understanding of all types of data structures and the scientific context at hand, the improper usage of p-values and associated hypothesis testing is more straightforward to avoid:

1. Which is H₀, which is H_a?

H₀ represents the status quo, and should remain so unless your data indicate otherwise. Thus, hypothesis testing is all about H_a — you use data as evidence to add weight to H_a, but this evidence may or may not be enough to topple the status quo.
Disagreement between data and H₀ alone may not be enough to reject H₀. (a) Disagreement between data and H₀ and (b) agreement between data and H_a together form the basis of rejecting H₀.
It is statistically illegal to construct H₀ and H_a based on the current data which will be used again to conduct the test — double-dipping^# may constitute scientific fraud.
When in doubt, ask a professional statistician for advice.

2. At what significance level?

The level α (alpha) must be prespecified prior to inspecting the data,^@ and should be context specific.
When in doubt, ask a professional statistician for advice.

3. But scientifically I know H₀ should be rejected even if my data tell me otherwise. Now what?

In light of #1.3, the statistically legal approach is to construct a new set of H₀ and H_a (possibly based on the current data), after which you collect brand new data, then test the new H₀ and H_a with the brand new data, then finally report the two sets of test results in the same report. Cherry picking^& may constitute scientific fraud.
Bayesian statisticians typically don’t employ hypothesis tests, because Bayesian inference can automatically integrate presumed scenarios into the estimator, and uncertainty is communicated as probabilities.⁺ Ask a professional Bayesian for advice.