## “A Litany of Problems With p-values”

Bayesian, likelihoodist, and frequentist views appear in the comments in Statistical Thinking: A Litany of Problems With p-values.

## Lower the statistical significance threshold to 0.005—or 0.001?

D. R. Bickel, “Sharpen statistical significance: Evidence thresholds and Bayes factors sharpened into Occam’s razors,” Working Paper, University of Ottawa, <hal-01851322>** **https://hal.archives-ouvertes.fr/hal-01851322 (2018). 2018 preprint

## Pre-data insights update priors via Bayes’s theorem

D. R. Bickel, “Bayesian revision of a prior given prior-data conflict, expert opinion, or a similar insight: A large-deviation approach,” *Statistics* **52**, 552-570 (2018). Full text | 2015 preprint | Simple explanation

## How to adjust statistical inferences for the simplicity of distributions

D. R. Bickel, “Confidence intervals, significance values, maximum likelihood estimates, etc. sharpened into Occam’s razors,” Working Paper, University of Ottawa, <hal-01799519>** **https://hal.archives-ouvertes.fr/hal-01799519 (2018). 2018 preprint | Slides

## Should the default significance level be changed from 0.05 to 0.005?

My comments in this discussion of “Redefine statistical significance”:

The call for smaller significance levels cannot be based only on mathematical arguments that p values tend to be much lower than posterior probabilities, as Andrew Gelman and Christian Robert pointed out in their comment (“Revised evidence for statistical standards”).

In the rejoinder, Valen Johnson made it clear that the call is also based on empirical findings of non-reproducible research results. How many of those findings are significant at the 0.005 level? Should meta-analysis have a less stringent standard?

…

“Irreplicable results can’t possibly add empirical clout to the mathematical argument unless it is already known or assumed to be caused by a given cut-off, and further, that lowering it would diminish those problems.”

The preprint cites empirical results to support its use of the 1:10 prior odds. If that is in fact a reliable estimate of the prior odds for the reference class of previous studies, then, in the absence of other relevant information, it would be reasonable to use as input for Bayes’s theorem.

John Byrd asks, “Is 1:10 replicable?” Is it important to ask whether a 1:1 prior odds can be rejected at the 0.005 significance level?

END

## Inference to the best explanation of the evidence

The *p* value and Bayesian methods have well known drawbacks when it comes to measuring the strength of the evidence supporting one hypothesis over another. To overcome those drawbacks, this paper proposes an alternative method of quantifying how much support a hypothesis has from evidence consisting of data.

D. R. Bickel, “The strength of statistical evidence for composite hypotheses: Inference to the best explanation,” *Statistica Sinica* **22**, 1147-1198 (2012). Full article | 2010 version

The special law of likelihood has many advantages over more commonly used approaches to measuring the strength of statistical evidence. However, it only can measure the support of a hypothesis that corresponds to a single distribution. The proposed general law of likelihood also can measure the extent to which the data support a hypothesis that corresponds to multiple distributions. That is accomplished by formalizing inference to the best explanation.