## Inference to the best explanation of the evidence

The *p* value and Bayesian methods have well known drawbacks when it comes to measuring the strength of the evidence supporting one hypothesis over another. To overcome those drawbacks, this paper proposes an alternative method of quantifying how much support a hypothesis has from evidence consisting of data.

D. R. Bickel, “The strength of statistical evidence for composite hypotheses: Inference to the best explanation,” *Statistica Sinica* **22**, 1147-1198 (2012). Full article | 2010 version

The special law of likelihood has many advantages over more commonly used approaches to measuring the strength of statistical evidence. However, it only can measure the support of a hypothesis that corresponds to a single distribution. The proposed general law of likelihood also can measure the extent to which the data support a hypothesis that corresponds to multiple distributions. That is accomplished by formalizing inference to the best explanation.

## Empirical Bayes single-comparison procedure

D. R. Bickel, “Small-scale inference: Empirical Bayes and confidence methods for as few as a single comparison,” *International Statistical Review ***82**, 457-476 (2014). Published version | 2011 preprint | Simple explanation (link added 21 June 2017)

Parametric empirical Bayes methods of estimating the local false discovery rate by maximum likelihood apply not only to the large-scale settings for which they were developed, but, with a simple modification, also to small numbers of comparisons. In fact, data for a single comparison are sufficient under broad conditions, as seen from applications to measurements of the abundance levels of 20 proteins and from simulation studies with confidence-based inference as the competitor.

## Optimal strength of evidence

D. R. Bickel, “Minimax-optimal strength of statistical evidence for a composite alternative hypothesis,” *International Statistical Review* **81**, 188-206 (2013). 2011 version | Simple explanation (added 2 July 2017)

This publication generalizes the likelihood measure of evidential support for a hypothesis with the help of tools originally developed by information theorists for minimizing the number of letters in a message. The approach is illustrated with an application to proteomics data.

## Confidence levels as degrees of belief

D. R. Bickel, “A frequentist framework of inductive reasoning,” *Sankhya A* **74**, 141-169 (2013). published version | 2009 version | relationship to a working paper | simple explanation (added 17 July 2017)

A confidence measure is a parameter distribution that encodes all confidence intervals for a given data set, model, and pivot. This article establishes some properties of the confidence measure that commend it as a viable alternative to the Bayesian posterior distribution.

Confidence (correct frequentist coverage) and coherence (compliance with Ramsey-type restrictions on rational belief) are both presented as desirable properties. The only distributions on a scalar parameter space that have both properties are confidence measures.

## Observed confidence levels for microarrays, etc.

D. R. Bickel, “Estimating the null distribution to adjust observed confidence levels for genome-scale screening,” *Biometrics* **67**, 363-370 (2011). Abstract and article | French abstract | Supplementary material | Simple explanation

This paper describes the first application of observed confidence levels to data of high-dimensional biology. The proposed method for multiple comparisons can take advantage of the estimated null distribution without any prior distribution. The new method is applied to microarray data to illustrate its advantages.