## How to adjust statistical inferences for the simplicity of distributions

D. R. Bickel, “Confidence intervals, significance values, maximum likelihood estimates, etc. sharpened into Occam’s razors,” Working Paper, University of Ottawa, <hal-01799519>** **https://hal.archives-ouvertes.fr/hal-01799519 (2018). 2018 preprint | Slides

## Against ideological philosophies of probability

Burdzy, Krzysztof

Resonance—from probability to epistemology and back. *Imperial College Press, London,* 2016. xx+408 pp. ISBN: 978-1-78326-920-4

60A05 (00A30 03A10 62A01)

Burdzy defines probability in terms of six “laws of probability”, intended as an accurate description of how probability is used in science (pp. 8–9, 217). Unlike the axiomatic systems from Kolmogorov onward that are distinct from their potential applications [see A. Rényi, Rev. Inst. Internat. Statist 33 (1965), 1–14; MR0181483], the laws require that mathematical probability by definition agree with features of objective events. Potentially subject to scientific or philosophical refutation (pp. 258–259), the laws are analogous to Maxwell’s equations (p. 222). The testable claim is that they accurately describe science’s use of epistemic probabilities as well as physical probabilities (pp. 259–261).

Laws 3, 4, and 6 are especially physical. Burdzy argues that probability theory could not be applied if symmetries such as physical independence (Law 3) could not be recognized and tentatively accepted by resonance (Section 11.4). Such symmetries do not include the law of the iterated logarithm or many other properties of Martin-Löf sequences, which he finds “totally useless from the practical point of view” (Section 4.14). Law 4, the requirement that assigning equal probabilities should be based on known physical symmetries rather than on ignorance (Section 11.25), echoes R. Chuaqui Kettlun’s Truth, possibility and probability [North-Holland Math. Stud., 166, North-Holland, Amsterdam, 1991 (Sections III.2 and XX.3); MR1159708]. Law 6 needs some qualification or further explanation since it does not apply directly to continuous random variables: “An event has probability 0 if and only if it cannot occur. An event has probability 1 if and only if it must occur” (p. 217).

There is some dissonance in applications to statistics. On the frequentist side, a confidence interval with a high level of confidence should be used to predict that the parameter value lies within the observed confidence interval (Section 11.11, as explained by pp. 292, 294). Even though that generalizes predicting that the parameter values corresponding to rejected null hypotheses are not equal to the true parameter value, Burdzy expresses doubt about how to formalize hypothesis testing in terms of prediction (Section 13.4). His predictive-testing idea may be seen as an application of Cournot’s principle (pp. 22, 278; see [M. R. Fréchet, Les mathématiques et le concret, Presses Univ. France, Paris, 1955 (pp. 201–202, 209–213, 216–217, 221); MR0075110]). On the Bayesian side, Burdzy concedes that priors based on resonance often work well and yet judges them too susceptible to prejudice for scientific use (Section 14.4.3). By ridiculing subjective Bayesian theory as if it legitimized assigning probabilities at will (Section 7.1), Burdzy calls attention to its failure to specify all criteria for rational probability assignment.

Burdzy adds color to the text with random references to religion from the perspective of an atheistic probabilist who left Catholicism (p. 178). Here are some representative examples. First, in contrast to attempts to demonstrate that an objective probability of God’s existence is low [R. Dawkins, The God delusion, Bantam Press, 2006] or high [R. Swinburne, The resurrection of God incarnate, Clarendon Press, Oxford, 2003], he denies the feasibility of computing such a probability (Section 16.7). Second, Burdzy is convinced that religions, like communism, philosophical theories of probability, and other secular ideologies, have inconsistencies to the point of hypocrisy, insisting that his “resonance’ theory” (p. 13) is not an ideology (Chapter 15), much as D. V. Lindley denied that his Bayesianism is a religion [Understanding uncertainty, revised edition, Wiley Ser. Probab. Stat., Wiley, Hoboken, NJ, 2014 (pp. 380–381); MR3236718]. Lastly, Burdzy attributes the infinite consequences of underlying Pascal’s Wager to efforts to deceive and manipulate (Section 16.2.2). However, documenting the historical origins of teachings of eternal bliss and eternal retribution on the basis of primitive Christian and pre-Christian sources lies far beyond the scope of the book.

Under the resonance banner, this probabilist rushes in with a unique barrage of controversial and well-articulated philosophical claims with implications for science and beyond. Those resisting will find themselves challenged to counter with alternative solutions to the problems raised.

Reviewed by David R. Bickel

## Should simpler distributions have more prior probability?

D. R. Bickel, “Computable priors sharpened into Occam’s razors,” Working Paper, University of Ottawa, <hal-01423673>** **https://hal.archives-ouvertes.fr/hal-01423673 (2016). 2016 preprint

## Profile likelihood & MDL for measuring the strength of evidence

D. R. Bickel, “Pseudo-likelihood, explanatory power, and Bayes’s theorem [Comment on ‘A likelihood paradigm for clinical trials’],” *Journal of Statistical Theory and Practice* **7**, 178-182 (2013).

## Estimates of the local FDR

Z. Yang, Z. Li, and D. R. Bickel, “Empirical Bayes estimation of posterior probabilities of enrichment: A comparative study of five estimators of the local false discovery rate,” *BMC Bioinformatics* **14**, art. 87 (2013). published version | 2011 version | 2010 version

This paper adapts novel empirical Bayes methods for the problem of detecting enrichment in the form of differential representation of genes associated with a biological category with respect to a list of genes identified as differentially expressed. Read more…

## Optimal strength of evidence

D. R. Bickel, “Minimax-optimal strength of statistical evidence for a composite alternative hypothesis,” *International Statistical Review* **81**, 188-206 (2013). 2011 version | Simple explanation (added 2 July 2017)

This publication generalizes the likelihood measure of evidential support for a hypothesis with the help of tools originally developed by information theorists for minimizing the number of letters in a message. The approach is illustrated with an application to proteomics data.