Archive for the ‘empirical Bayes’ Category

Model fusion & multiple testing in the likelihood paradigm

11 January 2015 Leave a comment

D. R. Bickel, “Model fusion and multiple testing in the likelihood paradigm: Shrinkage and evidence supporting a point null hypothesis,” Working Paper, University of Ottawa, deposited in uO Research at (2014). 2014 preprint | Supplement (link added 10 February 2015)

Errata for Theorem 4:

  1. The weights of evidence should not be conditional.
  2. Some of the equal signs should be “is a member of” signs.

Fiducial error propagation for empirical Bayes set estimates

10 January 2015 Leave a comment

D. R. Bickel, “A fiducial continuum from confidence sets to empirical Bayes set estimates as the number of comparisons increases,” Working Paper, University of Ottawa, deposited in uO Research at (2014). 2014 preprint

Two problems confronting the eclectic approach to statistics result from its lack of a unifying theoretical foundation. First, there is typically no continuity between a p-value reported as a level of evidence for a hypothesis in the absence of the information needed to estimate a relevant prior on one hand and an estimated posterior probability of a hypothesis reported in the presence of such information on the other hand. Second, the empirical Bayes methods recommended do not propagate the uncertainty due to estimating the prior.

The latter problem is addressed by applying a coherent form of fiducial inference to hierarchical models, yielding empirical Bayes set estimates that reflect uncertainty in estimating the prior. Plugging in the maximum likelihood estimator, while not propagating that uncertainty, provides continuity from single comparisons to large numbers of comparisons.

Causality, Probability, and Time (by Kleinberg)—a review

8 August 2014 Leave a comment

Kleinberg, Samantha
Causality, probability, and time. Cambridge University Press, Cambridge, 2013. viii+259 pp. ISBN: 978-1-107-02648-3
60A99 (03A05 03B48 62A01 62P99 68T27 91G80 92C20)

This informative and engaging book introduces a novel method of inferring a cause of an event on the basis of the assumption that each cause changes the frequency-type probability of some effect occurring later in time. Unlike most previous approaches to causal inference, the author explicitly models time lags between causes and effects since timing is often crucial to effective prediction and control.
Arguably an equally valuable contribution of the book is its integration of relevant work in philosophy, computer science, and statistics. While the first two disciplines have benefited from the productive interactions exemplified in [J. Pearl, Probabilistic reasoning in intelligent systems: networks of plausible inference, Morgan Kaufmann Ser. Represent. Reason., Morgan Kaufmann, San Mateo, CA, 1988; MR0965765 (90g:68003)] and [J. Williamson, Bayesian nets and causality, Oxford Univ. Press, Oxford, 2005; MR2120947 (2005k:68198)], the statistics community has developed its own theory of causal inference in relative isolation. Rather than following S. L. Morgan and C. Winship [Counterfactuals and causal inference: methods and principles for social research, Cambridge Univ. Press, New York, 2007] and others in bringing that theory into conversation with that of Pearl [op. cit.], the author creatively employs recent developments in statistical inference to identify causes.
For the specific situation in which many putative causes are tested but only a few are true causes, she explains how to estimate the local rate of discovering false causes. In this context, the local false discovery rate (LFDR) corresponding to a putative cause is a posterior probability that it is not a true cause. This is an example of an empirical Bayes method in that the prior distribution is estimated from the data rather than assigned.
Building on [P. Suppes, A probabilistic theory of causality, North-Holland, Amsterdam, 1970; MR0465774 (57 #5663)], the book emphasizes the importance for prediction not only of whether something is a cause but also of the strength of a cause. A cause is εsignificant if its causal strength, defined in terms of changing the probability of its effect, is at least ε, where ε is some nonnegative number. Otherwise, it is ε-insignificant.
The author poses an important problem and comes close to solving it, i.e., the problem of inferring whether a cause is ε-significant. The solution attempted in Section 4.2 confuses causal significance (ε-significance) with statistical significance (LFDR estimate below some small positive number α). This is by no means a fatal criticism of the approach since it can be remedied in principle by defining a false discovery as a discovery of an ε-insignificant cause. This tests the null hypothesis that the cause is ε-insignificant for a specified value of ε rather than the book’s null hypothesis, which in effect asserts that the cause is limε0ε-insignificant, i.e., ε-insignificant for all ε>0. In the case of a specified value of ε, a cause should be considered ε-significant if the estimated LFDR is less than α, provided that the LFDR is defined in terms of the null hypothesis of ε-insignificance. The need to fill in the technical details and to answer more general questions arising from this distinction between causal significance and statistical significance opens up exciting opportunities for further research guided by insights from the literature on seeking substantive significance as well as statistical significance [see, e.g., M. A. van de Wiel and K. I. Kim, Biometrics 63 (2007), no. 3, 806–815; MR2395718].

Reviewed by David R. Bickel

This review first appeared at Causality, Probability, and Time (Mathematical Reviews) and is used with permission from the American Mathematical Society.

Categories: empirical Bayes, reviews

Assessing multiple models

1 June 2014 Comments off

Small dimensional empirical Bayes inference

9 May 2013 Leave a comment

D. R. Bickel, “Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions,” Statistical Applications in Genetics and Molecular Biology 12, 529–543 (2013). 2011 version | erratum


To address multiple comparison problems in small-to-high-dimensional biology, this paper introduces estimators of the local false discovery rate (LFDR), reports their main properties, and illustrates their use with proteomics data. The new estimators have the following advantages:

  1. proven asymptotic conservatism;
  2. simplicity of calculation without the tuning of smoothing parameters;
  3. no strong parametric assumptions;
  4. applicability to very small numbers of hypotheses as well as to very large numbers of hypotheses.

The link to the erratum was added 31 March 2015.

Estimates of the local FDR

13 February 2013 Leave a comment

Z. Yang, Z. Li, and D. R. Bickel, “Empirical Bayes estimation of posterior probabilities of enrichment: A comparative study of five estimators of the local false discovery rate,” BMC Bioinformatics 14, art. 87 (2013). published version |  2011 version | 2010 version


This paper adapts novel empirical Bayes methods for the problem of detecting enrichment in the form of differential representation of genes associated with a biological category with respect to a list of genes identified as differentially expressed. Read more…

Optimal strength of evidence

13 February 2013 Leave a comment

D. R. Bickel, “Minimax-optimal strength of statistical evidence for a composite alternative hypothesis,” International Statistical Review 81, 188-206 (2013). 2011 version | Simple explanation (added 2 July 2017)


This publication generalizes the likelihood measure of evidential support for a hypothesis with the help of tools originally developed by information theorists for minimizing the number of letters in a message. The approach is illustrated with an application to proteomics data.

MLE of the local FDR

13 February 2013 Comments off

Local FDR estimation for low-dimensional data

18 October 2012 Leave a comment

M. Padilla and D. R. Bickel, “Estimators of the local false discovery rate designed for small numbers of tests,” Statistical Applications in Genetics and Molecular Biology 11 (5), art. 4 (2012). Full article | 2010 & 2012 preprints


This article describes estimators of local false discovery rates, compares their biases for small-scale inference, and illustrates the methods using a quantitative proteomics data set. In addition, theoretical results are presented in the appendices.

How to combine statistical methods

29 August 2012 1 comment

D. R. Bickel, “Game-theoretic probability combination with applications to resolving conflicts between statistical methods,” International Journal of Approximate Reasoning 53, 880-891 (2012). Full article | 2011 preprint | Slides | Simple explanation

Cover image

This paper proposes both a novel solution to the problem of combining probability distributions and a framework for using the new method to combine the results of differing statistical methods that may legitimately be used to analyze the same data set. While the paper emphasizes theoretical development, it is motivated by the need to combine two conflicting estimators of the probability of differential gene expression.