Archive

Archive for the ‘proteomics’ Category

Empirical Bayes single-comparison procedure

1 July 2016 Comments off

D. R. Bickel, “Small-scale inference: Empirical Bayes and confidence methods for as few as a single comparison,” International Statistical Review 82, 457-476 (2014). Published version2011 preprint | Simple explanation (link added 21 June 2017)

Parametric empirical Bayes methods of estimating the local false discovery rate by maximum likelihood apply not only to the large-scale settings for which they were developed, but, with a simple modification, also to small numbers of comparisons. In fact, data for a single comparison are sufficient under broad conditions, as seen from applications to measurements of the abundance levels of 20 proteins and from simulation studies with confidence-based inference as the competitor.

Small-scale empirical Bayes & fiducial estimators

22 March 2015 Comments off

M. Padilla and D. R. Bickel, “Empirical Bayes and fiducial effect-size estimation for small numbers of tests,” Working Paper, University of Ottawa, deposited in uO Research at http://hdl.handle.net/10393/32151 (2015). 2015 preprint

Small dimensional empirical Bayes inference

9 May 2013 Comments off

D. R. Bickel, “Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions,” Statistical Applications in Genetics and Molecular Biology 12, 529–543 (2013). 2011 version | erratum

20130509-232625.jpg

To address multiple comparison problems in small-to-high-dimensional biology, this paper introduces estimators of the local false discovery rate (LFDR), reports their main properties, and illustrates their use with proteomics data. The new estimators have the following advantages:

  1. proven asymptotic conservatism;
  2. simplicity of calculation without the tuning of smoothing parameters;
  3. no strong parametric assumptions;
  4. applicability to very small numbers of hypotheses as well as to very large numbers of hypotheses.

The link to the erratum was added 31 March 2015.

Optimal strength of evidence

13 February 2013 Comments off

D. R. Bickel, “Minimax-optimal strength of statistical evidence for a composite alternative hypothesis,” International Statistical Review 81, 188-206 (2013). 2011 version | Simple explanation (added 2 July 2017)

cover

This publication generalizes the likelihood measure of evidential support for a hypothesis with the help of tools originally developed by information theorists for minimizing the number of letters in a message. The approach is illustrated with an application to proteomics data.

Local FDR estimation for low-dimensional data

18 October 2012 Comments off

M. Padilla and D. R. Bickel, “Estimators of the local false discovery rate designed for small numbers of tests,” Statistical Applications in Genetics and Molecular Biology 11 (5), art. 4 (2012). Full article | 2010 & 2012 preprints

image

This article describes estimators of local false discovery rates, compares their biases for small-scale inference, and illustrates the methods using a quantitative proteomics data set. In addition, theoretical results are presented in the appendices.

Local FDR estimation software

30 June 2012 1 comment

LFDRenrich is a suite of R functions for the estimation of local false discovery rates by maximum likelihood under a two-component or three-component parametric mixture model of 2X2 tables such as those used in gene enrichment analyses.

LFDRhat is a more general suite of R functions for the estimation of local false discovery rates by maximum likelihood under a two-component or three-component parametric mixture model.

Minimax strength of statistical evidence

24 November 2011 Comments off

D. R. Bickel, “A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons,” Canadian Journal of Statistics 39, 610–631 (2011). Full text | Revised preprint | 2010 draft

93663f53-d2d9-49de-b379-0cb0b7b566d7

This paper introduces a novel approach to the multiple comparisons problem by generalizing a promising method of model selection developed by information theorists. The first two sections present that method and its main advantages over conventional approaches without burdening statisticians with unfamiliar terms from coding theory. A quantitative proteomics case study facilitates application of the new method to the analysis of data sets involving multiple biological features. The theorems describe its operating characteristics.

The cited medium-scale paper presented previous minimum description length (MDL) methods. Unlike those methods, the new MDL methods of the current paper are based on a conflation of the normalized maximum likelihood (NML) with the weighted likelihood (WL). The previous MDL methods are used in the CJS article for comparison with its NML/WL methods.