Archive

Archive for the ‘Types of data’ Category

Small dimensional empirical Bayes inference

9 May 2013 Leave a comment

D. R. Bickel, “Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions,” Statistical Applications in Genetics and Molecular Biology 12, 529–543 (2013). 2011 version | erratum

20130509-232625.jpg

To address multiple comparison problems in small-to-high-dimensional biology, this paper introduces estimators of the local false discovery rate (LFDR), reports their main properties, and illustrates their use with proteomics data. The new estimators have the following advantages:

  1. proven asymptotic conservatism;
  2. simplicity of calculation without the tuning of smoothing parameters;
  3. no strong parametric assumptions;
  4. applicability to very small numbers of hypotheses as well as to very large numbers of hypotheses.

The link to the erratum was added 31 March 2015.

Estimates of the local FDR

13 February 2013 Leave a comment

Z. Yang, Z. Li, and D. R. Bickel, “Empirical Bayes estimation of posterior probabilities of enrichment: A comparative study of five estimators of the local false discovery rate,” BMC Bioinformatics 14, art. 87 (2013). published version |  2011 version | 2010 version

logo

This paper adapts novel empirical Bayes methods for the problem of detecting enrichment in the form of differential representation of genes associated with a biological category with respect to a list of genes identified as differentially expressed. Read more…

Optimal strength of evidence

13 February 2013 Leave a comment

D. R. Bickel, “Minimax-optimal strength of statistical evidence for a composite alternative hypothesis,” International Statistical Review 81, 188-206 (2013). 2011 version | Simple explanation (added 2 July 2017)

cover

This publication generalizes the likelihood measure of evidential support for a hypothesis with the help of tools originally developed by information theorists for minimizing the number of letters in a message. The approach is illustrated with an application to proteomics data.

MLE of the local FDR

13 February 2013 Comments off

Local FDR estimation for low-dimensional data

18 October 2012 Leave a comment

M. Padilla and D. R. Bickel, “Estimators of the local false discovery rate designed for small numbers of tests,” Statistical Applications in Genetics and Molecular Biology 11 (5), art. 4 (2012). Full article | 2010 & 2012 preprints

image

This article describes estimators of local false discovery rates, compares their biases for small-scale inference, and illustrates the methods using a quantitative proteomics data set. In addition, theoretical results are presented in the appendices.

Bayes/non-Bayes blended inference

5 October 2012 Leave a comment

Updated with a new multiple comparison procedure and applications on 30 June 2012 and with slides for a presentation on 5 October 2012:

D. R. Bickel, “Blending Bayesian and frequentist methods according to the precision of prior information with applications to hypothesis testing,” Working Paper, University of Ottawa, deposited in uO Research at http://hdl.handle.net/10393/23124 (2012)2012 preprint | 2011 preprint | Slides

This framework of statistical inference facilitates the development of new methodology to bridge the gap between the frequentist and Bayesian theories. As an example, a simple and practical method for combining p-values with a set of possible posterior probabilities is provided.

In this new approach to statistics, Bayesian inference is used when the prior distribution is known, frequentist inference is used when nothing is known about the prior, and both types of inference are blended according to game theory when the prior is known to be a member of some set. (The robust Bayes framework represents knowledge about a prior in terms of a set of possible priors.) If the benchmark posterior that corresponds to frequentist inference lies within the set of Bayesian posteriors derived from the set of priors, then the benchmark posterior is used for inference. Otherwise, the posterior within that set that is closest to the benchmark posterior is used for inference.

Local FDR estimation software

30 June 2012 1 comment

LFDRenrich is a suite of R functions for the estimation of local false discovery rates by maximum likelihood under a two-component or three-component parametric mixture model of 2X2 tables such as those used in gene enrichment analyses.

LFDRhat is a more general suite of R functions for the estimation of local false discovery rates by maximum likelihood under a two-component or three-component parametric mixture model.

Extending the likelihood paradigm

15 June 2012 Leave a comment

image

D. R. Bickel, “The strength of statistical evidence for composite hypotheses: Inference to the best explanation,” Statistica Sinica 22, 1147-1198 (2012). Full article2010 version

Read more…

Effect-size estimates from hypothesis probabilities

25 February 2012 Leave a comment

D. R. Bickel, “Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals,” Statistical Applications in Genetics and Molecular Biology 11 (3), art. 7 (2012). Full article | 2010 preprint

image
The method contributed in this paper adjusts confidence intervals in multiple-comparison problems according to the estimated local false discovery rate. This shrinkage method performs substantially better than standard confidence intervals under the independence of the data across comparisons. A special case of the confidence intervals is the posterior median, which provides an improved method of ranking biological features such as genes, proteins, or genetic variants. The resulting ranks of features lead to better prioritization of which features to investigate further.

Estimating probabilities of enrichment

4 January 2012 Leave a comment

Z. Yang, Z. Li, and D. R. Bickel, “Empirical Bayes estimation of posterior probabilities of enrichment,” Technical Report, Ottawa Institute of Systems Biology, Technical Report, Ottawa Institute of Systems Biology, arXiv:1201.0153 (2011). Full preprint | 2010 seed

This paper adapts novel empirical Bayes methods for the problem of detecting enrichment in the form of differential representation of genes associated with a biological category with respect to a list of genes identified as differentially expressed. A microarray case study illustrates the methods using Gene Ontology (GO) terms, and a simulation study compares their performance. We report that which enrichment methods work best depends strongly on how many GO terms or other biological categories are of interest.