Archive

Archive for the ‘Methods’ Category

Confidence + coherence = fiducial shrinkage

30 June 2012 Leave a comment

D. R. Bickel, “A prior-free framework of coherent inference and its derivation of simple shrinkage estimators,” Working Paper, University of Ottawa, deposited in uO Research at http://hdl.handle.net/10393/23093 (2012). 2012 preprint

This paper proposes a new method of shrinking point and interval estimates on the basis of fiducial inference. Since problems with the interpretation of fiducial probability have prevented its widespread use, this manuscript first places fiducial inference within a general framework that has Bayesian and frequentist foundations.

Extending the likelihood paradigm

15 June 2012 Leave a comment

image

D. R. Bickel, “The strength of statistical evidence for composite hypotheses: Inference to the best explanation,” Statistica Sinica 22, 1147-1198 (2012). Full article2010 version

Read more…

Confidence-based decision theory

1 May 2012 Leave a comment

D. R. Bickel, “Coherent frequentism: A decision theory based on confidence sets,” Communications in Statistics – Theory and Methods 41, 1478-1496 (2012). Full article (open access) | 2009 version | Simple explanation (link added 27 June 2018)

image

To combine the self-consistency of Bayesian statistics with the objectivity of frequentist statistics, this paper formulates a framework of inference for developing novel statistical methods. The framework is based on a confidence posterior, a parameter probability distribution that does not require any prior distribution. While the Bayesian posterior is defined in terms of a conditional distribution given the observed data, the confidence posterior is instead defined such that the probability that the parameter value lies in any fixed subset of parameter space, given the observed data, is equal to the coverage rate of the corresponding confidence interval. Inferences based on the confidence posterior are reliable in the sense that the certainty level of a composite hypothesis is a weakly consistent estimate of the 0-1 indicator of hypothesis truth. At the same time, the confidence posterior is as non-contradictory as the Bayesian posterior since both satisfy the same coherence axioms. Using the theory of coherent upper and lower probabilities, the confidence posterior is generalized for situations in which no approximate or exact confidence set is available. Examples of hypothesis testing and estimation illustrate the range of applications of the proposed framework.

Additional summaries appear in the abstract and in Section 1.3 of the paper.

How to use priors with caution

13 April 2012 Leave a comment

D. R. Bickel, “Controlling the degree of caution in statistical inference with the Bayesian and frequentist approaches as opposite extremes,” Electronic Journal of Statistics 6, 686-709 (2012). Full text (open access) | 2011 preprint

Electronic Journal of Statistics

This paper reports a novel probability-interval framework for combining strengths of frequentist and Bayesian methods on the basis of game-theoretic first principles. It enables data analysis on the basis of the posterior distribution that is a blend between a set of plausible Bayesian posterior distributions and a parameter distribution that represents an alternative method of data analysis. This paper’s framework of statistical inference is intended to facilitate the development of new methods to bridge the gap between the frequentist and Bayesian approaches. Four concrete examples illustrate how such intermediate methods can leverage strengths of the two extreme approaches.

Effect-size estimates from hypothesis probabilities

25 February 2012 Leave a comment

D. R. Bickel, “Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals,” Statistical Applications in Genetics and Molecular Biology 11 (3), art. 7 (2012). Full article | 2010 preprint

image
The method contributed in this paper adjusts confidence intervals in multiple-comparison problems according to the estimated local false discovery rate. This shrinkage method performs substantially better than standard confidence intervals under the independence of the data across comparisons. A special case of the confidence intervals is the posterior median, which provides an improved method of ranking biological features such as genes, proteins, or genetic variants. The resulting ranks of features lead to better prioritization of which features to investigate further.

Estimating probabilities of enrichment

4 January 2012 Leave a comment

Z. Yang, Z. Li, and D. R. Bickel, “Empirical Bayes estimation of posterior probabilities of enrichment,” Technical Report, Ottawa Institute of Systems Biology, Technical Report, Ottawa Institute of Systems Biology, arXiv:1201.0153 (2011). Full preprint | 2010 seed

This paper adapts novel empirical Bayes methods for the problem of detecting enrichment in the form of differential representation of genes associated with a biological category with respect to a list of genes identified as differentially expressed. A microarray case study illustrates the methods using Gene Ontology (GO) terms, and a simulation study compares their performance. We report that which enrichment methods work best depends strongly on how many GO terms or other biological categories are of interest.

Combining inferences from different methods

28 November 2011 Leave a comment

D. R. Bickel, “Resolving conflicts between statistical methods by probability combination: Application to empirical Bayes analyses of genomic data,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1111.6174 (2011). Full preprint

This paper proposes a solution to the problem of combining the results of differing statistical methods that may legitimately be used to analyze the same data set. The motivating application is the combination of two estimators of the probability of differential gene expression: one uses an empirical null distribution, and the other uses the theoretical null distribution. Since there is usually not any reliable way to predict which null distribution will perform better for a given data set and since the choice between them often has a large impact on the conclusions, the proposed hedging strategy addresses a pressing need in statistical genomics. Many other applications are also mentioned in the abstract and described in the introduction.

Minimax strength of statistical evidence

24 November 2011 Leave a comment

D. R. Bickel, “A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons,” Canadian Journal of Statistics 39, 610–631 (2011). Full text | Revised preprint | 2010 draft

93663f53-d2d9-49de-b379-0cb0b7b566d7

This paper introduces a novel approach to the multiple comparisons problem by generalizing a promising method of model selection developed by information theorists. The first two sections present that method and its main advantages over conventional approaches without burdening statisticians with unfamiliar terms from coding theory. A quantitative proteomics case study facilitates application of the new method to the analysis of data sets involving multiple biological features. The theorems describe its operating characteristics.

The cited medium-scale paper presented previous minimum description length (MDL) methods. Unlike those methods, the new MDL methods of the current paper are based on a conflation of the normalized maximum likelihood (NML) with the weighted likelihood (WL). The previous MDL methods are used in the CJS article for comparison with its NML/WL methods.

Degree of caution in inference

26 September 2011 Leave a comment

D. R. Bickel, “Controlling the degree of caution in statistical inference with the Bayesian and frequentist approaches as opposite extremes,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1109.5278 (2011). Full preprint

This paper’s framework of statistical inference is intended to facilitate the development of new methods to bridge the gap between the frequentist and Bayesian approaches. Three concrete examples illustrate how such intermediate methods can leverage strengths of the two extreme approaches.

Software for local false discovery rate estimation

15 August 2011 Leave a comment

LFDR-MLE is a suite of R functions for the estimation of local false discovery rates by maximum likelihood under a two-group parametric mixture model of test statistics.