## Statomics on Web 2.0

Follow the Statomics Lab on Facebook.

## Was the loss function a mistake?

Laplace’s “introduction of a loss function proved to be a serious mistake, which came to hamper the development of an objective theory of statistical inference to the present day” (Hald, 2007, pp. 3-4).

## Fisherian alternatives to conventional statistics

Novel developments in statistics and information theory call for a reconsideration of important aspects of two of R. A. Fisher’s most controversial ideas: the fiducial argument and the direct use of the likelihood function. Some key features of observed confidence levels, the direct use of the likelihood function, and the minimum description length principle are summarized here:

- Like the fiducial distribution, a probability measure of observed confidence levels is in effect a posterior probability distribution of the parameter of interest that does not require any prior distribution. Derived from sets of confidence intervals, this probability distribution of a parameter of interest is traditionally known as a confidence distribution. When the parameter of interest is scalar, the observed confidence level of a composite hypothesis is equal to its fiducial probability. On the other hand, observed conference levels do not suffer from the difficulties of constructing a fiducial distribution of a vector parameter.
- The likelihood ratio serves not only as a tool for the construction of point estimators,
*p*-values, confidence intervals, and posterior probabilities, but is also fruitfully interpreted as a measure of the strength of statistical evidence for one hypothesis over another through the lens of a family of distributions. Modern versions of Fisher’s evidential use of the likelihood overcome multiplicity problems that arise in standard frequentism without resorting to a prior distribution. - A related approach is to select the family of distributions using a modern information-theoretic reinterpretation of the likelihood function. In particular, the minimum description length principle extends the scope of Fisherian likelihood inference to the challenging problem of model selection.

## Information Theoretic Methods

2010 Workshop on Information Theoretic Methods in Science and Engineering

August 16 – 18, 2010 | Tampere, Finland

## Research at the Statomics Lab

### Overview

At the Statomics Lab, we discover ways to assess complex information relevant to health care, renewable energy, and other applications in the post-genomic era. Improved statistical methods of weighing evidence enable more reliable interpretations of both case-control measurements of genomes and experimental measurements of transcript, protein, and metabolite levels in the cell. A more thorough understanding of these data impacts biomedicine and biotechnology, targeting higher-quality health care and sustainable energy availability.

David Bickel and the trainees in the Statomics Lab are improving statistical methods of weighing evidence to enable more reliable interpretations of both (1) experimental measurements of transcript, protein, and metabolite levels in the cell and (2) case-control measurements of genomes.

### Statistical systems biology

In the first component of the research program, the lab is developing statistical methods for the analysis of gene expression microarray data and other functional genomics data. The methods include the creation and testing of new ways to estimate levels of microarray gene expression. For example, this involves work on analogous methods for the case of unpaired data such as that of proteomics and metabolomics platforms and of single-channel microarrays and reliable estimation of the fold change of each gene. Since the emerging field of lipidomics has a need for such methods of data analysis, David Bickel is a mentor in the CIHR Training Program in Neurodegenerative Lipidomics.

### Inferring genome-wide associations

For the second component of this research program in high-dimensional statistics, the lab is extending similar methods developed for gene expression data to genome-wide association (GWA) studies, as follows. We are developing and comparing statistical methods of estimating odds ratios while considering concerns about multiple comparisons. In particular, we are inventing shrinkage estimates in the presence of multiple comparisons. We are also creating methods of reliably approximating probabilities of association in order to obtain better point and interval estimates of the effect sizes.

### More information

For details on the research summarized above, see the lab’s publications and preprints.