## Entropy sightings

Entropy and its many avatars. (English summary)

*J. Math. Soc. Japan*67 (2015), no. 4, 1845–1857.

94A17 (37A35 60-02 60K35 82B05)

The author, a chief architect of the theory of large deviations, chronicles several manifestations of entropy. It made appearances in the realms indicated by these section headings:

- Entropy and information theory
- Entropy and dynamical systems
- Relative entropy and large deviations
- Entropy and duality
- Log Sobolev inequality
- Gibbs states
- Interacting particle systems

The topics are connected whenever a concept introduced in one section is treated in more depth in a later section. In this way, relative entropy is seen to play a key role in large deviations, Gibbs states, and systems of interacting particles.

Less explicit connections are left to the reader’s enjoyment and education. For example, the relation between Boltzmann entropy and Shannon entropy in the information theory section is a special case both of Sanov’s theorem, presented in the section on large deviations, and of the relation of free energy and relative entropy, in the section on Gibbs states.

The paper ends with a tribute to Professor Kiyosi Itô.

Reviewed by David R. Bickel

**References**

- J. Axzel and Z. Daroczy, On Measures of Information and Their Characterizations, Academic Press, New York, 1975. MR0689178
- L. Boltzmann, Über die Mechanische Bedeutung des Zweiten Hauptsatzes der Wärmetheorie, Wiener Berichte, 53 (1866), 195–220.
- R. Clausius, Théorie mécanique de la chaleur, lère partie, Paris: Lacroix, 1868.
- H. Cramer, On a new limit theorem in the theory of probability, Colloquium on the Theory of Probability, Hermann, Paris, 1937.
- J. D. Deuschel and D. W. Stroock, Large deviations, Pure and Appl. Math., 137, Academic Press, Inc., Boston, MA, 1989, xiv+307 pp. MR0997938
- M. D. Donsker and S. R. S. Varadhan, Asymptotic evaluation of certain Markov process expectations for large time, IV, Comm. Pure Appl. Math., 36 (1983), 183–212. MR0690656
- A. Feinstein, A new basic theorem of information theory, IRE Trans. Information Theory PGIT-4 (1954), 2–22. MR0088413
- L. Gross, Logarithmic Sobolev inequalities, Amer. J. Math., 97 (1975), 1061–1083. MR0420249
- M. Z. Guo, G. C. Papanicolaou and S. R. S. Varadhan, Nonlinear diffusion limit for a system with nearest neighbor interactions, Comm. Math. Phys., 118 (1988), 31–59. MR0954674
- A. I. Khinchin, On the fundamental theorems of information theory, Translated by Morris D. Friedman, 572 California St., Newtonville MA 02460, 1956, 84 pp. MR0082924
- A. N. Kolmogorov, A new metric invariant of transitive dynamical systems and automorphisms of Lebesgue spaces, (Russian) Topology, ordinary differential equations, dynamical systems, Trudy Mat. Inst., Steklov., 169 (1985), 94–98, 254. MR0836570
- O. Lanford, Entropy and equilibrium states in classical statistical mechanics, Statistical Mechanics and Mathematical Problems, Lecture notes in Physics, 20, Springer-Verlag, Berlin and New York, 1971, 1–113.
- D. S. Ornstein, Ergodic theory, randomness, and dynamical systems, James K. Whittemore Lectures in Mathematics given at Yale University, Yale Mathematical Monographs, No. 5. Yale University Press, New Haven, Conn.-London, 1974, vii+141 pp. MR0447525
- I. N. Sanov, On the probability of large deviations of random magnitudes, (Russian) Mat. Sb. (N. S.), 42 (84) (1957), 11–44. MR0088087
- C. E. Shannon, A mathematical theory of communication, Bell System Tech. J., 27 (1948), 379–423, 623–656. MR0026286
- Y. G. Sinai, On a weak isomorphism of transformations with invariant measure, (Russian) Mat. Sb. (N.S.), 63 (105) (1964), 23–42. MR0161961
- H. T. Yau, Relative entropy and hydrodynamics of Ginzburg-Landau models, Lett. Math. Phys., 22 (1991), 63–80. MR1121850

## Undergraduate research project or internship

Acquire a statistical bioinformatics skill set by developing novel scientific software in the frontiers of genomics for high impact on medical science. Learn to analyze genomics data with newly created statistical methods. Make new biostatistics software accessible worldwide by improving the usability and functionality of the Statomics Lab’s data analysis code and by adding documentation. Providing scientists with these reliable biostatistical tools can advance medical research by improving the accuracy of conclusions drawn from genomics and clinical data.

Scientific breakthroughs from genome-sequencing projects brought the realization that reliable interpretation of the resulting information makes unprecedented demands for contemporaneous advances in computation and mathematical modeling. As the complexity of genomic data sets drives innovative statistics research, the Statomics Lab (http://davidbickel.com) aims to develop and apply novel methodology and algorithms to solve current problems in analyzing gene-expression, proteomics, metabolomics, SNP, ChIP-chip, and/or clinical data.

Intellectual curiosity and high mathematical aptitude are essential, as is the ability to quickly code and debug computer programs. Strong self motivation and good communication skills are also absolutely necessary. The following qualities are desirable but not required: coursework in bioinformatics, computer science, numerical methods, numerical analysis, software engineering, statistics, and/or biology; familiarly with BUGS, R, S-PLUS, C, Fortran, and/or LaTeX; experience with UNIX or Linux.

To be considered, send a PDF CV that has your GPA and contact information of two references to dbickel@uOttawa.ca with either “research project” or “internship” in the Subject line of the message and with a cover letter in the body of the message. Only those students selected for further consideration will receive a response.

## Statistics & biostatistics graduate studentships

Reliable interpretation of genomic information makes unprecedented demands for innovations in statistical methodology and its application to biological systems. This unique opportunity drives research at the Statomics Lab of the Ottawa Institute of Systems Biology (http://davidbickel.com). The Statomics Lab seeks new graduate students who will conduct original research involving the creation and evaluation of novel statistical tools for application to the analysis of transcriptomics, proteomics, metabolomics, and/or genome-wide-association data.

Each student will work toward an MSc or PhD degree in the Mathematics and Statistics Program at the University of Ottawa. MSc students have the additional option of choosing a Bioinformatics or Biostatistics Specialization. Financial support is available.

Intellectual curiosity and high mathematical aptitude are essential, as is the ability to quickly code and debug computer programs. Strong self motivation and good communication skills are also absolutely necessary. The following qualities are desirable but not required: coursework in bioinformatics, computer science, numerical methods, numerical analysis, software engineering, statistics, and/or biology; familiarly with BUGS, R, S-PLUS, C, Fortran, and/or LaTeX; experience with UNIX or Linux.

Canadians (by citizenship or permanent residency) are especially encouraged to apply, as are all exceptional students. To be considered, send a PDF CV that has your GPA and contact information of two references to dbickel@uOttawa.ca with either “MSc” or “PhD” and any specialization in the Subject line of the message and with a cover letter in the body of the message. Only those selected for further consideration will receive a response.

## Estimates of the local false discovery rate based on prior information: Application to GWAS

A. Karimnezhad and D. R. Bickel, “Incorporating prior knowledge about genetic variants into the analysis of genetic association data: An empirical Bayes approach,” Working Paper, University of Ottawa, deposited in uO Research at http://hdl.handle.net/10393/34889 (2016). 2016 preprint

## Empirical Bayes single-comparison procedure

D. R. Bickel, “Small-scale inference: Empirical Bayes and confidence methods for as few as a single comparison,” *International Statistical Review ***82**, 457-476 (2014). Published version | 2011 preprint | Simple explanation (link added 21 June 2017)

Parametric empirical Bayes methods of estimating the local false discovery rate by maximum likelihood apply not only to the large-scale settings for which they were developed, but, with a simple modification, also to small numbers of comparisons. In fact, data for a single comparison are sufficient under broad conditions, as seen from applications to measurements of the abundance levels of 20 proteins and from simulation studies with confidence-based inference as the competitor.

## Adaptively selecting an empirical Bayes reference class

F. A. Aghababazadeh, M. Alvo, and D. R. Bickel, “Estimating the local false discovery rate via a bootstrap solution to the reference class problem,” Working Paper, University of Ottawa, deposited in uO Research at http://hdl.handle.net/10393/34295 (2016). 2016 preprint