Quantifying Relative Incomplete Information for Hypothesis Testing in Statistical and Genetic Studies

Xiao-Li Meng
Department of Statistics, Harvard University

This paper attempts to establish a general framework for quantifying the relative amount of missing information in the context of hypothesis testing with incomplete data. The work is motivated by applications to studies, such as linkage analyses and haplotype-based association projects, designed to identify genetic contributions to complex diseases. In the genetic studies the information measures are used for the experimental design, technology comparison, interpretation of the data, and for understanding the behavior of some of the inference tools. The central difficulties in constructing such information measures arise from the multiple, and often conflicting, aims in practice. For large samples, we show that a satisfactory, likelihood-based general solution exists by using appropriate forms of the relative Kullback-Leiber information, and that the proposed measures are computationally inexpensive given the maximized likelihoods with the observed data. We exemplify the measures on data coming from mapping studies on the inflammatory bowel disease and stroke. For small-sample problems, which appear rather frequently in practice and sometimes in disguised forms (e.g., measuring individual contribution to a large study), the robust Bayesian approach holds great promise, though the choice of a general-purpose "default prior" is still a very challenging problem. We also report several intriguing connections we encountered in our investigation, such as the connection with the fundamental identity for the EM algorithm, the connection with the second CR (Chapman-Robbins) lower information bound and connections between likelihood ratios and Bayes factors.

(Note: This talk is based on Kong, Nicolae and Meng (2004) with the
same title and the above abstract)

Short Course: Information Theory & Statistics
Bin Yu & Mark Hansen
June 1, 2005
Colorado State University Campus
Fort Collins, CO 80523

Graybill Conference
June 2-3, 2005
Hilton Fort Collins
(Formerly: University Park Holiday- Inn)
Fort Collins, CO 80526

www.stat.colostate.edu/graybillconference
Graybill Conference Poster
Last Updated: Friday, May 24, 2005