Wei Pan
Back Home Up Next

Russell D. Wolfinger
David Allison
Jenny Bryan
Lai-Har Chi
Jane Chang
Philip Dixon
Kent M. Eskridge
Deborah Glueck
David L. Gold
Susan G. Hilsenbeck
Lawrence Hunter
Rebecka Jornsten
Steen Knudsen
Laura Lazzeroni
Chen-Tuo Liao
Peter Munson
Dan Nettleton
Wei Pan
David M. Rocke
Grace S. Shieh
Lue Ping Zhao
Deepak Mav
Annette Molinaro

Statistical significance analysis of longitudinal gene expression data

Xu Guo (1), Huilin Qi (2), Catherine M. Verfaillie (2), Wei Pan (1)
(1) Division of Biostatistics, University of Minnesota
(2) Department of Medicine, University of Minnesota

Time-course microarray experiments are designed to study biological processes in a temporal fashion. Longitudinal gene expression data arise when biological samples taken from the same subject are used at different time points to measure the gene expression levels. It has been observed that the gene expression patterns of samples of a given tumor measured at different time points are likely to be much more similar to each other than are the expression patterns of tumor samples of the same type taken from different subjects. In statistics, this phenomenon is called the within-subject correlation of repeated measurements on the same subject, and the resulting data are called longitudinal data. It is well known in other applications that valid statistical analyses have to appropriately take account of possible correlation in longitudinal data. For this reason, we apply estimating equation techniques to construct a robust statistic, which is a variant of the generalized Wald statistic and accounts for the correlation feature of longitudinal gene expression data, to detect genes with temporal changes in expression. We associate significance levels to the proposed statistic by either incorporating the idea of the Significance Analysis of Microarrays (SAM) method (Tusher et al, 2001) or using the mixture model method (MMM) (Pan et al, 2001) to identify significant genes. The utility of the statistic is demonstrated by applying it to an important study of osteoblast lineage-specific differentiation. Using simulated data, we also show pitfalls in drawing statistical inference when the correlation in longitudinal gene expression data is ignored.  

Graybill Conference
June 18-20, 2003
University Park Holiday Inn
Fort Collins, CO 80526
email: hari@stat.colostate.edu Fax: (970)491-7895 Phone: (970)491-5269
Last Updated: Wednesday, February 12, 2003