| Testing Whether Genetic Variation Explains Correlation of Quantitative Measures of Gene Expression, and Application to Genetic Network Analysis
| Zhaoxia Yu
Ph.D. Rice University
Friday, 26, 2007
Genetic networks for gene expression data are often built by graphical models, which in turn are built from pairwise correlations of gene expression levels. A key feature of building graphical models is evaluation of conditional independence of two traits, given other traits. Conditional independence is assessed by the partial correlation; if the partial correlation is not statistically different from zero, then conditional independence is assumed. When conditional independence can be assumed, the traits that are conditioned on are considered to “explain” the correlation of a pair of traits, allowing efficient building and interpretation of a network. Overlaying genetic polymorphisms, such as single nucleotide polymorphisms (SNPs), on quantitative measures of gene expression provides a much richer set of data to build a genetic network, because it is possible to evaluate whether sets of SNPs “explain” the correlation of gene expression levels. However, there is strong evidence that gene expression levels are controlled by multiple interacting genes, suggesting that it will be difficult to reduce the partial correlation completely to zero. Ignoring the fact that some set of SNPs can explain at least part of the correlation between gene expression levels, if not all, might miss important clues on the genetic control of gene expression. To enrich the assessment of the causes of correlation between gene expression levels, we develop methods to evaluate whether a set of covariates (e.g., SNPs, or even a set of quantitative expression transcripts), explains at least some of the correlation of gene expression levels. We achieve this by developing a statistical method to compare the marginal correlation of gene expression with the partial correlation, conditional on a set of covariates. We develop a two-stage procedure to first determine a set of covariates to condition on and then to test whether the set decreases the partial correlation, relative to the marginal correlation. These methods can be used to assist the interpretation of regulation of gene expression and the construction of gene regulation networks.