Statistical Methods for Detection of Pedigree Errors
in Genetic Linkage Studies |
Lei Sun
University of Toronto
Wednesday, 1 December 2004
3:10 PM
E202, Engineering
ABSTRACT
Genetic linkage studies look for regions of the genome that are shared, in excess of what is expected under the null hypothesis of no linkage, by affected relatives. The excess sharing is evaluated assuming a known pedigree that determines the relationships among the affected individuals. Unidentified pedigree errors can have serious consequences for linkage studies, resulting in either reduced power or false positive evidence for linkage. Genome-screen data collected for linkage studies can be used to detect misspecified relationships.
Mathematical models for the underlying segregation and transmission of the chromosomes will be described. Under these models, all the crossover processes in a pedigree can be viewed jointly as a continuous-time Markov random walk on the vertices of a hypercube. In practice, only limited information on this Markov process can be observed and the dimension of the hypercube is generally large. To circumvent the computational difficulties, we construct augmented Markov processes that have substantially reduced numbers of states, and we use a hidden Markov method to calculate the likelihood of observed genotype data for specific pairs of individuals. This allows us to perform hypothesis tests for detection of misspecified relationships and to construct confidence sets of relationships compatible with the data. For complex pedigrees, the likelihood calculations become infeasible. As an alternative, we propose some new statistics that are computationally simpler, yet result in powerful hypothesis tests for detection of pedigree errors. In the case when the null relationship for a pair is rejected, we propose a simple method using the EM algorithm to infer the likely true relationship for the pair.
I will also discuss the implementation of the methods, with applications to several data sets collected for linkage studies. The software, PREST, is freely available at http://utstat.utoronto.ca/sun/Software/Prest. This is joint work with Mary Sara McPeek of University of Chicago.
Refreshments will be served at 2:45 p.m. in Room 008 of the Statistics Building