Modeling nucleotide sequence variation in a viral quasispecies

Philip Dixon Department of Statistics, Iowa State University

A viral quasispecies is a collection of related but distinct genetic sequences. Viral mutation rates are high, so a population founded by a single genetic sequence rapidly diversifies. Some of this sequence variation codes for a different amino acid sequence; most is neutral. Recently, various groups have described temporal changes in a viral quasispecies. These changes are usually summarized using a phylogenetic tree that ignores the temporal structure of the data. A better model for the sequence change can be constructed as a Bayesian hierarchical model. This combines a model for the mutation process, a model for the abundance of each sequence over time, and a model for the sampling process. Such a model can account for unobserved mutations. Inference is by MCMC. The approach is illustrated using data from a long term study of the PRRS virus, a recent and devastating viral disease in pigs.

