Models and Methods for Microarray Experiments: Before and After the Fold Change Calculation
Wednesday, 6 April 2005
Department of Statistics
Colorado State University
E202 Engineering Building
Microarrays allow scientists to monitor expression levels of thousands of genes simultaneously. Scientists use microarrays to find relative expression (fold change) under various conditions. While the use of microarrays is now widely accepted, there are many proposed methods for analyzing microarray data. In this dissertation, we develop a framework for comparing methods for microarray data analysis and determination of sample size. We address issues relating to microarray data preprocessing, estimability of quantities of interest and diagnostics for checking model assumptions. We also propose a systematic procedure for transcriptional regulation analysis.
We compare the performance of the most popular methods for analysis of oligo microarray data (Microarray Suite 5.0, RMA and dChip) using a simulation framework (SimArray). A simulation study is employed because it allows for the manipulation of many aspects of an experiment, including the number of arrays, amount and sources of variability, the proportion of genes that are affected under a certain experimental condition and the fold changes. We discuss SimArray's use as a sample size calculator which allows scientists to choose an appropriate number of microarrays for a given experiment based on power and false discovery rate. A unique feature of SimArray is that it begins with probe level information.
A number of data preprocessing steps are taken before a model is fit to microarray data. One such step is normalization, which attempts to correct for systematic array differences. We closely examine some commonly used normalization methods and detail their limitations. We also propose a number of diagnostics that check the effectiveness of the preprocessing and the consistency between model assumptions and observed data.
We propose a unified model which would allow all preprocessing to be incorporated into a single model for the analysis of microarray data. We discuss this proposed unified model and its relationship to other models.
Finally, we present a case study involving transcriptional regulation analysis, which uses estimated fold changes as input.
Refreshments will be served at 2:45 p.m. in Room 008 of the Statistics Building