Microarray studies often involve thousands of simultaneous hypothesis tests, making it nearly certain that some positive results will occur merely by chance. This problem has led to renewed interest in statistical methods for multiple testing. Traditional simultaneous testing procedures were designed to control the probability of family-wise error, defined to be the occurrence of any false positive in an entire study. This goal is reasonable when a small number of comparisons are made, say between each pair of treatment arms in a clinical trial. However, this approach can result in low power to detect true positive results when there is a very large number of tests. Family-wise error methods are especially unsuited for large exploratory studies, where most investigators would willingly accept a small number of false positives. The false discovery rate was proposed as an alternative, less conservative, approach to multiple testing. However, false discovery rates are defined in terms of an expectation and do not indicate whether observed test results are “surprising” or not under the null hypothesis. This talk will explore the distribution of the number of false positives in microarray studies, estimated using standard randomization procedures. Detailed examination shows that even moderate levels of correlation among individual test data can greatly increase the variability of the number of false positives. I will propose a number of rigorous probability statements based on the randomization distribution, which are less conservative than traditional methods for family-wise error. The described methods are intended as simple and readily interpretable tools for evaluating positive test results in microarray, and other, studies involving multiple tests. |

Graybill Conference |