| | Clustering, Classification and Validation via the L1 Data Depth Rebecka Jornsten, Department of Statistics,Rutgers University Among
the many important tasks in the analysis of microarray data are (i) the
clustering of samples or genes, and (ii) the classification of samples. In
addition, it is useful to have tools for validating the clustering or
classification results. We present two new methods for clustering and
classification based on the intuitively simple concept of data depth. We
demonstrate on real and simulated data that our clustering method, DDclust, can
substantially improve clustering accuracy compared with the popular PAM
algorithm. The data depth based classifier, DDclass, is highly competitive with
the best reported methods. We also discuss a validation tool, the Relative Data
Depth (ReD), for clustering and classification. The ReD statistic is an
excellent tool for identifying outliers in clustering, and selecting the number
of clusters. In addition, the ReD statistic is shown to be a reliable indicator
of classification confidence. |