"Everything should be made as simple as possible, but not simpler." - Albert Einstein

Seminar Announcement

Efficient Adaptive Mixture Estimation

Jiayang Sun, Case Western.

Monday, March 9, 2009

4:00 pm 223 Weber

Data mining is important in scientific research, knowledge discovery and decision making. In this article, we develop and study new adaptive estimation procedures, the partial EM (PEM) and its Bayesian variants (BMAP & BPEM) for analyzing large data from heterogeneous populations. The adaptive procedures are computationally fast,  and provide good alternatives to a full EM when a full EM procedure can and can not be run, for large or streaming data. Under mild conditions, the PEM is
further shown to be consistent and efficient, and has a"super-efficiency" in the case that a mixture distribution for the second batch of the data has an extra component (a.k.a. ``contaminations'' or ``intrusions') to that of the first batch of the data.  The applications to a small data set and a network intrusion data set will also be shown.  (This is a joint work with Peng Liu and Jiahua Chen.)