| Efficient Adaptive Mixture Estimation
Jiayang Sun, Case Western.
Monday, March 9, 2009
4:00 pm 223 Weber
Data mining is important in scientific research, knowledge discovery
and decision making. In this article, we develop and study new adaptive
estimation procedures, the partial EM (PEM) and its Bayesian variants
(BMAP & BPEM) for analyzing large data from heterogeneous populations.
The adaptive procedures are computationally fast, and provide good
alternatives to a full EM when a full EM procedure can and can not be
run, for large or streaming data. Under mild conditions, the PEM is
further shown to be consistent and efficient, and has a"super-efficiency" in the case that a mixture distribution for the
second batch of the data has an extra component (a.k.a.
``contaminations'' or ``intrusions') to that of the first batch of the
data. The applications to a small data set and a network intrusion
data set will also be shown. (This is a joint work with Peng Liu and