|Data Mining Techniques for Temporal Point Processes Applied to Insurance Claims Data
Todd Iverson , Ph.D. Candidate, Department of Statistics, Colorado State University
Tuesday, June19, 2008
9 a.m. 006 Statistics
In this talk we address the problem of predicting patients
at risk of type 2 diabetes using insurance claims data.
We propose and test several methods that use support vector
machines, which are kernel-based classifiers.
We designed several kernels specifically designed for this
task and compared their performance on three years of
insurance claims data derived from the MarketScan database.
The resulting classifier is able to predict patients at risk
of type 2 diabetes with nearly 80% success when combining
several of the proposed kernels.
The specific form of the data, that of a sequence with timing
information attached to it, led us to develop two new kernels
inspired by dynamic time warping. The Global Time Warping (GTW)
and Local Time Warping (LTW) kernels build on an existing
time warping kernel by including the timing coefficients
present in classical time warping, while providing a solution for
the diagonal dominance present in most alignment methods. We show
that the LTW kernel performs significantly better than the existing
time warping kernel when the times contain relevant information.