"Everything should be made as simple as possible, but not simpler." - Albert Einstein

Seminar Announcement

Data Mining Techniques for Temporal Point Processes Applied to Insurance Claims Data

Todd Iverson , Ph.D. Candidate, Department of Statistics, Colorado State University

Tuesday, June19, 2008

9 a.m. 006 Statistics

In this talk we address the problem of predicting patients at risk of type 2 diabetes using insurance claims data. We propose and test several methods that use support vector machines, which are kernel-based classifiers. We designed several kernels specifically designed for this task and compared their performance on three years of
insurance claims data derived from the MarketScan database. The resulting classifier is able to predict patients at risk of type 2 diabetes with nearly 80% success when combining several of the proposed kernels.

The specific form of the data, that of a sequence with timing information attached to it, led us to develop two new kernels inspired by dynamic time warping. The Global Time Warping (GTW) and Local Time Warping (LTW) kernels build on an existing time warping kernel by including the timing coefficients present in classical time warping, while providing a solution for the diagonal dominance present in most alignment methods.  We show that the LTW kernel performs significantly better than the existing time warping kernel when the times contain relevant information.