Grunwald

Suboptimality of MDL and Bayes in Classification under Misspecification

Peter Grunwald
CWI Amsterdam/EURANDOM Eindhoven

We show that forms of Bayesian and MDL learning that are often applied to classification problems can be *statistically inconsistent*. We present a classification model (a large family of classifiers) and a distribution such that the best classifier within the model has generalization error (expected 0/1-prediction loss) almost 0. Nevertheless, no matter how many data are observed, both the classifier inferred by MDL and the classifier based on the Bayesian posterior will behave much worse than this best classifier in the sense that their expected 0/1-prediction loss is substantially larger. The reason is that, in order to apply Bayes and MDL to models consisting of classifiers, these classifiers first have to be converted into probability distributions or, equivalently, codes. The standard method for doing this is the logistic transformation. However, the resulting probability models cannot be expected to contain the true distribution, and this causes the problem: our result can be re-interpreted as showing that under misspecification, Bayes and MDL do not always converge to the distribution in the model that is closest in KL divergence to the data generating distribution. We compare this result with earlier results on Bayesian inconsistency by Diaconis, Freedman and Barron.

Joint work with John Langford of the Toyota Technological Institute, Chicago.

Short Course: Information Theory & Statistics
Bin Yu & Mark Hansen
June 1, 2005
Colorado State University Campus
Fort Collins, CO 80523

Graybill Conference
June 2-3, 2005
Hilton Fort Collins
(Formerly: University Park Holiday- Inn)
Fort Collins, CO 80526

www.stat.colostate.edu/graybillconference
Graybill Conference Poster
Last Updated: Friday, May 24, 2005