|Spatial Models with Applications in Computer Experiments
| Ke Wang
Ph.D. Candidate, Colorado State University, Department of Statistics
Thursday, March 1, 2007
1:00 p.m. .
Much scientific research involves conducting computer experiments to study complicated physical phenomena. In a typical computer experiment, a high-dimensional vector x 2 Rd is used as input to a computer code, yielding an output y(x). The output y is deterministic, i.e., running the code with the same inputs x would give the same outputs. Because the code is expensive to execute, one of the major goals of computer experiments is to seek an approximation model (metamodel) which is close to the true code but faster to run. A statistical approach to the problem is to model the response y(x) as a realization from a stochastic process and to construct a predictor appropriate for that process. For example, a stationary Gaussian process leads to kriging, or empirical best linear unbiased prediction (BLUP), which is a popular technique in computer experiments. A closely-related approach is Bayesian prediction of the deterministic function under a GP model, that has also been studied extensively during the past several years.
Though a GP modeling approach can easily accommodate prior knowledge in the form of covariance functions, can produce prediction intervals and is conceptually straightforward to implement, it has limitations. The stationarity assumption is a severe restriction for inhomo-geneous functions whose smoothness varies considerably over the input space. In computer experiments, the output does have heterogeneous features. Using the stationary GP model will oversmooth in some regions while undersmoothing in others. To overcome this limitation, both nonstationary GP and regression techniques have been developed. Gaussian processes with nonstationary covariance function work well in low-dimensional (R2 R3) input space, but the application in high-dimensional space is intractable. Adaptive regression splines and regression trees have been applied to computer experiments. The number of knots and locations are adaptively determined by data to account for inhomogeneity. Artificial neural networks also can be used to model heterogeneous functions by using “hidden” layers to introduce flexibility to modeling. These approaches have implicit covariance functions. Due to the large number of coefficients, these models have no clear interpretation. To overcome the limitations of existing approaches, we propose a new spatial model that allows for more flexibility in capturing the salient features of computer outputs.
Our work is illustrated by the Susceptible-Infected-Recovered (SIR) model, which is an epidemiological model of a disease spread in a population. The dynamics are described by a system of ordinary differential equations that can be solved numerically. The SIR model has a high-dimensional input space and provides not only the quantities of interest but also their first partial derivatives. Many interesting statistical problems are raised, such as predicting responses at unobserved inputs, uncertainty analysis, and combining the derivative information in modeling.
We model the deterministic computer response as a realization from a stochastic heteroscedastic process (SHP). The SHP is a non-Gaussian, stationary process. By conditioning on a latent process, the SHP is a Gaussian process with nonstationary covariance function. The versatile sample paths show that the new model is more flexible than the traditional GP model. The SHP model can also recover some Gaussian-like sample paths for certain model parameter values.
For parameter estimation, we propose to use maximum likelihood. Due to the high dimensional latent process, we develop an importance sampling method for likelihood computation. Our preliminary simulation results show that this importance sampling method is computationally efficient. Based on estimated model parameters, we can estimate the latent process _ at sampled locations and predict _ at unobserved input sites x0. Empirical BLUP is used to predict y(x0). We also propose another possible method (called pseudo-best predictor (BP)) to predict responses based on estimated model parameters and latent process values.
Our spatial model can be adapted to model the first partial derivative process. The derivative process provides additional information about the shape and smoothness of the underlying deterministic function and can assist in the prediction of responses at unobserved sites. The unconditional correlation function for the derivative process presents some interesting properties, and can be used as a new class of spatial correlation functions.
Part of this dissertation will explore the Bayesian approach for parameter estimation and prediction. We will fit the SHP model to some computer experiment data, such as outputs from the SIR model. If possible, we would like to combine derivative information in prediction. The major difficulties of bringing in derivative information are the increase in the dimensionality of the latent process and the numerical problems of inverting the enlarged covariance matrix.