|" Model Choice for Agricultural Data: A Comparison of Regression-Type Models to Predict Weed Counts."
| Maggie Stanislawski
Master's Candidate, Department of Statistics, Colorado State University
Wednesday, July 25, 2007
Statistical models are used widely in agricultural science, both for prediction and to explore the factors that affect pests and yield. Agricultural data do not usually meet the underlying assumptions of linear regression models, but these models are commonly used for many applications. A valid statistical model is extremely important for valid biological inference. For regression-type models, model building typically focuses on selecting the most suitable covariates to include in a model. In this analysis, we took a broader perspective; we aimed to find the most appropriate types of statistical models for weed count data that were gathered via a spatially dense sampling design from two grower-managed fields (Field 6 and Field 39) in Eastern Colorado from 1997 to 2000. Our goal was to investigate and compare a number of models to predict the observed weed patterns for pigweed (
Amaranthus retroflexus L.) and nightshade ( Solanum sarrachoides Sendtner) across these years. After finding a sound statistical model, we aimed to explore the biological factors that influence weed distributions. We first investigated the issue of spatial autocorrelation in the Field 39 data. We fit linear regression models with and without a trend surface, as well as spatial regression models. Various model selection methods and outlier analysis were performed to select an appropriate model for each year/weed combination. Since the R 2 values of the final models were relatively low, we concluded that alternative models should also be explored. We then compared negative binomial (NB), zero-inflated negative binomial (ZINB), and linear regression models (with a trend surface) for the analysis of the data from both fields. According to Akaike's information criterion (AIC), the ZINB and NB models were consistently better than the linear regression models. Predictions and the signs and significances of the regression coefficients were also compared, and these results tended to be consistent across model types. However, ZINB models allow for separate analysis of the factors that influence weed density and weed absence. This additional information is a major benefit, particularly for site-specific pest management.