|Modeling and Predicting Median Substrate Size in Oregon and
Washington Streams Utilizing Geogjraphic Information Systems
Julia J. Smith
Department of Statistics
Colorado State University
Monday, December 12, 2005
E204 Engineering Building
Median substrate size is a statistic that provides information about the size of substrate in a stream. It can be indicative of stream health as it is used to estimate bed load transport, stream power, microinvertebrate habitat, and the spawning habits for some fish species. In its Environmental and Monitoring and Assessment Program, the U.S. Environmental Protection Agency collected data, including a measure of median substrate size ( LD 50 ) at 485 streams in Oregon and Washington between 1994 and 2004. Using Geographic Information System data compiled at the University of Colorado and EMAP data, several models were created to predict LD 50 using only predictors that did not need to be measured on-site. The goal was to create a model with a small subset of available variables so that LD 50 could be predicted without on-site sampling. The coarse nature of LD 50 and a large set of predictors made this task difficult to accomplish. There were three approaches used in this analysis: stepwise variable selection and multiple regression, classification and regression trees, and a hybrid of multiple regression and classification and regression trees. The hybrid models provided the best prediction models at the cost of parsimony.
The Coast Range Ecoregion had a less coarse distribution of LD 50 than the entire data set. We present a separate analysis for this ecoregion utilizing the same methods. The goal was to find a model that would allow prediction without on-site sampling that could be applied to regions with similar ecosystems. These models had better predictive capabilities than those for the entire data set, and the hybrid models provided the best of these predictions.