"Everything should be made as simple as possible, but not simpler." - Albert Einstein

Seminar Announcement

Habitat Estimation through Synthesis of Species Presence/Absence Information and Environmental Covariate Data

Grant Dornan, Department of Statistics, Colorado State University

Wednesday, October 26, 2011

12:00 p.m. - 1:00 p.m., Statistics Building, room 006


This paper investigates the statistical model developed by Foster, et al. (2011) to estimate marine habitat maps based on environmental covariate data and species presence/absence information while treating habitat definition probabilistically.  The model assumes that two sites belonging to the same habitat have approximately the same species presence probabilities, and thus both environmental data and species presence observations can help to distinguish habitats at locations across a study region.  I develop a computational method to estimate the model parameters by maximum likelihood using a blocked non-linear Gauss-Seidel algorithm.  The main part of my work is developing and conducting simulation studies to evaluate estimation performance and to study related questions including the impacts of sample size, model bias and model misspecification.  Seven testing scenarios are developed including between 3 and 9 habitats, 15 and 40 species, and 150 and 400 sampling sites.  Estimation performance is primarily evaluated through fitted habitat maps and is shown to be excellent for the seven example scenarios examined.  Rates of successful habitat classification ranged from 0.92 to 0.98.  I show that there is a roughly balanced tradeoff between increasing the number of sites and increasing the number of species for improving estimation performance.  Standard model selection techniques are shown to work for selection of covariates, but selection of the number of habitats benefits from supplementing quantitative techniques with qualitative expert judgement.  Although estimation of habitat boundaries is extremely good, the rate of probabilistic transition between habitats is shown to be difficult to estimate accurately.  Future research should address this issue.  An appendix to this thesis includes a comprehensive and annotated collection of R code developed during this project.