| SGeneralized Hartley-Ross and Two-Stage Local Polynomial Regreession Estimation For Finite Populations
| Jehad Al-Jararha
Ph.D. Candidate, Colorado State University, Department of Statistics
Wednesday, April 19, 2007
In many applications, the value of an auxiliary variable, x , is available for each element in a finite population and is correlated with the study variable of interest, y . This auxiliary variable can be used to improve the precision of the estimator of the y -total. One method of improving precision is through finite population ratio estimation, which has been extensively discussed in the literature, especially under simple random sampling without replacement (SI). Hartley and Ross (1954) obtained an exactly unbiased estimator for the finite population ratio under (SI), and hence an unbiased ratio estimator of the y -total. Other authors have obtained an almost unbiased estimator for the finite population ratio, or have considered alternative sampling designs to obtain an unbiased or an almost unbiased estimator for these parameters.
In this work, Hartley and Ross estimator is generalized to an arbitrary measurable sampling design, producing a Generalized Hartley-Ross estimator (GHR). This estimator is unbiased and an exact variance and an unbiased estimator for the exact variance
are obtained. The computations for the exact variance and the unbiased variance estimator of the GHR require higher-order inclusion probabilities (up to fourth order), which are not easily obtained in general. To overcome this problem, two methods of
approximation are given.
The GHR estimator is shown to be mean square consistent under mild conditions. These conditions are met by simple random sampling without replacement, simple random cluster sampling, and stratified sampling designs. The GHR is evaluated under a superpopulation model, and it is shown that the Godambe-Joshi (1965) lower bound is not attainable for GHR. The GHR is compared to other estimators analytically and via simulation.
Another finite population estimation problem arises when auxiliary information x is available not for every element in the finite population, but for every cluster of elements. In this case, we consider nonparametric regression estimation for finite population totals under two-stage sampling, in which an initial probability sample of clusters is selected, and then elements are sampled from each selected clusters.
The finite population mean of the i th cluster is t i /n i , where ti and Ni are the y -total and the size of the i th cluster, respectively. A nonparametric superpopulation model is used to describe the model expectation of a cluster mean as a smooth function of x .
Nonparametric model-assisted estimators based on local polynomial regression are derived, using Ni and estimated cluster totals. These estimators are linear combinations of the estimated cluster totals with weights that are calibrated to known control means. The estimators are asymptotically design-unbiased and design consistent under mild assumptions. Also, a consistent estimator for the design mean squared error of the estimator is obtained. In simulation studies, the nonparametric estimator gives
better results than parametric estimators under a misspecified parametric model, and competitive results when the parametric model is in fact correct. Furthermore, when ^ Ni are used in place of Ni in the computations, better results are obtained. Theory that covers the case when ^ Ni are used is a topic of future work.