Abstract

In many sample surveys from finite populations, the value of an auxiliary variable x is available (at least in aggregate form) for the entire finite population, and is correlated with the study variable of interest y. This auxiliary variable can be used to improve the precision of the estimator of the y-total.

One method of improving precision is through finite population ratio estimation, which has been extensively discussed in the literature, especially under simple random sampling without replacement (SI). Hartley and Ross (1954) obtained an exactly unbiased estimator for the finite population ratio under SI, and hence an unbiased ratio estimator of the y-total. Other authors have obtained an almost unbiased estimator for the finite population ratio, or have considered alternative sampling designs to obtain an unbiased or an almost unbiased estimator for this parameter.

In this work, the Hartley and Ross (1954) estimator is generalized to unequal probability sampling designs, under the condition of measurability (strictly positive second-order inclusion probabilities). This results in generalized Hartley and Ross (GHR) estimation. Two distinct versions are considered.

The first builds on the Horvitz and Thompson (1952) estimator. This GHR estimator is unbiased and an exact variance and an unbiased estimator for the exact variance are obtained. The computations for the exact variance and the unbiased variance estimator of the GHR require higher-order inclusion probabilities (up to fourth order), which are not easily obtained in general. To overcome this problem, two methods of approximation are given.

The GHR estimator is shown to be mean square consistent under mild conditions. These conditions are met, for example, by simple random sampling without replacement, simple random cluster sampling, and stratified sampling designs.

Central limit theorems (CLTs) are established for GHR under the SI design and under the Poisson sampling ( PO) design. The asymptotic variance and a consistent estimator for the asymptotic variance are given under both designs.

The GHR is evaluated under a super-population model, and it is shown that the Godambe and Joshi (1965) lower bound is attainable for GHR under SI and PO sampling designs. The GHR is compared to other estimators analytically and via simulation.

The second version of GHR is derived using a Hansen and Hurwitz (1943) type estimator for with-replacement sampling. This estimator is unbiased. This estimator is discussed under two different asymptotic scenarios, when the population size N is fixed and number of independent draws m tends to infinity and when both m and N tend to infinity. Under each of the two cases, a CLT is established and the asymptotic variance and a consistent estimator for the variance are given. The Godambe and Joshi (1965) lower bound is shown to be almost attainable under the first case and attainable for the second case.

An important problem in applications is estimation of the population total ty under a stratified sampling design when stratum x-totals are known, particularly in the case of small stratum sizes. If biased estimators are used to estimate within stratum population y-totals, the bias may accumulate across strata. The unbiased GHR estimators can be used effectively in dealing with such situations by introducing a separate GHR estimator, analogous to the classic separate ratio estimator of survey statistics. A CLT is proven for the separate GHR estimator under a stratified sampling design when the stratum sizes are fixed and the number of strata tends to infinity. Simulation results show that GHR under different sampling designs gives excellent results compared to other almost unbiased estimators proposed in the literature, even when the number of strata is not large.