Optimal sampling strategies for weighted linear regression estimation
Our strong effort to find an optimal sampling strategy that was clearly superior to other strategies for a range of linearity conditions and variance structures for linear models showed that several sampling strategies turned out to be equally efficient. Each of these stratified the population to the maximum extent feasible, i.e., used n strata based on a covariate. Which of two ways of stratification to use and how units in each stratum were selected (simple random sampling or sampling with probability proportional to size) did not seem to matter much. Two regression estimators, one considering both probability and variance weights (Ŷgr) and one considering only probability weights (Ŷpi), are preferred estimators with the five efficient sampling selection schemes that select one unit per stratum with either equal or unequal probability sampling. The bootstrap variance estimator is generally the least biased, yet conservative, variance estimator and yields reliable coverage rates with 95% confidence intervals for most populations studied.