Statistics Canada
Symbol of the Government of Canada
Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

Appendix C Statistics used in sampling bias study

In Chapter 6, it is stated that under random sampling,

An equation to calculate the Z statistic

should follow an approximately normal (0,1) distribution. A justification for this is given here. Sampling was done independently in each CU. Therefore The estimator based on initial weights is the sum of  H independent random variables, where H is the number of CUs in Canada. There are 46,510 CUs in Canada that contain sampled households, so H is very large. Thus, according to the central limit theorem, Ratio to calculate the Z statistic will follow an approximately normal (0,1) distribution (see Kendall and Stuart [1963], p. 193) as will An equation to calculate the Z statistic if The expected value of the initial estimator is equal to the population countThe Z statistic, however, would not have a mean of 0 if the CU level samples of households were significantly biased for any reason.

An additional statistic will now be derived which allows us to test if the bias between two regions or two censuses is the same. Let A first initial estimator and A second initial estimator be estimators (based on initial weights) of the known population counts The known population count of the first estimator and The known population count of the second estimator for two distinct geographic areas or for two different censuses. Let Equation for the relative bias of the first estimator and Equation for the relative bias of the second estimator be the relative biases of The first initial estimator and The second initial estimator. We wish to test if the null hypothesisThe null hypothesis that the relative bias of the first initial estimator equals the relative bias of the second initial estimator is true. This can be done using the statistic

An equation to calculate the W statistic

where Equation to estimate the relative bias of the first estimator and Equation to estimate the relative bias of the second estimator are unbiased estimators of The relative bias of the first initial estimator and The relative bias of the second initial estimator respectively. Thus, if the null hypothesis The null hypothesis above is true, the expectation of  The W statistic is zero. Note also that the denominator of  The W statistic is the standard error of the numerator of  The W statistic (there is no covariance term because estimates from separate regions or from different censuses are independent) and hence The W statistic has a variance of 1. Now if The first initial estimator approximately follows a normal distribution (again based on the central limit theorem), The estimated relative bias of the first initial estimator will also approximately follow a normal distribution, as will The estimated relative bias of the second initial estimator and The estimated relative bias of the first initial estimator minus the estimated relative bias of the second initial estimator. Thus, The W statistic follows approximately a normal (0,1) distribution if the null hypothesis The null hypothesis is true.

previous gif   Previous page | Table of contents | Next page   next gif