In Chapter 6, it is stated that under random sampling,
should follow an approximately normal (0,1) distribution. A justification for this is given here.
Sampling was done independently in each CU.
Therefore is the sum of H independent random variables,
where H is the number of CUs in Canada. There are
46,510 CUs in Canada that contain sampled households, so
H is very large. Thus, according to the central limit theorem,
will
follow an approximately normal (0,1) distribution (see Kendall and Stuart [1963], p. 193)
as will
if
.
, however, would not
have a mean of 0 if the CU level samples of households were
significantly biased for any reason.
An additional statistic will now be derived which allows us to test if the bias between
two regions or two censuses is the same. Let and
be estimators (based on initial
weights) of the known population counts
and
for
two distinct geographic areas or for two different censuses. Let
and
be the relative biases of
and
. We wish to test if the null hypothesis
is true. This can be done using the statistic
where and
are
unbiased estimators of
and
respectively.
Thus, if the null hypothesis
above is true, the expectation of
is zero. Note also
that the denominator of
is the standard error of the numerator of
(there is no
covariance term because estimates from separate regions or from different censuses are independent)
and hence
has a variance of 1. Now if
approximately follows a normal distribution (again based on
the central limit theorem),
will also approximately follow a
normal distribution, as will
and
. Thus,
follows approximately a normal (0,1)
distribution if the null hypothesis
is true.