Statistics Canada
Symbol of the Government of Canada
Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

10. Evaluation of Coverage Studies

10.1 Reverse Record Check

10.1.1 Introduction

10.1.2 Comparisons with census counts

10.1.2.1 Enumerated

10.1.3 Comparison with population estimates

10.1.3.1 Deceased persons

10.1.3.2 Interprovincial migration

10.1.4 Components of population growth

10.2 Census Overcoverage Study

10.2.1 Comparison of 2001 and 2006 Automated Match Study (AMS)

10.2.2 Comparison of 2006 AMS and 2006 COS

10.2.3 Reliability

10.3 Population estimates

10.3.1 Error of closure

10.3.2 Error in two terms

10.3.3 Growth

10.3.4 Conclusion

10.1 Reverse Record Check

10.1.1 Introduction

The results of the largest coverage study, the Reverse Record Check (RRC), can be evaluated by comparing its estimates with data on the same characteristics from other sources such as the 2006 Census database. Comparisons with RRC estimates serve to evaluate RRC estimates and to quantify conceptual and measurement differences.

Despite some conceptual differences between the RRC and the 2006 Census, the RRC estimate of persons enumerated in the 2006 Census can be compared with the census count. However, to render the two numbers comparable, certain adjustments were made to the census counts before comparing them.

RRC intercensal components of growth estimates can be compared with estimates from other sources. The RRC estimate of persons who died between the 2001 Census and the 2006 Census can be compared with the count from vital statistics files. Estimates of counts of net interprovincial migration from Canada Customs and Revenue Agency data can be compared with RRC estimates. It is not possible, however, to construct strict comparisons for this characteristic, since reasonable adjustments for conceptual differences cannot be derived. Last, RRC estimates of population growth components can be compared with similar estimates from administrative data.

10.1.2 Comparisons with census counts

Since the RRC single-stage stratified sampling design results in unbiased estimators, differences between RRC estimates and census counts are due to sampling error on the part of the RRC estimates, conceptual differences between the two sources, and/or systematic biases in the two sources that result in an underestimate or overestimate of the characteristic under study.

10.1.2.1 Enumerated

Provincial and national comparisons are presented in Table 10.1.2.1 with the standard error of the RRC estimate and the t-value for testing the hypothesis that there is no difference between the RRC estimate and the comparable census count. The following adjustments were made to published census counts to account for conceptual differences between the two sources:

  • Imputations added in the whole household imputation stage of the census based on the results of the Dwelling Classification Study are not included. This is because while they are included in census counts, they are not part of the RRC estimate of enumerated persons.

  • 2006 Census overcoverage is subtracted. This is because the census counts contain overcoverage whereas the RRC estimate is based on the number of unique persons enumerated rather than the number of enumerations.

  • The census count of persons living outside Canada five years ago (based on Form 2B data) excluding immigrants from the intercensal period and non-permanent residents, is also subtracted. This is because the RRC estimates do not include these persons.

  • Last, 2001 Census overcoverage is added. This is because there is overcoverage in the RRC estimates via the initial weights in the 2001 Census sampling frame. These weights were not adjusted for this overcoverage.

Nationally, the RRC estimate of the number of persons enumerated in the 2006 Census is slightly lower, -0.03%, than the comparable 2006 Census count. In 2001, the RRC overestimated the comparable census count by 0.07% while in 1996 the RRC underestimated the census by 0.08%. Provincially, the greatest difference occurs for Quebec (t-value of 1.62) where the RRC estimate of the number of enumerated persons exceeds the comparable census count by 49,656. In the majority of provinces the difference is negative, though relatively small in most cases. None of the observed differences are statistically significant.

The most significant differences must be investigated, since they may be due to a bias in RRC classification (including, for example, province of residence on Census Day). However, other factors also play an important role. Apart from sampling errors, the difference may be explained by biases in the adjustments applied to the published census count to obtain a conceptually comparable figure (e.g., returning Canadians). The RRC non-response bias may also affect this difference, since the non-response adjustment is designed to obtain the best result for estimating missed persons rather than enumerated persons. Though there are few significant differences, the fact that most of them are negative may indicate a slight bias.

10.1.3 Comparison with population estimates

10.1.3.1 Deceased persons

Table 10.1.3.1 compares the estimated number of persons deceased during most of the intercensal period (i.e. May 15, 2001 to December 31, 2005) by RRC province of classification with counts from vital statistics files of deaths (VS). Deaths in 2006 are excluded, since vital statistics for 2006 were not yet available at the time of the analysis. At the national level, the RRC estimate is higher than the VS count by 9,134 (0.9%). The highest relative difference is observed in Manitoba (-6,635/45,687, or 14.5%). In absolute value terms, the differences vary from 0.8% to 14.5%. In t-value terms, the highest t-value occurs for (2.05) where the RRC estimate is higher than the VS count, and in Manitoba (-1.87) where the RRC estimate is lower than the VS count. All other estimates are well below the 95% confidence levels. Despite the slightly higher difference in British Columbia, these results indicate no need for further investigation.

10.1.3.2 Interprovincial migration

Table 10.1.3.2 compares RRC estimates of net interprovincial migration for the intercensal period with corresponding figures from Canada Customs and Revenue Agency (CCRA) files. In general, in-migration and out-migration statistics are not comparable since the RRC only takes into account migration flows that occurred between the sampling frame reference date e.g., May 15, 2001 for the census frame, and Census Day 2006, while estimates based on CCRA data take annual migration into account. Accordingly, only net migration estimates are presented.

The difference is significant for Alberta (t-value of 2.38) where the RRC estimates a much higher positive net migration than estimates derived from CCRA data. While both sources estimate a strong positive net migration, the net amount differs depending on the source. It is recognized that there has been considerable migration to Alberta, and that it may be hard to distinguish between permanent and temporary migration. Some who migrate to Alberta to work have settled there permanently. Others have gone there to work, but have retained their residence in their province of origin and return there on a more or less frequent basis. Census respondents do not always correctly identify the location where they should be enumerated. Furthermore, the census question used to identify mobility refers to the place where the person lived one year ago and five years ago but does not specify the concept of usual residence. It is therefore possible that the respondent will provide a temporary place of residence, leading to a misinterpretation of his or her mobility. The combination of these two factors may affect the accuracy of census estimates.

For all provinces except British Columbia, both estimates show net migration in the same direction. The Reverse Record Check estimate for British Columbia indicates a low negative net migration of 2,316, while the demographic data indicate a positive net migration of 12,887. The t-values for this comparison do not suggest a need for further investigation.

10.1.4 Components of population growth

An extensive comparison of RRC estimates of the components of intercensal population growth with census counts and population estimates derived from administrative data was done by the Demography Division (This topic is also discussed in Section 10.3.). The RRC estimates are a by-product of the RRC and therefore not necessarily precise. Table 10.1.4 compares the two estimates of population growth by component. Note that estimates of returning Canadians, and persons living on Indian reserves or settlements who were incompletely enumerated in 2001 and enumerated in 2006 were added to the RRC estimates to make them comparable to the administrative data counts.

The administrative data counts are a combination of a number of estimates of population growth component: births, deaths, immigration, internal migration, emigration, net number of non‑permanent residents, growth of non-enumerated Indian reserves. These counts are subject to varying amounts of measurement error depending on the source. This is particularly so for the net number of non-permanent residents. It is also important to note that the RRC is not designed to produce estimates of the components of growth. The components estimates are a by-product of the RRC. Therefore, differences between the RRC estimates and the administrative data counts are to be expected.

Nationally, RRC estimates differ by 5.1% from the administrative data estimates. The largest differences occur for British Columbia (-77,192) and Ontario (-49,371). As a percentage of the administrative data estimates, these differences amount to 32.0% and 6.0% respectively.

10.2 Census Overcoverage Study

Due to methodology changes in the way overcoverage was measured in 2006, we need a tool to evaluate whether the changes would cause a significant variation in overcoverage rates.

In general, the 2006 overcoverage can be divided into two parts: the part that would have been covered in 2001 by the Automated Match Study (AMS), and the part that would have been covered by the Reverse Record Check (RRC). For simplicity, and since it covers only a small portion of the total overcoverage (less than 1%), the Collective Dwelling Study (CDS) is not considered in this evaluation.

10.2.1 Comparison of 2001 and 2006 Automated Match Study (AMS)

The AMS was conducted again in 2006, to compare the 2001 overcoverage estimates with those of 2006 and to ensure differences observed between 2001 and 2006 were not due to methodology changes. The 'Monster Match' and 'Mini-Monster Match' computer programs were executed, to find similar pairs of households in the 2006 AMS sample frame. This portion of the overcoverage can be measured using the 2001 methodology. We should point out that the geographic variables used to identify a dwelling in 2001, i.e., province, provincial electoral district (PED) and enumeration area (EA), were replaced by a combination of province, census division (CD) and collection unit (CU). Since we wished to duplicate the 2001 methodology as closely as possible, CDs were converted into PEDs. However, we could not do so with as much accuracy for the 2006 collection units. Nonetheless, the concept of collection unit is similar to the concept of enumeration area.

Table 10.2.1 compares the 2001 AMS estimates with those of 2006.

We noted a steady increase in overcoverage between 1996 and 2006. Nationally, there was a 56% increase between 1996 and 2001 and a 99.8% increase between 2001 and 2006. The largest increases were observed in the Atlantic provinces, while the smallest ones occurred in the territories.

Since the AMS methodology did not change between 1996 and 2006, we may conclude that the increases are necessarily due to an actual increase in overcoverage in the Census. We noted that coefficients of variation (CVs) decreased between 1996 and 2006. Nationally, the CV decreased from 3.74% in 1996 to 1.56% in 2006.

10.2.2 Comparison of 2006 AMS and 2006 COS

One evaluation tool involves comparing COS and AMS overcoverage estimates. The procedure consists of matching all COS pairs that contain overcoverage with the AMS survey frame (this is referred to as the AMS domain of the COS). For matching pairs, we use COS weighting to estimate the AMS domain of the COS.

Here are the comparative results of overcoverage estimates of the 2006 AMS and of the AMS domain of the COS.

Nationally, we noted a difference of 8.28%. The largest differences were in the territories, especially in Nunavut where the AMS domain of the COS was 33.33% lower than the 2006 AMS. The difference was also consistently negative.

The final part of the evaluation entailed verifying whether cases identified as overcoverage by the AMS were also identified as overcoverage by the COS. This evaluation identified a bias in the COS estimates. Estimates from the AMS were used to adjust COS estimates for this bias, as outlined in Table 8.3.3, Section 8.3.

10.2.3 Reliability

Table 10.2.3 shows the reliability of the 2001 and 2006 overcoverage estimates in terms of estimated coefficient of variation. In 2006 all coefficients of variation (CVs) were below 5%, and we noted a significant reduction in CVs. This is because more than half the estimate was based on overcoverage cases with a weight of one. The reduction was also partly due to the fact that in 2001, standard errors deriving from the RRC were high compared to those of the AMS, the two main contributors to the overcoverage estimate.

10.3 Population estimates

10.3.1 Error of closure

Statistics Canada's Population Estimates Program (PEP) determines provincial and territorial population counts on Census Day by adding census population counts and estimates of census population net undercoverage1. The PEP then extends these adjusted census counts to July 1, whereupon they become the base population for postcensal population estimates. For more information on population estimates, see Estimates of Total Population, Canada, Provinces and Territories.

When determining the adjusted census counts, the PEP evaluates the quality of postcensal estimates for the five-year period preceding the census by comparing postcensal estimates for  Census Day with the adjusted census counts. The difference between the two is referred to as the error of closure. A detailed review of this error is the main quality evaluation of the postcensal estimates.

Table 10.3.1 provides errors of closure for 2006 and 2001 by province and territory. Note that a positive error means the postcensal estimate has overestimated the population. For Canada in 2006 the error of closure was +105,352, an error rate of +0.32%. This is double the 2001 rate. For eight provinces, the 2006 error rates were between -0.5% and +0.5%. The rates were higher for the Yukon Territory, British Columbia, the Northwest Territories, Alberta and Nunavut. Of these five regions, only British Columbia had a positive error. Compared to 2001, the 2006 rates were higher for the Yukon Territory, British Columbia and Alberta, and lower for the Atlantic provinces and Saskatchewan. Overall, as in 2001, the majority of provinces had small errors of closure. However, unlike 2001, the largest errors in 2006 ocurred in two of the most populous provinces.

10.3.2 Error in two terms

The size of the error of closure depends, on the one hand, on error in the postcensal population estimates and, on the other hand, on error in the estimate of census net population net undercoverage. In order to evaluate error in the postcensal estimates, it is useful to express the error of closure in two terms:

Two equations showing that the error of closure is equal to the difference between P subscript PEP and P subscript AC, and the error of closure is equal to the difference between P subscript RRC and P subscript AC less the difference between P subscript RRC less P subscript PEP

where

P subscript RRC:   Population estimated directly from the Reverse Record Check (RRC)
            = RRC estimate of the number of persons enumerated + census population net undercoverage + persons on Incompletely Enumerated Indian Reserves (IEIR)
P subscript AC:    Adjusted census population
            = census enumerations + census population net undercoverage + persons on on IEIR
P subscript PEP:   Postcensal estimate of population on Census Day

The first term compares the Census Day population estimated directly by the RRC with the adjusted census population. This difference, which is the difference between the RRC estimate of the number of enumerated persons and the number of census enumerations, should be due mainly to RRC sampling error2.

The second term compares the population estimated directly by the RRC with the postcensal estimate of population. This difference is a comparison of the RRC estimate of population growth with the PEP estimate. This term helps to determine whether the PEP estimates of population growth have important errors that may have contributed to the error of closure.

Table 10.3.2 presents the two error of closure terms for 2006 and their associated standard errors. First, we note that the error of closure is significant for Alberta, British Columbia, the Yukon Territory and the Northwest Territories.

The error of closure is clearly dominated by the difference between the estimates of growth and the PEP estimates. This is true for British Columbia and Ontario as well, although the difference is not significant for Ontario. For Alberta on the other hand, significant error of closure is equally due to both error types, with neither being significant. For most of the other provinces, the two types of error are smaller. In fact, sampling error in the RRC estimates as expressed by differences in the number of enumerated does not contribute significantly to the error of closure for any of the provinces, while error in the postcensal estimates does contribute significantly to the error of closure for Canada and British Columbia.

Analysis of the two error terms is not appropriate for the territories since the last step in RRC estimation, for the territories only, is to calibrate to census counts. In addition, the RRC cannot directly estimate population growth for 2001 to 2006 because only a sampling frame for the 2006 Census Day is used. In general, migration patterns for the territorial population, especially for short-term moves which are more difficult to estimate, account for the higher rate of errors of closure in the provinces.

10.3.3 Growth

The error term may be broken down further. Table 10.3.3 presents the differences between RRC and PEP estimates of population growth by growth component for the two provinces, Alberta and British Columbia, where the error of closure is significant, and for the provinces combined.

For Canada excluding the territories, the difference between the RRC and PEP estimates of population growth is almost entirely due to the difference between the RRC and PEP estimates of net international migration. The PEP estimate includes immigration, a very reliable component, net non-permanent residents which may also be considered reliable, and emigration. The RRCPEP difference is therefore somewhat due to the emigration difference. For Canada excluding the territories, the difference between the RRC and PEP estimates of emigration are also significant. That is, the PEP estimate of the number of persons exiting the country is significantly lower than the RRC estimate. This gap comes largely from British Columbia and Ontario.

For British Columbia, as for Canada, the difference in net international migration comprises the largest portion of the difference between RRC and PEP growth estimates. PEP emigration estimates for this province are significantly lower than those of the RRC. The difference in natural growth from a high RRC estimate of deaths also contributes to the overall gap. Last, the difference in net interprovincial migration is a factor, though the difference is not significant.

For Alberta, the difference between RRC and PEP growth is not significant. This is due to a difference that is positive and significant for net interprovincial migration, but negative for the other components. The RRC estimate of net interprovincial migration is significantly higher than that of the PEP. Without this difference the overall gap between RRC and PEP would have been smaller, as would the error of closure. A slightly low PEP estimate of net interprovincial migration for Alberta would thus have contributed to the negative error of closure for this province.

10.3.4 Conclusion

In conclusion and allowing for RRC sampling error, the PEP estimates are consistent with census data adjusted for population net undercoverage. Only two provinces and territories have significant errors of closure. Review of the error of closure reveals that an emigration estimate that may have been too low would have significantly contributed to the positive error for British Columbia. To this may be added a slightly high net interprovincial migration, probably due to an underestimate of people leaving for Alberta. A slightly low estimate of net interprovincial migration would have contributed to the negative error of closure for Alberta. This province received a considerable number of migrants in the year preceding the census, many of whom were temporary migrants. As was the case with errors for the territories, the greater difficulty in estimating this migration is almost certainly responsible for the larger error.

Note:

  1. The PEP also adds estimates of populations living on incompletely enumerated Indian reserves (IEIR) to these figures.
  2. The calculation of the differences in the number of enumerations requires certain adjustments to make the census and RRC numbers comparable. In particular, returning Canadians among those enumerated in the census but not among those enumerated by the RRC must be considered.

   Previous page | Table of contents | Next page