Guide to the Census of Population, 2016
Chapter 10 – Data quality assessment

Introduction

Data quality assessment provides an evaluation of the overall quality of census data. The results of this assessment are used to inform users of the reliability of the data, to make improvements for the next census, to adjust census data for non-response and, for two coverage studies (reverse record check and the Census Overcoverage Study), to produce official population estimates. Quality assessment activities take place throughout the census process, beginning prior to data collection and ending after dissemination.

Sources of error

However well a census is designed, the data collected will inevitably contain errors. Errors can occur at virtually every stage of the census process, from material preparation to creation of the list of dwellings, data collection and processing. Census data users should be aware of the types of errors that can occur, so they can assess the usefulness of the data for their own purposes.

Main types of errors:

Coverage errors occur when dwellings and/or persons are missed, incorrectly enumerated or counted more than once.

Non-response errors occur when some or all information about individuals, households or dwellings is not provided.

Response errors occur when a question is misunderstood or a characteristic is misreported by the respondent, the census enumerator or the Census Help Line operator.

Processing errors can occur at any stage of processing. Processing errors include errors that can be made at data capture during coding operations, when written responses are converted into numerical codes, and during imputation, when valid (but not necessarily accurate) values are inserted into a record to replace missing or invalid data.

Sampling errors apply only when answers to questions are obtained from a sample. This type of error applies only to the 2016 Census long-form questionnaire.

Measuring data quality

Many data quality studies have been conducted for recent censuses to allow data users to assess the impact of errors and improve their own understanding of how errors occur. For the 2016 Census, special studies examine errors in coverage and data quality, i.e., non-response, response and processing.

Three studies are conducted to measure coverage errors:

  1. Dwelling Classification Survey – One of the sources of coverage error in the census is the misclassification of dwellings on Census Day. This error can occur when an occupied dwelling is classified as unoccupied, or when an unoccupied dwelling is classified as occupied. The purpose of the Dwelling Classification Survey is to study these types of classification errors and adjust counts, if necessary. A sample of dwellings for which no census questionnaire was returned is contacted, information is collected on the occupancy status and, if occupied, on the number of usual residents.
  2. This information is used to adjust the census data for dwellings, households and persons. This is done by correcting the classification errors and adjusting household size distribution through imputation for dwellings that did not return the questionnaire. It is done in time for the initial population count release.
  1. Reverse Record Check – This study provides estimates of persons missed by the census (after accounting for the adjustments described in the Dwelling Classification Survey above). Estimates are developed for each province and territory and for various population subgroups (e.g., age-sex groups and marital status).
  1. Census Overcoverage Study – In the 2011 and 2016 censuses, double-counting of persons is determined by searching for linked records that have a high degree of matching on sex, date of birth and name. Linked records are sampled and checked manually, and results are used to estimate the census overcoverage (or the number of duplicate persons).

Certification

Certification consists of several activities to rigorously assess the quality of census data at specific levels of geography in order to ensure that the quality standards for public release are met. This evaluation includes the certification of population and dwelling counts, and variables related to dwelling and population characteristics.

During certification, response rates, invalid responses, edit failure rates, and a comparison of data before and after imputation are among the data quality measures used. Tabulations for the 2016 Census are produced and compared with corresponding data from past censuses, other surveys and administrative sources. Detailed cross-tabulations are also checked for consistency and accuracy.

Depending on the certification results, census data can be released in one of three ways:

For more information on the quality indicators and certification results, see the reference guides for the various domains of interest.

Response rate for the 2016 Census of Population

One of the key data quality measures used for the Census of Population is the response rate. Table 10.1 shows the response rates for the 2016 Census of Population both nationally and for each province and territory. The rates are provided for all occupied private dwellings for which a short form or long form was to be received and for the subset of occupied private dwellings for which a long form was to be received. For the long form, the unweighted response rate and the weighted response rate are provided.

The rates in Table 10.1 were calculated following data processing and data quality assessment. Response rates are calculated as follows: the number of private dwellings for which a questionnaire was filled out divided by the number of private dwellings classified as occupied according to the census database. The final classification of dwelling occupancy status is based on the data analysis collected by field staff, the data provided by respondents and the results of a quality study on the occupancy status of a sample of dwellings. The rates in Table 10.1 differ from the collection response rates previously disseminated because they take into account data processing and verification of the dwelling occupancy status and thus are considered final. With respect to weighted response rates, they are based on the long form's final sampling weights. The weighted response rates are therefore calculated as follows: the number of sampled weighted private dwellings for which a questionnaire was filled out divided by the number of weighted sampled private dwellings classified as occupied.

Table 10.1 2016 Census of Population: Response rates
Table summary
This table displays the results of 2016 Census of Population: Response rates. The information is grouped by Province/territory (appearing as row headers), Short and long form response rates, Unweighted response rates from the long form only and Weighted response rates from the long form only, calculated using percent units of measure (appearing as column headers).
Province/territory Short and long form response rates Unweighted response rates from the long form only Weighted response rates from the long form only
(%)
Canada 97.4 96.7 96.9
Newfoundland and Labrador 97.4 96.6 96.8
Prince Edward Island 97.5 96.9 97.0
Nova Scotia 97.6 97.1 97.2
New Brunswick 97.6 97.1 97.2
Quebec 97.6 97.2 97.3
Ontario 97.6 97.0 97.2
Manitoba 97.4 96.3 96.9
Saskatchewan 96.7 96.2 96.3
Alberta 97.0 96.3 96.4
British Columbia 96.5 95.7 96.0
Yukon 95.8 93.5 95.2
Northwest Territories 93.9 92.8 93.1
Nunavut 92.7 92.7 92.7
Date modified: