Appendix 1.4 – Note describing the Wood Buffalo census subdivision data collection methodology and the use of administrative data sources


On May 1, 2016, a wildfire began southwest of Fort McMurray, Alberta, and on May 3, swept through the community destroying many homes and buildings and forcing the largest wildfire evacuation in Alberta's history. Statistics Canada then decided to suspend census data collection (referred to as 'field data collection') in the evacuated areas.

Statistics Canada used a set of measures to ensure that residents of the Wood Buffalo census subdivision (CSD) (referred to as the Specialized municipality of Wood Buffalo or Wood Buffalo) were included in the 2016 Census of Population. Data for the evacuated area were derived from a combination of sources. First, many residents of the area responded online or by returning a paper questionnaire. Then field data collection was performed for a number of households using short- or long-form questionnaires. Lastly, short-form data were derived from a number of administrative data sources for the households residing in dwellings where field data collection was not possible. Data for all areas not evacuated due to the wildfire are from direct field data collection.

Reference date

For the 2016 Census, the reference date for data reporting is May 10, 2016. For residents of the evacuated areas during the wildfire, the reference date is May 1, 2016, to reflect the situation as it existed before the fire.

Data collection

Prior to the evacuation, and even in the following weeks when census data collection was suspended, some responses were received from the residents of the evacuated area. In August 2016, data collection was reinstated in Wood Buffalo and census representatives went door to door to complete census questionnaires. Efforts were focussed on collecting data for the one in four dwellings included in the long-form questionnaire sample. This was particularly important, as administrative data sources do not provide information for long-form questions. To further improve data quality, field data collection was also performed for dwellings in the areas for which no administrative data were available and for collective dwellings. In areas where enumerators prepare a list of dwellings and deliver census materials, field data collection was done for all dwellings.

Administrative data

Wherever possible and when no direct response had been received for a dwelling, data from various administrative data sources were used with a reference date as close as possible to May 2016, for variables such as name, date of birth, sex and marital status. As administrative data files did not contain information on language as collected on the census questionnaire, record linkages between the administrative sources and the 2011 Census database were performed. For successful linkages, the 2011 responses to the language questions were used as proxy for the 2016 language questions. Census questions for which no comparable information could be obtained from administrative data files, such as Relationship to Person 1 and common-law status, were derived during data processing.

Statistics Canada worked closely with both provincial and local authorities in Alberta to obtain access to administrative records to assist in the validation of the data derived from administrative data sources available in Statistics Canada.


If a census response was obtained for residents of a dwelling, this took precedence over any available administrative data. For the remaining cases, during data processing and for the calculation of response rates, data from administrative sources were considered as a response to the same extent as a direct response obtained through traditional collection methods.

Data quality for population and dwelling counts

For the population and dwelling counts, the Wood Buffalo CSD data went through the same quality assessments as the overall census data. A supplementary pre-validation activity was performed by Statistics Canada once data from field collection and administrative sources were combined. This additional step was done to certify that the alternative methods developed for this exceptional situation were providing satisfactory results.

Short-form questionnaire data quality

To obtain data on age, sex and families, Statistics Canada used administrative data for 54% of households, questionnaire data for 40% of households and imputation for the other data (5% of the remaining households).

For households for which administrative data were used, the age and sex data were taken directly from administrative data files. The distribution of the age and sex data from the administrative files and of the data taken directly from the completed questionnaires is comparable for both enumeration methods (administrative data and completed questionnaires). However, with respect to households enumerated using administrative data, the biggest determinant for attributing family characteristics was the use of marital status and parent-child relationship established during linkage with the tax data. A larger number of lone-parent families following processing of the administrative data than the completed questionnaire data was observed. The corresponding proportions were 19.2% (administrative data) and 9.8% (completed questionnaires).

This discrepancy had an impact on the proportion of families consisting of couples without children, which was 30.0% and 40.9%, respectively, depending on the enumeration method. There also seems to be a difference in the data on households taken from the administrative data files with respect to the number of people living common-law; the proportion from the administrative data is smaller than from data from traditional collection. There is also a significant difference in terms of the size of household; proportionally, there are far more one-person households and six or more person households in the administrative data than in the questionnaire data.

For households for which administrative data were used, the 2016 Census data on language were obtained from responses to the 2011 questions on language when linkage was possible. A comparison of the distribution of language variables does not show as many differences for households for which administrative data were used as it does for households for which the data came from completed questionnaires. Comparing the 2011 and 2016 figures for the family and language variables for the Wood Buffalo CSD must be done with caution.

