Part A: Context
Table of contents
- 2. Introduction
- 3. Evolution of the Canadian Census Program: 1871 to 2006
- 4. 2011 Census Program: introduction of the National Household Survey
Statistics Canada's mandate is to ensure that Canadians have access to a trusted source of statistics that meet their highest priority information needs. Statistics Canada is responsible under the Statistics Act (R.S.C., 1985, c. S-19) for conducting the census every five years. By law, the government (by an order in council) prescribes the questions to be asked in the census. By the same law, each person is required to provide the information requested in the census and Statistics Canada must protect the confidentiality of the personal information provided by respondents.
Since 1971, Statistics Canada's Census Program has used two questionnaires: a long form, distributed to a sample of households, which contained the full set of questions; and a short form, distributed to the remaining households, which contained only a basic set of questions. Up to and including 2006, both the short and the long forms were mandatory. In 2011, a voluntary sample survey called the National Household Survey (NHS) was conducted, replacing the mandatory long-form census in providing a detailed statistical portrait of Canada and its people. The short form remained mandatory and formed the census, distributed to all households. For the purpose of this report, the term 'Census Program' will be used to refer in a general way to the Canadian Census of Population, either the short and long forms from 1971 to 2006, the 2011 Census of Population and NHS or any other Census of Population prior to 1971.
Conducting a census is an important statistical undertaking, not only in Canada but in most countries. In a report presented by the Secretary-General at the forty-second session of the United Nations Statistical Commission, he noted that by the end of the 2010 round of population and housing censuses (spanning the period 2005 to 2014), 98% of the world's population should be counted (United Nations 2011a). Censuses are so essential that the United Nations has issued, since 1958, principles and recommendations for population and housing censuses. In the last revision prepared for the 2010 round of censuses (United Nations 2008), the introductory paragraphs speak to the importance of the censuses:
The most important capital a society can have is human capital. Assessing the quantity and quality of this capital at small area, regional and national levels is an essential component of modern government.
Aside from the answer to the question 'How many are we?' there is also a need to provide an answer to 'Who are we?' in terms of age, sex, education, occupation, economic activity and other crucial characteristics, as well as to 'Where do we live?' in terms of housing, access to water, availability of essential facilities, and access to the Internet. The answers to these questions provide a numerical profile of a nation which is the sine qua non of evidence-based decision-making at all levels, and is indispensable for monitoring universally recognized and internationally adopted Millennium Development Goals (United Nations 2008, p. 1).
While the traditional census remains the main approach to collecting this information,Footnote 1 there is a shift internationally towards other census approaches. In the United Nations Economic Commission for Europe (UNECE), of which Canada is a member, 10 of the 37 countries (see Table 1) that conducted a traditional census in the 2000 round and responded to the survey have moved to a new approach for the 2010 round (UNECE 2010). In 8 of the 10 cases, the new approach relies either partially ('combined') or completely ('register-based only') on administrative registers,Footnote 2 bringing to 20 the number of UNECE responding countries using such an approach in the 2010 round. Two countries, France and the United States, have moved to an approach ('other') that involves a form of sample survey conducted on a continuous basis rather than at a specific point in time.Footnote 3 High costs concentrated in the one- or two-year period around the census year, the difficulty in maintaining participation in such a large-scale operation and the need for more frequent data are among the reasons most often cited for the change to a new approach.
Other countries that planned a traditional census in the 2010 round are contemplating changes for the future. On April 1, 2011, the Office for National Statistics (ONS) in the United Kingdom launched the four-year Beyond 2011 program to establish and test census models for meeting future user requirements for population and wider sociodemographic statistics. Critical among these will be the use of administrative data (Nolan 2011). The ONS expects that it will complete its work on 2021 Census options development in 2014.
In New Zealand, the traditional census planned for March 2011 was postponed due to the earthquake in Christchurch, and will now be conducted in March 2013. Statistics New Zealand has examined the feasibility of moving to a register-based census (Bycroft 2010) and concluded that New Zealand met none of the pre-conditions for introducing such an approach. Statistics New Zealand is currently preparing a paper for presentation to government in early 2012 on the strategic direction for the New Zealand census. The draft proposal includes a long-term vision of an administrative census; a phased approach towards this vision with decision points for government; and recommends a continuation of a five yearly census, with efficiencies focussed on reducing real costs, until alternative options become feasible.
The move in Canada from a mandatory long-form census questionnaire to the voluntary National Household Survey (NHS) for the 2011 Census Program brought the notion of privacy intrusiveness to the forefront, raised questions as to whether Canadians should be obliged to answer certain questions and whether the information collected by the Census Program is relevant. As part of Statistics Canada's customary process to review and evaluate its statistical programs, Statistics Canada launched the 2016 Census Strategy Project in December 2010. The objective of this project was to study options and deliver a recommendation to the federal government on the methodology of the 2016 Census Program in 2012.
The 2016 Census Strategy Project operated under a very tight timeframe. The recommendation for 2016 had to be delivered to the government early enough in 2012 to then allow sufficient time for the planning, development, testing and implementation of the 2016 Census Program methodology, and very importantly, the approval of funding for the planning and development phase. The timeline limited significantly the depth of the analysis in specific areas, as noted in the report.
At the same time, the project was wide in scope. It reviewed the approaches for population censuses that exist around the world and evaluated their applicability to the Canadian context, as well as their adherence to Statistics Canada's mandate and business model. The project included a review of the constitutional and statutory requirements for the census and provided a content determination framework, including criteria for inclusion of content in the 2016 Census Program. At the same time, Statistics Canada identified areas worth investigating for the Census Program beyond 2016.
To support the 2016 Census Strategy Project, Statistics Canada put in place a rigorous governance structure. Internally, the project received guidance from an advisory committee consisting of directors general from relevant branches. Direction was also obtained from a steering committee of assistant chief statisticians. Externally, the project got advice through the existing structure of advisory committees, including the National Statistics Council (NSC), the most senior of these external advisory committees. Finally, a subcommittee of the NSC and an Expert Panel Review Committee of six international experts in official statistics and census-taking were struck specifically to review and advise on Statistics Canada's work on the 2016 Census Strategy Project.
The result of this 12-month project is presented in this report. Part A of the report continues with Section 3, which describes the evolution of changes in the Canadian Census Program, including drivers for change, and how these changes were introduced. Section 4 focuses particularly on the changes to the 2011 cycle and reports initial data quality findings.Footnote 4
Part B of this report is about the Census Program objectives and content needs. Sections 5 and 6 discuss the definition of a census and situate Census Program objectives in the context of legislative review and discussion with users conducted as part of the project.
Part C of this report discusses the Census Program methodology in response to the objectives and content needs presented in Part B. Section 7 builds on an independent report (Royce 2011) and presents three census-taking approaches that exist internationally. It lists the pre-conditions presented in Mr. Royce's report for conducting such approaches in any given country and adds to his assessment the results of additional reviews conducted by Statistics Canada to determine which approaches are possible for 2016 in Canada. Section 8 presents a process to determine content needs for the 2016 cycle in a structured and transparent way. Section 9 presents conclusions for the 2016 Census Program.
3. Evolution of the Canadian Census Program: 1871 to 2006
The Census Program has undergone many changes in the 140 years since Confederation.Footnote 5 It has evolved both to meet the changing needs of the country for information about itself, and to adapt to changes in the social, legislative, administrative, financial and technological environment in which the Census Program is carried out. In addition, the Census Program is not isolated from other parts of the statistical system; changes in the latter affect Census Program content and methodology, and conversely. This section describes the evolution of the content and the methodology of the Canadian Census Program, as well as the role played by research and testing, between 1871 and 2006. The 2011 Census Program is covered in Section 4.
3.1 Evolution of content
The topics asked in the Census Program have always considered issues of the day and the associated information needs. Thus questions evolved over time, reflecting the changes in Canadian society while maintaining, to the extent possible, the integrity of time series. This section discusses briefly the development of Canada's statistical system and highlights the evolution of the Census Program content in light of the issues and environment of the time.
The first census of Canada, following Confederation in 1867, was taken in 1871. The main goal of the Census Program in 1871 was to determine appropriate representation by population in the new Parliament, and since 1871, decennial Census Program data have been the cornerstone of representative government in Canada.
At that time, the Census Program was the only means of gathering information about all aspects of the population and the economy. Thus it had nine questionnaires for a total of 211 questions covering area, land holdings, vital statistics, religion, education, administration, the military, justice, agriculture, commerce, industry and finance. The population questions included the age, sex, religion, education, race, ancestral origins and occupation of each person—topics which are still asked in the Census Program today.
By 1901, the need for information on the wave of immigrants arriving in Canada, in conjunction with the more general requirement for extended and improved economic statistics, inspired the new Census and Statistics Act of 1905 (Worton 1998, p. 38).
A mid-decade Census Program of Manitoba, Alberta and Saskatchewan was introduced in 1906 to measure the rapid growth and development of the Prairies. This was part of the agreement for the Prairie provinces to join Confederation (see Section 5.2.1).
With the creation of the Dominion Bureau of Statistics in 1918, other household and business surveys were introduced, decreasing the need for certain content in the Census Program. For example, the introduction of a separate Vital Statistics Program meant that the Census Program of 1921 no longer had a questionnaire for persons who had died. In fact, there were only five questionnaires in 1921: population; agriculture; animals, animal products and fruits not on farms; manufacturing and trading establishments; and a supplemental questionnaire for the blind and for deaf-mutes.
The rapid growth after World War II, with its large population movements among provinces and into urban areas, demanded more frequent information on Canada's population, and in 1956, the Dominion Bureau of Statistics carried out the first nationwide quinquennial Census Program.
In 1971, the new Statistics Act made it a legal requirement for Statistics Canada, formerly the Dominion Bureau of Statistics, to hold censuses of population and of agriculture every five years.
The 1986 cycle broke with the pattern, established in 1956, of alternating full and shorter questionnaires for the Census Program. By repeating most of the questions asked in the full Census Program of 1981, many procedures and processing systems could be re-used, resulting in significant cost savings while producing a full range of data.
The 1986 cycle also saw a major expansion of the use of the Census Program to identify specific subpopulations for postcensal surveys, when a postcensal survey on health and activity limitations was introduced. This methodology reduces overall respondent burden and is a cost-effective means of collecting information. Since then, postcensal surveys have been used to respond to important information needs on geographically dispersed segments of the Canadian population or for very targeted groups, which cannot be obtained from other sampling means used for broad general household surveys such as the General Social Survey or the Canadian Community Health Survey.Footnote 6
Examples of content changes to the Census Program which responded to emerging information needs include:
- In 1901, questions on the characteristics of immigrants—place of birth, year of immigration, racial origin or nationality, mother tongue—were added because of the policy to settle the Prairies.
- In 1931, in response to the Great Depression, questions were asked to gauge the level of unemployment and to analyze its causes.
- In 1971, a new question on language spoken at home was added to complement data on mother tongue and knowledge of official languages; this was to provide information on the degree of assimilation of different language groups. Also in 1971, questions on weeks worked and full-time and part-time employment and place of work were added. The first permitted a distinction to be made between types of employment, an important difference when interpreting employment-income data. Place of work was introduced to help plan transportation facilities and to identify commuting areas used to determine census metropolitan area boundaries.
- The 1971 cycle was the last to ask about wartime service, availability of hot and cold running water, access to a bath or shower and presence of flush toilets.Footnote 7 In 1991, the decennial question on how many children were ever born to women aged 15 and over was asked for the last time.
- The 1996 cycle saw the addition of new questions on Aboriginal identity and population group to meet information needs related to an increasingly diverse population, in particular, for employment equity. Mode of transportation to work was added to better understand commuting patterns and the use of public transit, while questions on household activities or unpaid work were added to shed light on the contributions of care givers.
- In the 2006 cycle, questions on education were re-worded to improve both relevance and response quality, including a new question on location of study to respond to the need for more precise information on credential recognition.
As well, question wording and response categories changed to reflect social realities. For example, in 1871, the only options for 'marital status' were married, widowed or other. Today, there are five categories: legally married, separated but still legally married, divorced, widowed, or never legally married.
From 1951 to 1971, the term 'head of household' was used, referring to the husband in a husband-wife family. In 1976, the term 'head' was changed to refer to either the husband or wife. The reference to 'head' was dropped altogether in the 1981 cycle, to be replaced by the neutral 'Person 1.'
In 2001, the definition of common-law couples was changed to include members of the opposite sex or same sex living together as a couple, but who are not legally married to each other. The 2006 cycle was the first Census Program to count same-sex married couples, reflecting the legalization of same-sex marriages at that time.
Thus, the Census Program has evolved with the Canadian statistical system, both adapting to societal changes together with the changing needs for information.
3.2 Evolution of methodology
While the methodology of the Canadian Census Program is much different today than in 1871, the motivations for methodological change have remained relatively constant over time, and include the following:
- improving the efficiency of the Census Program to reduce costs and improve timeliness
- minimizing respondent burden while still meeting the need for data
- addressing concerns of privacy and confidentiality
- improving the quality of Census Program data.
Many of the changes to the methodology of the Census Program have addressed more than one of these goals. The major developments in Census Program methods are grouped into collection of data, processing of data, and data quality improvement and measurement.
3.2.1 Collection of data
From 1871 to 1966 inclusive, the Census Program was conducted by the traditional canvassing methodology, where an enumerator visited the dwelling in person to complete the questionnaire with the household. By the mid-1960s, however, societal changes and concerns about privacy had made this approach increasingly expensive and problematic. Self-enumeration, where respondents complete their own questionnaires, was introduced in the 1971 cycle as a means to reduce collection costs, enhance privacy, and improve data quality. The door-to-door methodology is now restricted primarily to remote areas of the country.
Between 1971 and 2001, questionnaires were dropped off by enumerators, who subsequently received the completed forms either by mail or by picking them up at the household. These enumerators edited the questionnaires and followed up with the households, to complete any missing information.
Increasing privacy concerns related to this verification role of the local enumerator were a major driver for change in 2006. Questionnaires were mailed to 66% of households, with drop-off of questionnaires in areas where the Address RegisterFootnote 8 was not yet of sufficient quality. Most importantly, instead of questionnaires being returned to a local enumerator, paper questionnaires were mailed back to a central office, where they were scanned and edited by computer. As well, respondents were offered the option of completing their questionnaire entirely by Internet, further enhancing privacy and improving data quality. Some 18% of respondents took advantage of this new way of completing their questionnaire. Follow-up to complete any missing information was conducted from the central office by telephone. The only contact by an enumerator in the field was for households that had failed to return their questionnaire.
With the development of scientific sampling methods in the 1930s, it was realized that high-quality data could be produced while reducing both costs and respondent burden. The first use of sampling was in 1941, where every tenth household received a supplementary questionnaire on housing. A major expansion of the use of sampling occurred in the 1971 cycle, when it was decided that most questions only needed to be asked of a one in three sample of households. The remaining two thirds of households received only a short version of the questionnaire. The sampling fraction for the long form was further reduced to one in five households in the 1981 cycle, where it remained through to 2006.
The 2006 cycle also saw the first instance of the use of administrative data as a substitute for direct collection of data from the respondents. Respondents were asked for permission to link to their tax return to obtain their income data, and over 82% of respondents agreed to this option, eliminating the need for them to answer a series of detailed income questions, thus reducing respondent burden and improving data quality.
3.2.2 Processing of data
The early Census Program questionnaires were processed and tabulated by hand, unaided by machines of any kind. By 1921, mechanical tabulation methods were beginning to be used, and in 1931, a new sorter-tabulator made production 50 times faster than was previously possible. Census Program processing has continued to take advantage of the latest developments in technology and statistical methods. For example, automated edit and imputation methods were introduced in the 1981 cycle, automated coding of write-in responses was introduced in 1991, and automated edits were built into the Internet version of the questionnaire in 2006. Replacement of many of the manual operations of the Census Program by automation has reduced costs, improved timeliness, improved data quality and enhanced privacy by having less human interaction with respondent data.
The technology for capturing data from questionnaires has also evolved over time. 'Mark-sense' technology was first used in 1951, where answers were recorded in designated positions on the questionnaire and then scanned into the computer. From 1981 to 2001, the key-entry capacity at the Canada Revenue Agency was used to transform the data from paper questionnaires to computer tapes. In 2006, with the centralization of editing and processing, paper questionnaires were scanned and intelligent character recognition technology was introduced. In the case of Internet responses, the data entry is done by the respondent directly, with edits applied in real time to catch errors and to allow the respondent to correct them.
Until the 1980s, Census Program results were compiled and published in paper form; for example, the 1871 results were published in five bilingual volumes in 1873. Today, dissemination of results by Internet has largely replaced paper publications. As well as improving the timeliness of dissemination, the Internet allows the data user much more flexibility in finding the exact information sought. For the researcher, the production of public use microdata files and access to information in Statistics Canada's research data centres have enhanced the accessibility of Census Program data while maintaining strict confidentiality protection rules.
3.2.3 Data quality improvement and measurement
The quality of Census Program data is of utmost importance, with the single most important output being the population counts. The Census Program methodologies have developed over the decades to both improve the accuracy of the census population counts and to measure the coverage of the Census Program. Early Census Program cycles concentrated primarily on developing consistent procedures and on supervisory checks of the quality of the enumerator's work. In the 1950s and 1960s, independent measures, such as having local letter carriers check the quality of the enumerators' dwelling lists, began to be introduced. In 1961, the reverse record checkFootnote 9 was developed to estimate the number of persons missed, and to identify the principal reasons for undercoverage in the Census Program. This led to additional measures being introduced, such as a follow-up of a sample of dwellings classified by the enumerator as vacant and a study of persons temporarily away from their usual place of residence. The results of these studies were included in the official census population counts. Statistical quality control methods were also introduced for manual operations such as enumerator editing and follow-up, coding, and data capture. Methods for producing estimates based on the long-form sample and for estimating the amount of sampling error were also improved during the 1971 to 1986 period.
The results of the 1986 reverse record check showed a significant increase in the level of undercoverage compared to previous Census Program cycles, and undercoverage continued to be unevenly distributed across provinces. As well as affecting the quality of the Census Program results themselves, the increased undercoverage was a serious concern for the Population Estimates Program,Footnote 10 which rebases its estimates to the census population counts every five years. Billions of dollars of major federal transfers to the provinces and territories are directly tied to the population estimates, and so coverage became a major concern for the 1991 Census Program.
A two-part strategy was adopted. The first part was to develop new methods to improve coverage; examples included the development and use of an Address Register as a coverage check on the interviewers' address listings (similar to the letter carrier checks which had been conducted in earlier cycles but later dropped for cost reasons), and the addition of new coverage questions on the questionnaire.
The second part of the strategy was to expand the coverage measurement program, with the goal of producing estimates of net undercoverage (undercoverage minus overcoverage) that were sufficiently reliable to be used by the Population Estimates Program. The reverse record check was expanded to the territories, the sample size of the reverse record check in the provinces was increased significantly, and a program to estimate the level of overcoverage was developed. The expanded program provided estimates of net undercoverage for the provinces and territories of sufficient quality to be incorporated into the 1991 and subsequent estimates of the Population Estimates Program.
3.3 Role of research and testing
Because many important decisions depend on the Census Program data and the opportunity to 'get it right' occurs only every five years, major changes to both content and methodology need to be thoroughly tested and proven before being introduced in the Census Program. Mistakes are potentially costly, both in terms of wasted money and the unavailability of essential data. Statistics Canada has a long tradition of carrying out tests and experiments before introducing major changes into the Census Program.
Major changes are sometimes tested in a large scale 'dress rehearsal' carried out two to three years in advance of a given Census Program cycle. Tests of new or modified questions may involve large nationwide samples, so that problems of question wording can be detected. Such large-scale tests provide valuable experience to allow adjustments to be made before the Census Program itself is conducted. In other cases, however, new questions or methodologies can only be tested properly within the environment of the real-time Census Program activities, with its high level of public awareness and the scale of its operations. Statistics Canada often takes advantage of this opportunity; for example, centralized edit was first tested in the 1996 cycle and the Internet collection methodology in the 2001 cycle before being both rolled out nationally in 2006.
Census Program methodology often evolves only over several cycles. New methods may first be introduced in a limited way, and then expanded as experience with them is gained and improvements are made. Tests of mailing out questionnaires date as far back as the 1960s, but it was only by 2006 that the creation of an Address Register was cost effective and had reached a sufficient level of quality to be used to mail questionnaires. Automated coding of write-in responses was first used in a limited way in 1991, but was only extended to all write-in questions in 2006.
The same is true of content; for example, concepts such as head of household and marital status have changed along with society, and have seen gradual, but significant, changes.
Evaluating the results of each Census Program cycle in order to develop, test and implement new questions and new methodologies is an ongoing process that requires a long-term focus and commitment.
4. 2011 Census Program: introduction of the National Household Survey
In the 2011 Census Program, all households received the same 2011 Census questionnaire. It consisted of the same eight questions (name, date of birth, sex, marital and common-law statuses, household relationships, mother tongue and the question seeking consent to release in 92 years their personal information to Library and Archives Canada) that appeared on the 2006 Census short-form questionnaire, plus two additional questions on language that had appeared on the 2006 Census long-form questionnaire. This 2011 Census questionnaire was the only form referred to as the 2011 Census in the 2011 Census Program and it remained mandatory to fill it out.
Instead of the census long form that was administered to 20% of the households on a mandatory basis in 2006, a new voluntary National Household Survey (NHS) was administered to 30% of the households. In addition to the same questions as the 2011 Census questionnaire, the NHS questionnaire included most of the same topics as the 2006 Census long form. Topics covered in the NHS were (in order of appearance on the questionnaire):
- Activities of daily living
- Sociocultural information
- Citizenship and immigration
- Ethnic origin, population group
- Aboriginal identity, Registered or Treaty Indian status, member of a First Nation/Indian band
- Place of birth of parents
- Labour market activities, including:
- Place of work and journey to work
- Work activity
- Language at work
- Child care
- Housing, including shelter costs
- 92-year consent
The design of the 2011 Census Program was based on testing and results from previous cycles, with a particular focus on mail and Internet response. Self-enumeration areas included 98.9% of the private dwellings in Canada. In 'mail-out-letter' areas (61.4% of private dwellings), Internet promotion letters providing only an individualized and secure Internet code and a toll-free number to request a paper questionnaire were mailed. In 'mail-out-questionnaire' areas, an additional 18.1% of private dwellings (for a total mail-out of 79.5%) received a questionnaire package that also included an individualized and secure Internet code. For private dwellings where mail-out was not possible (19.4%), enumerators listed the dwellings and delivered the questionnaire packages in person ('list/leave' areas). Again, the questionnaire packages contained an individualized and secure Internet code. The remaining 1.1% of private dwellings consisted primarily of reserves and remote areas where enumerators listed dwellings and conducted face-to-face interviews (called 'canvasser' areas).
This mail-out strategy relied extensively on Statistics Canada's Address Register, a frame of residential location-based addresses that also maintains mailing addresses (McClean and Charland 2011). To delineate the mail-out areas (for both the 2011 Census and NHS), the location address had to follow civic style addressing (i.e., it contains at least a civic number, a street name and a municipality) and Canada Post Corporation (CPC) had to use the same civic style addressing for mail distribution. Small pockets of potential mail-out areas within otherwise non-mail-out areas were removed to ease the management of field collection operations.
To promote response while remaining cost-effective, a new 'wave' collection methodology was tested and put in place for 2011. Adjustments to the original plans were required to react to CPC disruptions and environmental events in certain parts of the country (e.g., forest fires). The wave collection methodology is summarized in the box below.
Wave collection methodology for the 2011 Census and NHS
On May 3 (i.e., seven days before Census Day), the Internet promotion letters or questionnaires for the 2011 Census were delivered by CPC in mail-out areas and questionnaires were dropped off by enumerators in list/leave areas. Between May 13 and 19, reminder letters were sent to census non-respondents. Between May 20 and 31, census non-respondents in the mail-out-letter areas received for the first time a questionnaire package, and on May 24, those in the mail-out-questionnaire areas received a telephone voice broadcast reminder when Statistics Canada had a phone number for them. All census non-respondents were subsequently moved to the non-response follow-up stage, performed by enumerators, which started on June 1 and ended on August 5 in all areas.
For the NHS, census Internet respondents who were in the NHS sample were offered the option to respond online to the NHS immediately after completion of their census questionnaire. Between June 6 and 9, these NHS households received a reminder letter if they had not yet completed the questionnaire. Between June 3 and 14, census paper questionnaire respondents in the NHS sample received for the first time their NHS paper questionnaire. Between June 13 and 16, these NHS households received a reminder letter if they had not yet completed the questionnaire. All NHS non-respondents (including census non-respondents selected in the NHS) were subsequently moved to the non-response follow-up, performed by enumerators, which started as early as June 8. On July 14, remaining NHS non-respondents were subsampled at 33% and only these were subsequently further followed-up. This follow-up ended on August 24 in all areas.
It is worth noting that the total duration of census collection activities was one month shorter in 2011 compared to 2006, and about two weeks shorter for the NHS compared to the census long form in 2006. In 2006, there were particular challenges in mobilizing the necessary workforce to perform all collection activities in certain parts of the country.
The move from the mandatory census long form in 2006 to the voluntary NHS in 2011 was expected to significantly impact the response rates. With lower response rates comes the risk of increasing the non-response bias. No tests to predict these impacts had ever been done on a survey in Canada that has both this magnitude and self-enumeration as the main collection mode. The only example was a test conducted in the Labour Force Survey (which is conducted by interviews, not self-enumeration) in late 1998 and early 1999 to assess the possible impact of a change from mandatory to voluntary status. The analysis was limited to those who refused to complete the survey. The refusal rate for the voluntary portion of the test was consistently about twice (around 3%) the rate of the mandatory portion (around 1.5%). The study also found that a voluntary survey could lead to significant differences in the national unemployment rate.
The closest experience internationally was a 2003 test of making response to the American Community Survey (ACS) in the United States voluntary. The ACS, which replaced the decennial census long form last conducted in 2000, is carried out continuously. For a given monthly sample, collection by mail is attempted the first month, followed by telephone follow-up the next month and finally by face-to-face interviews the third month but only for a subsample of the non-respondents. The voluntary test resulted in a drop of 20.8 percentage points in the mail cooperation rate and a drop of 13.3 percentage points in the telephone cooperation rate (Griffin and Raglin 2011). As a result, if the ACS were to become voluntary, more non-respondents would have to be moved to the telephone follow-up and more would reach the stage of face-to-face interviews, two collection modes that are more costly than mail. Griffin (2011) has estimated that an additional 66.5 million U.S. dollars (an increase of 48%) would be needed to preserve the same quality with a voluntary ACS. If the budget remained unchanged, the loss in quality (sampling variance) would be of the order of 45%.
To compensate for the loss of responses (increased sampling variance) and the potential increase in bias, three main actions were taken for the 2011 NHS. First, the sampling fraction was increased from 20% (about 2.5 million dwellings) for the 2006 Census long form to 30% (about 4.5 million dwellings) for the 2011 NHS. With an achieved response rate of 93.5% in the 2006 Census long form and an anticipated response rate of only 50% for the 2011 NHS, the sampling fraction increase was meant to achieve approximately the same effective sample size (and similar sampling variance), i.e., approximately the same number of responding dwellings. It is important to note that increasing the sample size does not prevent non-response bias; it only assures that dissemination of results in 2011 at very fine local areas would not be restricted, compared to 2006, due to lack of responding units in the sample.
Second, to address the issue of non-response bias, a random subsample of the remaining NHS non-respondents as of July 14 was selected, with particular attention given to areas likely to contain populations at risk of not responding or to be heterogeneous according to 2006 Census Program data. Additional efforts were made to follow up and obtain responses from this subsample. Subsampling and additional follow-up of non-respondents is a well-known technique in the survey methodology literature and was first proposed by Hansen and Hurwitz (1946). For example, as mentioned earlier, it is used by the ACS to subsample the non-respondents to be followed up by face-to-face interviews. The technique assumes that the extra efforts made to elicit responses for all units in the subsample (e.g., by moving from mail-out/mail-back collection to face-to-face interviews for the subsample) will result in responses being obtained from most of the subsample. In practice, however, there is no collection strategy that will ensure full response for the subsample. Statistics Canada's planning assumption was for a 75% response rate in the subsample.
Third, 2011 Census data are available for almost all households in Canada and this information can be used in weighting the NHS responding households. This method can reduce non-response bias in the NHS to the extent to which non-response in the NHS is correlated with the 2011 Census variables. In addition, there is a plan to further adjust for non-response using income tax information. It is unlikely, however, that all NHS non-response bias can be explained via these variables.
4.2 Preliminary results
As mentioned in Section 4.1, data collection ended in August for the 2011 Census and NHS. Because the quality of the census and NHS data has not been fully assessed, most of the findings presented here are preliminary. The evaluation of data quality will continue for many more months as Statistics Canada conducts the different evaluations which will lead to the release of the census and NHS information, beginning respectively in February 2012 and in 2013.
The following tables provide a comparison of the response rates for the 2011 Census and NHS to those of the 2006 Census. The response rates are as of August 24, 2011, for the 2011 Census and as of September 30, 2011, for the 2011 NHS, and as such are considered preliminary. They are obtained directly from collection results, i.e., before data processing and data quality verification. They are calculated as the number of private dwellings that returned a questionnaire divided by the number of private dwellings classified as occupied by field staff. For the NHS, the rate is calculated on the sampled private dwellings only. The 2006 response rates are the final ones, i.e., after data processing and data quality verification.
Table 2 shows that the 2011 Census response rates are slightly better than those of the 2006 Census, except in the territories. However, these rates must be compared with great caution. First, as mentioned above, the 2006 rates are final as opposed to the 2011 ones. Second, these differences result from both substantial changes in the design of the census (e.g., moving from a combination of short and long forms in 2006 to a single short form in 2011) and improvement in data collection methods in 2011. More processing and analysis will be needed to get a better understanding of these differences.
Table 3 shows that the response rates from the 2011 NHS are much lower than those obtained during the 2006 Census with the long form. Once again, caution must be taken when comparing these rates. However, further adjustments to the 2011 rates are unlikely to significantly change these differences. It should be noted that the 69.3% response rate obtained in 2011 is much higher than the initial 50% rate expected during the planning of the survey.
Table 4 shows the return rates of census questionnaires obtained in 2006 and 2011. The return rate is equal to the number of questionnaires completed and returned by Canadian households, either by Internet or by mail using the paper questionnaire, divided by the number of occupied dwellings in parts of Canada where self-enumeration was planned.
Internet or mail return rate of 2006 and 2011 Census questionnaires (self-enumerated regions only)
In most provinces and territories, return rates for the 2011 Census are higher than those of the previous census. At this stage of the analysis, it is thought that the increase is the result of the new wave collection methodology, introduced with the 2011 Census Program.
When considering the return rate of 2011 Census questionnaires according to response mode, the rates obtained in 2011 are 53.7% for Internet and 31.5% for mail, with a total return rate of 85.2%, as shown in Table 4, whereas these rates were 18.3% and 62.4% respectively in 2006. The very large increase in Internet returns for the 2011 Census is mainly due to the mail-out, to more than 60% of Canadian private dwellings, of a letter with an Internet access code and instructions on how to obtain a paper questionnaire, while in 2006, dwellings had received a questionnaire with a return envelope.
Table 5 shows the distribution, by 2011 Census and NHS response mode, of households that completed the 2011 Census and were selected for the 2011 NHS. Households that did not complete the 2011 Census are excluded from this table.
Distribution of households that completed the 2011 Census and were selected for the 2011 National Household Survey (NHS), by 2011 Census and NHS response mode
The distribution analysis in Table 5 shows that 13.7% of households that completed the 2011 Census via Internet did not respond to the NHS, compared to 48.2% of those households that completed the 2011 Census by mail. Among households that completed the 2011 Census by Internet, 81.1% also completed the NHS by Internet. Overall, 64.2% (not shown in Table 5) of all households that responded to the NHS chose the Internet response mode.
It is important to remember that households that completed their 2011 Census questionnaire by Internet and were selected for the NHS were immediately offered to complete the NHS. This was not the case for households that returned their completed census questionnaire by mail; the NHS questionnaire was sent to them about four weeks later. This could suggest that an integrated census and NHS collection increases the NHS response rate. However, it is premature to conclude that more NHS questionnaires would have been returned by mail if both questionnaires had been sent out at the same time, or even combined. More tests and analyses are required.
Another important measurement for the NHS is the weighted collection response rate. This rate takes into account the subsampling of NHS non-respondents, which was intended to minimize potential non-response bias. The weighted non-response represents the response rate that would have been achieved if follow-ups had been done for all non-responding NHS households instead of only a subsample of them.Footnote 11 According to initial analyses, the preliminary weighted response rate at the national level is slightly less than 80%. This rate is well below the 93.5% rate obtained with the 2006 long form. As mentioned in Section 4.1, Statistics Canada had hoped to obtain responses for 75% of the subsample of non-respondents, but in reality, this rate does not exceed 50% according to initial analyses.
That being said, a greater non-response does not automatically imply bias. Further assessments will have to be carried out, but preliminary analyses based on 2011 Census data showed no major differences between NHS respondents and non-respondents in terms of age or gender. Small differences were noted with regards to marital status (slightly fewer separated, divorced and widowed persons responded to the NHS) and mother tongue (slightly fewer persons whose mother tongue is English responded to the NHS).
Analysis of preliminary response rates (weighted and unweighted rates) also revealed that at provincial and census metropolitan area (CMA) levels, rates are quite similar and do not vary much. However, at a lower level, such as census subdivisions (i.e., municipalities as defined by provincial or territorial legislation), there are considerable variations for subdivisions with fewer than 10,000 dwellings. For some of them, response rates are lower than the national response rate. Variation increases for even smaller geographic units, such as subdivisions with fewer than 1,000 dwellings or dissemination areas. This was observed during the 2006 Census Program, but the extent of variation in response rates was much smaller than the extent observed with the 2011 NHS.
For households that chose to respond to the NHS, preliminary analysis of response rates to NHS questions seems to show that rates for the first modules of the NHS questionnaire up to the education module are not very different from the rates for the same modules in the 2006 long form. However, the differences are more important starting with the labour market activity module. The main reasons of the differences compared to 2006 could have to do with the absence of a follow-up for partial non-response to NHS questions and to the voluntary nature of the questionnaire, which may have had an impact on the perseverance of respondents.
In conclusion, for the 2011 Census, the preliminary response rate is generally higher than the 2006 final response rate. The questionnaire return rate is higher than that of 2006 by nearly five percentage points. The Internet completion rate of questionnaires is nearly three times higher than that of 2006. Finally, the total duration of collection operations was almost one month shorter compared to 2006.
As for the NHS, the preliminary response rate is approximately 25 percentage points below the final response rate for the 2006 long form. When considering the NHS sample design, the NHS weighted preliminary response rate allows us to reduce this gap by at most 10 points. Several methods are being developed and implemented to process and weight the NHS results in order to reduce the impact of non-response on estimates. An analysis of the effects of increased non-response on the quality of estimates is also underway.