## Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

# 2011 Census: Data Quality and Confidentiality Standards and Guidelines (Public)

Confidentiality (non-disclosure) rules

The following describes the various rules used to ensure confidentiality (or non-disclosure) of individual respondent identity and characteristics. All census data are subject to confidentiality (non-disclosure) rules.

## Area suppression for standard^{Footnote1} and non-standard geographic areas

Area suppression is used to remove all characteristic data for geographic areas below a specified population size.

The specified population size for all standard areas^{Footnote1} or aggregations of standard areas is 40, except for blocks, block-faces or postal codes. Consequently, no characteristics or tabulated data are to be released if the total population of the area is less than 40.

The specified population size for six-character postal codes (forward sortation area – local distribution unit [FSA-LDU]), geocoded areas and custom areas built from the block, block-face or LDU levels is 100. Consequently, no characteristics or tabulated data are to be released if the total population of the area is less than 100. Generally, blocks and individual urban block-faces (one side of the street between two intersections) will be too small to meet the above-specified population size thresholds. Where an aggregation of blocks or block-faces fall above the threshold specified by the population size, data can be retrieved through a custom tabulation.

These specified population size thresholds are applied to 2011 Census data as well as all previous census data.

Please refer to section Postal code minimum aggregation rules for additional rules applicable to postal code data.

## Postal code minimum aggregation rules

In addition to the confidentiality rules on disseminating Census data with the postal codes, the following rules are applied to postal codes. These rules fall under clause 03.01 (n) of the Commercial Non-Mailing licence between Statistics Canada and Canada Post Corporation.

- All requests must include batches of two or more postal codes; the only exception being for postal codes which have a zero as the second digit (rural postal codes).
- Groups of postal codes are to be assigned a unique classification/number (e.g. K1A 0T6, 0T7, 0T8 = Custom Area 1); under the terms of the contract listed above, clients cannot be provided with lists of postal codes, only the name specified in the client's request can be used.
- All other confidentiality rules for custom extractions still apply as per Area suppression for standard and non-standard geographic areas.

Also, the following disclaimer is applicable to all postal code custom requests:

**Postal code validation disclaimer:**Statistics Canada makes no representation or warranty as to, or validation of the accuracy of any postal code

^{OM}data submitted to Statistics Canada.

Please note these rules are applicable to historical postal code requests as well.

## Random rounding

All counts in census tabulations are subjected to random rounding. Random rounding transforms all raw counts to random rounded counts. This reduces the possibility of identifying individuals within the tabulations.

All counts are rounded to a base of 5, meaning they will end in either 0 or 5. The random rounding algorithm employed controls the results and rounds the unit value of the count according to a predetermined frequency. Table below shows those frequencies. Note that counts ending in 0 or 5 are not changed and remain as 0 or 5.

Unit values of | Will round to count ending in 0 | Will round to count ending in 5 |
---|---|---|

1 | 4 times out of 5 | 1 time out of 5 |

2 | 3 times out of 5 | 2 times out of 5 |

3 | 2 times out of 5 | 3 times out of 5 |

4 | 1 time out of 5 | 4 times out of 5 |

5 | Never | Always |

6 | 1 time out of 5 | 4 times out of 5 |

7 | 2 times out of 5 | 3 times out of 5 |

8 | 3 times out of 5 | 2 times out of 5 |

9 | 4 times out of 5 | 1 time out of 5 |

0 | Always | Never |

The random rounding algorithm uses a random seed value to initiate the rounding pattern for tables. In these routines, the method used to seed the pattern can result in the same count in the same table being rounded up in one execution and rounded down in the next.

## Disclosure avoidance for statistics

Statistics (such as mean, standard error, sum, median, percentile, ratio or percentage) are not subject to random rounding. However, when shown in tabulations accompanying the counts used to calculate the statistic, their presence can result in disclosure of individuals. To prevent this, we use statistic suppression methods, or special statistic calculations.

### Statistic suppression

For all quantitative variables, a statistic is suppressed if the number of actual records used in the calculation is less than 4.

### Special statistic calculations

- The statistic value is never rounded, except for frequencies.
- All statistics based on ranks (medians, percentiles) are calculated the usual way and they are never rounded. We never release the minimum or the maximum of a statistic.
- When a sum is specified for age, then the program multiplies the unrounded average of the group in question by the rounded frequency. Otherwise, if a sum is specified for a variable other than age, the program rounds the actual sum.

When a division is specified (averages, percentages, ratios, etc.), the program must apply point (3) to both numerator and denominator before it proceeds with the division.

**Note:**Statistics based on ranks like median and percentiles are always calculated via linear interpolations. That means that, for low count cells, these statistics are not reliable. That is the reason no additional confidentiality measures are applied to them.

**Note:**The average of an age is not altered by the rounding, because the numerator is the product of the true average by the rounded frequencies and the denominator is the rounded frequencies. The two frequencies cancel each other, leaving the true average untouched.