Regulation (EU) 2018/1091 states that "the Commission is to respect the confidentiality of the data transmitted in line with Regulation (EC) No 2232009 of the European Parliament and of the Council. The necessary protection of confidentiality of data should be ensured, among other means, by limiting the use of the location parameters to spatial analysis of information and by appropriate aggregation when publishing statistics. For that reason a harmonised approach for the protection of confidentiality and quality aspects for data dissemination should be developed, while making efforts to render online access to official statistics easy and user-friendly".

Regulation (EC) No 223/2009 Article 3 'Confidential data' means data which allow a statistical unit (i.e. the person, company or organisation to which the data refers) to be identified, either directly or indirectly, thereby disclosing individual information. To determine whether a statistical unit is identifiable, account shall be taken of all relevant means that might reasonably be used by a third party to identify the statistical unit.

The risk of a statistical unit being identified is the only factor that qualifies data as confidential. It is not important which information is disclosed and if this information is sensitive or not<a href="#risk" aria-describedby="footnote-label" id="risk-ref"></a>. In this light, one cannot argue that some variables (e.g. crops, livestock) are less sensitive than others (labour force).

GDPR


On the 8th of February 2018 the Directors-General and Presidents of the National Statistical Institutes (NSIs) and of the European Union's statistical authority (Eurostat) met at an informal workshop on the implications of the GDPR in European statistics and the following conclusions were issued:

  1. acknowledged the high relevance of the GDPR implementation for the production of high quality official statistics and for maintaining the confidence of the respondents providing personal data for statistical purposes;
  2. recognised that in almost all Member States procedures have been initiated to enact derogations from the data subjects' rights referred to in some or all of the following Articles of the GDPR: 15 (access), 16 (rectification), 18 (restriction) and 21 (objection);
  3. agreed that the same derogations should apply across all statistical domains and should not be domain-specific;
  4. acknowledged that the NSIs and other statistical authorities (ONAs) are responsible for the protection of all personal data they process, both those collected in the framework of an EU regulation and those collected for purely national interests;
  5. noted that appropriate derogations in national law, when granted, could in the most cases be sufficient to effectively address the potential ramifications of the GDPR and the specific needs of the statistical production in each Member State;
  6. agreed that, in the interest of harmonising the protection of the data subjects' rights in the field of official statistics, additional uniform derogations at EU level, notably in Regulation 2232009, could be useful and should be considered once enough experience in the application of the GDPR has been collected; in this respect discussion at expert level should be organised at a later stage;
  7. agreed to share experience and best practice in addressing the implications of the GDPR for official statistics at the national level; to this end, a collaborative platform will be created by Eurostat to store and share examples of national provisions and justifications for derogations;
  8. emphasised the need to establish constructive dialogue with data protection authorities at national and European level in order to clarify the specificities of statistical production, including a better understanding of statistical methodology and existing safeguards.

Data storage and dissemination

To be developed

Confidentiality measures for microdata

See 9.5.3 - Scientific use files

Confidentiality measures for location

In the presence of geospatial data, disclosure control experts must face a paradox. On the one hand, such data need more protection because they allow more identification, and on the other hand they offer many possibilities for analysis, that users don't want to distort too much by suppressing data. Disclosure risk is higher when considering geospatial data:

Technically, the dissemination classification (zoning, administrative boundaries, or regular tessellations such as grid squares) is a categorical variable like any another one (an additional dimension of tabular data). It is therefore possible to deal with disclosure risk with no geographical consideration. Nevertheless, a geographically intelligent management of disclosure issues will preserve the underlying spatial phenomenon. A risk-utility compromise has to be made, using relevant distortion indicators (EFGS & Eurostat, 2017). The risk for identifying holdings by crossing census gridded data with the proposed scientific use files (see 9.5.3 - Scientific use files) is close to zero. This is due to the following facts:

Figure 65 – Agricultural holding density (number of farms per square Km of UAA)


In order to protect the confidentiality in case of very large holdings, when it is possible that only one farm exists in one of the cells of the grid, it will be possible to allocate the position of a farm to the nearest neighbouring cell with at least one other holding. If none of the 8 neighbouring cells (chosen in random order) has at least one other holding, the neighbouring locations have to be extended until a grid cell is located. As much as possible the chosen cell should be such that the location is within the same NUTS3 region of the original cell. A cell is considered to belong to a NUTS3 region if the lower-left coordinate is inside the polygon that defines the NUTS3 region at the 1:100.000 scale.

Figure 66 – If only one farm at a location, assign it to a random neighbouring cell within the same NUTS3; if still not possible, enlarge the area. 

Multi-resolution grids

Multi resolution grids are represented by a hierarchical structure through two associations. Each StatisticalGrid instance can be associated with a lower and/or an upper resolution grid through the Hierarchical relation association. A StatisticalGridCell belonging to a given StatisticalGrid is composed of the overlapping cells its grid's lower resolution grid, and composes the cell it overlaps in its grid's higher resolution grid. Lower and upper StatisticalGridCells are associated through the Hierarchical composition. Figure 71 – INSPIRE Grid

Source: https://inspire.ec.europa.eu/id/document/tg/su 

Confidentiality for tabular data

Eurostat disseminates a high number of statistical tables. All these tabular data are treated for primary confidentiality. Primary confidentiality concerns tabular cell data, whose dissemination would permit attribute disclosure. The two main reasons for declaring data to be primary confidential are: too few contributors in a cell and dominance of the first n largest contributors in a cell <a href="#conf" aria-describedby="footnote-label" id="conf-ref"></a>. 

In the tables disseminated on Eurostat website, a cell is confidential if:

A confidential value is replaced with ":c".

For non-confidential cells, the extrapolated number of holdings and all values of variables in cells are rounded to the closest multiple of 10.

Because of the confidentiality treatment, the sum of the individual cells does not systematically match with the value of the "total" cell.


<footer>
    <h2 class="visually-hidden" id="footnote-label">Footnotes</h2>
    <ol>
      <li id="risk"> European Business Statistics Manual. <a href="https://ec.europa.eu/eurostat/statistics-explained/index.php?title=European_business_statistics_manual_-_Statistical_Disclosure_Control#SDC_rules_and_methods_for_tabular_data_" target="_blank"> See https://ec.europa.eu/eurostat/statistics-explained/index.php?title=European_business_statistics_manual_-_Statistical_Disclosure_Control#SDC_rules_and_methods_for_tabular_data_</a>. <a href="#risk-ref" aria-   label="Back to content">↩</a></li>
<li id="conf"> 
Handbook on Statistical Disclosure Control, version 1.2, Jan 2010. <a href="https://ec.europa.eu/eurostat/cros/system/files/SDC_Handbook.pdf" target="_blank"> See https://ec.europa.eu/eurostat/cros/system/files/SDC_Handbook.pdf</a>. <a href="#conf-ref" aria-   label="Back to content">↩</a></li>
    </ol>
  </footer>



Table
Table 21 – Problems, possible improvements and proposals for application of primary confidentiality