Page tree

Each country delivers a single dataset with core data and data for the modules.

For the reference year 2023, core and module data are requested for holdings in the main frame and are not requested for holdings in the frame extension.  However, countries may send on voluntary basis core and module data on frame extension if they collect such data for national purposes.

The dataset includes the field HLD_FEF (Holding in frame extension flag) which flags holdings that belong to the frame extension. Eurostat is able to distinguish between holdings in the main frame and holdings in the frame extension in the dataset, in order to make possible the publication of data on the relevant population coverage as well as the meaningful analysis of trends (over 2020-2026) on the same population coverage.  

The modules are to be collected for all holdings for which core data are collected or for a sub-sample of holdings of the core.

In order to ensure in Eurostat a clear understanding and a proper use of the correct fields for weighting the data and calculating the variance estimates for various variables belonging to core and modules, the dataset contains a separate set of sampling design and extrapolation factor fields for core and for each module. See the below table.

For core and each module:

  • There are three foreseen extrapolation factor fields, but in many cases (census or one-stage stratified random sampling), only one extrapolation factor field should be filled in.
  • There are many foreseen sampling design data fields, but in many cases either no data field should be filled in (census) or only the stratum identification number should be filled in (e.g. one-stage stratified random sampling).

Eurostat has foreseen all these fields in order to make possible variance estimation for complex sampling designs. In FSS 2013 and FSS 2016, Hungary and North Macedonia (for rural areas) used stratified one-stage cluster sampling, while North Macedonia (for urban areas) used stratified two-stage sampling. The present fields allow recording information for sampling designs up to three-stages.

3.8.1 Extrapolation factors and sampling design fields


Code

Label

M

HLD_FEF

Holding in frame extension flag

M

EXTPOL_FACT1_CORE

Extrapolation factor 1 for the core

V

EXTPOL_FACT2_CORE

Extrapolation factor 2 for the core

V

EXTPOL_FACT3_CORE

Extrapolation factor 3 for the core

V

STRA_ID_CORE

Stratum identification number (core)

M

STRA_IDF_CORE

Stratum identification number flag (core)

V

PSU_CORE

Primary sampling unit (core)

M

PSUF_CORE

Primary sampling unit flag (core)

V

SSU_CORE

Secondary sampling unit (core)

M

SSUF_CORE

Secondary sampling unit flag (core)

V

OSU_S1_CORE

Order of selection of the unit in the first stage (core)

M

OSU_SF1_CORE

Order of selection of the unit in the first stage flag (core)

C

EXTPOL_FACT1_LAFO

Extrapolation factor 1 for labour force and other gainful activities

V

EXTPOL_FACT2_LAFO

Extrapolation factor 2 for labour force and other gainful activities

V

EXTPOL_FACT3_LAFO

Extrapolation factor 3 for labour force and other gainful activities

V

STRA_ID_LAFO

Stratum identification number (labour force and other gainful activities)

M

STRA_IDF_LAFO

Stratum identification number flag (labour force and other gainful activities)

V

PSU_LAFO

Primary sampling unit (labour force and other gainful activities)

M

PSUF_LAFO

Primary sampling unit flag (labour force and other gainful activities)

V

SSU_LAFO

Secondary sampling unit (labour force and other gainful activities)

M

SSUF_LAFO

Secondary sampling unit flag (labour force and other gainful activities)

V

OSU_S1_LAFO

Order of selection of the unit in the first stage (labour force and other gainful activities)

M

OSU_SF1_LAFO

Order of selection of the unit in the first stage flag (labour force and other gainful activities)

C

EXTPOL_FACT1_RDEV

Extrapolation factor 1 for rural development

V

EXTPOL_FACT2_RDEV

Extrapolation factor 2 for rural development

V

EXTPOL_FACT3_RDEV

Extrapolation factor 3 for rural development

V

STRA_ID_RDEV

Stratum identification number (rural development)

M

STRA_IDF_RDEV

Stratum identification number flag (rural development)

V

PSU_RDEV

Primary sampling unit (rural development)

M

PSUF_RDEV

Primary sampling unit flag (rural development)

V

SSU_RDEV

Secondary sampling unit (rural development)

M

SSUF_RDEV

Secondary sampling unit flag (rural development)

V

OSU_S1_RDEV

Order of selection of the unit in the first stage (rural development)

M

OSU_SF1_RDEV

Order of selection of the unit in the first stage flag (rural development)

C

EXTPOL_FACT1_MIRR

Extrapolation factor 1 for irrigation

V

EXTPOL_FACT2_MIRR

Extrapolation factor 2 for irrigation

V

EXTPOL_FACT3_MIRR

Extrapolation factor 3 for irrigation

V

STRA_ID_MIRR

Stratum identification number (irrigation)

M

STRA_IDF_MIRR

Stratum identification number flag (irrigation)

V

PSU_MIRR

Primary sampling unit (irrigation)

M

PSUF_MIRR

Primary sampling unit flag (irrigation)

V

SSU_MIRR

Secondary sampling unit (irrigation)

M

SSUF_MIRR

Secondary sampling unit flag (irrigation)

V

OSU_S1_MIRR

Order of selection of the unit in the first stage (irrigation)

M

OSU_SF1_MIRR

Order of selection of the unit in the first stage flag (irrigation)

C

EXTPOL_FACT1_MSMP

Extrapolation factor 1 for soil management practices

V

EXTPOL_FACT2_MSMP

Extrapolation factor 2 for soil management practices

V

EXTPOL_FACT3_MSMP

Extrapolation factor 3 for soil management practices

V

STRA_ID_MSMP

Stratum identification number (soil management practices)

M

STRA_IDF_MSMP

Stratum identification number flag (soil management practices)

V

PSU_MSMP

Primary sampling unit (soil management practices)

M

PSUF_MSMP

Primary sampling unit flag (soil management practices)

V

SSU_MSMP

Secondary sampling unit (soil management practices)

M

SSUF_MSMP

Secondary sampling unit flag (soil management practices)

V

OSU_S1_MSMP

Order of selection of the unit in the first stage (soil management practices)

M

OSU_SF1_MSMP

Order of selection of the unit in the first stage flag (soil management practices)

C

EXTPOL_FACT1_MMEQ

Extrapolation factor 1 for machinery and equipment

V

EXTPOL_FACT2_MMEQ

Extrapolation factor 2 for machinery and equipment

V

EXTPOL_FACT3_MMEQ

Extrapolation factor 3 for machinery and equipment

V

STRA_ID_MMEQ

Stratum identification number (machinery and equipment)

M

STRA_IDF_MMEQ

Stratum identification number flag (machinery and equipment)

V

PSU_MMEQ

Primary sampling unit (machinery and equipment)

M

PSUF_MMEQ

Primary sampling unit flag (machinery and equipment)

V

SSU_MMEQ

Secondary sampling unit (machinery and equipment)

M

SSUF_MMEQ

Secondary sampling unit flag (machinery and equipment)

V

OSU_S1_MMEQ

Order of selection of the unit in the first stage (machinery and equipment)

M

OSU_SF1_MMEQ

Order of selection of the unit in the first stage flag (machinery and equipment)

C

EXTPOL_FACT1_MORC

Extrapolation factor 1 for orchard

V

EXTPOL_FACT2_MORC

Extrapolation factor 2 for orchard

V

EXTPOL_FACT3_MORC

Extrapolation factor 3 for orchard

V

STRA_ID_MORC

Stratum identification number (orchard)

M

STRA_IDF_MORC

Stratum identification number flag (orchard)

V

PSU_MORC

Primary sampling unit (orchard)

M

PSUF_MORC

Primary sampling unit flag (orchard)

V

SSU_MORC

Secondary sampling unit (orchard)

M

SSUF_MORC

Secondary sampling unit flag (orchard)

V

OSU_S1_MORC

Order of selection of the unit in the first stage (orchard)

M

OSU_SF1_MORC

Order of selection of the unit in the first stage flag (orchard)

In the above table M=Mandatory i.e. the fields should always be filled in (they are always applicable). C=Conditional, the fields are mandatory only for holdings with module data and should be set to null if holdings do not have module data. V=Voluntary i.e. the fields can be either filled in (if applicable) or set to null (if not applicable).

3.8.2 Sampling strategies

When the data for both core and a module are collected on samples, two strategies are identified to draw the samples: positive coordination and two-phase sampling.

3.8.2.1 POSITIVE coordination

The core and module samples are drawn with positive coordination from the same frame and at the same time using the Permanent Random Number technique to obtain maximum overlapping among the samples. All holdings in the module sample are included in the core sample.

In case the core and module samples are drawn using one-stage stratified sampling, then for calculating the extrapolation factors and the variance estimates for core and module data, the usual procedures for one-stage stratified sampling are used for core and each module. The situation is analogous in the case of another sampling design. The positive coordination among samples does not change the procedures.

3.8.2.2 TWO-PHASE SAMPLING

In the case of two-phase sampling, the core sample is selected from the frame in the first phase and the module sub-sample is selected from the core sample in the second phase.

There are at least two fundamental differences between multi-stage sampling and multi-phase sampling:

  • In multi-stage sampling, the units of selection are generally different at different stages, forming some kind of hierarchy. Most frequently, this hierarchy is determined by different levels of spatial units (e.g. enumeration areas, holdings). In multi-phase sampling, on the other hand, selection units are the same at each phase.
  • In multi-stage sampling, information is collected only from the units selected at the last stage of sampling (from holdings). In multi-phase sampling, on the other hand, information is collected after each phase, and information collected in the previous phase(s) of sampling is used in the later procedures. There are two main ways in which information from the previous phase(s) can be used:
    • Information is used for the later phase of the sampling procedure (e.g. for stratification). Some core variables collected with the core sample (e.g. utilised agricultural area, livestock) can be used for the stratification necessary for the selection of the module sub-sample(s).
    • Information is used as auxiliary information (e.g. for ratio estimators) in the estimation procedure. The estimates of the totals of main variables in the larger core sample can be used in the procedure for calibrating the modules' smaller (sub) samples.

3.8.2.2.1 Assumption to simplify point and variance estimation

Two-phase sampling can be simplified to one-stage sampling i.e. the selection of a module sub-sample from a core sample can be considered a direct selection of the module sub-sample from the sampling frame. This simplification can be done if the independence condition is fulfilled. Independence basically means that the information collected for the core sample is not used in selecting or calibrating the module sub-sample. If the module sub-sample is selected at the same time as the core sample, the data for both the core sample and the module sub-sample are collected in parallel, and the collected core data are not used for calibrating the module data, the independence condition is met. In such a case, the theory is straightforward, since we are dealing with two independent sampling mechanisms and the inclusion probabilities for the module are products of two unconditional probabilities. Taking into consideration that in practice similar assumptions are already made:

  • Unit non-response is a case of two-phase sampling, where the sample is selected in the first phase and the respondent 'sample' is self-selected in the second phase. In practice, the Missing Completely at Random or the Missing at Random response mechanism assumption is accepted. This means that independent sampling in the second phase (more explicitly, for generating the response mechanism) is assumed, and the re-weighting formulae in fact assume direct sampling of respondents;
  • In some cases, the estimates of a data collection are calibrated to the totals estimated in another data collection based on a larger sample size. The additional variability caused by the sampling errors of those calibration totals is assumed not significant and neglected in practice,

we propose to assume the above-mentioned independence condition for the module data. Namely:

  • The estimation of module data is based on the product of the extrapolation factor(s) corresponding to the selection of the core sample from the frame and the extrapolation factor(s) corresponding to the selection of the module sub-sample from the core sample. This extrapolation factor is equivalent to the one calculated as if the module sub-sample is selected from the sampling frame, as long as the same stratification is used in the core sample and module sub-sample.
  • The variance estimation of module data assumes that the module sub-sample is selected from the sampling frame, of course taking into account all the relevant sampling design information. Otherwise, a more complicated approach for the variance estimation (usually model-based) must be employed.

For example, suppose that the core sample is selected from the frame using one-stage stratified random sampling and that the module sub-sample is selected from the core sample, using random sub-selection of units in each stratum.

  • The estimation of module data is based on the product of two extrapolation factors (corresponding to the selection of the core sample from the frame and to the selection of the module sub-sample from the core sample, respectively). The field EXTPOL_FACT1_* for the module should record this product.
  • The variance estimation of module data assumes that the module sample was selected using one-stage stratified sampling from the frame (and not two-phase stratified sampling from the frame). Of course, the variance estimation should take into account the strata (which are common for the selection of the core sample and the module sub-sample).

Independently of the sampling strategy, calibration procedures are welcome. In case calibration is used, it is recommended to estimate variance by considering the effect of calibration on variance. For Eurostat this recommendation is not feasible. In order to correctly estimate the variance in the presence of calibration, Eurostat needs the residuals of the regression between the target variable and the calibration variables. This should be the case for each target variable. Countries use multitude of different set of variables and calibration methods. According to the conclusions for FSS in the Working Group meeting in October 2017 and for other domains (e.g. Labour Market Statistics Working Group in December 2015), calibration variability can be ignored when estimating variance in Eurostat. Still, the final calibrated weights are considered in Eurostat when estimating variance.

3.8.2.3 Extrapolation factor fields

This section presents the principles for filling in the extrapolation factor fields.  

These principles allow Eurostat to calculate the extrapolated aggregate for any variable in the core or in a module by multiplying the value of the core or module variable with the product of the corresponding extrapolation factors EXTPOL_FACT1_* x EXTPOL_FACT2_* x EXTPOL_FACT3_* (once null values are replaced with 1). 

3.8.2.3.1 Extrapolation factors for the core (EXTPOL_FACT*_CORE)

For the reference year 2023, the core data collection is required for the main frame and may be carried out as census or sample.  Countries may send on voluntary basis core data on frame extension if they collect such data for national purposes.

The first extrapolation factor field for the core (EXTPOL_FACT1_CORE) should be always filled in (is always mandatory), irrespective of the national coverage of the core data (main frame or main frame plus frame extension). In case of a census, it is completed in principle with 1; values different from 1 are accepted as non-response adjustment and calibration are done via the extrapolation factors. In case of a sample-based data collection, it is completed with values in principle higher than or equal to 1 (depending on whether the sampled holdings belong or not to take-all strata); where calibration is applied, some values can be lower than 1. 

The subsequent extrapolation factor fields for the core (EXTPOL_FACT2_CORE and EXTPOL_FACT3_CORE) should be completed only when applicable depending on the sampling design. Where not applicable, they should be set to null. 

The expected completion of the extrapolation factors depending on the sampling design is as follows:

  • Only the first extrapolation factor (EXTPOL_FACT1_CORE) is completed in case of census or one-stage sampling;

  • Only the first two extrapolation factors (EXTPOL_FACT1_CORE and EXTPOL_FACT2_CORE) are completed in case of two-stage sampling or one-stage cluster sampling;

  • The three extrapolation factors (EXTPOL_FACT1_CORE, EXTPOL_FACT2_CORE and EXTPOL_FACT3_CORE) are completed in case of  three-stage sampling or two-stage cluster sampling.

See the examples in the section "Examples for IFS 2023", below.


3.8.2.3.2 Extrapolation factors for the modules

According to art 7(2) of Regulation 2018/1091, module data collection is required for the main frame. Countries may send on voluntary basis module data on frame extension if they collect such data for national purposes. 

A module should be collected from all or from a sub-sample of holdings for which core data are collected. The following applies irrespective of the national coverage of the module data (main frame or main frame plus frame extension):

1. If a module is collected for all holdings for which core data are collected, then for all holdings:

  • the first extrapolation factor field(s) of the module should copy the information from the completed extrapolation factor field(s) of the core,
  • any possible remaining subsequent extrapolation factor field of the module should be null.

See examples 1,4,5,7 and 10 from section "Examples for IFS 2023", below.

2. If a module is collected for a sub-sample of the core, then the following is valid only for the holdings in the sub-sample:

  • the first extrapolation factor field(s) of the module should cover for the selection of the holdings from the frame into the module sample.

Suppose that the core is collected on census and the module sample is selected from the core population using one-stage stratified random sampling. Then the EXTPOL_FACT1_* of the module should record the extrapolation factors corresponding to the selection of the module sample from the frame population using one-stage stratified random sampling.

Suppose that the core sample is selected from the frame using one-stage stratified random sampling and that the module sub-sample is selected from the core sample, using random sub-selection of holdings in each stratum. Then the EXTPOL_FACT1_* of the module should record the product of two extrapolation factors of the holdings (corresponding to the selection of the core sample from the frame using one-stage stratified random sampling and to the selection of the module sub-sample from the core sample, respectively). This is equivalent to the extrapolation factor calculated as if the module sub-sample is directly selected from the frame, as long as the same stratification is used in the core sample and module sub-sample.

  • any possible remaining subsequent extrapolation factor field of the module should be null.

See examples 2,3,6,8 and 9 from section "Examples for IFS 2023", below.

Therefore the following rules are valid:

  • all holdings with module data:
    • should have at least the first extrapolation factor (EXTPOL_FACT1_*) filled in;
    • In principle, the holdings with module data should have each filled in extrapolation factor (EXTPOL_FACT1_*, EXTPOL_FACT2_*, EXTPOL_FACT3_*) higher or equal to the corresponding extrapolation factor of the core (EXTPOL_FACT1_CORE, EXTPOL_FACT2_CORE, EXTPOL_FACT3_CORE). However, following calibration, it can happen that the extrapolation factor of the module (EXTPOL_FACT1_*, EXTPOL_FACT2_*, EXTPOL_FACT3_*) is lower than the corresponding extrapolation factor of the core (EXTPOL_FACT1_CORE, EXTPOL_FACT2_CORE, EXTPOL_FACT3_CORE).
  • all holdings without module data should have all extrapolation factors fields for the module null.


3.8.2.4 Sampling design fields

This section presents how to fill in the sampling design fields.

The sampling design fields are of two types: data fields and flag fields. The data fields are: STRA_ID_*,  PSU_*,  SSU_* and OSU_S1_*.  The flag fields are: STRA_IDF_*,  PSUF_*,  SSUF_* and OSU_SF1_*.   "*" stands for "CORE", "LAFO, "RDEV", "MIRR", "MSMP", "MMEQ" and respectively "MORC" in IFS 2023.  For example, STRA_ID_CORE (Stratum identification number (core)) is a data field, while STRA_IDF_CORE (Stratum identification number flag (core)) is a flag field.

Depending on the national sampling design for a data collection, data are required in no field, one, some or all data fields. Therefore, the data fields can be either filled in (where applicable) or set to null (where not applicable).

All flag fields should be filled in all cases. They should indicate the applicability of the data fields and where relevant the specific meaning of the data fields.

See the examples in section "Examples for IFS 2023", below.

3.8.2.4.1 Sampling design fields for core and modules

For the reference year 2023, core and module data collection is required for the main frame and may be carried out as census or sample.  Countries may send on voluntary basis core and module data on frame extension if they collect such data for national purposes.

A module should be collected from all or from a sub-sample of holdings for which core data are collected. The following applies irrespective of the national coverage of the module data (main frame or main frame plus frame extension):

1. If a module is collected for all holdings for which core data is collected, then for all holdings:

  • the first data field(s) and flag field(s) of the module should copy the information (if any) from the data field(s) and flag field(s) of the core,
  • any possible remaining subsequent data field(s) of the module should be null and any remaining subsequent flag field(s) of the module should be _Z (not applicable).

See examples 1,4,5,7 and 10 from section "Examples for IFS 2023", below.

2. If a module is collected for a sub-sample of the core, then only for the holdings in the sub-sample:

  • the first data field(s) and flag field(s) of the module should include relevant information on the selection of the holdings from the frame into the module.

Suppose that the core is collected on census and the module sample is selected from the core population using one-stage stratified random sampling. Then the STRA_ID_* of the module should record the strata used for the selection of the module sample from the frame population.

Suppose that the core sample is selected from the frame using one-stage stratified random sampling and that the module sub-sample is selected from the core sample, using random sub-selection of units in each stratum. Then the STRA_ID_* of the module should record the same information as the STRA_ID_CORE.

  • any possible remaining subsequent data field(s) of the module should be null and any remaining subsequent flag field(s) of the module should be _Z (not applicable).

See examples 2,3,6,8 and 9 from section "Examples for IFS 2023", below.

Therefore the following rules are generally valid:

  • all holdings with module data:
    • should in principle copy the completed data field(s) and the flag field(s) of the core (and possibly have additional data fields filled in) for the module. "In principle" and not always because the stratification of the core and module samples can be different.
  • all holdings without module data should have all data fields null and all flag fields _Z (not applicable) for the module.

3.8.3 Description of sampling design fields

3.8.3.1 Stratum identification number (STRA_ID_*)

Identification code for the stratum of the holding

Stratifying a population means dividing it into non-overlapping subpopulations, called strata. Independent samples are then selected in each stratum. The population is usually stratified before the units are selected in the first stage. The units selected in the first stage are either:

  • holdings when one-stage (not cluster) sampling is used OR
  • primary sampling units (PSUs) i.e. the hierarchical clusters superior to the holdings and to which the holdings belong, when one-stage cluster sampling or two- or more-stage sampling is used. Example of PSUs: districts, municipalities etc. 

The code indicates the primary stratum each holding belongs to. The code should uniquely identify all primary strata in the dataset. The code refers to the original strata at the time of the selection, except for:

  • the large holdings (for which strata may be updated to take-all strata);
  • the case of self-representing PSUs which become themselves strata (see below "Self-representing PSU");
  • the case when strata are collapsed due to a single unit in the stratum (see below "Collapsed stratum due to a single unit in the stratum").

The code does not refer to the strata used for post-stratification or calibration.

The code refers to explicit strata. Systematic sampling with implicit stratification will be accounted for through the use of field OSU_S1_* (see the explanation of this field, below).

3.8.3.2 Stratum identification number flag (STRA_IDF_*)

Flag indicating the applicability and the origin of the stratum

  • 1 - Original stratum or updated stratum for large units
  • 2 - Self-representing PSU
  • 3 - Collapsed stratum due to a single unit selected in the first stage in the stratum
  • _Z - Not applicable (no strata)

3.8.3.2.1 Self-representing PSU

Self-representing PSUs are PSUs selected with certainty (with a probability of 1). For example, a self-representing PSU is a municipality selected in the first sampling stage from a stratum with one municipality. For the purpose of estimating variance, self-representing PSUs should be treated as primary strata. Therefore, for a self-representing PSU, a separate, unique value is assigned to STRA_ID_* for its identification. STRA_IDF_* should receive the flag 2. See example 11

3.8.3.2.2 Collapsed stratum due to a single unit in the stratum

If a stratum consists of only one unit selected in the first stage (among a larger number of units in the stratum population), or if a stratum contains only one respondent unit selected in the first stage (among a larger number of selected units), primary strata have to be collapsed such that every stratum consists of at least two units. For doing so, strata should be grouped with strata that are most similar in terms of the main variables. The decision of which strata are collapsed should be based on information that is available in the sampling frame. Preferably, strata similar in terms of holding size or farm type are collapsed. The stratum code of the collapsed stratum is equal to the stratum code of the stratum that before collapsing already contained more than one unit. The holdings in the collapsed stratum receive STRA_IDF_* equal to 3. See example 12

3.8.3.3 Primary sampling unit (PSU_*)

Code of the primary sampling unit

A population is divided into clusters (i.e. disjoint sub-populations) in case direct-element sampling is either impossible (due to lack of sampling frame) or its implementation too expensive (the population is widely distributed geographically). A sample of clusters (PSUs) is then selected at the first stage of sampling. Primary sampling units (PSUs) refer to hierarchical clusters superior to agricultural holdings selected in the first stage of sampling. The code should uniquely identify all PSUs in the dataset, irrespective of the strata which they belong to. The field is applicable in case of one-stage cluster sampling or two- or more- stage (cluster) sampling. For example:

  • in a one-stage cluster sampling, municipalities are PSUs selected in the first stage and all holdings from the selected municipalities are included;
  • in a two-stage sampling, municipalities are PSUs selected in the first stage and then some holdings from each PSU are selected in the second stage;
  • in a three-stage sampling, municipalities are PSUs selected in the first stage, enumeration areas are SSUs selected in the second stage and then some holdings from each SSU are selected in the third stage.

If PSUs are selected several times (i.e. they are sampled with replacement), at each selection the selected PSU should receive a separate unique code. This is due to the fact that if PSUs are drawn with replacement, the variance estimation procedure treats repeated instances of the same PSU as separate PSUs.

The case of self-representing PSUs (see the definition of the self-representing PSUs, above)

For the purpose of estimating variance, self-representing PSUs should be treated as primary strata and their secondary sampling units (SSUs) should be treated as PSUs. The field PSU_* is filled with the SSUs, as follows:

  • If SSUs are selected in the second stage, they should be treated as if they are PSUs and should receive a unique PSU code (PSU_*).
  • If holdings are selected in the second stage, they should receive a unique PSU code (PSU_*). Even if holdings are not SSUs in their correct meaning (a SSU is hierarchically superior to agricultural holdings), this is done so as to keep a consistent structure of the completed dataset.

3.8.3.4 Primary sampling unit flag (PSUF_*)

Flag indicating the applicability of primary sampling unit

  • 1 - Filled (one-stage cluster sampling or two- or more- stage sampling)
  • _Z - Not applicable (other cases)

3.8.3.5 Secondary sampling unit (SSU_*)

Code for the secondary sampling unit

Secondary sampling units (SSUs) are clusters which form the PSUs and which are hierarchically superior to agricultural holdings. SSUs are disjoint sub-populations independently drawn from each PSU.

The field should uniquely identify all SSUs in the dataset, irrespective of the strata and PSUs which they belong to.

The completion of this field is applicable only in case of two-stage cluster sampling or three- or more- stage (cluster) sampling. For example:

  • in a two-stage cluster sampling, municipalities are PSUs selected in the first stage, enumeration areas are SSUs selected in the second stage and all holdings from the selected enumeration areas are included.
  • in a three-stage sampling, municipalities are PSUs selected in the first stage, enumeration areas are SSUs selected in the second stage and then some holdings from each SSU are selected in the third stage.

If SSUs are selected several times (i.e. they are sampled with replacement), at each selection the selected SSU should receive a separate code. This is due to the fact that if SSUs are drawn with replacement, the variance estimation procedure treats repeated instances of the same SSU as separate SSUs.

The case of self-representing PSUs (see the definition of the self-representing PSUs, above)

For the purpose of estimating variance, self-representing PSUs should be treated as primary strata, their SSUs should be treated as PSUs and their tertiary sampling units (TSUs) should be treated as SSUs. The field SSU_* is filled with the TSUs, as follows:

  • If TSUs are selected in the third stage, they should be treated as if they are SSUs and should receive a unique SSU code (SSU_*).
  • If holdings are selected in the third stage, they should receive a unique SSU code (SSU_*). Even if holdings are not TSUs in their correct meaning (a TSU is hierarchically superior to agricultural holdings), this is done so as to keep a consistent structure of the completed dataset.

3.8.3.6 Secondary sampling unit flag (SSUF_*)

Flag indicating the applicability of secondary sampling unit

  • 1 - Filled (two-stage cluster or three- or more-stage sampling)
  • _Z – Not applicable (other cases)

3.8.3.7 Order of selection of the unit in the first stage (OSU_S1_*)

Rank of the selection of the units in the first stage

The unit selected in the first stage is the:

  • holding itself when one-stage (not cluster) sampling is used OR
  • primary sampling unit – PSU (i.e. the hierarchical cluster superior to the holding and to which the holding belongs) when one-stage cluster or two- or more-stage sampling is used.

This information is important for variance estimation purposes as a systematic drawing from a judiciously ordered sampling frame may substantially decrease sampling errors. The order of sampling units is relevant only when it is by a variable correlated with the main variables.

This information makes possible to consider the effect of implicit stratification to the overall variance. For this purpose, Eurostat computes an additional field 'computational' strata in the dataset (see Calculation of weights, variance estimation and quality rating system - IFS-Integrated-Farm-Statistics - EC Extranet Wiki (europa.eu)).

3.8.3.8 Order of selection of the unit in the first stage flag (OSU_SF1_*)

Flag indicating the applicability of systematic sampling

  • 1 - Filled (systematic selection)
  • _Z - Not applicable (no systematic selection)


3.8.4 Examples for IFS 2023

For the reference year 2023, the core and module data are required only for the main frame and may be collected based on census or sample.  Therefore, the examples from 1 to 10 consider various cases of census and sample of holdings covering the main frame

Core and module data on frame extension can be sent voluntarily to Eurostat.  In this case, the completion of the fields for the holdings in the frame extension is the same as for the holdings in the main frame.

As illustrated in the examples from 1 to 10, the following general rules apply for core and modules, irrespective of the population coverage (main frame or main frame plus frame extension):

  • In the case of census or one-stage (not cluster) sampling:
    • only the first extrapolation factor (EXTPOL_FACT1_*) is completed,
    • PSU_* should not be completed and PSUF_* should be flagged _Z,
    • SSU_* should not be completed and SSUF_* should be flagged _Z.

  • In the case of two-stage sampling or one-stage cluster sampling:
    • only the first two extrapolation factors (EXTPOL_FACT1_* and EXTPOL_FACT2_*) are completed,
    • PSU_* should be completed and PSUF_* should be flagged 1,
    • SSU_* should not be completed and SSUF_* should be flagged _Z.

  • In the case of three-stage sampling or two-stage cluster sampling:
    • All three extrapolation factors (EXTPOL_FACT1_*, EXTPOL_FACT2_* and EXTPOL_FACT3_*) are completed,
    • PSU_* should be completed and PSUF_* should be flagged 1,
    • SSU_* should be completed and SSUF_* should be flagged 1.

Finally, example 11 shows how to fill in the dataset in case of self-representing PSUs while example 12 shows how to fill in the dataset in case of collapsed strata.

3.8.4.1 Example 1 - Census for core and for modules, on main frame 

Table 10 – Coverage and sampling strategy of the data collections for Example 1


Core variables

Module variables

Main frame

Census

Census

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

1

(null)

(null)

(null)

_Z

(null)

_Z

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


1

(null)

(null)

(null)

_Z

(null)

_Z

(null)

_Z

(null)

_Z

Main frame

3.8.4.2 Example 2 - Census for core and stratified one-stage random sampling for modules, on main frame 

Table 11 – Coverage and sampling strategy of the data collections for Example 2


Core variables

Module variables

Main frame

Census

Stratified one-stage random sampling

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

1

(null)

(null)

(null)

_Z

(null)

_Z

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the holding

(null)

(null)

code of the stratum from which the holding is extracted

1

(null)

_Z

(null)

_Z

(null)

_Z

Main frame


3.8.4.3 Example 3 – Census for core and stratified one-stage systematic sampling for the modules, on main frame 

Table 12 – Coverage and sampling strategy of the data collections for Example 3


Core variables

Module variables

Main frame

Census

Stratified one-stage systematic sampling

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

1

(null)

(null)

(null)

_Z

(null)

_Z

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the holding

(null)

(null)

code of the stratum from which the holding is extracted

1

(null)

_Z

(null)

_Z

the rank of selection of the holding in the stratum (in each stratum, the rank goes from 1 to n, where n is the number of sampled holdings in the stratum)

1

Main frame


3.8.4.4 Example 4 -  Stratified one-stage random sampling for core and modules, on main frame 

Table 13 – Coverage and sampling strategy of the data collections for Example 4


Core variables

Module variables

Main frame

Stratified one-stage random sampling

Stratified one-stage random sampling

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

weight of the holding

(null)

(null)

code of the stratum from which the holding is extracted

1

(null)

_Z

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the holding

(null)

(null)

code of the stratum from which the holding is extracted

1

(null)

_Z

(null)

_Z

(null)

_Z

Main frame


3.8.4.5 Example 5 - Stratified one-stage cluster sampling for core and modules, on main frame

Table 14 – Coverage and sampling strategy of the data collections for Example 5


Core variables

Module variables

Main frame

Stratified one-stage cluster sampling

Stratified one-stage cluster sampling

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

weight of the PSU to which the holding belongs

1

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs 

1

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the PSU to which the holding belongs

1

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs 

1

(null)

_Z

(null)

_Z

Main frame



3.8.4.6 Example 6 - Stratified one-stage cluster sampling for core and stratified two-stage sampling for modules, on main frame

Table 15 – Coverage and sampling strategy of the data collections for Example 6


Core variables

Module variables

Main frame

Stratified one-stage cluster sampling

Stratified two-stage sampling

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

weight of the PSU to which the holding belongs

1

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the PSU to which the holding belongs

weight of the holding extracted from the PSU

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

(null)

_Z

(null)

_Z

Main frame


3.8.4.7 Example 7 – Stratified two-stage sampling for core and modules, on main frame

Table 16 – Coverage and sampling strategy of the data collections for Example 7


Core variables

Module variables

Main frame

Stratified two-stage sampling

Stratified two-stage sampling

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

weight of the PSU to which the holding belongs

weight of the holding extracted from the PSU

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the PSU to which the holding belongs

weight of the holding extracted from the PSU

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

(null)

_Z

(null)

_Z

Main frame


3.8.4.8 Example 8 –  Stratified one-stage sampling for core and sub-sample from the core sample for modules, on main frame

Table 17 – Coverage and sampling strategy of the data collections for Example 8


Core variables

Module variables

Main frame

Stratified one-stage sampling

The module sub-sample is selected from the core sample using random sub-selection of units in each stratum

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

weight of the holding

(null)

(null)

code of the stratum from which the holding is extracted

1

(null)

_Z

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the holding (product of core holding weight and the module sub-selection holding weight)

(null)

(null)

code of the stratum from which the holding is extracted

1

(null)

_Z

(null)

_Z

(null)

_Z

Main frame

3.8.4.9 Example 9 – Stratified two-stage sampling for core and sub-sample from the core sample for modules, on main frame

Table 18 – Coverage and sampling strategy of the data collections for Example 9


Core variables

Module variables

Main frame

Stratified two-stage sampling

The module sub-sample is selected from the core sample using random sub-selection of units in each stratum

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

weight of the PSU to which the holding belongs

weight of the holding extracted from the PSU

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

(null)

_Z

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the PSU to which the holding belongs

weight of the holding (product of core holding weight and the module sub-selection holding weight)

(null)

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

(null)

_Z

(null)

_Z

Main frame


3.8.4.10 Example 10 – Stratified three-stage sampling for core and modules, on main frame

Table 19 – Coverage and sampling strategy of the data collections for Example 10


Core variables

Module variables

Main frame

Stratified three-stage sampling

Stratified three-stage sampling

The dataset should be filled in as presented below:


HLD_FEF

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

OSU_S1_CORE

OSU_SF1_CORE

Main frame

0

weight of the PSU extracted from the frame of PSUs

weight of the SSU extracted from the PSU

weight of the holding extracted from the SSU

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

code of the SSU to which the holding belongs

1

(null)

_Z

EXTPOL_FACT1_MODULE

EXTPOL_FACT2_ MODULE

EXTPOL_FACT3_ MODULE

STRA_ID_ MODULE

STRA_IDF_ MODULE

PSU_ MODULE

PSUF_ MODULE

SSU_ MODULE

SSUF_ MODULE

OSU_S1_ MODULE

OSU_SF1_ MODULE


weight of the PSU extracted from the frame of PSUs

weight of the SSU extracted from the PSU

weight of the holding extracted from the SSU

code of the stratum from which the PSU is extracted

1

code of the PSU to which the holding belongs

1

code of the SSU to which the holding belongs

1

(null)

_Z

Main frame



3.8.4.11 Example 11 - Self-representing PSUs

Let's consider example 7, where the core variables on main frame are collected using stratified two-stage sampling. In the first stage, from a stratum (STRA_ID_CORE =3000) with two municipalities (PSUs) in the population, both municipalities are selected with certainty. Their EXTPOL_FACT1_CORE is 1. They are self-representing PSUs. The codes of the PSUs (PSU_CORE) are 1 and 2. In the second stage, from each municipality, two holdings are selected with weights (EXTPOL_FACT2_CORE) equal to 3.00, respectively 4.00. There are therefore 4 holdings belonging to 2 PSUs in the dataset:

HLD_ID

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

16

1.00

3.00


3000

1

1

1


_Z

17

1.00

3.00


3000

1

1

1


_Z

18

1.00

4.00


3000

1

2

1


_Z

19

1.00

4.00


3000

1

2

1


_Z

Now let's apply the rules on self-representing PSUs:

  • The 2 PSUs (municipalities) should become strata and receive different and unique values in STRA_ID_CORE (these values should not be already in use for indicating some other stratum). In this example, the PSU_CORE codes (1 and 2) are simply transferred to STRA_ID_CORE (assuming that the codes 1 and 2 are not already in use for indicating some other stratum).
  • STRA_IDF_CORE receives code 2 for the 2 PSUs and all 4 holdings belonging to them.
  • The holdings receive unique PSU_CORE codes, let's say 100, 110, 200 and 210.
  • The information from EXTPOL_FACT2_CORE moves to EXTPOL_FACT1_CORE.
  • EXTPOL_FACT2_CORE remains empty but however receives values 1, in order to keep a consistent structure of the completed dataset.

Therefore, the dataset will display the following information:

HLD_ID

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

16

3.00

1.00


1

2

100

1


_Z

17

3.00

1.00


1

2

110

1


_Z

18

4.00

1.00


2

2

200

1


_Z

19

4.00

1.00


2

2

210

1


_Z

3.8.4.12 Example 12 - Collapsed strata

Let's consider example 7, where the core variables on main frame are collected using stratified two-stage sampling.  To keep the table short,  in the first stage,

  • from a stratum (STRA_ID_CORE=3000), 1 municipality (PSU) having code PSU_CORE =1 is selected with EXTPOL_FACT1_CORE equal to 2.50.
  • from another stratum (STRA_ID_CORE =4000), 2 municipalities (PSUs) having code PSU_CORE 2 and 3 are selected with EXTPOL_FACT1_CORE equal to 2.0.

1 holding belonging to PSU_CORE =1, 2 holdings belonging to PSU_CORE=2 and 2 holdings belonging to PSU_CORE=3 were selected and answered. The dataset includes 5 selected holdings belonging to 3 selected PSUs:

HLD_ID

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

18

2.50

3.00


3000

1

1

1


_Z

19

2.00

4.00


4000

1

2

1


_Z

20

2.00

4.00


4000

1

2

1


_Z

21

2.00

3.50


4000

1

3

1


_Z

22

2.00

3.50


4000

1

3

1


_Z

The stratum of PSU_CORE =1 is collapsed, by receiving the code 4000 (STRA_ID_CORE ) of another stratum, which let's say is the most similar in terms of holding size or farm type. The Stratum identification number flag becomes 3 for holdings belonging to the collapsed stratum.

HLD_ID

EXTPOL_FACT1_CORE

EXTPOL_FACT2_CORE

EXTPOL_FACT3_CORE

STRA_ID_CORE

STRA_IDF_CORE

PSU_CORE

PSUF_CORE

SSU_CORE

SSUF_CORE

18

2.50

3.00


4000

3

1

1


_Z

19

2.00

4.00


4000

1

2

1


_Z

20

2.00

4.00


4000

1

2

1


_Z

21

2.00

3.50


4000

1

3

1


_Z

22

2.00

3.50


4000

1

3

1


_Z