...
The operations in this API supports SMDX-3.0 artefacts and implements the SDMX REST API specification v2.0.
To make the most of this guide, a basic knowledge of XML and REST webservices is required.
The main elements are refered referred to as SDMX artefacts. Below are Please find below a short definition of some terms that are used in SDMX and their definitions:
Dataset: a collection of related observations, organized organised according to a predefined structure
Data Structure Definition (DSD): metadata describing the structure and organization organisation of a dataset, the statistical concepts and attached to them code lists used within the dataset
Dimensions: concepts that determine the dataset’s "physical" structure
Codelist: a code list is a predefined list from which some statistical coded concepts take their values. Each code list has the following properties:
identifier (it provides a unique identification within the set of code lists specified by a structural definitions maintenance agency);
name (also unique);
description (a description of the purpose of the code list); and
code value length (either an exact or a maximum number of characters and a type, i.e. numeric or alphanumeric).
Attributes: give additional information about the concepts used and do not affect the dataset structure itself
Dataflow: a structure which describes, categorizes categorises and constrains the allowable content of a dataset that providers supply for different reference periods
Concept scheme: the descriptive information for an arrangement or division of concepts into groups based on characteristics, which the objects have in common. A concept scheme is a maintained list of concepts that are used in key family and metadata structure definitions (Definitions from EUROSTAT SDMX info space and OECD Glossary of statistical terms)
...
Tip |
---|
Call in this guide, explicitly request not compressed response but it is recommended to use compression in automated or recurrent calls. |
Looking
...
into the metadata of a dataset
in the SDMX Dataflow
Taking the dataset ISOC_CI_ID_H as example, its main information are available in its Dataflow SDMX artefact
...
Additionally a set of annotations would provide additional information (omitted in previous example, please expand full XML below ) would provide additional informationto see them)
Annotation type | Description | Value(s) (in AnnotationTitle or multi-lingual AnnotationText) |
---|---|---|
OBS_COUNT | Number of statisticals statistical observations in the dataset | 95814 |
OBS_PERIOD_OVERALL_OLDEST | Oldest TIME position reported in an observation | 2002 |
OBS_PERIOD_OVERALL_LATEST | Latest TIME position reported in an observation | 2014 |
UPDATE_STRUCTURE | Timestamp when the dataset structure last changed
| 2021-02-08T23:00:00+0100 |
UPDATE_DATA | Timestamp when the dataset data last changed | 2023-05-10T11:00:00+0200 |
ESMS_HTML | Link to Reference Metadata page | https://ec.europa.eu/eurostat/cache/metadata/en/isoc_i_esms.htm |
ESMS_SDMX | Link to Reference Metadata archive | https://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file=metadata/isoc_i_esms.sdmx.zip |
SOURCE_INSTITUTIONS | Source institution | Eurostat |
...
These resources are versioned, so version present in the reference must be used to ensure consistency.
This definition is informing about providing the list of dimensions used in the definition of the time-series of the dataset.
The order of dimensions will help build key filtering in the Data data query later
For each dimension a reference is provided
to the to the concept holding the dimension label, the concept is one item of a concept scheme.
Tip icon false In current Eurostat Dissemination Chain, there is one DSD and one ConceptScheme Concept Scheme generated for each dataset with identical identifier (but potentially different version).
to the code lists list holding the code and labels for the dimension positions
These code lists are reference metadata and may contains more code and labels that than the one used by a specific dataset.
To known the list of positions present in the dataset, please refer to the Content Constraint artefact (next section).
...
- the mandatory TIME_PERIOD time-dimension where the value are expressed using ISO8601ISO 8601 standard
- the primary measure OBS_VALUE holding the statistical value observation the
the optional value attribute OBS_FLAG hodling holding the statistical status (also refered referred as flags)
...
- provides annual data ( freq = A)
- provides data for 14 indicators
- provides data in 2 units
- provides 17 breakdown and a breakdown TOTAL on hhtyp ( "Type of Household" )[ hhtyp ]
- provides data for EU aggregates and member states + other countries
- provides data from 2002 to 2010 and plus 2014
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
<?xml version='1.0' encoding='UTF-8'?> <m:Structure xmlns:m="http://www.sdmx.org/resources/sdmxml/schemas/v3_0/message" xmlns:s="http://www.sdmx.org/resources/sdmxml/schemas/v3_0/structure" xmlns:c="http://www.sdmx.org/resources/sdmxml/schemas/v3_0/common"> <m:Header> <m:ID>BA8A9F67ED104DEEB816A3660F8170BE</m:ID> <m:Test>false</m:Test> <m:Prepared>2023-05-10T09:10:50.349Z</m:Prepared> <m:Sender id="ESTAT"> <c:Name xml:lang="de">Statistische Amt der Europäischen Union (Eurostat)</c:Name> <c:Name xml:lang="en">Statistical Office of the European Union (Eurostat)</c:Name> <c:Name xml:lang="fr">Office de statistique de l'Union européenne (Eurostat)</c:Name> </m:Sender> <m:Receiver id="unknown"/> </m:Header> <m:Structures> <s:DataConstraints> <s:DataConstraint agencyID="ESTAT" id="ISOC_CI_ID_H" version="1.0" urn="urn:sdmx:org.sdmx.infomodel.registry.DataConstraint=ESTAT:ISOC_CI_ID_H(1.0)" role="Actual"> <c:Annotations> <c:Annotation> <c:AnnotationTitle>3</c:AnnotationTitle> <c:AnnotationType>TABLE_COMPLEXITY</c:AnnotationType> </c:Annotation> </c:Annotations> <c:Name xml:lang="en">Cube description for dataflow ISOC_CI_ID_H</c:Name> <s:ConstraintAttachment> <s:Dataflow>urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=ESTAT:ISOC_CI_ID_H(1.0)</s:Dataflow> </s:ConstraintAttachment> <s:CubeRegion include="true"> <s:KeyValue id="freq"> <s:Value>A</s:Value> </s:KeyValue> <s:KeyValue id="indic_is"> <s:Value>H_IPC</s:Value> <s:Value>H_ITV</s:Value> <s:Value>H_IPALM</s:Value> <s:Value>H_IMPH</s:Value> <s:Value>H_IGAME</s:Value> <s:Value>H_IPCQ</s:Value> <s:Value>H_ITVQ</s:Value> <s:Value>H_IPALMQ</s:Value> <s:Value>H_IMPHQ</s:Value> <s:Value>H_IGAMEQ</s:Value> <s:Value>H_IOTHDV</s:Value> <s:Value>H_IDKPC</s:Value> <s:Value>H_IPORT</s:Value> <s:Value>H_ITV2</s:Value> </s:KeyValue> <s:KeyValue id="unit"> <s:Value>PC_HH</s:Value> <s:Value>PC_HH_IACC</s:Value> </s:KeyValue> <s:KeyValue id="hhtyp"> <s:Value>TOTAL</s:Value> <s:Value>A1</s:Value> <s:Value>A1_DCH</s:Value> <s:Value>A2</s:Value> <s:Value>A2_DCH</s:Value> <s:Value>A_GE3</s:Value> <s:Value>A_GE3_DCH</s:Value> <s:Value>ALL_NDCH</s:Value> <s:Value>ALL_DCH</s:Value> <s:Value>HH_O1</s:Value> <s:Value>HH_NO1</s:Value> <s:Value>HH_DEG1</s:Value> <s:Value>HH_DEG2</s:Value> <s:Value>HH_DEG3</s:Value> <s:Value>HHI_Q1</s:Value> <s:Value>HHI_Q2</s:Value> <s:Value>HHI_Q3</s:Value> <s:Value>HHI_Q4</s:Value> </s:KeyValue> <s:KeyValue id="geo"> <s:Value>EU27_2020</s:Value> <s:Value>EU28</s:Value> <s:Value>EU27_2007</s:Value> <s:Value>EU25</s:Value> <s:Value>EU15</s:Value> <s:Value>EA</s:Value> <s:Value>BE</s:Value> <s:Value>BG</s:Value> <s:Value>CZ</s:Value> <s:Value>DK</s:Value> <s:Value>DE</s:Value> <s:Value>EE</s:Value> <s:Value>IE</s:Value> <s:Value>EL</s:Value> <s:Value>ES</s:Value> <s:Value>FR</s:Value> <s:Value>HR</s:Value> <s:Value>IT</s:Value> <s:Value>CY</s:Value> <s:Value>LV</s:Value> <s:Value>LT</s:Value> <s:Value>LU</s:Value> <s:Value>HU</s:Value> <s:Value>MT</s:Value> <s:Value>NL</s:Value> <s:Value>AT</s:Value> <s:Value>PL</s:Value> <s:Value>PT</s:Value> <s:Value>RO</s:Value> <s:Value>SI</s:Value> <s:Value>SK</s:Value> <s:Value>FI</s:Value> <s:Value>SE</s:Value> <s:Value>IS</s:Value> <s:Value>NO</s:Value> <s:Value>CH</s:Value> <s:Value>UK</s:Value> <s:Value>MK</s:Value> <s:Value>RS</s:Value> <s:Value>TR</s:Value> </s:KeyValue> <s:KeyValue id="TIME_PERIOD"> <s:Value>2002</s:Value> <s:Value>2003</s:Value> <s:Value>2004</s:Value> <s:Value>2005</s:Value> <s:Value>2006</s:Value> <s:Value>2007</s:Value> <s:Value>2008</s:Value> <s:Value>2009</s:Value> <s:Value>2010</s:Value> <s:Value>2014</s:Value> </s:KeyValue> </s:CubeRegion> </s:DataConstraint> </s:DataConstraints> </m:Structures> </m:Structure> |
...
Instead of specifying a dataset code in the dataflow request the ALL keyword special character * can be used to mean ALL and retrieve a list of all Eurostat datasets in one request
...
Above link retrieves the complete dataset in default format : SDMX-ML 3.0 Structure Specific Data. (please remind taht , that is the default format as Generic Data is now removed )from SDMX 3.0
The data file is compose composed of a set of time-series identified by a its series-keys containing Observation as the one show below
...
It is possible to further customize customise the query to retrieve only the needed data or to request a different output format :.
Filtering data
In SDMX 3.0 REST API, there are tow two kind of filtering mechanism
- Filtering on series-keys : it is possible to define for each dimension one value or a wildcard to scope the data extraction ;
this filter must follow dimensions order
- Filter by component value : it is possible to define for each dimension a list of value to filter on , and ; TIME_PERIOD dimension also allow range ; this filter is not impacted by dimensions order
These two filtering mechanisms can be combined with the logical limitation that a dimension first specified on the series-keys filter should not be repeated in the component value filter.
...
First level of filtering is about selecting how the dataset could be sliced and . It is done by filtering on the series-keys following the dimension order as specified in the DSD
...
with the following syntax:
- '*' is a wildacard means wildcard for ALL values meaning no filtering is needed for this dimension
- A specific value can be set for a dimension
- Partial key is supported and assume wildcarding '*' wildcard value for remaining dimensiondimensions, see example below
Scope | Details on the series-keys filter | Link | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Single time-series fully specified |
| https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?compress=false | ||||||||||||
Single indicator + Single Unit All Type and countries |
|
This can prove useful on FREQ dimension, in dataset holding data for multiple frequency , to ensure to retrieve a single type of data (annual, quarterly, monthly...) per request as FREQ is usually the first dimension of a dataset. | ||||||||||||
Single GEO data | As the GEO dimension is the last, previous dimension must be wildcarded
|
...
Component value filtering has the following syntax c[dim_id]=value
:
- dim_id is the dimension id as found in the DSD
- value can list several positions separated by a comma
- both dim_id and value are case insensitive
...
Filtering the observations to be returned based on their TIME_PERIOD value is also controller controlled via component value filtering as other dimension but additionally support value range definition Following SDMX syntaxit supports value range definition prefixing the value by a short operator and a colon ': '
Reusing above single time-series example, it can be restricted on TIME_PERIOD PERIOD
Scope | Description | Link |
---|---|---|
Single time value | Get value for 2010 only | or to align with other examples below by specifying the equals operator eq |
Rolling time | Get value after 2010 | Inclusive operator : "greater than or equals" : ge |
Until time | Get value before 2010 | Inclusive operator : "lower than or equals" : le |
From - to | By combining previous operator with a '+' it is possible to specify a time range to get data between 2008 and 2010
|
...
...
Retrieving Navigation artefacts
It is worth to mention that secondary navigation artefacts exists to that represent as SDMX artefacts a classification of dataset in categories (also refered referred as "Navigation Tree in Eurostat")
- Category Scheme : Hierarchy of categories
- Categorisation : one categorisation is referencing one dataset into a category of a Category Scheme
...