Child pages
  • API - Getting started with SDMX3.0 API

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The operations in this API supports SMDX-3.0 artefacts and implements the SDMX REST API specification v2.0.

To make the most of this guide, a basic knowledge of XML and REST webservices is required.

The main elements are refered referred to as SDMX artefacts.  Below are Please find below a short definition of some terms that are used in SDMX and their definitions:

  • Dataset: a collection of related observations, organized organised according to a predefined structure

  • Data Structure Definition (DSD): metadata describing the structure and organization organisation of a dataset, the statistical concepts and attached to them code lists used within the dataset

  • Dimensions: concepts that determine the dataset’s "physical" structure

  • Codelist: a code list is a predefined list from which some statistical coded concepts take their values. Each code list has the following properties:

    • identifier (it provides a unique identification within the set of code lists specified by a structural definitions maintenance agency);

    • name (also unique);

    • description (a description of the purpose of the code list); and

    • code value length (either an exact or a maximum number of characters and a type, i.e. numeric or alphanumeric).

  • Attributes: give additional information about the concepts used and do not affect the dataset structure itself

  • Dataflow: a structure which describes, categorizes categorises and constrains the allowable content of a dataset that providers supply for different reference periods

  • Concept scheme: the descriptive information for an arrangement or division of concepts into groups based on characteristics, which the objects have in common. A concept scheme is a maintained list of concepts that are used in key family and metadata structure definitions (Definitions from EUROSTAT SDMX info space and OECD Glossary of statistical terms)

...

Tip

Call in this guide, explicitly request not compressed response but it is recommended to use compression in automated or recurrent calls.

Looking

...

into the metadata of a dataset

in the SDMX Dataflow

Taking the dataset ISOC_CI_ID_H as example, its main information are available in its Dataflow SDMX artefact

...

Additionally a set of annotations would provide additional information (omitted in previous example, please expand full XML below ) would provide additional informationto see them) 

Annotation type 

Description

Value(s) (in AnnotationTitle

or multi-lingual AnnotationText)

OBS_COUNTNumber of statisticals statistical observations in the dataset95814
OBS_PERIOD_OVERALL_OLDESTOldest TIME position reported in an observation2002
OBS_PERIOD_OVERALL_LATESTLatest TIME position reported in an observation2014
UPDATE_STRUCTURE

Timestamp when the dataset structure last changed

  • structural change to the list of dimensions
  • change in list of dimension positions
2021-02-08T23:00:00+0100
UPDATE_DATATimestamp when the dataset data last changed2023-05-10T11:00:00+0200
ESMS_HTMLLink to Reference Metadata pagehttps://ec.europa.eu/eurostat/cache/metadata/en/isoc_i_esms.htm
ESMS_SDMXLink to Reference Metadata archivehttps://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file=metadata/isoc_i_esms.sdmx.zip
SOURCE_INSTITUTIONSSource institutionEurostat

...

(warning) These resources are versioned, so version present in the reference must be used to ensure consistency.

This definition is informing about providing the list of dimensions used in the definition of the time-series of the dataset.

The order of dimensions will help build key filtering in the Data data query later

For each dimension a reference is provided

  1. to the to the concept holding the dimension label, the concept is one item of a concept scheme.

    Tip
    iconfalse

    In current Eurostat Dissemination Chain, there is one DSD and one ConceptScheme Concept Scheme generated for each dataset with identical identifier (but potentially different version).

  2. to the code lists list holding the code and labels for the dimension positions

    These code lists are reference metadata and may contains more code and labels that than the one used by a specific dataset.

    To known the list of positions present in the dataset, please refer to the Content Constraint artefact (next section).

...

...

  • provides annual data ( freq = A)
  • provides data for 14 indicators 
  • provides data in 2 units
  • provides 17 breakdown and a breakdown TOTAL on hhtyp ( "Type of Household" )[ hhtyp ]
  • provides data for EU aggregates and member states + other countries
  • provides data from 2002 to 2010 and plus 2014
Code Block
languagexml
titleData Constraint SDMX XML
linenumberstrue
collapsetrue
<?xml version='1.0' encoding='UTF-8'?>
<m:Structure xmlns:m="http://www.sdmx.org/resources/sdmxml/schemas/v3_0/message" xmlns:s="http://www.sdmx.org/resources/sdmxml/schemas/v3_0/structure" xmlns:c="http://www.sdmx.org/resources/sdmxml/schemas/v3_0/common">
	<m:Header>
		<m:ID>BA8A9F67ED104DEEB816A3660F8170BE</m:ID>
		<m:Test>false</m:Test>
		<m:Prepared>2023-05-10T09:10:50.349Z</m:Prepared>
		<m:Sender id="ESTAT">
			<c:Name xml:lang="de">Statistische Amt der Europäischen Union (Eurostat)</c:Name>
			<c:Name xml:lang="en">Statistical Office of the European Union (Eurostat)</c:Name>
			<c:Name xml:lang="fr">Office de statistique de l'Union européenne (Eurostat)</c:Name>
		</m:Sender>
		<m:Receiver id="unknown"/>
	</m:Header>
	<m:Structures>
		<s:DataConstraints>
			<s:DataConstraint agencyID="ESTAT" id="ISOC_CI_ID_H" version="1.0" urn="urn:sdmx:org.sdmx.infomodel.registry.DataConstraint=ESTAT:ISOC_CI_ID_H(1.0)" role="Actual">
				<c:Annotations>
					<c:Annotation>
						<c:AnnotationTitle>3</c:AnnotationTitle>
						<c:AnnotationType>TABLE_COMPLEXITY</c:AnnotationType>
					</c:Annotation>
				</c:Annotations>
				<c:Name xml:lang="en">Cube description for dataflow ISOC_CI_ID_H</c:Name>
				<s:ConstraintAttachment>
					<s:Dataflow>urn:sdmx:org.sdmx.infomodel.datastructure.Dataflow=ESTAT:ISOC_CI_ID_H(1.0)</s:Dataflow>
				</s:ConstraintAttachment>
				<s:CubeRegion include="true">
					<s:KeyValue id="freq">
						<s:Value>A</s:Value>
					</s:KeyValue>
					<s:KeyValue id="indic_is">
						<s:Value>H_IPC</s:Value>
						<s:Value>H_ITV</s:Value>
						<s:Value>H_IPALM</s:Value>
						<s:Value>H_IMPH</s:Value>
						<s:Value>H_IGAME</s:Value>
						<s:Value>H_IPCQ</s:Value>
						<s:Value>H_ITVQ</s:Value>
						<s:Value>H_IPALMQ</s:Value>
						<s:Value>H_IMPHQ</s:Value>
						<s:Value>H_IGAMEQ</s:Value>
						<s:Value>H_IOTHDV</s:Value>
						<s:Value>H_IDKPC</s:Value>
						<s:Value>H_IPORT</s:Value>
						<s:Value>H_ITV2</s:Value>
					</s:KeyValue>
					<s:KeyValue id="unit">
						<s:Value>PC_HH</s:Value>
						<s:Value>PC_HH_IACC</s:Value>
					</s:KeyValue>
					<s:KeyValue id="hhtyp">
						<s:Value>TOTAL</s:Value>
						<s:Value>A1</s:Value>
						<s:Value>A1_DCH</s:Value>
						<s:Value>A2</s:Value>
						<s:Value>A2_DCH</s:Value>
						<s:Value>A_GE3</s:Value>
						<s:Value>A_GE3_DCH</s:Value>
						<s:Value>ALL_NDCH</s:Value>
						<s:Value>ALL_DCH</s:Value>
						<s:Value>HH_O1</s:Value>
						<s:Value>HH_NO1</s:Value>
						<s:Value>HH_DEG1</s:Value>
						<s:Value>HH_DEG2</s:Value>
						<s:Value>HH_DEG3</s:Value>
						<s:Value>HHI_Q1</s:Value>
						<s:Value>HHI_Q2</s:Value>
						<s:Value>HHI_Q3</s:Value>
						<s:Value>HHI_Q4</s:Value>
					</s:KeyValue>
					<s:KeyValue id="geo">
						<s:Value>EU27_2020</s:Value>
						<s:Value>EU28</s:Value>
						<s:Value>EU27_2007</s:Value>
						<s:Value>EU25</s:Value>
						<s:Value>EU15</s:Value>
						<s:Value>EA</s:Value>
						<s:Value>BE</s:Value>
						<s:Value>BG</s:Value>
						<s:Value>CZ</s:Value>
						<s:Value>DK</s:Value>
						<s:Value>DE</s:Value>
						<s:Value>EE</s:Value>
						<s:Value>IE</s:Value>
						<s:Value>EL</s:Value>
						<s:Value>ES</s:Value>
						<s:Value>FR</s:Value>
						<s:Value>HR</s:Value>
						<s:Value>IT</s:Value>
						<s:Value>CY</s:Value>
						<s:Value>LV</s:Value>
						<s:Value>LT</s:Value>
						<s:Value>LU</s:Value>
						<s:Value>HU</s:Value>
						<s:Value>MT</s:Value>
						<s:Value>NL</s:Value>
						<s:Value>AT</s:Value>
						<s:Value>PL</s:Value>
						<s:Value>PT</s:Value>
						<s:Value>RO</s:Value>
						<s:Value>SI</s:Value>
						<s:Value>SK</s:Value>
						<s:Value>FI</s:Value>
						<s:Value>SE</s:Value>
						<s:Value>IS</s:Value>
						<s:Value>NO</s:Value>
						<s:Value>CH</s:Value>
						<s:Value>UK</s:Value>
						<s:Value>MK</s:Value>
						<s:Value>RS</s:Value>
						<s:Value>TR</s:Value>
					</s:KeyValue>
					<s:KeyValue id="TIME_PERIOD">
						<s:Value>2002</s:Value>
						<s:Value>2003</s:Value>
						<s:Value>2004</s:Value>
						<s:Value>2005</s:Value>
						<s:Value>2006</s:Value>
						<s:Value>2007</s:Value>
						<s:Value>2008</s:Value>
						<s:Value>2009</s:Value>
						<s:Value>2010</s:Value>
						<s:Value>2014</s:Value>
					</s:KeyValue>
				</s:CubeRegion>
			</s:DataConstraint>
		</s:DataConstraints>
	</m:Structures>
</m:Structure>

...

Instead of specifying a dataset code in the dataflow request the ALL keyword special character * can be used to mean ALL and retrieve a list of all Eurostat datasets in one request

...

Above link retrieves the complete dataset in default format : SDMX-ML 3.0 Structure Specific Data. (please remind taht , that is the default format as Generic Data is now removed )from SDMX 3.0

The data file is compose composed of a set of time-series identified by a its series-keys containing Observation as the one show below

...

It is possible to further customize customise the query to retrieve only the needed data or to request a different output format :.

Filtering data

In SDMX 3.0 REST API, there are tow two kind of filtering mechanism

  • Filtering on series-keys : it is possible to define for each dimension one value or a wildcard to scope the data extraction ; (warning) this filter must follow dimensions order
  • Filter by component value : it is possible to define for each dimension a list of value to filter on , and ; TIME_PERIOD dimension also allow range ; this filter is not impacted by dimensions order

These two filtering mechanisms can be combined with the logical limitation that a dimension first specified on the series-keys filter should not be repeated in the component value filter.

...

First level of filtering is about selecting how the dataset could be sliced and . It is done by filtering on the series-keys following the dimension order as specified in the DSD

...

with the following syntax:

  • '*' is a wildacard means wildcard for ALL values meaning no filtering is needed for this dimension 
  • A specific value can be set for a dimension
  • Partial key is supported and assume wildcarding '*' wildcard value for remaining dimensiondimensions, see example below

Scope

Details on the series-keys filterLink

Single time-series

fully specified


DimensionVersion
FREQA
INDIC_ISH_IPC
UNITPC_HH_IACC
HHTYPTOTAL
GEOEA
https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?compress=false  

Single indicator

+ Single Unit

All Type and countries


DimensionVersion
FREQA
INDIC_ISH_IPC
UNITPC_HH_IACC
HHTYP*
GEO*

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.*.*?compress=false


(lightbulb) Using partial key support means that the remaining dimension .*.* can be omitted from the URL so it would be equivalent to 

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC?compress=false

This can prove useful on FREQ dimension, in dataset holding data for multiple frequency , to ensure to retrieve a single type of data (annual, quarterly, monthly...) per request as FREQ is usually the first dimension of a dataset.

Single GEO data  

As the GEO dimension is the last, previous dimension must be wildcarded

Code Block
*.*.*.*.EU27_2020

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/*.*.*.*.EU27_2020?compress=false


...

Component value filtering has the following syntax c[dim_id]=value:

  • dim_id is the dimension id as found in the DSD 
  • value can list several positions separated by a comma
  • both dim_id and value are case insensitive

...

Filtering the observations to be returned based on their TIME_PERIOD value is also controller controlled via component value filtering as other dimension but additionally support value range definition Following SDMX syntaxit supports value range definition prefixing the value by a short operator and a colon ': '

Reusing above single time-series example, it can be restricted on TIME_PERIOD PERIOD

ScopeDescriptionLink
Single time valueGet value for 2010 only

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?c[TIME_PERIOD]=2010&compress=false    

or to align with other examples below by specifying the equals operator eq

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?c[TIME_PERIOD]=eq:2010&compress=false    

Rolling time

Get value after 2010


Inclusive operator  : "greater than or equals" : ge
https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?c[TIME_PERIOD]=ge:2010&compress=false    
Exclusive operator : "greater than" gt

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?c[TIME_PERIOD]=gt:2010&compress=false   

Until time

Get value before 2010


Inclusive operator  : "lower than or equals" : le
https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?c[TIME_PERIOD]=le:2010&compress=false    
Exclusive operator : "lower than" lt

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?c[TIME_PERIOD]=lt:2010&compress=false   

From - to

By combining previous operator with a '+'

it is possible to specify a time range to get data between 2008 and 2010

c[TIME_PERIOD]=ge:2008+le:2010

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0/A.H_IPC.PC_HH_IACC.TOTAL.EA?c[TIME_PERIOD]=ge:2008+le:2010&compress=false    

...

...

Retrieving Navigation artefacts

It is worth to mention that secondary navigation artefacts exists to that represent as SDMX artefacts a classification of dataset in categories (also refered referred as "Navigation Tree in Eurostat")

  • Category Scheme : Hierarchy of categories 
  • Categorisation :  one categorisation is referencing one dataset into a category of a Category Scheme

...