Overview

The TSV format available in the SDMX 2.1 and 3.0 APIs is a format specific to Eurostat.

This format originates from the tab-delimited data files provided by Eurostat via Bulk Download Facility.

While usage of standard format issued from SDMX standard is recommended, this format is kept for compatibility with existing clients and ease of use.

Details on the format

TSV API responses

Raw input from legacy documentation

‘Tsv’ files are flat files that include a ‘tab delimited’ sequence of values time-series in each line
instead of one value per line/record .
– In most files the sequences of values are time series.
– For datasets without the dimension time (e.g. ‘area of the regions’), or that cover only
one period of time, the sequences of values are not time series but another dimension,
e.g. geographical series.as in SDMX-CSV.

– Contains one Header line then one or more Data lines

– The columns (or fields or cells) of the records are ‘tab delimited’.

– Time series lines are sorted in ascending alphabetical order on their seriesKeys identifier, i.e on the first column,

– ATTENTION: time-series descending order (for explanation see the chapter ‘Hints for
Excel users’).
– ATTENTION: cells for which there is "no data available" at all are NOT stored in the tsv
files on the bulk download, doing so would explode the size of the tsv files. When using
the on-line extraction tools like Tables, Graphs and Maps or Data Explorer, such cells
contain the symbol ":"
EXAMPLE
Dataset with time series (with made-up values)
present in the tsv file

Info

In below examples,

orange dot represents a space
orange arrow represents a tabulation

Header line

First line of the TSV file, for example from the header line of https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/ei_isin_m?format=TSV

Image Added

SeriesKeys column (first column)

sequence of dimension codes separated by a comma providing the format of the time-series seriesKeys identifier used in data lines followed by a back slash and the time dimension code that is always TIME_PERIOD in SDMX standard \TIME_PERIOD

Code Block
freq,unit,s_adj,

...

nace_r2,indic,geo\TIME_PERIOD

For each of these dimension code there is a corresponding SDMX codelist with the same code available also in TSV format

Dimension code	Corresponding codelist download link in TSV
freq	https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/freq?format=TSV
unit	https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/unit?format=TSV
s_adj	https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/s_adj?format=TSV
nace_r2	https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/nace_r2?format=TSV
indic	https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/indic?format=TSV
geo	https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/geo?format=TSV

Observation column(s)

In the header line, other columns contains the observation time period

Observation columns are sorted in ascending order. The notation follows SDMX and ISO8601 standards ( characters in bold are fixed)

There is a trailing space in each column that correspond to the separator between observation value and abservation statuses, it is important to align columns when considering both tab and space as column separator

Period	Format	Example
year	YYYY	2015
semester	YYYY-SN	2015-S1
quarter	YYYY-QN	2015-Q4
month	YYYY-MM	2015-02
week	YYYY-WNN	2015-W01
day	YYYY-MM-DD	2015-12-31

Data line(s)

SeriesKeys column (first column)

sequence of dimension codes separated by a comma providing the seriesKeys identifier for the data time series for the current line.

Code Block
M,I2015,CA,B,IS-IP,AT

This information uniquely identify the time-series of the dataset.

Dimension code	Position code	Position label
freq	M	Monthly
unit	I2015	Index, 2015=100
s_adj	CA	Calendar adjusted data, not seasonally adjusted data
nace_r2	B	Mining and quarrying
indic	IS-IP	Production index
geo	AT	Austria

Observation column(s)

All other columns but the first line represent the sequence of observation value and statuses corresponding to the time period in the header line.

When an observation value is not available a special character colon is displayed instead ":"

Where available, status time 2004m05 2004m04 2004m03 2004m02
mio-eur,nsa,ext_eurozone,net,bp-100,eurozone 11148 10660 13398 9437
mio-eur,nsa,ext_eurozone,net,bp-200,eurozone 3386e 539 -185 -432
mio-eur,nsa,ext_eurozone,net,bp-300,eurozone -5626e -6696i 1902 919
mio-eur,nsa,ext_eurozone,net,bp-379,eurozone -5758e -4165 -3970 -4703
mio-eur,nsa,ext_eurozone,net,bp-993,eurozone 3151.5e 338.7i 11146.1 5221.0
mio-eur,nsa,ext_eurozone,net,bp-994,eurozone 2314 669 543 2113
mio-eur,nsa,ext_eurozone,net,bp-010,eurozone 5465.1 1006.5 11689.0 7334.3– First line: header.
– Other lines: records with the sequence of values.
– First column — first line: sequence of codes separated by a comma followed by a
code separated by a back slash ‘\’
The codes separated by a comma ‘,’ are the ‘names’ of the dimensions used for
identifying each (time) series.
For each of these codes there is a file (with the same name plus the extension
‘dic’) in the directory dic.
The code separated by a back slash ‘\’ is the ‘name’ of the dimension of the
sequence of values, e.g. ‘time’ (if this is a time series) or ‘geo’ (in the case of a
geographical series).
– First column except the first line: sequence of codes separated by a comma ‘,’
that represent the ‘names’ of the items (or instances or positions) of the
dimensions. The label/title of these codes can be found in the ‘dic’ file that has
the same name of the corresponding dimension.
– Other columns of the first line: sequence of codes corresponding to the items of
the dimension.
– All other columns but the first line: sequence of values.
Where available, flags are attached to values. The separator used between values
values and flags statuses is a blank. If there are no flags, the value is followed by a blank.
– The decimal symbol used in the files is the dot ‘.’.
Note for Excel users: these files can be straightforwardly opened in Excel (see chapter
Hints for Excel users).

Should copy and refresh the section 3. HINTS FOR EXCEL USERS

Raw input from Migrating to API TSV

Should be converted to a confluence page with minor adaptation

simple space.

When an observation have not statuses, there remains a trailing space in the column to keep columns aligned when considering both tab and space as column separator

Following excerpt show two time-series for which there are data since the first available time position

Image Added

For Ireland (IE) there are observation values without statuses, so the extra space character.

For Netherlands (NL) the observalue values have a "provisional" status so the p status flag is attached to the value, separated by a space

Special case : datasets with multiple time frequencies

When a dataset contains time series of different frequencies, the data file contains one row for each frequency respecting the general rule of alphabetical order.

The columns will be ordered according to the relevant frequency code alphabetical order, so for a same year and a dataset with Annual, Quarterly and Monthly data, the order will be

Annual, then Monthly and finally Quarterly observations, please see below example come from AVIA_GOR_ME dataset

freq	unit	tra_meas	airp_pr\TIME_PERIOD	...	2019	2019-01	2019-02	...	2019-11	2019-12	2019-Q1	2019-Q2	2019-Q3	2019-Q4	...
A	T	FRM_BRD	ME_LYPG_AT_LOWW	...	70.6	:	:	...	:	:	:	:	:	:	...
[ ... ]
M	T	FRM_BRD	ME_LYPG_AT_LOWW	...	:	5.1	4.4	...	6.7	5.8	:	:	:	:	...
[ ... ]
Q	T	FRM_BRD	ME_LYPG_AT_LOWW	...	:	:	:	...	:	:	14.7	15.2	19.2	21.4	...

Due to this format, this situation must extend the meaning of the ':' special character from "not available" to "not applicable" as it is logically that there cannot be data for 2019 on the monthly time-serie.

...

Child pages

Versions Compared

Old Version 2

New Version Current

Key

Overview

Details on the format

Raw input from legacy documentation

Header line

SeriesKeys column (first column)

Observation column(s)

Data line(s)

SeriesKeys column (first column)

Observation column(s)

Raw input from Migrating to API TSV

Special case : datasets with multiple time frequencies

Child pages

Page History

Versions Compared

Old Version 2

New Version Current

Key

Overview

Details on the format

Raw input from legacy documentation

Header line

SeriesKeys column (first column)

Observation column(s)

Data line(s)

SeriesKeys column (first column)

Observation column(s)

Raw input from Migrating to API TSV

Special case : datasets with multiple time frequencies