Child pages
  • API - FAQ - TSV data format

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Current »

Overview

The TSV format available in the SDMX 2.1 and 3.0 APIs is the only format specific to Eurostat.

This format originates from the tab-delimited data files provided previously via Eurostat Bulk Download Facility .

While usage of standard format issued from SDMX standard is recommended, this format is kept for compatibility with existing clients and ease of use.

Details on the format

TSV API responses are flat files that include a ‘tab delimited’ time-series in each line 
instead of one value per line/record as in SDMX-CSV.

– Contains one Header line then one or more Data lines

– The columns (or fields or cells) of the records are ‘tab delimited’.

– Time series lines are sorted in ascending alphabetical order on their seriesKeys identifier, i.e on the  first column,

(warning) ATTENTION: time-series for which there is "no data available" at all are NOT present in the tsv file


In below examples,

  • orange dot represents a space
  • orange arrow represents a tabulation

Header line

First line of the TSV file, for example

SeriesKeys column (first column)


sequence of dimension codes separated by a comma providing the format of the time-series seriesKeys identifier used in data lines followed by a back slash and the time dimension code that is always TIME_PERIOD in SDMX standard \TIME_PERIOD




freq,unit,s_adj,nace_r2,indic,geo\TIME_PERIOD

For each of these dimension code there is a corresponding SDMX codelist with the same code available also in TSV format 

(minus) TODO dimensions label from the concept in TSV?


Observation column(s)

In the header line, other columns contains the observation time period

Observation columns are sorted in ascending 
order

(warning) Trailing space is important to align columns when 

The notation follows SDMX and ISO8601 standards  ( (info) characters in bold are fixed) 

PeriodFormatExample
yearYYYY2015
semester YYYY-SN2015-S1
quarter YYYY-QN2015-Q4
monthYYYY-MM2015-02
weekYYYY-WNN2015-W01
dayYYYY-MM-DD2015-12-31

(minus) TODO note on multi-freq

Data line(s)

First column

(minus) seriesKeys

Observation column

(minus) Not application vs Not available




EXAMPLE

– Other columns of the first line: sequence of codes corresponding to the items of 
the dimension. 
– All other columns but the first line: sequence of values.
Where available, flags are attached to values. The separator used between values 
and flags is a blank. If there are no flags, the value is followed by a blank.
– The decimal symbol used in the files is the dot ‘.’.

Hints for Excel users

Should review the section 3. HINTS FOR EXCEL USERS from Bulk merged with the updated guide in the Migrating PDF


Important changesRaw input from Migrating to API TSV

Should be converted to a confluence page with minor adaptation

  • case of dimension header
  • why it is TIME_PERIOD uppercase
  • No labels
_