Child pages
  • API - FAQ - Usage in Excel Power Query

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Introduction

With Power Query, Excel can perform ETL tasks on external data (ETL = Extract, transform, load) 

About Power Query (quote from Microsoft documention)

With Power Query (known as Get & Transform in Excel), you can import or connect to external data, and then shape that data, for example remove a column, change a data type, or merge tables, in ways that meet your needs. Then, you can load your query into Excel to create charts and reports. Periodically, you can refresh the data to make it up to date. Power Query is available on three Excel applications, Excel for Windows, Excel for Mac and Excel for the Web.


The API textual data formats (SDMX-CSV or TSV) work well as a Web data source to such ETL tool.

About JSON format

JSON data provided by Eurostat is following JSON-STAT format (https://json-stat.org/format ) that is tailored for Web visualization as it requires a toolkit for its processing, the main toolkit is in Javascript, and there is none for Excel.

Therefore such JSON data are not directly usable within Excel and is out of scope of our support.

This guide propose a few examples of getting started with loading data into Excel.

About data refresh

Connecting directly to the API will ensure having freshest available data, as content can be fully refreshed and the defined transformation pipeline in Excel Power Query will be applied again.

Further in this page, there is an example on how the refresh date can be visible.

In case you need to work, not with the lastest data, but a specific data extraction. it is recommended to work with files you downloaded from the API and organised in your storage that can eventually be loaded CSV File within Excel.

In this case the extraction date could be set as the file modified date as provided by the operating system. Please see below on how to retrieve this information from Excel Power Query

Read file modified date time

Please create first a blank query to hold the path to the csv file on your disk

  • Name = FilePath
  • Value= C:\Test Folder\Source File.csv

Then a second query with the following setup to extract the modified timestamp

  • Name = Date Time Modified
  • Formula 

    let
        //Get the file path
        FilePathString = FilePath,
    
        //Get length of file path
        FilePathLength = Text.Length(FilePathString),
    
        //Get last slash position
        LastSlash = Text.PositionOf(FilePathString,"\",Occurrence.Last),
    
        //Get the folder path
        FolderPath = Text.Start(FilePathString,LastSlash + 1),
    
        //Get the file name
        FileName = Text.End(FilePathString,FilePathLength - LastSlash - 1),
    
        //Get the date modified
        DateModified = Folder.Contents(FolderPath){[Name=FileName]}[Date modified]
    
    in
        DateModified

Starting from SDMX-CSV 2.0 data

General information

It is possible to build an API link to download SDMX-CSV-2.0 data for a dataset via the Databrowser download options menu.

In following example a complete dataset is used so the URL syntax is simple.

Step by step basic example 

Working with the ISOC_CI_ID_H dataset download link on SDMX 3.0 API with parameter to retrieve SDMX-CSV 2.0 data

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0&compress=false

Starting from a blank worksheet, it is possible to load the data for the Data menu with the From Web menu item

It opens a window where to enter the URL  

This load the Power Query Editor with an initial Source  step configured 

= Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0&compress=false")

While this guide try to minimize the need to edit the formula as much as possible, for daily use it is recommended to use compressed input by

  • removing the &compress=false parameter in the URL
  • wrapping the Web.Contents into a Binary.Decompress() method for GZIP uncompression
= Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0"),Compression.GZip)


(lightbulb) It is recommended to change the query name to use the dataset code


Right click on the file icon named ec.europa.eu and select CSV option

This automatically apply the following modification 

  • Modify the Source step to split by comma Split by Comma

    = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0&compress=false"),[Delimiter=",", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None])
  • Add a Promote headers step

    = Table.PromoteHeaders(Source, [PromoteAllScalars=true])
  • Add a Changed Type step

    = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo", type text}, {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", Int64.Type}, {"OBS_FLAG", type text}})

    in the Changed Type step, converting to number is choosing a number type based on the first value read, so it is possible that Integer type would be used by default

    In case your data report error it means it contains decimal number that your Excel regional settings cannot interpret as such

    You must ensure that type number is defined for OBS_VALUE and the English Europe is used as regional settings to use dot as a separator

    Changed Type step formula should be modified to be similar to the following :

    = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, [...OTHER_DIMENSIONS...], {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", type number}, {"OBS_FLAG", type text}}, "en-150")

From this point, data is already usable and can be loaded into excel via the Close and Load button.

The Excel workbook can be further customised to your need, the data source can be refreshed via the Refresh button of Excel.

Query can be reopened to apply further configuration, you may decide to remove the unnecessary first column as you may save your excel workbook with the code of the dataset in the filename.

Complete result of this step as visible in the Advanced Editor

(lightbulb) This code can be copy pasted 

let
    Source = Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0"),Compression.GZip),
    #"Imported CSV" = Csv.Document(Source,[Delimiter=",", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo", type text}, {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", Int64.Type}, {"OBS_FLAG", type text}}, "en-150"),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"STRUCTURE", "STRUCTURE_ID"})
in
    #"Removed Columns"

Preparing a query from an example code block in this page

Starting from a blank worksheet, in the Data ribbon menu, in Get Data > Launch Power Query Editor

In New Source menu > Other Sources > Blank Query 

Finally opening the Advanced Editor

=>

Then simply paste an example of a query definition taken from this page to have it loaded into excel.


Extension on the example : working with labels and last update date from the data file

There are 2 options to enable on the initial data query that could enpower further data rendering in Excel

  • &labels=name => Adding this option will add an extra column with the label for each corresponding codified column
    • an extra column with the dataset label will be added next to the dataset identifier
    • an extra column with the dimension label as header and the position label as values will be added for each dimension
  • &returnLastUpdateDate=true => Adding this option will add a last column names "LAST UPDATE" with the timestamp of the data last update
    • By design in SDMX there could be a different date per time-series or observation, for Eurostat this date is the same for the complete dataset but the value will be repeated on each row

If you edit the URL in the Source step with options adding extra column it is best to restart from zero as some steps will not be complete (especially the Source and Changed Type one)

For example the number or columns to render is a number that is set during the very fist CSV transformation 


It is now possible to get rid of unnecessary columns to keep the columns of importance either

  • during the data query definition by removing the column
  • by hiding them after data is loaded into Excel worksheet

Example : building a summary sheet on the dataset from the first row of data

Having loaded the data and renamed the sheet as data, then created a Summary sheet.

It is possible to reference content from the data sheet with the standard syntax sheetname!$COLUMN_LETTER$ROW_NUMBER

These columns could then be hidden in the data sheet to not hinder data viewing

Information on Date Time Refreshed

Information about when data was last refreshed is available from the Queries & Connections sidebar as illustrated below

Refreshing content can be done from here or from the Data ribbon.

To add the refresh time information to your summary sheet, the best option is to define a new Query with the following setup

  • Name = Date Time Refreshed
  • Formula = DateTime.LocalNow()

Then load this query into the information sheet you are building

Example : building a series key column instead of having one column per dimension code

In case you want to visualise and filter the observations for a one or more time-series.

It could be useful to have a series keys value available.

To do so, please ensure your query is opened in Powery Query Editor

Then select the coded dimension columns and select the Merge Columns in the Transform ribbon  and select dot character as separator, this would using using the series key in SDMX REST API calls

 


This column can be moved by Drag and Drop before the others,  this action is recorded as a ReoderColumns action in the query definition

Complete result of this step as visible in the Advanced Editor

let
    Source = Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0&labels=name&returnLastUpdateDate=true"),Compression.GZip),
    #"Imported CSV" = Csv.Document(Source,[Delimiter=",", Columns=20, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"STRUCTURE_NAME", type text}, {"freq", type text}, {"Time frequency", type text}, {"indic_is", type text}, {"Information society indicator", type text}, {"unit", type text}, {"Unit of measure", type text}, {"hhtyp", type text}, {"Type of household", type text}, {"geo", type text}, {"Geopolitical entity (reporting)", type text}, {"TIME_PERIOD", Int64.Type}, {"Time", type text}, {"OBS_VALUE", Int64.Type}, {"Observation value", type text}, {"OBS_FLAG", type text}, {"Observation status (Flag)", type text}, {"LAST UPDATE", type datetime}}, "en-150"),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"STRUCTURE", "Time", "Observation value", "Observation status (Flag)"}),
    #"Merged Columns" = Table.CombineColumns(#"Removed Columns",{"freq", "indic_is", "unit", "hhtyp", "geo"},Combiner.CombineTextByDelimiter(".", QuoteStyle.None),"Series Keys"),
    #"Reordered Columns" = Table.ReorderColumns(#"Merged Columns",{"STRUCTURE_ID", "STRUCTURE_NAME", "Series Keys", "Time frequency", "Information society indicator", "Unit of measure", "Type of household", "Geopolitical entity (reporting)", "TIME_PERIOD", "OBS_VALUE", "OBS_FLAG", "LAST UPDATE"})
in
    #"Reordered Columns"

Extension on the example : pivoting data 

As the SDMX-CSV display one statistical observation per line, it can be necessary to build more column from another dimension. 

TIME_PERIOD is the most common dimension to PIVOT one

Starting from the basic example, of having the data loaded as CSV columns, to apply pivoting, the following steps are necessary

Merge OBS_FLAG column with OBS_VALUE column separated by a space.

Select TIME_PERIOD column and "Pivot column", then select previously generated column and advanced option to not aggregate

This operation can take a few minutes.

As you may have spotted the columns are not ordered

You need to modify the formula to include a List.Sort to control the order of the column

= Table.Pivot(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE"), List.Sort(List.Distinct(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE")[TIME_PERIOD]),Order.Ascending), "TIME_PERIOD", "Obs")


or with Descending option

= Table.Pivot(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE"), List.Sort(List.Distinct(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE")[TIME_PERIOD]),Order.Descending), "TIME_PERIOD", "Obs")


In case you prefer to have value and attributes in separated column, you could split back the column based on the space separator

(lightbulb) As this operation must be done on each column, it may be easier to manipulate the advanced query text than use the interface


let
    Source = Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0"),Compression.GZip),
    #"Imported CSV" = Csv.Document(Source,[Delimiter=",", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo", type text}, {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", Int64.Type}, {"OBS_FLAG", type text}},"en-150"),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"STRUCTURE", "STRUCTURE_ID"}),
    #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Removed Columns", {{"OBS_VALUE", type text}}, "fr-BE"),{"OBS_VALUE", "OBS_FLAG"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Obs"),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE"), List.Sort(List.Distinct(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE")[TIME_PERIOD]),Order.Ascending), "TIME_PERIOD", "Obs"),
    #"Split Column by Delimiter 2002" = Table.SplitColumn(#"Pivoted Column", "2002", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2002.OBS", "2002.FLAG"}),
    #"Changed Type 2002" = Table.TransformColumnTypes(#"Split Column by Delimiter 2002",{{"2002.OBS", Int64.Type}, {"2002.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2003" = Table.SplitColumn(#"Changed Type 2002", "2003", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2003.OBS", "2003.FLAG"}),
    #"Changed Type 2003" = Table.TransformColumnTypes(#"Split Column by Delimiter 2003",{{"2003.OBS", Int64.Type}, {"2003.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2004" = Table.SplitColumn(#"Changed Type 2003", "2004", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2004.OBS", "2004.FLAG"}),
    #"Changed Type 2004" = Table.TransformColumnTypes(#"Split Column by Delimiter 2004",{{"2004.OBS", Int64.Type}, {"2004.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2005" = Table.SplitColumn(#"Changed Type 2004", "2005", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2005.OBS", "2005.FLAG"}),
    #"Changed Type 2005" = Table.TransformColumnTypes(#"Split Column by Delimiter 2005",{{"2005.OBS", Int64.Type}, {"2005.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2006" = Table.SplitColumn(#"Changed Type 2005", "2006", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2006.OBS", "2006.FLAG"}),
    #"Changed Type 2006" = Table.TransformColumnTypes(#"Split Column by Delimiter 2006",{{"2006.OBS", Int64.Type}, {"2006.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2007" = Table.SplitColumn(#"Changed Type 2006", "2007", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2007.OBS", "2007.FLAG"}),
    #"Changed Type 2007" = Table.TransformColumnTypes(#"Split Column by Delimiter 2007",{{"2007.OBS", Int64.Type}, {"2007.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2008" = Table.SplitColumn(#"Changed Type 2007", "2008", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2008.OBS", "2008.FLAG"}),
    #"Changed Type 2008" = Table.TransformColumnTypes(#"Split Column by Delimiter 2008",{{"2008.OBS", Int64.Type}, {"2008.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2009" = Table.SplitColumn(#"Changed Type 2008", "2009", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2009.OBS", "2009.FLAG"}),
    #"Changed Type 2009" = Table.TransformColumnTypes(#"Split Column by Delimiter 2009",{{"2009.OBS", Int64.Type}, {"2009.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2010" = Table.SplitColumn(#"Changed Type 2009", "2010", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2010.OBS", "2010.FLAG"}),
    #"Changed Type 2010" = Table.TransformColumnTypes(#"Split Column by Delimiter 2010",{{"2010.OBS", Int64.Type}, {"2010.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2014" = Table.SplitColumn(#"Changed Type 2010", "2014", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2014.OBS", "2014.FLAG"}),
    #"Changed Type 2014" = Table.TransformColumnTypes(#"Split Column by Delimiter 2014",{{"2014.OBS", Int64.Type}, {"2014.FLAG", type text}},"en-150")
in
    #"Changed Type 2014"

Starting from TSV data

General information

It is possible to build an API link to download TSV 2.0 data for a dataset via the Databrowser download options menu.

In following example a complete dataset is used so the URL syntax is simple.

Step by step basic example 

Working with the ISOC_CI_ID_H dataset download link on SDMX 3.0 API with parameter to retrieve TSV data

https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false 

Starting from a blank worksheet, it is possible to load the data for the Data menu with the From Web menu item  

It opens a window where to enter the URL

This load the Power Query Editor with an initial Source  step configured 

= Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false ")

While this guide try to minimize the need to edit the formula as much as possible, for daily use it is recommended to use compressed input by

  • removing the &compress=false parameter in the URL
  • wrapping the Web.Contents into a Binary.Decompress() method for GZIP uncompression
= Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV"),Compression.GZip)


(lightbulb) It is recommended to change the query name to use the dataset code


Right click on the file icon named ec.europa.eu and select CSV option

This automatically apply the following modification 

  • Modify the Source step to split by TAB

    = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false"),[Delimiter="	", Columns=11, Encoding=65001, QuoteStyle=QuoteStyle.None])
  • Add a Changed Type step

    = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}, {"Column9", type text}, {"Column10", type text}, {"Column11", type text}},"en-150")

From this point, data is already usable if it is ok to work with a single time series columns, otherwise the first column can be split by comma

As you may have noticed Excel did not detect the first line as an header row

  • Delete the Changed Type step(s) from the Applied Steps list 
  • in the Transform Ribbon, click on Use First Row as Header(s) 

Finally it can be loaded into excel via the Close and Load button.

The Excel workbook can be further customised to your need, the data source can be refreshed via the Refresh button of Excel.

Query can be reopened to apply further configuration.

Complete result of this step as visible in the Advanced Editor

(lightbulb) This code can be copy pasted 

let
    Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false"),[Delimiter="	", Columns=11, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5"}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo\TIME_PERIOD", type text}, {"2002 ", type text}, {"2003 ", type text}, {"2004 ", type text}, {"2005 ", type text}, {"2006 ", type text}, {"2007 ", type text}, {"2008 ", type text}, {"2009 ", type text}, {"2010 ", type text}, {"2014 ", type text}},"en-150")
in
    #"Changed Type"

TSV - separating value and flags

As done in the SDMX-CSV example it could be possible to edit each data column to split further by space.

let
    Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false"),[Delimiter="	", Columns=11, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5"}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo\TIME_PERIOD", type text}, {"2002 ", type text}, {"2003 ", type text}, {"2004 ", type text}, {"2005 ", type text}, {"2006 ", type text}, {"2007 ", type text}, {"2008 ", type text}, {"2009 ", type text}, {"2010 ", type text}, {"2014 ", type text}},"en-150"),
    #"Split Column by Delimiter 2002" = Table.SplitColumn(#"Changed Type", "2002 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2002.OBS", "2002.FLAG"}),
    #"Changed Type 2002" = Table.TransformColumnTypes(#"Split Column by Delimiter 2002",{{"2002.OBS", type text}, {"2002.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2003" = Table.SplitColumn(#"Changed Type 2002", "2003 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2003.OBS", "2003.FLAG"}),
    #"Changed Type 2003" = Table.TransformColumnTypes(#"Split Column by Delimiter 2003",{{"2003.OBS", type text}, {"2003.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2004" = Table.SplitColumn(#"Changed Type 2003", "2004 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2004.OBS", "2004.FLAG"}),
    #"Changed Type 2004" = Table.TransformColumnTypes(#"Split Column by Delimiter 2004",{{"2004.OBS", type text}, {"2004.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2005" = Table.SplitColumn(#"Changed Type 2004", "2005 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2005.OBS", "2005.FLAG"}),
    #"Changed Type 2005" = Table.TransformColumnTypes(#"Split Column by Delimiter 2005",{{"2005.OBS", type text}, {"2005.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2006" = Table.SplitColumn(#"Changed Type 2005", "2006 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2006.OBS", "2006.FLAG"}),
    #"Changed Type 2006" = Table.TransformColumnTypes(#"Split Column by Delimiter 2006",{{"2006.OBS", type text}, {"2006.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2007" = Table.SplitColumn(#"Changed Type 2006", "2007 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2007.OBS", "2007.FLAG"}),
    #"Changed Type 2007" = Table.TransformColumnTypes(#"Split Column by Delimiter 2007",{{"2007.OBS", type text}, {"2007.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2008" = Table.SplitColumn(#"Changed Type 2007", "2008 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2008.OBS", "2008.FLAG"}),
    #"Changed Type 2008" = Table.TransformColumnTypes(#"Split Column by Delimiter 2008",{{"2008.OBS", type text}, {"2008.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2009" = Table.SplitColumn(#"Changed Type 2008", "2009 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2009.OBS", "2009.FLAG"}),
    #"Changed Type 2009" = Table.TransformColumnTypes(#"Split Column by Delimiter 2009",{{"2009.OBS", type text}, {"2009.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2010" = Table.SplitColumn(#"Changed Type 2009", "2010 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2010.OBS", "2010.FLAG"}),
    #"Changed Type 2010" = Table.TransformColumnTypes(#"Split Column by Delimiter 2010",{{"2010.OBS", type text}, {"2010.FLAG", type text}},"en-150"),
    #"Split Column by Delimiter 2014" = Table.SplitColumn(#"Changed Type 2010", "2014 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2014.OBS", "2014.FLAG"}),
    #"Changed Type 2014" = Table.TransformColumnTypes(#"Split Column by Delimiter 2014",{{"2014.OBS", type text}, {"2014.FLAG", type text}},"en-150")
in
    #"Changed Type 2014"


An alternative solution is to use the advanced editor to load content as TEXT in the first step and use a more advanced function to split by multiple delimiters in one go

let
    Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/ISOC_CI_ID_H/?format=TSV"), null, null, 65001)}),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByAnyDelimiter({",", " ", "#(tab)"}, QuoteStyle.Csv)),
    #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true])
in
    #"Promoted Headers"

TSV - looking up dimension position labels

Recommended way to add labelling on top of TSV data is to load it from the accompanying codelists. Power Query make this quite easy by allowing to combine several queries

First step is to define several queries to load the data and the necessary codelists

Then the data table must be combined with the FREQ dimension, the result of this combination must be combined with INDIC_IS dimension etc

To combine two queries, a new query must be created from Power Query Editor and select the column in each table to use to JOIN them. The Join Kind must be Left Outer to execute the lookup properly 

This will add a new column at the end of the table that should be edited to only display Column 2 of the codelist content to render the labels.

When all labels columns are created they can be moved to be next to the code column

All these steps are summarized in the following workbook and codeblock


// data
let
    Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/ISOC_CI_ID_H/?format=TSV"), null, null, 65001)}),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByAnyDelimiter({",", "#(tab)"}, QuoteStyle.Csv)),
    #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true])
in
    #"Promoted Headers"

// FREQ
let
    Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/FREQ?format=TSV"),[Delimiter="	", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150")
in
    #"Changed Type"

// INDIC_IS
let
    Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/INDIC_IS?format=TSV"),[Delimiter="	", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150")
in
    #"Changed Type"

// UNIT
let
    Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/UNIT?format=TSV"),[Delimiter="	", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150")
in
    #"Changed Type"

// HHTYP
let
    Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/HHTYP?format=TSV"),[Delimiter="	", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150")
in
    #"Changed Type"

// GEO
let
    Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/GEO?format=TSV"),[Delimiter="	", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150")
in
    #"Changed Type"

// data+FREQ
let
    Source = Table.NestedJoin(data, {"freq"}, FREQ, {"Column1"}, "FREQ.1", JoinKind.LeftOuter),
    #"Expanded FREQ.1" = Table.ExpandTableColumn(Source, "FREQ.1", {"Column2"}, {"FREQ.1.Column2"})
in
    #"Expanded FREQ.1"

// data+FREQ+INDIC_IS
let
    Source = Table.NestedJoin(#"data+FREQ", {"indic_is"}, INDIC_IS, {"Column1"}, "INDIC_IS.1", JoinKind.LeftOuter),
    #"Expanded INDIC_IS.1" = Table.ExpandTableColumn(Source, "INDIC_IS.1", {"Column2"}, {"INDIC_IS.1.Column2"})
in
    #"Expanded INDIC_IS.1"

// data+FREQ+INDIC_IS+UNIT
let
    Source = Table.NestedJoin(#"data+FREQ+INDIC_IS", {"unit"}, UNIT, {"Column1"}, "UNIT.1", JoinKind.LeftOuter),
    #"Expanded UNIT.1" = Table.ExpandTableColumn(Source, "UNIT.1", {"Column2"}, {"UNIT.1.Column2"})
in
    #"Expanded UNIT.1"

// data+FREQ+INDIC_IS+UNIT+HHTYP
let
    Source = Table.NestedJoin(#"data+FREQ+INDIC_IS+UNIT", {"hhtyp"}, HHTYP, {"Column1"}, "HHTYP.1", JoinKind.LeftOuter),
    #"Expanded HHTYP.1" = Table.ExpandTableColumn(Source, "HHTYP.1", {"Column2"}, {"HHTYP.1.Column2"})
in
    #"Expanded HHTYP.1"

// data+LABELS
let
    Source = Table.NestedJoin(#"data+FREQ+INDIC_IS+UNIT+HHTYP", {"geo\TIME_PERIOD"}, GEO, {"Column1"}, "GEO", JoinKind.LeftOuter),
    #"Expanded GEO" = Table.ExpandTableColumn(Source, "GEO", {"Column2"}, {"GEO.Column2"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded GEO",{"freq", "FREQ.1.Column2", "indic_is", "INDIC_IS.1.Column2", "unit", "UNIT.1.Column2", "hhtyp", "HHTYP.1.Column2", "geo\TIME_PERIOD", "GEO.Column2", "2002 ", "2003 ", "2004 ", "2005 ", "2006 ", "2007 ", "2008 ", "2009 ", "2010 ", "2014 "})
in
    #"Reordered Columns"

Share data queries

Option 1 : Copy several queries between workbooks and share workbook files

Starting from a blank worksheet, it is possible to re-use existing queries from another workbook


Create a new worksheet

Reveal the Queries & Connections  

Right-click under the queries pane and click Paste

This will automatically load the last query into the current worksheet 


Option 2 : Export and share ODC files

When a query is completed, it is possible to select the final query to be exported as an ODC file as the one below

TSV-ISOC-WITH-LABELS.odc

This file can be simply double-clicked on windows and should open in Excel directly.

Otherwise it can be loaded as an existing connection

This menu automatically list connections present in the "My Data Sources" user folder C:\Users\user_id\Documents\My Data Sources

Files in another location can be opened with the Browser for more button.

More information on ODC files from Microsoft documentation


Lookup dataset metadata

As TSV format does not include metadata of the dataset, an additional query to the JSON Dataflow can be executed to retrieve the dataset label and the annotations holding the metadata.

A new query can be defined as below and the result loaded in a dedicated worksheet.

let
    Source = Json.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/structure/dataflow/ESTAT/ISOC_CI_ID_H?format=JSON&lang=en&compress=false")),
    #"Converted to Table" = Record.ToTable(Source),
    #"Pivoted Column" = Table.Pivot(#"Converted to Table", List.Distinct(#"Converted to Table"[Name]), "Name", "Value"),
    #"Expanded extension" = Table.ExpandRecordColumn(#"Pivoted Column", "extension", {"lang", "id", "agencyId", "version", "datastructure", "annotation"}, {"extension.lang", "extension.id", "extension.agencyId", "extension.version", "extension.datastructure", "extension.annotation"}),
    #"Expanded extension.annotation" = Table.ExpandListColumn(#"Expanded extension", "extension.annotation"),
    #"Expanded extension.annotation1" = Table.ExpandRecordColumn(#"Expanded extension.annotation", "extension.annotation", {"type", "title", "date", "href"}, {"extension.annotation.type", "extension.annotation.title", "extension.annotation.date", "extension.annotation.href"})
in
    #"Expanded extension.annotation1"




  • No labels
_