Introduction
With Power Query, Excel can perform ETL tasks on external data (ETL = Extract, transform, load)
About Power Query (quote from Microsoft documention)
With Power Query (known as Get & Transform in Excel), you can import or connect to external data, and then shape that data, for example remove a column, change a data type, or merge tables, in ways that meet your needs. Then, you can load your query into Excel to create charts and reports. Periodically, you can refresh the data to make it up to date. Power Query is available on three Excel applications, Excel for Windows, Excel for Mac and Excel for the Web.
The API textual data formats (SDMX-CSV or TSV) work well as a Web data source to such ETL tool.
About JSON format
JSON data provided by Eurostat is following JSON-STAT format (https://json-stat.org/format ) that is tailored for Web visualization as it requires a toolkit for its processing, the main toolkit is in Javascript, and there is none for Excel.
Therefore such JSON data are not directly usable within Excel and is out of scope of our support.
This guide propose a few examples of getting started with loading data into Excel.
About data refresh
Connecting directly to the API will ensure having freshest available data, as content can be fully refreshed and the defined transformation pipeline in Excel Power Query will be applied again.
Further in this page, there is an example on how the refresh date can be visible.
In case you need to work, not with the lastest data, but a specific data extraction. it is recommended to work with files you downloaded from the API and organised in your storage that can eventually be loaded CSV File within Excel.
In this case the extraction date could be set as the file modified date as provided by the operating system. Please see below on how to retrieve this information from Excel Power Query
Read file modified date time
Please create first a blank query to hold the path to the csv file on your disk
- Name = FilePath
- Value= C:\Test Folder\Source File.csv
Then a second query with the following setup to extract the modified timestamp
- Name = Date Time Modified
Formula
let //Get the file path FilePathString = FilePath, //Get length of file path FilePathLength = Text.Length(FilePathString), //Get last slash position LastSlash = Text.PositionOf(FilePathString,"\",Occurrence.Last), //Get the folder path FolderPath = Text.Start(FilePathString,LastSlash + 1), //Get the file name FileName = Text.End(FilePathString,FilePathLength - LastSlash - 1), //Get the date modified DateModified = Folder.Contents(FolderPath){[Name=FileName]}[Date modified] in DateModified
Starting from SDMX-CSV 2.0 data
General information
It is possible to build an API link to download SDMX-CSV-2.0 data for a dataset via the Databrowser download options menu.
In following example a complete dataset is used so the URL syntax is simple.
Step by step basic example
Working with the ISOC_CI_ID_H dataset download link on SDMX 3.0 API with parameter to retrieve SDMX-CSV 2.0 data
Starting from a blank worksheet, it is possible to load the data for the Data menu with the From Web menu item
It opens a window where to enter the URL
This load the Power Query Editor with an initial Source step configured
= Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0&compress=false")
While this guide try to minimize the need to edit the formula as much as possible, for daily use it is recommended to use compressed input by
- removing the &compress=false parameter in the URL
- wrapping the Web.Contents into a Binary.Decompress() method for GZIP uncompression
= Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0"),Compression.GZip)
It is recommended to change the query name to use the dataset code
Right click on the file icon named ec.europa.eu and select CSV option
This automatically apply the following modification
Modify the Source step to split by comma Split by Comma
= Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0&compress=false"),[Delimiter=",", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None])
Add a Promote headers step
= Table.PromoteHeaders(Source, [PromoteAllScalars=true])
Add a Changed Type step
= Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo", type text}, {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", Int64.Type}, {"OBS_FLAG", type text}})
in the Changed Type step, converting to number is choosing a number type based on the first value read, so it is possible that Integer type would be used by default
In case your data report error it means it contains decimal number that your Excel regional settings cannot interpret as such
You must ensure that type number is defined for OBS_VALUE and the English Europe is used as regional settings to use dot as a separator
Changed Type step formula should be modified to be similar to the following :
= Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, [...OTHER_DIMENSIONS...], {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", type number}, {"OBS_FLAG", type text}}, "en-150")
From this point, data is already usable and can be loaded into excel via the Close and Load button.
The Excel workbook can be further customised to your need, the data source can be refreshed via the Refresh button of Excel.
Query can be reopened to apply further configuration, you may decide to remove the unnecessary first column as you may save your excel workbook with the code of the dataset in the filename.
Complete result of this step as visible in the Advanced Editor
This code can be copy pasted
let Source = Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0"),Compression.GZip), #"Imported CSV" = Csv.Document(Source,[Delimiter=",", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None]), #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]), #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo", type text}, {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", Int64.Type}, {"OBS_FLAG", type text}}, "en-150"), #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"STRUCTURE", "STRUCTURE_ID"}) in #"Removed Columns"
Preparing a query from an example code block in this page
Starting from a blank worksheet, in the Data ribbon menu, in Get Data > Launch Power Query Editor
In New Source menu > Other Sources > Blank Query
Finally opening the Advanced Editor
=>
Then simply paste an example of a query definition taken from this page to have it loaded into excel.
Extension on the example : working with labels and last update date from the data file
There are 2 options to enable on the initial data query that could enpower further data rendering in Excel
- &labels=name => Adding this option will add an extra column with the label for each corresponding codified column
- an extra column with the dataset label will be added next to the dataset identifier
- an extra column with the dimension label as header and the position label as values will be added for each dimension
- &returnLastUpdateDate=true => Adding this option will add a last column names "LAST UPDATE" with the timestamp of the data last update
- By design in SDMX there could be a different date per time-series or observation, for Eurostat this date is the same for the complete dataset but the value will be repeated on each row
If you edit the URL in the Source step with options adding extra column it is best to restart from zero as some steps will not be complete (especially the Source and Changed Type one)
For example the number or columns to render is a number that is set during the very fist CSV transformation
It is now possible to get rid of unnecessary columns to keep the columns of importance either
- during the data query definition by removing the column
- by hiding them after data is loaded into Excel worksheet
Example : building a summary sheet on the dataset from the first row of data
Having loaded the data and renamed the sheet as data, then created a Summary sheet.
It is possible to reference content from the data sheet with the standard syntax sheetname!$COLUMN_LETTER$ROW_NUMBER
These columns could then be hidden in the data sheet to not hinder data viewing
Information on Date Time Refreshed
Information about when data was last refreshed is available from the Queries & Connections sidebar as illustrated below
Refreshing content can be done from here or from the Data ribbon.
To add the refresh time information to your summary sheet, the best option is to define a new Query with the following setup
- Name = Date Time Refreshed
- Formula = DateTime.LocalNow()
Then load this query into the information sheet you are building
Example : building a series key column instead of having one column per dimension code
In case you want to visualise and filter the observations for a one or more time-series.
It could be useful to have a series keys value available.
To do so, please ensure your query is opened in Powery Query Editor
Then select the coded dimension columns and select the Merge Columns in the Transform ribbon and select dot character as separator, this would using using the series key in SDMX REST API calls
This column can be moved by Drag and Drop before the others, this action is recorded as a ReoderColumns action in the query definition
Complete result of this step as visible in the Advanced Editor
let Source = Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0&labels=name&returnLastUpdateDate=true"),Compression.GZip), #"Imported CSV" = Csv.Document(Source,[Delimiter=",", Columns=20, Encoding=1252, QuoteStyle=QuoteStyle.None]), #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]), #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"STRUCTURE_NAME", type text}, {"freq", type text}, {"Time frequency", type text}, {"indic_is", type text}, {"Information society indicator", type text}, {"unit", type text}, {"Unit of measure", type text}, {"hhtyp", type text}, {"Type of household", type text}, {"geo", type text}, {"Geopolitical entity (reporting)", type text}, {"TIME_PERIOD", Int64.Type}, {"Time", type text}, {"OBS_VALUE", Int64.Type}, {"Observation value", type text}, {"OBS_FLAG", type text}, {"Observation status (Flag)", type text}, {"LAST UPDATE", type datetime}}, "en-150"), #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"STRUCTURE", "Time", "Observation value", "Observation status (Flag)"}), #"Merged Columns" = Table.CombineColumns(#"Removed Columns",{"freq", "indic_is", "unit", "hhtyp", "geo"},Combiner.CombineTextByDelimiter(".", QuoteStyle.None),"Series Keys"), #"Reordered Columns" = Table.ReorderColumns(#"Merged Columns",{"STRUCTURE_ID", "STRUCTURE_NAME", "Series Keys", "Time frequency", "Information society indicator", "Unit of measure", "Type of household", "Geopolitical entity (reporting)", "TIME_PERIOD", "OBS_VALUE", "OBS_FLAG", "LAST UPDATE"}) in #"Reordered Columns"
Extension on the example : pivoting data
As the SDMX-CSV display one statistical observation per line, it can be necessary to build more column from another dimension.
TIME_PERIOD is the most common dimension to PIVOT one
Starting from the basic example, of having the data loaded as CSV columns, to apply pivoting, the following steps are necessary
Merge OBS_FLAG column with OBS_VALUE column separated by a space.
Select TIME_PERIOD column and "Pivot column", then select previously generated column and advanced option to not aggregate
This operation can take a few minutes.
As you may have spotted the columns are not ordered
You need to modify the formula to include a List.Sort to control the order of the column
= Table.Pivot(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE"), List.Sort(List.Distinct(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE")[TIME_PERIOD]),Order.Ascending), "TIME_PERIOD", "Obs")
or with Descending option
= Table.Pivot(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE"), List.Sort(List.Distinct(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE")[TIME_PERIOD]),Order.Descending), "TIME_PERIOD", "Obs")
In case you prefer to have value and attributes in separated column, you could split back the column based on the space separator
As this operation must be done on each column, it may be easier to manipulate the advanced query text than use the interface
let Source = Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=csvdata&formatVersion=2.0"),Compression.GZip), #"Imported CSV" = Csv.Document(Source,[Delimiter=",", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None]), #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]), #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"STRUCTURE", type text}, {"STRUCTURE_ID", type text}, {"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo", type text}, {"TIME_PERIOD", Int64.Type}, {"OBS_VALUE", Int64.Type}, {"OBS_FLAG", type text}},"en-150"), #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"STRUCTURE", "STRUCTURE_ID"}), #"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Removed Columns", {{"OBS_VALUE", type text}}, "fr-BE"),{"OBS_VALUE", "OBS_FLAG"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Obs"), #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE"), List.Sort(List.Distinct(Table.TransformColumnTypes(#"Merged Columns", {{"TIME_PERIOD", type text}}, "fr-BE")[TIME_PERIOD]),Order.Ascending), "TIME_PERIOD", "Obs"), #"Split Column by Delimiter 2002" = Table.SplitColumn(#"Pivoted Column", "2002", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2002.OBS", "2002.FLAG"}), #"Changed Type 2002" = Table.TransformColumnTypes(#"Split Column by Delimiter 2002",{{"2002.OBS", Int64.Type}, {"2002.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2003" = Table.SplitColumn(#"Changed Type 2002", "2003", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2003.OBS", "2003.FLAG"}), #"Changed Type 2003" = Table.TransformColumnTypes(#"Split Column by Delimiter 2003",{{"2003.OBS", Int64.Type}, {"2003.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2004" = Table.SplitColumn(#"Changed Type 2003", "2004", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2004.OBS", "2004.FLAG"}), #"Changed Type 2004" = Table.TransformColumnTypes(#"Split Column by Delimiter 2004",{{"2004.OBS", Int64.Type}, {"2004.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2005" = Table.SplitColumn(#"Changed Type 2004", "2005", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2005.OBS", "2005.FLAG"}), #"Changed Type 2005" = Table.TransformColumnTypes(#"Split Column by Delimiter 2005",{{"2005.OBS", Int64.Type}, {"2005.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2006" = Table.SplitColumn(#"Changed Type 2005", "2006", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2006.OBS", "2006.FLAG"}), #"Changed Type 2006" = Table.TransformColumnTypes(#"Split Column by Delimiter 2006",{{"2006.OBS", Int64.Type}, {"2006.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2007" = Table.SplitColumn(#"Changed Type 2006", "2007", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2007.OBS", "2007.FLAG"}), #"Changed Type 2007" = Table.TransformColumnTypes(#"Split Column by Delimiter 2007",{{"2007.OBS", Int64.Type}, {"2007.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2008" = Table.SplitColumn(#"Changed Type 2007", "2008", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2008.OBS", "2008.FLAG"}), #"Changed Type 2008" = Table.TransformColumnTypes(#"Split Column by Delimiter 2008",{{"2008.OBS", Int64.Type}, {"2008.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2009" = Table.SplitColumn(#"Changed Type 2008", "2009", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2009.OBS", "2009.FLAG"}), #"Changed Type 2009" = Table.TransformColumnTypes(#"Split Column by Delimiter 2009",{{"2009.OBS", Int64.Type}, {"2009.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2010" = Table.SplitColumn(#"Changed Type 2009", "2010", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2010.OBS", "2010.FLAG"}), #"Changed Type 2010" = Table.TransformColumnTypes(#"Split Column by Delimiter 2010",{{"2010.OBS", Int64.Type}, {"2010.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2014" = Table.SplitColumn(#"Changed Type 2010", "2014", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2014.OBS", "2014.FLAG"}), #"Changed Type 2014" = Table.TransformColumnTypes(#"Split Column by Delimiter 2014",{{"2014.OBS", Int64.Type}, {"2014.FLAG", type text}},"en-150") in #"Changed Type 2014"
Starting from TSV data
General information
It is possible to build an API link to download TSV 2.0 data for a dataset via the Databrowser download options menu.
In following example a complete dataset is used so the URL syntax is simple.
Step by step basic example
Working with the ISOC_CI_ID_H dataset download link on SDMX 3.0 API with parameter to retrieve TSV data
Starting from a blank worksheet, it is possible to load the data for the Data menu with the From Web menu item
It opens a window where to enter the URL
This load the Power Query Editor with an initial Source step configured
= Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false ")
While this guide try to minimize the need to edit the formula as much as possible, for daily use it is recommended to use compressed input by
- removing the &compress=false parameter in the URL
- wrapping the Web.Contents into a Binary.Decompress() method for GZIP uncompression
= Binary.Decompress(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV"),Compression.GZip)
It is recommended to change the query name to use the dataset code
Right click on the file icon named ec.europa.eu and select CSV option
This automatically apply the following modification
Modify the Source step to split by TAB
= Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false"),[Delimiter=" ", Columns=11, Encoding=65001, QuoteStyle=QuoteStyle.None])
Add a Changed Type step
= Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}, {"Column9", type text}, {"Column10", type text}, {"Column11", type text}},"en-150")
From this point, data is already usable if it is ok to work with a single time series columns, otherwise the first column can be split by comma
As you may have noticed Excel did not detect the first line as an header row
- Delete the Changed Type step(s) from the Applied Steps list
- in the Transform Ribbon, click on Use First Row as Header(s)
Finally it can be loaded into excel via the Close and Load button.
The Excel workbook can be further customised to your need, the data source can be refreshed via the Refresh button of Excel.
Query can be reopened to apply further configuration.
Complete result of this step as visible in the Advanced Editor
This code can be copy pasted
let Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false"),[Delimiter="#(tab)", Columns=11, Encoding=65001, QuoteStyle=QuoteStyle.None]), #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5"}), #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true]), #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo\TIME_PERIOD", type text}, {"2002 ", type text}, {"2003 ", type text}, {"2004 ", type text}, {"2005 ", type text}, {"2006 ", type text}, {"2007 ", type text}, {"2008 ", type text}, {"2009 ", type text}, {"2010 ", type text}, {"2014 ", type text}},"en-150") in #"Changed Type"
TSV - separating value and flags
As done in the SDMX-CSV example it could be possible to edit each data column to split further by space.
let Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/data/dataflow/ESTAT/ISOC_CI_ID_H/1.0?format=TSV&compress=false"),[Delimiter="#(tab)", Columns=11, Encoding=65001, QuoteStyle=QuoteStyle.None]), #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5"}), #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true]), #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"freq", type text}, {"indic_is", type text}, {"unit", type text}, {"hhtyp", type text}, {"geo\TIME_PERIOD", type text}, {"2002 ", type text}, {"2003 ", type text}, {"2004 ", type text}, {"2005 ", type text}, {"2006 ", type text}, {"2007 ", type text}, {"2008 ", type text}, {"2009 ", type text}, {"2010 ", type text}, {"2014 ", type text}},"en-150"), #"Split Column by Delimiter 2002" = Table.SplitColumn(#"Changed Type", "2002 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2002.OBS", "2002.FLAG"}), #"Changed Type 2002" = Table.TransformColumnTypes(#"Split Column by Delimiter 2002",{{"2002.OBS", type text}, {"2002.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2003" = Table.SplitColumn(#"Changed Type 2002", "2003 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2003.OBS", "2003.FLAG"}), #"Changed Type 2003" = Table.TransformColumnTypes(#"Split Column by Delimiter 2003",{{"2003.OBS", type text}, {"2003.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2004" = Table.SplitColumn(#"Changed Type 2003", "2004 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2004.OBS", "2004.FLAG"}), #"Changed Type 2004" = Table.TransformColumnTypes(#"Split Column by Delimiter 2004",{{"2004.OBS", type text}, {"2004.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2005" = Table.SplitColumn(#"Changed Type 2004", "2005 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2005.OBS", "2005.FLAG"}), #"Changed Type 2005" = Table.TransformColumnTypes(#"Split Column by Delimiter 2005",{{"2005.OBS", type text}, {"2005.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2006" = Table.SplitColumn(#"Changed Type 2005", "2006 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2006.OBS", "2006.FLAG"}), #"Changed Type 2006" = Table.TransformColumnTypes(#"Split Column by Delimiter 2006",{{"2006.OBS", type text}, {"2006.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2007" = Table.SplitColumn(#"Changed Type 2006", "2007 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2007.OBS", "2007.FLAG"}), #"Changed Type 2007" = Table.TransformColumnTypes(#"Split Column by Delimiter 2007",{{"2007.OBS", type text}, {"2007.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2008" = Table.SplitColumn(#"Changed Type 2007", "2008 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2008.OBS", "2008.FLAG"}), #"Changed Type 2008" = Table.TransformColumnTypes(#"Split Column by Delimiter 2008",{{"2008.OBS", type text}, {"2008.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2009" = Table.SplitColumn(#"Changed Type 2008", "2009 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2009.OBS", "2009.FLAG"}), #"Changed Type 2009" = Table.TransformColumnTypes(#"Split Column by Delimiter 2009",{{"2009.OBS", type text}, {"2009.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2010" = Table.SplitColumn(#"Changed Type 2009", "2010 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2010.OBS", "2010.FLAG"}), #"Changed Type 2010" = Table.TransformColumnTypes(#"Split Column by Delimiter 2010",{{"2010.OBS", type text}, {"2010.FLAG", type text}},"en-150"), #"Split Column by Delimiter 2014" = Table.SplitColumn(#"Changed Type 2010", "2014 ", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"2014.OBS", "2014.FLAG"}), #"Changed Type 2014" = Table.TransformColumnTypes(#"Split Column by Delimiter 2014",{{"2014.OBS", type text}, {"2014.FLAG", type text}},"en-150") in #"Changed Type 2014"
An alternative solution is to use the advanced editor to load content as TEXT in the first step and use a more advanced function to split by multiple delimiters in one go
let Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/ISOC_CI_ID_H/?format=TSV"), null, null, 65001)}), #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByAnyDelimiter({",", " ", "#(tab)"}, QuoteStyle.Csv)), #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true]) in #"Promoted Headers"
TSV - looking up dimension position labels
Recommended way to add labelling on top of TSV data is to load it from the accompanying codelists. Power Query make this quite easy by allowing to combine several queries
First step is to define several queries to load the data and the necessary codelists
Then the data table must be combined with the FREQ dimension, the result of this combination must be combined with INDIC_IS dimension etc
To combine two queries, a new query must be created from Power Query Editor and select the column in each table to use to JOIN them. The Join Kind must be Left Outer to execute the lookup properly
This will add a new column at the end of the table that should be edited to only display Column 2 of the codelist content to render the labels.
When all labels columns are created they can be moved to be next to the code column
All these steps are summarized in the following workbook and codeblock
// data let Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/ISOC_CI_ID_H/?format=TSV"), null, null, 65001)}), #"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByAnyDelimiter({",", "#(tab)"}, QuoteStyle.Csv)), #"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true]) in #"Promoted Headers" // FREQ let Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/FREQ?format=TSV"),[Delimiter="#(tab)", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150") in #"Changed Type" // INDIC_IS let Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/INDIC_IS?format=TSV"),[Delimiter="#(tab)", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150") in #"Changed Type" // UNIT let Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/UNIT?format=TSV"),[Delimiter="#(tab)", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150") in #"Changed Type" // HHTYP let Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/HHTYP?format=TSV"),[Delimiter="#(tab)", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150") in #"Changed Type" // GEO let Source = Csv.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/GEO?format=TSV"),[Delimiter="#(tab)", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.None]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}},"en-150") in #"Changed Type" // data+FREQ let Source = Table.NestedJoin(data, {"freq"}, FREQ, {"Column1"}, "FREQ.1", JoinKind.LeftOuter), #"Expanded FREQ.1" = Table.ExpandTableColumn(Source, "FREQ.1", {"Column2"}, {"FREQ.1.Column2"}) in #"Expanded FREQ.1" // data+FREQ+INDIC_IS let Source = Table.NestedJoin(#"data+FREQ", {"indic_is"}, INDIC_IS, {"Column1"}, "INDIC_IS.1", JoinKind.LeftOuter), #"Expanded INDIC_IS.1" = Table.ExpandTableColumn(Source, "INDIC_IS.1", {"Column2"}, {"INDIC_IS.1.Column2"}) in #"Expanded INDIC_IS.1" // data+FREQ+INDIC_IS+UNIT let Source = Table.NestedJoin(#"data+FREQ+INDIC_IS", {"unit"}, UNIT, {"Column1"}, "UNIT.1", JoinKind.LeftOuter), #"Expanded UNIT.1" = Table.ExpandTableColumn(Source, "UNIT.1", {"Column2"}, {"UNIT.1.Column2"}) in #"Expanded UNIT.1" // data+FREQ+INDIC_IS+UNIT+HHTYP let Source = Table.NestedJoin(#"data+FREQ+INDIC_IS+UNIT", {"hhtyp"}, HHTYP, {"Column1"}, "HHTYP.1", JoinKind.LeftOuter), #"Expanded HHTYP.1" = Table.ExpandTableColumn(Source, "HHTYP.1", {"Column2"}, {"HHTYP.1.Column2"}) in #"Expanded HHTYP.1" // data+LABELS let Source = Table.NestedJoin(#"data+FREQ+INDIC_IS+UNIT+HHTYP", {"geo\TIME_PERIOD"}, GEO, {"Column1"}, "GEO", JoinKind.LeftOuter), #"Expanded GEO" = Table.ExpandTableColumn(Source, "GEO", {"Column2"}, {"GEO.Column2"}), #"Reordered Columns" = Table.ReorderColumns(#"Expanded GEO",{"freq", "FREQ.1.Column2", "indic_is", "INDIC_IS.1.Column2", "unit", "UNIT.1.Column2", "hhtyp", "HHTYP.1.Column2", "geo\TIME_PERIOD", "GEO.Column2", "2002 ", "2003 ", "2004 ", "2005 ", "2006 ", "2007 ", "2008 ", "2009 ", "2010 ", "2014 "}) in #"Reordered Columns"
Share data queries
Option 1 : Copy several queries between workbooks and share workbook files
Starting from a blank worksheet, it is possible to re-use existing queries from another workbook
Create a new worksheet
Reveal the Queries & Connections
Right-click under the queries pane and click Paste
This will automatically load the last query into the current worksheet
Option 2 : Export and share ODC files
When a query is completed, it is possible to select the final query to be exported as an ODC file as the one below
This file can be simply double-clicked on windows and should open in Excel directly.
Otherwise it can be loaded as an existing connection
This menu automatically list connections present in the "My Data Sources" user folder C:\Users\user_id\Documents\My Data Sources
Files in another location can be opened with the Browser for more button.
Lookup dataset metadata
As TSV format does not include metadata of the dataset, an additional query to the JSON Dataflow can be executed to retrieve the dataset label and the annotations holding the metadata.
A new query can be defined as below and the result loaded in a dedicated worksheet.
let Source = Json.Document(Web.Contents("https://ec.europa.eu/eurostat/api/dissemination/sdmx/3.0/structure/dataflow/ESTAT/ISOC_CI_ID_H?format=JSON&lang=en&compress=false")), #"Converted to Table" = Record.ToTable(Source), #"Pivoted Column" = Table.Pivot(#"Converted to Table", List.Distinct(#"Converted to Table"[Name]), "Name", "Value"), #"Expanded extension" = Table.ExpandRecordColumn(#"Pivoted Column", "extension", {"lang", "id", "agencyId", "version", "datastructure", "annotation"}, {"extension.lang", "extension.id", "extension.agencyId", "extension.version", "extension.datastructure", "extension.annotation"}), #"Expanded extension.annotation" = Table.ExpandListColumn(#"Expanded extension", "extension.annotation"), #"Expanded extension.annotation1" = Table.ExpandRecordColumn(#"Expanded extension.annotation", "extension.annotation", {"type", "title", "date", "href"}, {"extension.annotation.type", "extension.annotation.title", "extension.annotation.date", "extension.annotation.href"}) in #"Expanded extension.annotation1"