Child pages
  • API - Detailed guidelines - Asynchronous API

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In case the requested filtered data would be to important to be prepared, a clent client error code 413 is returned with a request suggestion to apply more filtering to the request.

...

Expand
titleMore details on asynchronous trigger and thresholds...

When a data request is initiated, the system first checks if the exact same request was already performed previously and if applicable lookup the data directly from an internal cache and return it as a response.
If the data is not cached, the data needs to be extracted and the system estimates the related "extraction cost" in term of potential number of data cells returned.
To compute this cost, the system resolves the number of positions matched by each dimension filter.

As an example, if a dataset has 3 dimensions with respectively 5, 10 and 20 positions available for each dimension, the dataset cardinality is 5 x 10 x 20 = 1000 cells.
An extraction request asking for:

  • 3 positions for the first dimension
  • 2 positions for the second dimension
  • no filtering for the third dimension
    will potentially match 3 x 2 x 20 = 120 cells which is also the estimated cost of this request.

The decision whether to deliver the data synchronously or asynchronously is related to factors such as the complexity of the query and the volume of the data (number of cells) to be returned:

  • if the data is cached -> the data is returned synchronously
  • if the data has to be extracted, the "cost" of the request is estimated and:

In order to know how many positions are available for the dimensions of a dataset, the API provides an SDMX endpoint which returns the SDMX data constraints artefact for the specified dataset.

Taking Eurostat Comext dataset DS-045409 as example, its data constraints can be retrieved using:
https://ec.europa.eu/eurostat/api/comext/dissemination/sdmx/2.1/contentconstraint/estat/DS-045409

In this dataset, the dimensions have the following number of positions:

  • freq has 2 positions
  • reporter has 33 positions
  • partner has 282 positions
  • product has 40321 positions
  • flow has 2 positions
  • time_period has 468 positions (36 years and 432 months)
  • indicators has 3 positions

The dataset cardinality is then: 2 x 33 x 282 x 40321 x 2 x 468 x 3 = 2 107 276 101 216 cells.

Examples queries

1 - Query in range for asynchronous extraction

Following query would be considered within limits and processed by the system 

http://ec.europa.eu/eurostat/api/comext/dissemination/sdmx/2.1/data/DS-045409/A.DK.US..1.SUPPLEMENTARY_QUANTITY?format=SDMX_2.1_STRUCTURED

This query matches the following positions:

  • freq -> 1 position ("A")
  • reporter 1 position ("DK")
  • partner -> 1 position ("US")
  • product -> 40321 positions (there is no filter on this dimension)
  • flow -> 1 position ("1")
  • time_period -> 36 positions (there is no explicit filter on this dimension but the system will only return yearly data)
  • indicators -> 1 position ("SUPPLEMENTARY_QUANTITY")

Estimated cost: 1 x 1 x 1 x 40321 x 1 x 36 x 1 = 1 451 556 which is above the synchronous limit but below the maximum extraction limit so this request is treated asynchronously.

2 -Query above range for asynchronous extraction

Following query would be considered off limits and not processed by the system 

https://ec.europa.eu/eurostat/api/comext/dissemination/sdmx/2.1/data/DS-045409/A.PT...2.QUANTITY_IN_100KG?format=SDMX_2.1_STRUCTURED1

This query matches the following positions:

  • freq -> 1 position ("A")
  • reporter 1 position ("PT")
  • partner -> 282 positions (there is no filter on this dimension is set)
  • product -> 40321 positions (there is no filter on this dimension is set)
  • flow -> 1 position ("2")
  • time_period -> 36 positions (there is no explicit filter on this dimension but the system will only return yearly data as the frequency requested is annual)
  • indicators -> 1 position ("QUANTITY_IN_100KG")

Estimated cost: 1 x 1 x 282 x 40321 x 1 x 36 x 1 = 409 338 792 which is above the maximum extraction limit of 5 000 000 cells and an error is returned.

...