Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel2

Purpose

Web archiving preserves web content for future generations and keeps it accessible to the public, even if it is

...

no longer available on the original website

...

.

...

Rules

All public websites belonging to European institutions, bodies and agencies must be archived. Although the Publications Office archives websites on its own initiative, site owners must inform the web archiving service when new websites are created, so that they can be included in the EU web archive.

When are websites archived ?

Regular archiving

...

Websites belonging to the EU institutions, agencies and bodies are archived at least 4 times per year. In general,

...

the scope of regular archiving is limited to websites hosted on the europa.eu domain and subdomains

...

. External websites may be included only if duly justified.

Ad hoc and final archiving

...

If you intend to take a website offline or change it substantially, the website can be archived on an ad hoc basis at the request of the website owner.

In principle, only requests to archive websites in the europa.eu domain or subdomains will be accepted. For websites or pages outside the europa.eu domain, the requester should duly justify that:

  • the long-term value of the content justifies its preservation
  • it has significant long-term political, legal, information, use, research, social, cultural, historical, or artistic value
  • the content aligns with the values, mission and mandate of the EU institutions
  • the EU institution’s stakeholders and/or the public in general will be affected if this digital heritage is not preserved

In principle, all static web content is archived. Embedded social media accounts and databases behind websites are currently not archived.

Where to find the archive

The archive is freely accessible online

...

Ad hoc and final archives respond to specific needs at a specific point in time:

- keeping a thematic record or collection of parts of a website (e.g. relating to COVID-19).

- preserving a final record of the content of a website which is to be taken offline or going to change substantially.

When websites or webpages are due to be removed, website owners must request archiving at least one month beforehand in order to preserve a final version in the EU web archive. Send an email to the web archiving service at the EU Publications Office to request final archiving.

Where to find the archive

The EU web archive is available here: https://archive-it.org/home/euwebarchive

Links to archived content are structured as follows:

https://wayback.archive-it.org/12090/*/URL of the website you want to consult

For example:

https://wayback.archive-it.org/12090/*/https://ec.europa.eu/info/index_en takes you to the calendar page, where you can see the different dates on which the English version of the Commission homepage has been captured.

To view the most recent archive, simply look for the last available date.

Web archive content

The Publications Office checks the archives regularly. However, feedback from website owners is extremely useful regarding missing content, whether the archive is displayed in all available languages, or whether sub-sites that have been omitted from the archive.

In archive terminology, archived URLs are called ‘seeds’.

All pages with a URL starting with the same root as the seed will be archived.

E.g. for the seed www.webpage.eu/environment/:

- www.webpage.eu/environment/clima will be archived,

- but www.water.webpage.eu or www.webpage.eu/weather are out of scope.

Some types of content are excluded from web archives:

  • Databases and some types of dynamic content highly dependent upon human interaction. 
    This means that searches will not work, neither will links based on search queries.
  • Social media. Some embedded content may appear in the archive. However, do not expect all social media content to be included.
  • External links and documents out of scope.
    • The crawl captures all URLs discovered as part of the website, as explained above.

Preparing sites for archiving

Before revamping or taking

...

all or

...

part of

...

your website offline, you may want to archive it one last time. Prepare your website for archiving by removing all content and files that have no future value (historical, legal, political, research, cultural)

...

. Remove

...

any content that is:

  • protected by intellectual property rights (e.g. copyright)

...

  • confidential or private

...

  • affected by data protection rules.  

The following guidelines can help you to prepare your site for archiving:

Preparing sites for archiving.pdf

...

Users can navigate archived sites like a live website. However, archiving with a crawler has some technical limitations and as a result certain features may not work,

...

such as:

  • the original website’s built-in search;
  • content that can only be reached after logging in;
  • certain navigational elements, e.g. drop-down menus, pagination, tick boxes and some maps;
  • flash animations and games, streaming media and embedded social media;
  • complex JavaScript;
  • POST functionality.

...

Making a web archiving request

Archiving workflow

...

  1. Regular archiving of living websites

What

How

Who

When

Archiving request

...

Send an e-mail to OP-WEB-PRESERVATION

Website owner

Upon establishment of a new EC and/or DG website

Analysis of request


OP WP team


Approval/rejection of request

Email with justification of conclusions to website owner

...

and Comm Europa Management

OP WP team


For accepted requests

...




Regular crawling

Remote crawling

OP WP team

...

At least four times per year

Quality control

Visual/manual check of quality of the crawl, and feedback to OP WP team

Website owner

Upon invitation, sent by OP WP team, or any time.

Patching

If needed and if possible:

...

improving the quality of the archived version

OP WP team

...

Upon reception of

...

website owner’s feedback on quality or as a part of the regular quality control

Acceptance/rejection of crawl

Email to OP

...

-WEB-PRESERVATION

Website owner


Publication/takedown of crawl


OP WP team

...



2.   Ad hoc archiving of websites that are to be taken offline or changed substantially

What

How

Who

When

...

Clean-up of website

...

See preparing sites for offline preservation checklist

...

Archiving request

Send an e-mail to OP-WEB-PRESERVATION

Website owner

At least

...

1 month before the site will be taken offline/changed

Analysis of request


OP WP team


Approval/rejection of request

Email with justification of conclusions to website owner and

...

Comm Europa Management

OP WP team

Maximum 1 week after reception of CEM approval

For accepted requests




Planning

...

Discussion of deadlines and crawl specifications

OP WP team and website owner

Upon approval of the request

Crawling

Archiving following crawl specifications

OP WP team

...

According to planning agreed with website owner

Quality control

Regular quality control by OP WP team

Visual check of quality of the crawl, and feedback to OP WP team

OP WP team

Website owner

Upon invitation, sent by OP WP team

Patching

If needed and if possible:

...

improving the quality of the archived version

OP WP team

...

Upon reception of

...

WO feedback on quality or as a part of the regular quality control

Acceptance

Email to OP

...

-WEB-PRESERVATION

Website owner


Publication


OP WP team


Redirections (if desired)

...

Takedown policy

...

Under certain circumstances, it may be

...

necessary to hide pages in the web archive from public view.

Anyone can submit a motivated takedown request

...

via email to OP-WEB-PRESERVATION.

Takedown will

...

be considered

...

only if the page:

...

  • includes one of the following types of content:

...

    • of personal data as processed by

...

    • EU institutions, bodies, offices and agencies
    • copyright protected material for which the necessary rights are not held
    • defamatory or obscene material or messages

...

    • content

...

    • which may cause serious and real administrative difficulties to the website owner

...

  • was published in good faith, but

...

  • due to a change in circumstances its takedown is now considered appropriate

...

  • was published in error and

...

  • takedown is

...

  • necessary to correct

...

  • the mistake.

Legal information

© European Union, 2019

The Publications Office carries out web archiving to preserve the EU websites

...

. Most of the archived content of

...

websites

...

in the EU web archive (EUWA), is under EU (or EU institutions, agencies or bodies) copyright. Ownership and copyright of websites in the EUWA remain the responsibility of the website owners.

Unless otherwise stated,

...

material obtained from the EUWA may be freely reproduced. This general principle can be subject to conditions, which may be specified in individual copyright notices. It does not apply to photographs, videos, pieces of music or other material subject to intellectual property rights of third parties (non-EU). In such cases, permission to use the material must be sought directly from the copyright holders. The Publications Office does not

...

guarantee that all third-party content is appropriately marked.

All logos and trademarks are excluded from the abovementioned permission.

Any queries regarding the above should be addressed by email to

...

...

 See also the Privacy statement.

Contact and support

Need further assistance on this topic? Please

...