Data

Section Contents

What are the OpenURL data?

All OpenURL requests made by end users via the Router at openurl.ac.uk are logged, and (subject to the metadata included by the referring service) provide a record of the article that user was attempting to find via their local resolver. The Router redirects these requests to the appropriate local resolver for each user, so though the Router logs each request, the ultimate outcome (whether the end user obtained a copy of the article, and from where) is unknown to the Router.

The OpenURL Router supports various types of requests other than links direct to local resolvers. These include the "lookup" requests (registry searches) and requests for the preferred button image to be used for each resolver. These requests are all logged, but they are not OpenURL requests and do not contain bibliographic metadata.

The OpenURL Router Data are thus data about traffic flowing through the UK OpenURL Router, and are sometimes known as activity data. The data are made publicly available so that other service providers may use them.

What data are made available?

Before being made available in any form, the data are anonymised to remove data that may identify an individual institution or individual person. Then the data file is made available 'as is'. We identify this as Level 1. It includes resolver redirect requests and those "lookup" requests where no institution is identified. It excludes the button requests as these identify an institution. It may be used by anybody for any purpose that they believe will be useful, such as for analysis or to create services for UK Higher and Further Education. The Level 2 data file contains data that have been further processed, i.e. all extraneous data are removed leaving only redirection data. EDINA uses these data as the basis of a prototype recommender service for UK HE and makes them available for others to use.

We distinguish three levels of data as follows:

LevelWhat's this?What has been processed?Is it available?
0Original log file DataNo processing (contains identifiable IP addresses and institutions)No
1Anonymised DataIP addresses are encrypted using an algorithm and institutional identifiers are anonymisedComing soon
2Anonymised Redirect DataA subset of the Level 1 data, containing only entries that redirect to a resolverYes

What licence applies to use of the data?

The data are made available under the Open Data Commons (ODC) Public Domain Dedication and Licence (http://www.opendatacommons.org/licenses/pddl/1-0/) and the ODC Attribution Sharealike Community Norms (http://www.opendatacommons.org/norms/odc-by-sa/). The licence was selected to maximise possible uses for the data. Please read the terms of the PDDL and the Attribution Sharealike Community Norms before using the data. The PDDL effectively dedicate IPR to the public domain meaning that you can use the data as you please but the community norms request that you respect the goodwill of those making the data available by:

How often are these data updated?

The data files are updated monthly to cover the previous calendar month. The data file for any single month is available for download as a flat file. There will also be one data file covering a full calendar year (starting 01 January) available as a flat file, with data added each calendar month. You should be aware that these files can be very large (monthly files vary from 125MB to 600MB, and annual files are over 1GB). You can find the data files in the OpenURL Router Data Section.

What exactly are these data?

The data captured vary from request to request, since different users enter different information into requests. In some cases very little data is captured.

Log-specific request data (based on OpenURL Router log entries):

Request-specific data (based on the OpenURL standard):

As the OpenURL Router Data files are large, sample files are available containing a subset of the data for initial analysis.

Sample OpenURL Router Data

EDINA offers no guarantee that the data are accurate, current or fit for any specific use. By using the data, the user accepts that they cannot rely on them and that any arrangements made between the user and any other person is entirely at their sole risk and responsibility.

The Service Provider makes no statement about the suitability of the data. All warranties, terms and conditions in this regard, including all warranties, terms and conditions implied by statute or otherwise, of satisfactory quality and fitness for purpose are excluded to the fullest extent permitted by law.

The Service Provider further excludes to the fullest extent permissible by law all liability for damages and direct, indirect, or consequential loss (all three of which terms include pure economic loss, loss of profits, loss of business, business interruption, depletion of goodwill and like loss) or otherwise incurred by a user or any third party and arising out of or in any way connected with the use of the data whether based on contract, tort, strict liability or otherwise.

The data are made available under the Open Data Commons (ODC) Public Domain Dedication and Licence (http://www.opendatacommons.org/licenses/pddl/1-0/) and the ODC Attribution Sharealike Community Norms (http://www.opendatacommons.org/norms/odc-by-sa/). Please read the terms of the PDDL and the Attribution Sharealike Community Norms before using the data.

The sample files are made available in two forms for initial analysis. The first is tab-delimited csv format, the format in which the data files are made available. The second is formatted to be displayed in spreadsheets, such as MS Excel, so that an understanding can be gained of the type of data available.

Note that the full data files will only be available in csv format.

How can the data be used?

As the data are made available under an open licence there are many different things that could be done with the data alone or in combination with other data sets. Several people have done this already and created some interesting applications and visualisations and described ways to manipulate the large files. With thanks to the following we are aware of:

Manipulating the large(ish) files

Creating visualisations of the data

Analysis of the data

If you use the data please let us know at edina@ed.ac.uk

OpenURL Router Data

EDINA offers no guarantee that the data are accurate, current or fit for any specific use. By using the data, the user accepts that they cannot rely on them and that any arrangements made between the user and any other person is entirely at their sole risk and responsibility.

The Service Provider makes no statement about the suitability of the data. All warranties, terms and conditions in this regard, including all warranties, terms and conditions implied by statute or otherwise, of satisfactory quality and fitness for purpose are excluded to the fullest extent permitted by law.

The Service Provider further excludes to the fullest extent permissible by law all liability for damages and direct, indirect, or consequential loss (all three of which terms include pure economic loss, loss of profits, loss of business, business interruption, depletion of goodwill and like loss) or otherwise incurred by a user or any third party and arising out of or in any way connected with the use of the data whether based on contract, tort, strict liability or otherwise.

The data are made available under the Open Data Commons (ODC) Public Domain Dedication and Licence (http://www.opendatacommons.org/licenses/pddl/1-0/) and the ODC Attribution Sharealike Community Norms (http://www.opendatacommons.org/norms/odc-by-sa/). Please read the terms of the PDDL and the Attribution Sharealike Community Norms before using the data.

[an error occurred while processing this directive]