Support Us

You are browsing the archive for Technical.

Public Domain Calculators at Europeana

Guest - May 12, 2010 in COMMUNIA, External, OKF Projects, Open Knowledge Foundation, Public Domain, Public Domain Works, Technical, WG Public Domain, Working Groups

The following guest post is from Christina Angelopoulos at the Institute for Information Law (IViR) and Maarten Zeinstra at Nederland Kennisland who are working on building a series of Public Domain Calculators as part of the Europeana project. Both are also members of the Open Knowledge Foundation’s Working Group on the Public Domain.

Europeana Logo

Over the past few months the Institute for Information Law (IViR) of the University of Amsterdam and Nederland Kennisland have been collaborating on the preparation of a set of six Public Domain Helper Tools as part of the EuropeanConnect project. The Tools are intended to assist Europeana data providers in the determination of whether or not a certain work or other subject matter vested with copyright or neighbouring rights (related rights) has fallen into the public domain and can therefore be freely copied or re-used, through functioning as a simple interface between the user and the often complex set of national rules governing the term of protection. The issue is of significance for Europeana, as contributing organisations will be expected to clearly mark the material in their collection as being in the public domain, through the attachment of a Europeana Public Domain Licence, whenever possible.

The Tools are based on six National Flowcharts (Decisions Trees) built by IViR on the basis of research into the duration of the protection of subject matter in which copyright or neighbouring rights subsist in six European jurisdictions (the Czech Republic, France, Italy, the Netherlands, Spain and the United Kingdom). By means of a series of simple yes-or-no questions, the Flowcharts are intended to guide the user through all important issues relevant to the determination of the public domain status of a given item.

Researching Copyright Law

The first step in the construction of the flowcharts was the careful study of EU Term Directive. The Directive attempts the harmonisation of rules on the term of protection of copyright and neighbouring rights across the board of EU Member States. The rules of the Directive were integrated by IViR into a set of Generic Skeleton European Flowcharts. Given the essential role that the Term Directive has played in shaping national laws on the duration of protection, these generic charts functioned as the prototype for the six National Flowcharts. An initial version of the Generic European Flowchart, as well as the National Flowcharts for the Netherlands and the United Kingdom, was put together with the help of the Open Knowledge Foundation at a Communia workshop in November 2009.

Further information necessary for the refinement of these charts as well as the assembly of the remaining four National Flowcharts was collected either through the collaboration of National Legal Experts contacted by IViR (Czech Republic, Italy and Spain) or independently through IViR’s in-house expertise (EU, France, the Netherlands and the UK).

Both the Generic European Flowcharts and the National Flowcharts have been split into two categories: one dedicated to the rules governing the duration of copyright and the sui generis database right and one dedicated to the rules governing neighbouring rights. Although this division was made for the sake of usability and in accordance with the different subject matter of these categories of rights (works of copyright and unoriginal databases on the one hand and performances, phonograms, films and broadcasts on the other), the two types of flowcharts are intended to be viewed as connected and should be applied jointly if a comprehensive conclusion as to the public domain status of an examined item is to be reached (in fact the final conclusion in each directs the user to the application of the other). This is due to the fact that, although the protected subject matter of these two categories of rights differs, they may not be entirely unrelated. For example, it does not suffice to examine whether the rights of the author of a musical work have expired; it may also be necessary to investigate whether the rights of the performer of the work or of the producer of the phonogram onto which the work has been fixated have also expired, in order to reach an accurate conclusion as to whether or not a certain item in a collection may be copied or re-used.

Legal Complexities

A variety of legal complexities surfaced during the research into the topic. Condensing the complex rules that govern the term of protection in the examined jurisdictions into a user-friendly tool presented a substantial challenge. One of the most perplexing issues was that of the first question to be asked. Rather than engage in complicated descriptions of the scope of the subject matter protected by copyright and related rights, IViR decided to avoid this can of worms. Instead, the flowchart’s starting point is provided by the question “is the work an unoriginal database?” However, this solution seems unsatisfactory and further thought is being put into an alternative approach.

Other difficult legal issues stumbled upon include the following:

  • Term of protection vis-à-vis third countries
  • Term of protection of works of joint authorship and collective works
  • The term of protection (or lack thereof) for moral rights
  • Application of new terms and transitional provisions
  • Copyright protection of critical and scientific publications and of non-original photographs
  • Copyright protection of official acts of public authorities and other works of public origins (e.g. legislative texts, political speeches, works of traditional folklore)
  • Copyright protection of translations, adaptations and typographical arrangements
  • Copyright protection of computer-generated works

On the national level, areas of uncertainty related to such matters as the British provisions on the protection of films (no distinction is made under British law between the audiovisual or cinematographic work and its first fixation, contrary to the system applied on the EU level) or exceptional extensions to the term of protection, such as that granted in France due to World Wars I and II or in the UK to J.M. Barrie’s “Peter Pan”.

Web based Public Domain Calculators

Once the Flowcharts had been prepared they were translated into code by IViR’s colleagues at Kennisland, thus resulting in the creation of the current set of six web-based Public Domain Helper Tools.

Technically the flowcharts needed to be translated into formats that computers can read. In this project Kennisland choose for an Extensible Markup Language (XML) approach for describing the questions in the flowcharts and the relations between them. The resulting XML documents are both human and computer readable. Using XML documents also allowed Kennisland to keep the decision structure separate from the actual programming language, which makes maintenance of both content and code easier.

Kennisland then needed to build an XML reader that could translate the structures and questions of these XML files into a questionnaire or apply some set of data to the available questions, so as to make the automatic calculation of large datasets possible. For the EuropeanaConnect project Kennisland developed two of these XML readers. The first translates these XML schemes into a graphical user interface tool (this can be found at EuropeanaLabs) and the second can potentially automatically determine the status of a work which resides at the Public Domain Works project mercurial depository on KnowledgeForge. Both of these applications are open source and we encourage people to download, modify and work on these tools.

It should be noted that, as part of Kennisland’s collaboration with the Open Knowledge Foundation, Kennisland is currently assisting in the development of an XML base scheme for automatic determination of the rights status of a work using bibliographic information. Unfortunately however this information alone is usually not enough for the automatic identification on a European level. This is due to the many international treaties that have accumulated over the years; rules for example change depending on whether an author is born in a country party to the Berne convention, an EU Member State or a third country.

It should of course also be noted that there is a limit to the extent to which an electronic tool can replace a case-by-case assessment of the public domain status of a copyrighted work or other protected subject matter in complicated legal situations. The Tools are accordingly accompanied by a disclaimer indicating that they cannot offer an absolute guarantee of legal certainty.

Further fine-tuning is necessary before the Helper Tools are ready to be deployed. For the moment test versions of the electronic Tools can be found here. We invite readers to try these beta tools and give us feedback on the pd-discuss list!

Note from the authors: If the whole construction process for the Flowcharts has highlighted one thing that would be the bewildering complexity of the current rules governing the term of protection for copyright and related rights. Despite the Term Directive’s attempts at creating a level playing field, national legislative idiosyncrasies are still going strong in the post-harmonisation era – a single European term of protection remains very much a chimera. The relevant rules are hardly simple on the level of the individual Member States either. In particular in countries such as the UK and France, the term of protection currently operates under confusing entanglements of rules and exceptions that make the confident calculation of the term of protection almost impossible for a copyright layperson and difficult even for experts.

PD Calculators

Generic copyright flowchart by Christina Angelopoulos. PDF version available from Public Domain Calculators wiki page

CKAN 0.11 Released

Rufus Pollock - February 12, 2010 in CKAN, OKF Projects, Open Knowledge Foundation, Releases, Technical

We are pleased to announce the release of version 0.11 of the CKAN software, our open source registry of open data used in and

CKAN tag cloud

This is our biggest release so far (55 tickets) with lots of new features and improvements. This release also saw a major new production deployment with the CKAN software powering which had its public launch on Jan 21st!

Main highlights (for a full listing of tickets please see the trac milestone):

  • Package Resource object (multiple download urls per package): each package can have multiple ‘resources’ (urls) with each resource having additional metadata such as format, description and hash (#88, #89, #229)
  • “Full-text” searching of packages (#187)
  • Semantic web integration: RDFization of all data plus integration with an online RDF store (e.g. for at or Talis store) (#90 #163)
  • Package ratings (#77 #194)
  • i18n: we now have translations into German and French with deployments at and (#202)
  • Package diffs available in package history (#173)
  • Minor:
    • Package undelete (#21, #126)
    • Automated CKAN deployment via Fabric (#213)
    • Listings are sorted alphabetically (#195)
    • Add extras to rest api and to ckanclient (#158 #166)
  • Infrastructural:
    • Change to UUIDs for revisions and all domain objects
    • Improved search performance and better pagination
    • Significantly improved performance in API and WUI via judicious caching

New report on sharing aid information is now open for comments

Jonathan Gray - September 21, 2009 in News, OKF Projects, Open Data, Open Knowledge Definition, Open Knowledge Foundation, Open Standards, Technical, WG Development

We’re pleased to announce the publication of a new report, Unlocking the potential of aid information. The report, by the Open Knowledge Foundation and Aidinfo, looks at how to make information related to international development (i) legally open, (ii) technically open and (iii) easy to find.

The report and relevant background information can be found at:

It aims to inform the development of a new platform for publishing and sharing aid information:

The International Aid Transparency Initiative (IATI) aims to improve the availability and accessibility of aid information by designing common standards for publication of info about aid. It’s is not about creating another database on aid activities, but creating a platform that will enable existing databases – and potential new services – to access this aid information and create compelling application providing more detailed, timely, and accessible information about aid.

The idea of openness is crucial to creating this platform and achieving transparency. Information must be openly available with as few restrictions in how the information is accessed and used as possible. To this end, we need to design a technical architecture that enables information to be published and accessed in an open way.

There are three main recommendations in the report, which are as follows:

  • Recommendation 1 – Aid information should be legally open. The standard should require a core set of standard licenses for pubishing aid information under. It should require that either:
    • (i) information is published under one of a small number of recommended options:
      • Licenses for content: Creative Commons Attribution or Attribution Sharealike license
      • Legal tools for data: Open Data Commons Public Domain Dedication and License (PDDL), Open Data Commons Open Database License (ODbL) or Creative Commons CC0
    • or that (ii) information is published using a license/legal tool that is compliant with a standard such as the Open Knowledge Definition.
  • Recommendation 2 – Aid information should be technically open. The standard should require that raw data is made available in bulk (not just via an API or web interface) with any relevant schema information and either:
    • (i) in one of a small number of recommended formats:
      • Text: HTML, ODF, TXT, XML
      • Data: CSV, XML, RDF/XML
    • or (ii) in a format:
      • (a) which is machine readable and
      • (b) for which the specification is publicly and freely available and usable
  • Recommendation 3 – Aid information should be easily findable. The standard should require that aid organisations add their knowledge assets to a registry with some basic metadata describing the information.

We are now welcoming comments on the report until Sunday 1st November 2009. To submit comments you can:

  1. Directly annotate the documents with your comments:
  2. Submit your comments for discussion on the open development mailing list.
  3. Email your comments to info at okfn dot org.

CKAN 0.9 Released

Guest - August 13, 2009 in CKAN, News, OKF Projects, Open Knowledge Foundation, Releases, Technical

We are pleased to announce the release of CKAN version 0.9! CKAN is the Comprehensive Knowledge Archive Network, a registry of open knowledge packages and projects.

Changes include:

  • Add version attribute for package
  • Fix purge to use new version of Versioned Domain Model (vdm) (0.4)
  • Link to changed packages when listing revision
  • Show most recently registered or updated packages on front page
  • Bookmarklet to enable easy package registration on CKAN
  • Usability improvements (package search and creation on front page
  • Use external list of licenses from license repository
  • Convert from py.test to nosetests

There are now over 560 packages in the registry – which means that on average we’ve been adding a package a day since version 0.8 was released in May!

Open Data and the Semantic Web Workshop, London, 13th November 2009

Jonathan Gray - August 5, 2009 in Events, Open Data, Open Knowledge Foundation, Technical

Linking Open Data cloud

We’re currently organising a workshop on ‘open data and the semantic web’, which will take place in London this autumn. Details are as follows:

  • When: Friday 13th November 2009, 1000-1800
  • Where: London Knowledge Lab, 23-29 Emerald Street, London, WC1N 3QS. (See map)
  • Wiki:
  • Participation: Attendance is free. If you are planning to come along please add your name to the wiki.
  • Microbloggers: See notices on and Twitter

Further details:

Semantic web technologists and advocates are increasingly beginning to see the value of ‘open data’ for the data web. Tim Berners-Lee has spoken about the importance of open data, and being able to access raw data in easy to use formats, and the Linking Open Data project demonstrates what can be done by linking together a rich variety of publicly re-usable datasets.

This informal, hands-on workshop will bring together researchers, technologists, and people interested in open data and the semantic web from both public and private sector organisations for a day of talks and discussions.

Themes will include:

  • Linking Open Data
  • Legal tools for open data
  • Finding open data goes live!

Jonathan Gray - May 22, 2009 in News, Open Data, Policy, Technical

The US governments new site (which we blogged about last month) is now live!

There are currently a selection of core datasets available – from information about World Copper Smelters to results from the Residential Energy Consumption Survey. Raw data is available in XML, Text/CSV, KML/KMZ, Feeds, XLS, or ESRI Shapefile formats. As well as exploring and downloading the data that is available you can also suggest datasets that you’d like to be added!

From the site:

As a priority Open Government Initiative for President Obama’s administration, increases the ability of the public to easily find, download, and use datasets that are generated and held by the Federal Government. provides descriptions of the Federal datasets (metadata), information about how to access the datasets, and tools that leverage government datasets. The data catalogs will continue to grow as datasets are added. Federal, Executive Branch data are included in the first version of is a major milestone in the Obama administration’s Open Government Initiative. To mark the occasion, Sunlight Labs, Google, O’Reilly Media, and TechWeb have launched Apps for America 2 – inviting proposals for open source mashups, visualisations or other innovative re-uses of material from

You can watch a video of Vivek Kundra, the US’s CIO, talking about on YouTube.

Great news for open government data – and open data in general!

Launch of Open Data Grid

Jonathan Gray - May 13, 2009 in News, OKF Projects, Open Data, Open Knowledge Foundation, Technical

storage facility 37 by sevensixfive

In the last couple of months we’ve had several threads on the okfn-discuss list about distributed storage for open data (see here and here).

Last month we started a distributed storage project, aiming to provide distributed storage infrastructure for OKF and other open knowledge projects.

After researching various technical options, we’ve launched an Open Data Grid based on Allmydata’s open-source “Tahoe” system at:

Anyone can store open data on the grid, or start running a storage node. For more details see the readme. If you’d like to comment on the service feel free to post on the okfn-discuss list!

CKAN 0.8 Released

Rufus Pollock - May 12, 2009 in CKAN, News, OKF Projects, Technical

A new release of CKAN is now out together with a new, and substantially improved versioned domain model library. Changes include:

  • View information about package history (ticket:53)
  • Basic datapkg integration (ticket:57)
  • Show information about package openness using icons (ticket:56)
  • One-stage package create/registration (r437)
  • Reinstate package attribute validation (r437)
  • Upgrade to vdm 0.4

The CKAN code is available from:

The data is available from:

We’ve now over 500 packages — an almost 100% increase in the last six months. If you come across a large dataset or substantial collection, please consider registering it on CKAN!

5th Communia Workshop: Post-Event Information + Statement

Jonathan Gray - April 23, 2009 in COMMUNIA, Events, News, Open Data, Open Knowledge Foundation, Policy, Talks, Technical

Participants at 5th COMMUNIA Workshop

The 5th Communia Workshop took place last month at the London School of Economics. It brought together researchers, policy-makers, stakeholders and representatives from across Europe, the United States and Australia for two days of talks and discussions about reusing public sector content and data.

In the afternoon of the first day, participants co-drafted a simple statement. If you support the statement, we encourage you to sign – regardless of whether or not you attended the workshop:

In addition many of the speakers made suggestions for policy recommendations, which are available at:

Documentation, including audio, video and slides, will be published at:

Material published so far includes:

You can also see:

Get Updates