A report from an EU meeting on the “goals and requirements for a pan-European data portal” is now online (PDF).
The meeting took place in Luxembourg last month. Participants included Nigel Shadbolt, one of four members of the UK Government’s Public Sector Transparency Board, and Jose Manuel Alonso, co-lead of the eGovernment Interest Group at W3C.
I was invited on behalf of the Open Knowledge Foundation to discuss our work on the CKAN project, both as part of data.gov.uk and as part of the LOD2 project, which will bring together open data from local, regional and national public bodies across Europe.
From the introduction:
On the 3rd of November 2010 the European Commission organised in Luxembourg a technical
workshop on the goals and requirements for a possible pan-European data portal. Experts with
practical experience in their respective countries were invited to share their experiences and ideas.The experts consider that such a portal would add value to existing regional and national initiatives by improving transparency on issues of EU-wide interest, providing evidence for better policy making, improving the efficiency of data-dependent administrative and business processes and stimulating economic development through EU-wide reuse of data.
Several issues of legal, technical and socio-political nature must be addressed for such a portal to function effectively, among them the need for high level political support, the systematic adoption of reuse-friendly data licences, the promotion of established data standards for maximal interoperability and the organic involvement of European software developers and data-literate citizens.
A pan-European portal should be able to expand rapidly in breadth (thus fostering the interest of the public with large numbers of relevant datasets) while at the same time also showing the value of deeper data integration, starting from a core set of statistical, financial, geospatial data of high quality. Agile prototyping and development models are recommended, given the extremely fast pace at which data initiatives are developing in Europe.
A small working group should be created to drive the issue forward and meet regularly to identify more precisely technical requirements. The group should connect with other open data stakeholder groups established at the national or European level and contribute to the definition of European datasets, government open data conferences and software development competitions, with first results visible and publicised by mid-2011.
The report identifies several reasons for developing a pan-European data portal:
A) For European citizens
- Single point of access on European information
- Enabling services for citizens that live at country borders and/or work abroad
- knowledge of successful open government data initiatives in some Member States can drive further initiatives in other Member States
B) For administrations
- Improvement of interoperability across processes thanks to greater availability of data
- Improved comparability of EU 27 information and data
- Reduction in administrative costs
- Avoiding / cutting existing costs of re-publication of official information
- More efficiency in servicing Freedom of Information requests
- Involvement of European citizens (crowd sourcing approach) can have positive effects on transparency and quality of data.
C) For economic development
- Planning and monitoring resource for companies operating across EU borders
- Driving the European innovation process
- Driving force for European economy (information technology, new location based services, analyzing services et al)
- Harmonisation of standards and guidelines for open government data across Europe
It also highlighted the value of open licenses, which allow anyone to reuse the data for any purpose:
The participants of the workshop furthermore identified appropriate data licensing at the source as the conceptual precondition for any value to be extracted by data reuse (developers will not reuse data if it is not clear that they have the right to do so). This appears to be mostly an issue of educating data publishers on the selection of an appropriate license. There may be however contexts in which this might turn out to be a legislative issue, to be considered in the context of the review of the Public Sector Information Directive. There was also consensus on the fact that a clear licensing policy should be created and enforced on a pan-European data portal so as to maximise the opportunity for data reuse.
The report concludes:
[…] participants agreed that a pan-European data portal with the characteristics described above would add value to open data initiatives from the Member States. Such an initiative should be pursued without delay in order to exploit the current momentum of open government data initiatives across Europe
It is fantastic to see such interest in open government data from the European Commission, and we look forward to following further developments with great interest.
If you’re interested in keeping in touch with the Open Knowledge Foundation’s work in this area you can follow:
- lod2.okfn.org – which we are using as a place for working notes and thoughts in relation to our work on the LOD2 project
- the euopendata and open-government mailing lists
Dr. Jonathan Gray is Lecturer in Critical Infrastructure Studies at the Department of Digital Humanities, King’s College London, where he is currently writing a book on data worlds. He is also Cofounder of the Public Data Lab; and Research Associate at the Digital Methods Initiative (University of Amsterdam) and the médialab (Sciences Po, Paris). More about his work can be found at jonathangray.org and he tweets at @jwyg.
I think there is a tendency to get rather hung up on data portals. In the end the key principle is “publish once”, to an Internet endpoint that can be discovered.
The simpliest form of discovery would be through a search engine, but anyone is also free to collect these endpoints into a catalogue, e.g. in support of data portal built around a particular theme and/or facet – environmental + Open Data.
The thing to be avoided is where the act of publishing involves creating discovery metadata at a given point of access, as is the case with data.gov.uk today.
Having read the report in more detail, a few additional points:
1) In terms of challenges, the need is not to identify a “portal architecture”, rather an overall (distributed) system architecture (‘blueprint’) across a number of design domains. The portal is just a point of access onto this system.
2) A model/approach for the harmonisation of datasets across the EC has already been established through the EC INSPIRE Directive. This relates to the publishing of harmonised, interoperable geo-referenced data.
INSPIRE achieves harmonisation through a “Generic Conceptual Model” and the development of thematic data specifications against this model. Individual datasets are then aligned to these specifications, either through direct conversion, or the use of transformation web services.
It would seem sensible to build on this approach within the EC.
3) Again under INSPIRE, the EC is already establishing an authoritative registry of conceptual models for certain themes, e.g. Transport.
Given the relevance of INSPIRE, I would strongly recommend that INSPIRE is represented on the proposed working group, i.e. someone from the JRC INSPIRE team.