Good news for open data: Protocol for Implementing Open Access Data, Open Data Commons PDDL and CCZero
December 17th, 2007
Last night Science Commons announced the release of the Protocol for Implementing Open Access Data:
The Protocol is a method for ensuring that scientific databases can be legally integrated with one another. The Protocol is built on the public domain status of data in many countries (including the United States) and provides legal certainty to both data deposit and data use. The protocol is not a license or legal tool in itself, but instead a methodology for a) creating such legal tools and b) marking data already in the public domain for machine-assisted discovery.
As well as working closely with the Open Knowledge Foundation, Talis and Jordan Hatcher, Science Commons have spent the last year consulting widely with international geospatial and biodiversity scientific communities. They’ve also made sure that the protocol is conformant with the Open Knowledge Definition:
We are also pleased to announce that the Open Knowledge Foundation has certified the Protocol as conforming to the Open Knowledge Definition. We think it’s important to avoid legal fragmentation at the early stages, and that one way to avoid that fragmentation is to work with the existing thought leaders like the OKF.
Also, Jordan Hatcher has just released a draft of the Public Domain Dedication & Licence (PDDL) and an accompanying document on open data community norms. This is also conformant with the Open Knowledge Definition:
The current draft PDDL is compliant with the newly released Science Commons draft protocol for the “Open Access Data Mark” and with the Open Knowledge Foundation’s Open Definition.
Furthermore Creative Commons have recently made public a new protocol called CCZero which will be released in January. CCZero will allow people:
(a) ASSERT that a workhas no legal restrictions attached to it, OR
(b) WAIVE any rights associated with a work so it has not legal restrictions attached to it,
and
(c) “SIGN” the assertion or waiver.
All of this is fantastic news for open data!
Keeping “Open” Libre
November 20th, 2007
Last week I attended the Jornadas gvSIG, the developer/user gathering for the open source GIS project supported by the regional government in Valencia. There seems to be a very supportive climate towards free software and open licensed data in Spain. I was impressed to hear people from commercial consultancies and local government information and infrastructure departments talking so strongly about software libre and the need to compartir el conocimiento, where tecnologia proprietaria has no place in a proyecto cooperativo. Government is increasingly moving toward an explicit Creative Commons based open licensing approach to public data and its Spatial Data Infrastructure - census data, political and administrative shapes, street networks and aerial imagery - all kinds of geographic information, open and libre.
Our household only knows about Indo-European languages, but can’t think of another language than English where a distinction between libre (free) and gratis (free) isn’t explicitly made. Talk of datos libres or freie daten has both rhetorical strength and public plausibility in a way in which free, in English, hasn’t. The term “open source software” originally came about as a softening of the term “free software”, in an attempt to introduce a non-radical plausibility. Free and Open Source software can be essentially the same thing, under a different name, open licensed in the same way.
In the last few weeks I’ve heard of Google’s launch of “OpenSocial” and its bootstrapping of the “Open Handset Alliance”. The latter, certainly, is based on patent/license-encumbered hardware and not offering an “Open Platform” that will run on more truly libre telephony hardware platforms such as OpenMoko. How libre is “open”, in these cases? How libre can a system be that relies on data formats and hardware recipes that require royalties and/or membership of a consortium in order to use it?
In such circumstances I am very glad an effort like opendefinition.org, attempting to describe a yardstick by which the libre qualities of open data, data service, data format, works can be assessed. I hope that, in helping to keep the definition of usefully “open” clear, this may help to keep open free.
How to Develop Geodata Domain Models
April 20th, 2007
Jo Walsh (who’s also a member of the Open Knowledge Foundation) has written a great post over on the mappinghacks blog about the development a new data model for OpenStreetMap. Though focused on the issue of modelling geodata the points she raises, particularly in relation to ‘Audit’ functionality (change tracking, versioning etc), are applicable to many other areas of open knowledge development and I strongly recommend reading the original post in full.
Copyright not applicable to geodata?
April 1st, 2007
Over the last couple of weeks, I’ve heard new questions and opinions about open licensing of geographic information, coming from several different directions. Specifically:
- Local and regional authorities in Italy and in New Zealand among others, have been looking into whether it is appropriate to use a Creative Commons license for geodata.
- Richard Fairhurst of the OpenStreetmap project are attempting to find out whether database right, rather than CC-style copyright, is a potential option for open licensing its data.
- Chris Holmes, having submitted repackaged public domain data with service configurations, the lot under a CC-SA license, to the OSGeo geodata repository, has been seeking informal legal advice from Science Commons, the data licensing arm of Creative Commons.
Chris’s email to the osgeo/geodata list offers some context and the conclusion that copyright-based licenses are inapplicable to geographic information in its state as a “collection of facts”. CC, by this reading, just does not apply to geographic data (though it may apply to a rendered map as a creative expression of the underlying facts). In using a copyright-based license for open data, we risk imposing constraints that are new and unenforceable.
… the Science Commons initiative is about getting science data more available, which unlike geospatial data is something that traditionally has been available for all, only published papers about the data were under copyright. So they would be very hesitant to create a regime for data licensing that would make it easier for people to put more restrictions on their data. They are launching a ‘facts are free’ campaign soon to get across to the world that one can’t copyright scientific data.
The Science Commons FAQ on databases and copyright goes into more detail on to-CC-or-not-to-CC for “factual” information. It mentions that the Creative Commons licenses specific to Belgium and the Netherlands include the database right, but other territory-specific European CC licenses do not. If a Belgium-specific OpenStreetmap clone were to use this license, could it be safely recombined with the global body of open geodata, or not?
Richard pointed to this cogent paper going through some of the relevant case law for database right as it impacts geodata. I’m reminded of James Boyle’s classic FT article on the European Commission’s assessment of the negative economic impact of database right in Europe.
- What does all this imply about the use of Crown Copyright to cover state-collected geographic information in the UK, Canada and elsewhere?
- If a CC- or GPL- derived, copyright-based license is not apt for geodata, what standard forms of “click-use contract” can be recommended now to state agencies looking to provide open access to geodata?
Open Knowledge 1.0 Nearly Here
March 8th, 2007
Open Knowledge 1.0, which takes place on Saturday March the 17th at Limehouse Town Hall in London, is now just over a week away. While there are still some places left we are nearing capacity so, if you would like to come, we advise you to register as soon as possible via: http://www.okfn.org/okcon/register/
Open Knowledge 1.0
- When: Saturday 17th March 2007, 11am until 6:30pm (Doors open at 1030)
- Where: Limehouse Town Hall, 646 Commercial Road, London, E14 7HA.
- Programme: http://www.okfn.org/okcon/programme/
- Registration: http://www.okfn.org/okcon/register/
- Wiki: http://okfn.org/wiki/okcon/
On the 17th March 2007 the first all-day Open Knowledge event is taking place in London. This event will bring together individuals and groups from across the open knowledge spectrum and includes panels on open media, open geodata and open scientific and civic information.
The event is open to all but we encourage you to register because space is limited. A small entrance fee of £10 is planned to help pay for costs but concessions are available.
Speakers
Open Scientific and Civic Data
- Tim Hubbard, leader of the Human Genome Analysis Group at the Sanger Institute
- Peter Murray-Rust, Professor in the Unilever Centre for Molecular Science Informatics at Cambridge University
- John Sheridan, Head of e-Services at the Office of Public Sector Information
Geodata and Civic Information
- Ed Parsons, until recently CTO of the Ordnance Survey
- Steve Coast, founder of Open Street Map
- Charles Arthur, freeourdata.org.uk and Technology Editor of the Guardian
Open Media
- Paula Ledieu, formerly Director of the BBC’s Creative Archive project and now Managing Director and Director of Open Media for Magic Lantern Productions
- Susana Noguero and Olivier Schulbaum of Platoniq
- Zoe Young of http://www.transmission.cc/
Open Space
Lightning talks and mini-presentations. See: http://okfn.org/wiki/okcon/
Theme: Atomisation and Commercial Opportunity
Discussions of ‘Open Knowledge’ often end with licensing wars: legal arguments, technicalities, and ethics. While those debates rage on, Open Knowledge 1.0 will concentrate on two pragmatic and often-overlooked aspects of Open Knowledge: atomisation and commercial possibility.
Atomisation on a large scale (such as in the Debian ‘apt’ packaging system) has allowed large software projects to employ an amazing degree of decentralised, collaborative and incremental development. But what other kinds of knowledge can be atomised? What are the opportunities and problems of this approach for forms of knowledge other than Software?
Atomisation also holds a key to commercial opportunity: unrestricted access to an ever-changing, atomised landscape of knowledge creates commercial opportunities that are not available with proprietary approaches. What examples are there of commercial systems that function with Open Knowledge, and how can those systems be shared?
Bringing together open threads from Science, Geodata, Civic Information and Media, Open Knowledge 1.0 is an opportunity for people and projects to meet, talk and plan things.
Opening Up Ancient Geodata: The Barrington Atlas II
January 22nd, 2007
I’ve written previously about the Barrington Atlas of the Ancient World which took 12 years to produce (1988-2000). It’s a wonderful example of interdisciplinary collaboration using, as it did, the talents of a multitude of classical scholars as well as many cartographers. In that earlier post I pointed out that, unfortunately, none of the underlying geodata available (only the images) but that it had a fairly hefty price tag — even though much of the work was ‘up-front’ funded and digital distribution would be practically zero-cost.
However, it seems that, thanks to a generous grant from the National Endowment for the Humanities, the Ancient World Mapping Center at UNC](http://www.unc.edu/awmc/) have been able to start a project entitled Pleiades to:
provide on-line access to all information about Greek and Roman geography assembled by the Classical Atlas Project for the Barrington Atlas of the Greek and Roman World (R. Talbert, ed., Princeton, 2000. Pleiades will also enable large-scale collaboration in order to maintain and diversify this dataset. Combining open-content approaches (like those used by Wikipedia) with academic-style editorial review, Pleiades will enable anyone — from university professors to casual students of antiquity — to suggest updates to geographic names, descriptive essays, bibliographic references and geographic coordinates.
This is fantastic news and I do hope that all the geodata, both that already in the atlas and that which will be contributed, will be made open by attaching an appropriate open license. It would be a real shame if was one of those classic: “you’re free to read this (at least as long as this website stays around) but not to reuse or redistribute it”.
The Tragedy of the Enclosed Lands
January 13th, 2007
Normally I don’t like to blog things solely in order to propagate them, adding no commentary. But The Tragedy of the Enclosed Lands is about the most lucid, personal and usefully social explanation of the consequences of the lack of open access to geodata in the UK, that I have ever read online.
There are thousands of applications like this, waiting to be fulfilled. Where is the real justification for a proprietary geodata access policy?
INSPIRE: Where Next?
November 24th, 2006
The OKF has been very actively involved in the publicgeodata’s campaign on the INSPIRE directive. Now that it appears compromise between all of the parties — the European Commission, Council and Parliament — has been reached it is natural to ask ourselves both: Where next? and How did we do?
Where Next
The immediate point to make here is that on the issues we care about that the compromise allows for national law makers to exercise a lot of discretion on how they implement the Directive. From our point of view this means there’s plenty to fight for at the national level as INSPIRE will need to be ‘transposed’ into each national law. Any optionality is another chance to obtain more ‘open’ legislation as well as an opportunity to make the case for the social and commercial benefits of open geodata.
How Did We Do
All in all I think the campaign has been a tremendous success. Ok, so we didn’t manage to achieve a total u-turn in European geodata policy but
(a) We can do that next time
(b) Though not perfect, INSPIRE is an improvement over the status quo (non-open/unfree geodata is currently the norm across Europe)
What we did achieve with a campaign that was zero-budget, entirely dependent on spare bits volunteers’ time, and only started when the directive was already at second reading was:
A petition that was signed by over 7000 citizens from across the EU
Letters to MEPs and national ministries making the case for open geodata along with personal contact with many of the parties involved (MEPs, civil servants etc). This will stand us in good stead in the future and likely had some impact on the compromise that was eventually reached.
The dissemination and analysis of a large amount of information about what was happening (particular credit to you here Benjamin Henrion)
Link ups with other campaiging groups such as the UK’s freeourdata
So, all in all, I think there plenty to be proud of which should give us heart as we prepare ourselves to take the campaign on to the national level.
Mashing up is hard to do
November 2nd, 2006
Mashups, what. Two or more data sources or works combined to become a new data source or work. A media cultural term (cf Steinski) now applied to web applications; comparable to the tradition of data overlay in cartography and analysis mapping. Mashup is also a curious marketing phenomenon.
Interoperability, what. The exchange and reuse of data and code from machine to machine without human intervention or interpolation, by means of translatable or common interfaces and description languages. The metaphor is lego bricks with interoperable bumps and notches.
Business model, what. A license to extract profit from an information imbalance. Or, a means of sustaining and rewarding participation in a growing activity.
The UK Geospatial Mashups event at the Ordnance Survey on the 20th October was a meeting of the tribes, but I found it a surprisingly business-oriented affair; Mikel Maron captured the general tone in his writeup. In the closing panel session there was much talk of business models based on the ability to “mash up” different sources of information from online services. There really seemed to be an acceptance that the open and collaborative production of data and knowledge allows for the provision of better services and creation of more value.
Every question about business models had an unspoken coda, seemed like a half-finished sentence. “How can we find a business model for this (that is is based on providing open access to data)?” “How can we pay for the cost of collection and maintenance (of a body of data that is openly and freely available)?”.
For me, the answer to that question is contained in another question asked that day, by a local council representative in the audience. They are obliged to buy a UK wide license for postcode geocoding and could in theory just use a 25-per-day “free” lookups on the Post Office’s online service in their own online applications. The local council doesn’t need a National Spatial Data Infrastructure to exist.
Yet through their existing knowledge base about addresses, postcodes, locations and the fact that they manage a lot of public services which collect often literal “drive-by data” about the locality - social workers on visits, refuse collection services, disabled transport services - the local council is in a good position to collect and maintain its own body of changes. A positive analogy is a solar-powered household or neighborhood selling its electricity surplus back to the National Grid. The current situation equates to giving the surplus away and then having to buy a subscription to it back.
There’s already at least one local authority contributing recommended bicycle route data to OpenStreetmap, that isn’t on centralised maps. A piece by Charles Arthur in today’s Guardian, From postcodes to roads, we can collect it ourselves talks of a new, commercially-based open mapping project as well as two complementary efforts to build a free of copyright postcode location database.
The best way to demonstrate the power of collaborative mapping and how the willingness and dedication of just a few ordinary citizens can produce tremendously useful public resources, free for reuse and redistribution. The OpenStreetmap project and its expanding community are demonstrating this approach very well. At the UK Geospatial Mashups event, Sean Phelan, founder of Multimap, said - “I am completely convinced by the OpenStreetmap presentation that OpenStreetmap is viable”. (The quality of the editing environment in JOSM and the presentation quality of Osmarender both seemed very compelling to the audience.) I hope all these efforts will go on to demonstrate that data quality really can be had in a free, peer-based form where a lot of contributions are artefacts of other work.
The social practise for the open and collaborative production of geodata is there already, whether or not there is a visible “business model” accompanying it; if money is needed to support knowledge-generating activities directly, then some business maintenance model has to be found, but perhaps support for them can be a byproduct of other business…
Collaborative and public geodata
August 20th, 2006
Chris Holmes’ words on “Why isn’t collaborative geodata a big deal already” got me thinking about how some properties of the world can be observed - like street shapes and names - others can’t, but have to be transmitted - like postal codes and administrative boundaries. A GPS unit and a lot of goodwill will get you some way, but there are a lot of missing pieces.
In the US people don’t appreciate the wealth of data they have; in Europe people don’t realise quite how much they can’t get done. Collaborative mapping is a middle way that has yet really to catch on - there’s either no pressure for it or not enough reference data to act as a framework for it. I hope the current interest in “data mash-ups” in the UK prefigures a movement towards that middle way.
At a European level the legislative discussion over the public right to explore and reuse state-collected geodata continues, with a final vote in Parliament expected “in the early autumn”. Public Geodata is sending another Open Letter, to Ministers in the Council about their viewpoint on the INSPIRE Directive establishing a framework for European spatial data infrastructure going into the conciliation process before Third Reading in Parliament.
Technically, a lot of the “data infrastructure” problem has been about uncertainty in discovery / search / exchange protocols - no shared understanding of base metadata models. I hope the recent work being done at OSGeo on Simple Catalog Interfaces can feed into this usefully somehow; also the tile distribution project further up the stack; in making these “SDI” interfaces and concepts genuinely more useful by citizen developers and potential contributors and ground-truthers.
