Support Us

You are browsing the archive for Open Geodata.

Ordnance Survey to open up UK geospatial data

Jonathan Gray - November 19, 2009 in News, Open Data, Open Geodata, Open Government Data, Policy

In a press release earlier this week, it was announced that there will be moves to open up geospatial data produced by the Ordnance Survey:

The Prime Minister and Communities Secretary John Denham will today announce that the public will have more access to Ordnance Survey maps from next year, as part of a Government drive to open up data to improve transparency.

While in the past Ordnance Survey have made limited data available with restrictions on how it can be used (such as via the OpenSpace API)- it looks like the new material will be open as in the Open Knowledge Definition, meaning it can be used for any purpose, including commercial:

Data relating to electoral and local authority boundaries as well as postcode areas would be released for free re-use, including commercially. Mid-scale digital mapping information would also be released in the same way.

At the Open Knowledge Foundation, we believe there are a plethora of social and economic benefits to making data open. In a similar vein, the release says:

Making public data available also enables people to reuse it in different and more imaginative ways than may have originally been intended. Estimates suggest that this could generate as much as a billion pounds for the UK economy.

For example developers might use this information alongside other Government data about transport, health or education, for services that generate economic and social value.

Openness of data is as important for local government as it is for national government – making people more connected to their community and giving them the tools to demand action on issues that matter. Releasing council records in re-usable form could mean that citizens can find out everything from the council accounts to the number of streetlights and community wardens, to when the rubbish is collected and the hedges trimmed.

The news was also reported by the BBC and the Guardian:

The relevant Ordnance Survey data is set to be released in April 2010. Very exciting news!

Open Knowledge Conference (OKCon) 2010: Call for Proposals

Jonathan Gray - November 10, 2009 in Events, News, OKCon, Open Access, Open Data, Open Geodata, Open Government Data, Open Knowledge Definition, Open Knowledge Foundation, Open Science

The Open Knowledge Conference (OKCon) 2010 Call for Proposals is now open!

We would be grateful for help in circulating the call to relevant lists and communities! You can reuse or point to:

Open Knowledge Conference (OKCon) 2010: Call for Proposals

Introduction

OKCon, now in its fifth year, is the interdisciplinary conference that brings together individuals from across the open knowledge spectrum for a day of presentations and workshops.

Open knowledge promises significant social and economic benefits in a wide range of areas from governance to science, culture to technology. Opening up access to content and data can radically increase access and reuse, improving transparency, fostering innovation and increasing societal welfare.

This is a time of great change. In addition to high profile initiatives such as Wikipedia, OpenStreetMap and the Human Genome Project, there is enormous growth among open knowledge projects and communities at all levels. Moreover, in the last year, governments across the world have begun opening up huge amounts of their data.

And it doesn’t stop there. In academia, open access to both publications and data has been gathering momentum, and similar calls to open up learning materials have been heard in education. Furthermore this gathering flood of open data and content is the creator and driver of massive technological change. How can we make this data available, how can we connect it together, how can we use it collaborate and share our work?

Join us to discuss all of this and more!

Topics

We welcome proposals on any aspect of creating, publishing or reusing content or data that is open in accordance with opendefinition.org. Topics include but are not limited to:

Technology

  • Semantic Web and Linked Data in relation to open knowledge
  • Platforms, methods and tools for creating, sharing and curating open knowledge
  • Light-weight, adaptive interaction models
  • Open, decentralized social network applications
  • Open geospatial data

Law, Society and Democracy

  • Open Licensing, Legal Tools and the Public Domain
  • Open government data and content (public sector information)
  • Open knowledge and international development
  • Opening up access to the law

Culture and Education

  • Open educational tools and resources
  • Business models for open content
  • Incentive and rewards open-knowledge contributors
  • Open textbooks
  • Public domain digitisation initiatives

Science and Research

  • Opening up scientific data
  • Supporting scientific workflows with open knowledge models
  • Open models for scientific innovation, funding and publication (‘open-access’)
  • Tools for analysing and visualizing open data
  • Open knowledge in the humanities

Important Dates

  • Submission deadline: January 31st 2010
  • Notification of acceptance: March 1st
  • Camera-ready papers due: March 31st
  • OKCon: April 24th 2010

Submission Details

We are accepting three types of submissions:

  1. Full papers of 5-10 pages describing novel strategies, tools, services or best-practices related to open knowledge,
  2. Extended talk abstracts of 2-4 pages focusing on novel ideas, ongoing work and upcoming research challenges.
  3. Proposals for short talks and demonstrations

OKCon will implement an open submission and reviewing process. To make a submission visit:

Depending on the assessment of the submissions by the programme committee and external reviewers, submissions will be accepted either as full, short or lightning/poster presentations.

Proceedings of OKCON will be published at CEUR-WS.org. If you want your submission to be included in the conference proceedings you have to prepare a manuscript of your submission according to the LNCS Style.

Programme Committee

  • Sören Auer, AKSW/Universität Leipzig
  • Christopher Corbin, UK Advisory Board on Public Sector Information (APPSI)
  • Adnan Hadzi and Andrea Rota, Department of Media and Communications, Goldsmiths College, University of London
  • Claudia Müller-Birn, Carnegie Mellon University
  • Peter Murray-Rust, University of Cambridge
  • Rufus Pollock, Open Knowledge Foundation and Emmanuel College, University of Cambridge
  • John Wilbanks, Science Commons

OpenFlights data released under Open Database License (ODbL)

Jonathan Gray - October 14, 2009 in Exemplars, External, News, Open Data, Open Data Commons, Open Geodata

OpenFlights

OpenFlights is a site for “flight logging, mapping, stats and sharing”.

We’re very pleased to hear they’ve just released their data under the Open Database License (ODbL):

One of OpenFlights‘ most popular features is our dynamic airport and airline route mapping, and today, we’re proud to release the underlying data in an easy-to-use form, up to date for October 2009. Behold 56749 routes between 3310 airports on 669 airlines spanning the globe.

The data can be downloaded from our Data page and is free to use under the Open Database License.

See also the OpenFlights package on CKAN:

Where is the nearest bus stop? UK Department for Transport adds NaPTAN data to Open Street Map

Jonathan Gray - August 20, 2009 in Exemplars, External, News, Open Data, Open Geodata, Open Government Data

Bus stop by Ti.mo on Flickr

The UK’s Department for Transport (DfT) has recently released data from the National Public Transport Access Node (NaPTAN) database to be put on Open Street Map (OSM).

As it says on the NaPTAN website:

> NaPTAN provides a unique identifier for every point of access to public transport in the UK, together with meaningful text descriptions of the stop point and its location.

The NaPTAN page on the Open Street Map wiki says the data contains:

> [...] details of some 350,000 public transport access points in Great Britain including bus stops, railway stations, tram stops and ferry terminals. This data includes a name, geocode, official code and other information useful to the project. The data set includes both the physical points of access to transport (Platforms, bus stops, airport gateways, etc), the interchanges (Stations, Airports, Ports, Clusters, etc ) and the the entrances to the interchanges from the street or public thoroughfare [...]

While the main NaPTAN database has restrictions on commercial use (see the naptan package page on CKAN, added at our Workshop on Public Information last autumn), under a special arrangement with the DfT and Traveline, the Open Street Map Foundation has been given access to the database to import useful and relevant data to Open Street Map to be made open under the terms of their license, the Creative Commons Attribution Sharealike license.

An email from Roger Slevin at the DfT earlier this year placates the concerns that Ordnance Survey may claim rights in the data:

> I am conscious that some concern has been expressed about whether OS has any rights to the NaPTAN data (or NPTG) – and I can assure the OSM community that the Department for Transport has been assured by Ordnance Survey that they do not claim any rights over NaPTAN location data – and it is a matter of record that Department for Transport is the owner of the NPTG database. Both NaPTAN and NPTG are maintained by DfT as national databases, collating data from all local transport authorities in England, Wales and Scotland.

Further details of the import agreement are available at:

  • The NaPTAN data is currently being converted to OSM format, imported by county and merged with OSM data. The first county, West Midlands, was uploaded at the end of March and data for Greater London was uploaded on Monday. It is planned to have data for the whole of the UK by the end of the year. This will mean that Open Street Map should have bus stops and other public transport points for the whole of Great Britain!

A full list of NaPTAN data added to OSM is available at:

  • We’ve added a package page to CKAN at:

  • *

If you are based in the UK and interested in helping out – you can check that the data in your local area is correct, as there are some ghost stops in the data, and duplicates where transport access points were previously added to OSM!

This is excellent news – and big kudos to the DfT for donating the data! We hope that other departments consider following suit and adding their geodata to OSM!

The import was supported by Ideas in Transit, which is “a five-year project that applies User Innovation to the transport challenges faced by individuals and society”. For more on their Open Street Map related activity, you can see the Ideas in Transit page on the OSM wiki.

Detail of OSM showing transport access points

Open Plaques: open data about UK heritage sites

Jonathan Gray - August 11, 2009 in Exemplars, External, Open Data, Open Geodata, Public Domain

Open Plaques

Open Plaques is a project to find and document all the UK’s blue heritage plaques, which commemorate sites where famous events occurred, or with a connection to notable historical figures.

There are currently over 1700 plaques, which can be browsed by area, by person, by role or by organisation. Though the project is currently in alpha the idea is that anyone will be able to add or edit plaques, and display photos uploaded to Flickr. We hope there will be participation from local history groups, schools and so on!

On the data plage, they state that all the data is in the public domain:

We consider the data to be Public Domain, and make no claims of copyright over either the data we’ve collected ourselves, nor the value we’ve added to existing data. That said, we can accept no liability for any issues that may arise over the re-use of this data, and you’re advised to make your own assessment. If you do re-use the data, we’d love it if you could acknowledge Open Plaques, and link back to us – however you are under no obligation to do so.

As there is a field for the plaques’ coordinates, you can view the locations of the plaques on Open Street Map in a given region, for example in Worthing, Bath or Birmingham.

Open Plaques screenshot

See also:

Open Data: Openness and Licensing

Rufus Pollock - February 2, 2009 in Ideas and musings, Open Data, Open Definition, Open Geodata, Open Knowledge Definition, Open Science, Policy

Why does this matter?

Why bother about openness and licensing for data? After all they don’t matter in themselves: what we really care about are things like the progress of human knowledge or the freedom to understand and share.

However, open data is crucial to progress on these more fundamental items. It’s crucial because open data is so much easier to break-up and recombine, to use and reuse. We therefore want people to have incentives to make their data open and for open data to be easily usable and reusable — i.e. for open data to form a ‘commons’.

A good definition of openness acts as a standard that ensures different open datasets are ‘interoperable’ and therefore do form a commons. Licensing is important because it reduces uncertainty. Without a license you don’t know where you, as a user, stand: when are you allowed to use this data? Are you allowed to give to others? To distribute your own changes, etc?

Together, a definition of openness, plus a set of conformant licenses deliver clarity and simplicity. Not only is interoperability ensured but people can know at a glance, and without having to go through a whole lot of legalese, what they are free to do. (For more see this article and this post).

Thus, licensing and definitions are important even though they are only a small part of the overall picture. If we get them wrong they will keep on getting in the way of everything else. If we get them right we can stop worrying about them and focus our full energies on other things.

Background

Over the last couple of years there has been substantial discussion about the licensing (or not) of (open) data and what ‘open’ should mean. In this debate there two distinct, but related, strands:

  1. Some people have argued that licensing is inappropriate (or unnecessary) for data.
  2. Disagreement about what ‘open’ should mean. Specifically: does openness allow for attribution and share-alike ‘requirements’ or should ‘open’ data mean ‘public domain’ data?

These points are related because arguments for the inappropriateness of licensing data usually go along the lines: data equates to facts over which no monopoly IP rights can or should be granted; as such all data is automatically in the public domain and hence there is nothing to license (and worse ‘licensing’ amounts to an attempt to ‘enclose’ the public domain).

However, even those who think that open data can/should only be public domain data still agree that it is reasonable and/or necessary to have some set of community ‘rules’ or ‘norms’ governing usage of data. Therefore, the question of what requirements should be allowed for ‘open’ data is a common one, whatever one’s stance on the PD question.

Of course, even with agreement on requirements, there is still the question of whether these should be ‘enforced’ through a license or via community norms. To summarize, the three main questions are:

Qu 1. Is it important to license?

Qu 2: What ‘restrictive’ requirements are compatible with openness? In particular does ‘open’ equate to PD only or are attribution and share-alike ‘requirements’ permitted?

Qu 3: Community norms or licenses? Should ‘community norms’ or license terms be used in order to encode requirements such as attribution and share-alike?

Below I look at each of these in turn, laying out, as I see it, the current consensus and expressing my own view.

Question 1: Is it Important to License?

The simple answer here is yes. Whether one likes it or not there are a whole bunch of jurisdictions where there are IP rights in data(bases). Note that this does not imply any monopoly rights in any facts that data represents.

Thus, even if you just want your data to be in the ‘public domain’, you need to apply a license — or something very closely resembling a license. (A suitable example is the Open Data Commons Public Domain Dedication and License).

Question 2: What Should Openness Allow?

Despite the sometimes heated discussion, there is, in fact, broad agreement: openness means freedom to use and reuse data in any way you wish. The only debate is over what, if any, conditions can be imposed when allowing use and reuse. In particular, following the example of the software and content domains, the following two items have been proposed as permissible exceptions to the basic rule of ‘allow everything’:

  1. Requirement of attribution (in a non-burdensome manner)
  2. Requirement to share-alike (a reuser or share-alike material must, when making publicly available their own material, make it openly available under a similar share-alike license)

Attribution

Everyone agrees that requiring attribution is OK. Furthermore, it also now generally accepted that having this requirement in a license is not be a problem.

(In the original Protocol for Implementing Open Access Data attribution was alleged to be problematic due to a potential for ‘attribution stacking’. However, these concerns appear to have been allayed. To my mind, it was never clear why data needed to be different: code and content both have plenty of examples of projects with many contributors, much reuse and an attribution requirement).

Share-Alike

Share-alike provisions are more controversial. It has been argued that share-alike conditions are problematic because of the potential for incompatibility between two share-alike licenses (or community norms). At the same time share-alike may provide an important incentive for individuals and communities to make their data openly available since it provides some assurance that this data will remain open. Thus, any evaluation comes down to the balance between:

  1. The costs, if any, of allowing share-alike in terms of e.g. complexity and compatibility.

  2. The benefits, if any, that share-alike provides by encouraging the creation of open data in the first place and in ensuring subsequent ‘sharing back’ by those who build upon that data.

In my view the benefits are substantial while the costs are not. Incompatibility can largely be avoided by only ‘approving’ share-alike licenses that are compatible. At the same time, share-alike enshrines a principle that is important to many communities in the code and content spheres and same seems true of data (consider e.g. Open Street Map).

(Aside: it is important to emphasize that permitting share-alike does not mean it is must be used. In fact, a particular community could recommend against using share-alike as, for example, the Python community does for code hoping to make it into its standard library.)

Question 3: Licenses versus Community Norms

Even if a basic license is used it can be argued that any ‘requirements’ for attribution or share-alike should not be in a license but in ‘community norms’. So which is best?

In my view, when making available data, licenses are much better than community norms. Why?

  1. A license is always needed even if you are taking a PD approach. So ‘norms’ don’t obviate the need to license.
  2. A license is able to encode ‘norms’ both formally and informally (for example, in a preamble — cf. the GPL).
  3. A license is likely to elicit at least as much, and almost certainly more, conformity with its provisions than community norms. This is especially true outside of the community. The future is likely to see a much more mixed data landscape whether in science or elsewhere with many ‘non-community’ (non-academic) business and among ordinary citizens. (Note also that for these groups the simplicity and formality of a license makes it superior to ‘norms’ in almost every respect — transparency, certainty etc.
    • If there are concerns that, in some jurisdictions, the absence of ‘data’ rights make e.g. share-alike provisions unenforceable nothing is lost by using a license: the license de facto reverts to the status of a community norm and any concerns regarding “false expectations” can easily be dealt with by a simple warning.

Flexibility: some have argued that ‘norms’ are more ‘flexible’ than licenses. I’m not clear what this really means:

  • Flexible = not enforceable. Perhaps true but I am unclear why this is an advantage (even to a user it is easy to comply with the open license)
  • Flexible = leeway around the edges. For example I won’t get in trouble if I don’t attribute quite right. But this is true of licenses too: it is very unlikely anyone gets sued for a minor error in attribution and even with share-alike no court is likely to award damages for a mistake made in good faith — especially if it can be easily corrected.
  • Flexible = fuzzy. Fuzziness does not seem an attractive property when sharing data — both sharer and sharee want clarity.
  • Flexible = easily changed. Allowing major changes is a serious problem both for licensors and licensees (certainty and clarity would disappear). For minor changes licenses are just as good.

Thus, in every respect I can think of, licenses are superior to community norms when making available open data.

Conclusion

Summarizing the the conclusions from the above discussion we have:

Qu 0: Does this matter?

Yes. A good definition of openness and the use of some form of licensing is crucial to a healthy future for the open data community (and that will include pretty much everyone …).

Qu 1: Is it important to license?

Ans: A ‘license’ is always necessary — even if you advocate a PD-only approach. There is too much variation (and uncertainty) about what the IP situation is across the world to just go with the default. All providers of data should apply some kind of license or PD dedication.

Qu 2: What ‘restrictive’ requirements are compatible with openness? In particular does ‘open’ equate to PD only or are attribution and share-alike ‘requirements’ permitted?

Ans: Both attribution and share-alike should be permitted. Attribution is widely agreed to be acceptable. The second, ‘share-alike’ is more controversial, but in my view should be allowed: there is no reason to break with the precedent set in code and content domains and its benefits seem substantial while costs are minimal if licenses are correctly managed.

Qu 3: Community norms or licenses?

Ans: Use licenses when making available data. Licenses provide all the benefits of community norms in terms of explicitly encoding the preferences of a community. At the same time they deliver greater clarity and transparency, and, in many jurisdictions, provides a legal enforceability which norms do not with regard to requirements of attribution or share-alike.

Colophon

This essay comes out of ongoing discussions over the last few years with a large assortment of communities and individuals. The primary motivation for sitting down and pulling the threads together came out of reading Michael Nielsen’s post on The role of open licensing in open science (+ thread) and recent emails with John Wilbanks of Science Commons on the Open Definition coord list.

Related work and earlier discussion on this matter include:

Good news for open data: Protocol for Implementing Open Access Data, Open Data Commons PDDL and CCZero

Jonathan Gray - December 17, 2007 in External, News, Open Access, Open Data, Open Geodata, Open Knowledge Definition, Open Knowledge Foundation

Last night Science Commons announced the release of the Protocol for Implementing Open Access Data:

The Protocol is a method for ensuring that scientific databases can be legally integrated with one another. The Protocol is built on the public domain status of data in many countries (including the United States) and provides legal certainty to both data deposit and data use. The protocol is not a license or legal tool in itself, but instead a methodology for a) creating such legal tools and b) marking data already in the public domain for machine-assisted discovery.

As well as working closely with the Open Knowledge Foundation, Talis and Jordan Hatcher, Science Commons have spent the last year consulting widely with international geospatial and biodiversity scientific communities. They’ve also made sure that the protocol is conformant with the Open Knowledge Definition:

We are also pleased to announce that the Open Knowledge Foundation has certified the Protocol as conforming to the Open Knowledge Definition. We think it’s important to avoid legal fragmentation at the early stages, and that one way to avoid that fragmentation is to work with the existing thought leaders like the OKF.

Also, Jordan Hatcher has just released a draft of the Public Domain Dedication & Licence (PDDL) and an accompanying document on open data community norms. This is also conformant with the Open Knowledge Definition:

The current draft PDDL is compliant with the newly released Science Commons draft protocol for the “Open Access Data Mark” and with the Open Knowledge Foundation’s Open Definition.

Furthermore Creative Commons have recently made public a new protocol called CCZero which will be released in January. CCZero will allow people:

(a) ASSERT that a workhas no legal restrictions attached to it, OR
(b) WAIVE any rights associated with a work so it has not legal restrictions attached to it,
and
(c) “SIGN” the assertion or waiver.

All of this is fantastic news for open data!

Keeping “Open” Libre

jwalsh - November 20, 2007 in Ideas and musings, Open Geodata, Open Knowledge Definition, Open/Closed

Last week I attended the Jornadas gvSIG, the developer/user gathering for the open source GIS project supported by the regional government in Valencia. There seems to be a very supportive climate towards free software and open licensed data in Spain. I was impressed to hear people from commercial consultancies and local government information and infrastructure departments talking so strongly about software libre and the need to compartir el conocimiento, where tecnologia proprietaria has no place in a proyecto cooperativo. Government is increasingly moving toward an explicit Creative Commons based open licensing approach to public data and its Spatial Data Infrastructure – census data, political and administrative shapes, street networks and aerial imagery – all kinds of geographic information, open and libre.

Our household only knows about Indo-European languages, but can’t think of another language than English where a distinction between libre (free) and gratis (free) isn’t explicitly made. Talk of datos libres or freie daten has both rhetorical strength and public plausibility in a way in which free, in English, hasn’t. The term “open source software” originally came about as a softening of the term “free software”, in an attempt to introduce a non-radical plausibility. Free and Open Source software can be essentially the same thing, under a different name, open licensed in the same way.

In the last few weeks I’ve heard of Google’s launch of “OpenSocial” and its bootstrapping of the “Open Handset Alliance”. The latter, certainly, is based on patent/license-encumbered hardware and not offering an “Open Platform” that will run on more truly libre telephony hardware platforms such as OpenMoko. How libre is “open”, in these cases? How libre can a system be that relies on data formats and hardware recipes that require royalties and/or membership of a consortium in order to use it?

In such circumstances I am very glad an effort like opendefinition.org, attempting to describe a yardstick by which the libre qualities of open data, data service, data format, works can be assessed. I hope that, in helping to keep the definition of usefully “open” clear, this may help to keep open free.

How to Develop Geodata Domain Models

Rufus Pollock - April 20, 2007 in External, Open Geodata

Jo Walsh (who’s also a member of the Open Knowledge Foundation) has written a great post over on the mappinghacks blog about the development a new data model for OpenStreetMap. Though focused on the issue of modelling geodata the points she raises, particularly in relation to ‘Audit’ functionality (change tracking, versioning etc), are applicable to many other areas of open knowledge development and I strongly recommend reading the original post in full.

Copyright not applicable to geodata?

jwalsh - April 1, 2007 in Ideas and musings, Open Geodata

Over the last couple of weeks, I’ve heard new questions and opinions about open licensing of geographic information, coming from several different directions. Specifically:

  • Local and regional authorities in Italy and in New Zealand among others, have been looking into whether it is appropriate to use a Creative Commons license for geodata.
  • Richard Fairhurst of the OpenStreetmap project are attempting to find out whether database right, rather than CC-style copyright, is a potential option for open licensing its data.
  • Chris Holmes, having submitted repackaged public domain data with service configurations, the lot under a CC-SA license, to the OSGeo geodata repository, has been seeking informal legal advice from Science Commons, the data licensing arm of Creative Commons.

Chris’s email to the osgeo/geodata list offers some context and the conclusion that copyright-based licenses are inapplicable to geographic information in its state as a “collection of facts”. CC, by this reading, just does not apply to geographic data (though it may apply to a rendered map as a creative expression of the underlying facts). In using a copyright-based license for open data, we risk imposing constraints that are new and unenforceable.

… the Science Commons initiative is about getting science data more available, which unlike geospatial data is something that traditionally has been available for all, only published papers about the data were under copyright. So they would be very hesitant to create a regime for data licensing that would make it easier for people to put more restrictions on their data. They are launching a ‘facts are free’ campaign soon to get across to the world that one can’t copyright scientific data.

The Science Commons FAQ on databases and copyright goes into more detail on to-CC-or-not-to-CC for “factual” information. It mentions that the Creative Commons licenses specific to Belgium and the Netherlands include the database right, but other territory-specific European CC licenses do not. If a Belgium-specific OpenStreetmap clone were to use this license, could it be safely recombined with the global body of open geodata, or not?

Richard pointed to this cogent paper going through some of the relevant case law for database right as it impacts geodata. I’m reminded of James Boyle’s classic FT article on the European Commission’s assessment of the negative economic impact of database right in Europe.

  • What does all this imply about the use of Crown Copyright to cover state-collected geographic information in the UK, Canada and elsewhere?
  • If a CC- or GPL- derived, copyright-based license is not apt for geodata, what standard forms of “click-use contract” can be recommended now to state agencies looking to provide open access to geodata?
Get Updates