Open Data: a means to an end, not an end in itself

September 15, 2011 in Ideas and musings, Open Data

The following is a post by Rufus Pollock, co-Founder of the Open Knowledge Foundation.

In almost all the talks I give about open data or content, I aim, at least once, to make the statement along the lines:

“Openness for data and content is not an end in itself, it’s a means to an end”

This, of course, begs the question: if open data is a means and not an end in itself, what are the real ends that we are seeking?

The real ends are the improved creation, processing and use information for the purpose of bettering our lives and the world around us — finding a better way to travel to work, understanding and addressing climate change, finding better ways to cure and prevent disease, deciding who to vote for, the list goes on and on because it includes almost anything where information, and more specifically digital information is or could be important.

Now, there are many things that contribute to us improving the “creation, processing and use of information” but the following are especially important (and interlink):

  1. Scalability — i.e. dealing with larger and larger amounts of information
  2. Improved tools, techniques and process for handling that information
  3. Wide access to the raw data and content

(I’d also add a fourth item: to create, process and use information in a collaborative, distributed and decentralized manner that puts ‘information power’ — the power to access, understand and utilize information — in the hands of the many rather than concentrating it in the hands of the few. However, I have left this out as it could be argued that this is not a requirement for improvement but an additional, and separate, desiderata.)

It is at this point that openness enters: openness — both of data and of tools — is central to making rapid progress in each of these areas:

  1. Scalability: successful ‘data scaling’ requires componentization — the breaking up material into maintainable chunks (components) that can be recombined. However, without openness componentization cannot function because the recombination of components will rapidly become impossible due to the need to check and clear rights with so many different sources of data (and incompatibilities between the conditions imposed by different sources).

  2. Tools, technique and process. Open data makes it much easier to develop and share tools, techniques and processes for working with data. Moreover, without open data the application of those tools can be severely limited.

  3. Wider access to the material: given the vast amount of material becoming available we’re going to want as many people as possible (and not just ‘professionals’) to be able to access, experiment with and redistribute that data as easily as possible. Remember the many minds principle: the best thing to do with your data will be though of by someone else.

Summing Up

Open data, then, is a means to an end not an end in itself. Openness is important to the extent it helps us do something “useful” — not because it is valuable in and of itself.

I think it’s important to emphasize this point because as the open data movement grows, we need to be clear that open data is not some magic potion that, on its own, will automatically solve problems. Fundamentally, to be useful data (open or otherwise) needs to be used: it needs individuals and institutions to analyze it and to act on that analysis, it needs companies and communities to build apps and services with it, and it needs tools and processes developed to facilitate doing those activities.

This is not to underestimate the value of openness: as argued above, it is central to making significant progress in “doing useful stuff”, but we must also avoid the trap of confusing means with ends, and thereby neglecting the many other changes that are needed if open data is to deliver full value.

