Avatar of admin

by

Open Data Going Mainstream?

April 10, 2008 in Musings, Open Data, Open Knowledge

Bret Taylor’s recent post entitled “We Need a Wikipedia for Data” has been garnering a lot of attention around the blogosphere. While his suggestions are not particularly novel, the post and the attention it has garnered, is, I think, indicative of the growing interests in the issues of (open) data and its importance for the development of related services and products.

While generally in agreement with Bret’s arguments, there are a few differences that are worth raising. First Bret appears to favour some kind of centralized repository that everyone can read from and write to:

To this end, I think we should create a Wikipedia for data: a global database for all of these important data sources to which we all contribute and that anyone can use.

As readers of this blog will know, we’re sceptical of this ‘one ring to rule them all’ approach. In this regard, it is also important to distinguish finding material, parsing it, and plugging it together, issues that got rather run together in the surrounding discussion. As I wrote in a comment to Bret’s post:

There seem to be several distinct issues you (and your commenters) are concerned with:

  1. Discoverability of datasets. For this you want a registry of some kind and this is exactly what the Comprehensive Knowledge Archive Network (CKAN) is designed to do. …

  2. ‘Developing’ data particularly using many contributors and a versioning (wiki-like) model. This seems a general problem and one which I wrote about in this post on the collaborative development of data back in February last year. Since then various projects have launched or developed which attempt to address this issue, even if only partially (e.g. Freebase, Swivel, Numbrary, http://www.openeconomics.net …). This then leads into:

  3. Componentizing data so that one can easily plug different datasets together rather than having to aggregate data together in one big place (crudely: ‘One Ring to Rule them All’ vs. ‘Small Pieces, Loosely Joined’). After all it seems unlikely that any one organization, however large, can hold ‘all the data’, and in ay case doing so would negate the benefits of having ‘many minds’ working on a problem. It is our hope that CKAN would start to facilitate the kind of packaging that one frequently observes in software but is, as yet, fairly rare for knowledge (data/content/…). More on this can be found in this blog post on componentization plus the slides from our presentation at XTech.

To conclude, I definitely agree about the importance of having more open data and making it easier to find and use though I’m hoping that it will take a more decentralized and componentized form than simply a ‘wikipedia’ for data. More important though than any details is the fact that this kind of interest from a wider audience indicates that issues of data openness and production are going mainstream — something we as a community should strongly welcome.

Related posts:

  1. Talk at ETech 06 Today Jo and I did our talk at ETech about the experience of helping organize WSFII.London last year and the lessons we learned from it. Here is the slide show (in the s5 format and should work in any standards...
  2. The Comprehensive Knowledge Archive Network (CKAN) Launched Today After a year of (off and on) development we are delighted today to announce the official launch of the Comprehensive Knowledge Archive Network (CKAN for short): http://www.ckan.net/. CKAN is a registry of open knowledge packages and projects — be that...
  3. DBpedia 2.0 DBpedia recently released the new version of their dataset. The project aims to extract structured information from Wikipedia so that this can be queried like a database. On their blog they say: The renewed DBpedia dataset describes 1,950,000 “things”, including...

2 responses to Open Data Going Mainstream?

  1. JP said on April 11, 2008

    Dealipedia, the business deal wiki, currently has almost 20,000 transactions on record and offers a free daily newsletter roundup of recent M&A, VC investment, IPO and bankruptcy deals.

    http://www.dealipedia.com/ http://www.dealipedia.com/newsletter_subscribe.php

  2. You might have a look at the Dataverse Network (http://thedata.org), which does much of what you’re thinking about. Gary King p.s. Thanks to Christian Zimmermann for the reference!

Leave a reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>