The following post is by Tony Hirst, who has been working with Rufus Pollock of the Open Knowledge to create http://GetTheData.org/, a new question and answer site for data-related questions.

Where can I find a list of airports with their locations? Where can I find historical weather data? How do I find the county from a postcode or a state from a zipcode? How do I find a book title from its ISBN? What’s the best tool(s) for scraping data from websites? Is there a way to get RDF Linked Data in a format that you can use?

These sorts of questions sound familiar? With increasing amounts of data available, it can still be hard to:

  • Find the data you you want;
  • Query a datasource to return just the data you want;
  • Get the data from a datasource in a particular format;
  • Convert data from one format to another (Excel to RDF, for example, or CSV to JSON);
  • Get data into a representation that means it can be easily visualised using a pre-existing tool.

In some cases the data will exist in a queryable and machine readable form somewhere, if only you knew where to look. In other cases, you might have found a data source but lack the query writing expertise to get hold of just the data you want in a format you can make use of. Or maybe you know the data is in Linked Data store on data.gov.uk, but you just can’t figure how to get it out?

This is where GetTheData.org comes in. Get The Data arose out of a conversation between myself and Rufus Pollock at the end of last year, which resulted with Rufus setting up the site now known as http://getthedata.org.

getTheData.org

The idea behind the site is to field questions and answers relating to the practicalities of working with public open data: from discovering data sets, to combining data from different sources in appropriate ways, getting data into formats you can happily work with, or that will play nicely with visualisation or analysis tools you already have, and so on.

At the moment, the site is in its startup/bootstrapping phase, although there is already some handy information up there. What we need now are your questions and answers …

So, if you publish data via some sort of API or queryable interface, why not considering posting self-answered questions using examples from your FAQ?

If you’re running a hackday, why not use GetTheData.org to post questions arising in the scoping the hacks, tweet a link to the question to your event backchannel and give the remote participants a chance to contribute back, at the same time adding to the online legacy of your event.

If you’re looking for data as part of a research project, but can’t find it or can’t get it in an appropriate form that lets you link it to another data set, post a question to GetTheData.

If you want to do some graphical analysis on a data set, but don’t know what tool to use, or how to get the data in the right format for a particular tool, that’d be a good question to ask too.

Which is to say: if you want to GetTheData, but can’t for whatever reason, just ask GetTheData.org.

Website | + posts