Open Data Search: finding useful datasets, worldwide
March 16, 2011 in CKAN, LOD2, Open Government Data, Technical, WG Open Government Data
The following post is from Friedrich Lindenberg, who is a developer at the Open Knowledge Foundation working on CKAN, PublicData.eu and Open Spending.
Recently, there has hardly been a week in which there hasn’t been an announcement of a new local, regional or national open data initiative – including ever more extensive catalogues of data that is being opened up (CKAN alone now runs in 20 or more places). While this is great news for those of us interested in re-using the data, it also means it becomes increasingly hard to keep a good overview of what kind of data are available for which places. To get a better overview we’ve now started a meta search engine for open data, opendatasearch.org.
opendatasearch.org is a global version of the prototype publicdata.eu site we announced in January: it’s an aggregator for datasets, providing a simple and unified search interface to all of the catalogues contained. At the moment, this includes all known instances of the CKAN software, the Sunlight Foundation’s National Data Catalog (and with it a large number of US-based data sources), the World Bank data catalogue, Sweden’s DCat-enabled OpenGov.se and Nexedi’s Data Publica portal. We’ve also put up search.ckan.net which provides access to the combined index of all CKANs only.
Behind the scenes, opendatasearch.org is web spider with a twist: all collected data is converted to DCat, DERI/W3C’s RDF-based ontology for dataset descriptions. While this convention is still in early development, it’s interesting to see how well different kinds of catalogues can be expressed in it already (the harvested data can be found here). By harvesting a growing set of existing dataset descriptions, we hope to gather a comprehensive picture of the dataset properties that are widely used and that should be represented in a common format. Our goal with this is to establish some degree of interoperability between different data catalogues, leading into a federated catalogue architecture for Europe and perhaps beyond.
These standardization concerns aside, we want to make opendatasearch.org useful on its own. For the immediate future this means adding support for more filter options, including licenses (and their compliance to open data principles), languages used in metadata and the data itself and geographic scopes of the collected information. This, of course, is an open source development effort and we’d glad to welcome those interested in contributing comments, catalogue data or functionality on the ckan-discuss mailing list!
Related posts:
- Launch of NosDonnees.fr, a community driven French open data catalogue A quick note to announce (and celebrate!) the launch of a new community driven French open data catalogue, NosDonnees.fr last Friday in Paris. The catalogue is a joint initiative between the Open Knowledge Foundation and Regards Citoyens. Efforts are currently...
- Interested in making an open data catalogue? Virtual meeting on 11th February 2010 We’ve been working hard to set up instances of CKAN for open government data – most notably in data.gov.uk but also for open government data in Germany, France, Canada and elsewhere. We are currently soliciting for feedback on how we...
- What features should be included in a catalogue of open government data? There have recently been several posts about what features are desirable in government data catalogues. The Sunlight Foundation recently announced they are planning to build on data.gov to allow “community participation so that people can submit their own data sources”...
Open Knowledge Foundation Blog 
0 responses to Open Data Search: finding useful datasets, worldwide