New open data from London Datastore
January 11th, 2010
As you may well have seen, last Thursday the Greater London Authorities announced the new London Datastore:
From the press release:
The Mayor of London will unveil plans for the capital’s first open data project which will see large amounts of previously unavailable information from City Hall released online.
Similar to the hugely successful ‘Apps for the Democracy’ project in the United States the Mayor will be joined by President Barack Obama’s Chief Technology Officer Aneesh Chopra and Linda Cureton, Chief Information Officer, NASA during a rare live web link up with the world’s largest electronics show, CES, in Las Vegas.
The Mayor of London, Boris Johnson, also announced that there will be £200k from Channel Four’s 4IP to encourage people to make new useful services based on the data - which is excellent news!
Picking up from our international round up of open data on cities from last autumn, we’ve updated the package page on CKAN:
Is the data open? Though they don’t use a license or legal tool to make the data open, their Terms and Conditions appear to make the data open as in the Open Knowledge Definition. Nevertheless it would be good if they made this more explicit by using a legal tool such as the PDDL, ODbL, or CC0!
What data have they released? Speaking with Chris Taggart of Openly Local last week, we expected a fair few datasets to be sliced from existing sources such as the Office of National Statistics. But as Chris notes on his blog, it looks like there are plans to open up a lot more data, including new data from from Transport for London!
For more see:
- London unveils digital datastore, BBC News
- Boris Johnson to launch London ‘Datastore’ with hundreds of sets of data, Guardian
- London opens up data with online scheme, Financial Times
- Las Vegas industry event – or London data store launch?, Brian Hoadley
- Tinkering With Timetric – London Datastore Borough Population Data from Tony Hirst, Open University
- The GLA and open data: did he really say that?, Chris Taggart, Openly Local
Opening up UK local spending data
January 5th, 2010
Just before Christmas, the UK Government announced a new report on Making local public expenditure data public, and the development of Local Spending Reports. The report outlines government plans to publish lots more information on where UK public money is spent at local level:
It is critical [...] that information on public expenditure should be clear, accessible and useful. We believe that spending by local authorities and other public bodies should be as transparent to delivery partners and local people as it can be. We want to make it easier for citizens to look right across all the local services in an area and spot evidence of duplication or waste, and hold providers to account.
The Government is therefore committed to the broader provision of local information, and Local Spending Reports sit within those plans as a major part of the central government offer to the information set available on local services and local places.
They suggest that opening up local government data on where public money is spent may encourage innovation in representing this data - and specifically cite the Open Knowledge Foundation’s Where Does My Money Go project, as well as Openly Local.
We are aware of the many excellent websites springing up which are providing innovative ways of looking at available public data and presenting it in very user-friendly formats; for example www.openlylocal.com/ (information on local councillors and council meetings) and Open Knowledge Foundation’s free interactive online tool for showing where UK public spending goes at www.wheredoesmymoneygo.org/prototype/. This latter tool enables the public to explore data on UK public spending over the past six years in an intuitive way using an array of maps, timelines and graphs.
On New Year’s Eve there was an announcement about making existing UK local spending reports more detailed, more comprehensive and easier to query:
Local Spending Reports provide information about how public money is being spent in local areas including money going to police and fire services, transport and health.
This is all crucial information enabling people to see how their taxes are being put to use. But at the moment if people want to see not only what is being spent but what that money is delivering they would need to trawl through an array of different data, reports and statistics.
John Denham is clear that improving the quantity and quality of data in the public domain will not only increase transparency but will also be key to improving efficiency and securing better value for money.
Changes are therefore being proposed to improve the way that local spending reports are produced and presented. At the moment they exist as a series of excel spreadsheets. From next summer they will be published online in a clear and user friendly format that will enable the data to be easily interrogated.
The first UK local spending report (for 2006-7) was published earlier last year. Details of this are available at:
Today Chris Taggart republished the first spending report it in a form which should make it easier to understand. Chris is our resident local data expert at Where Does My Money Go?, Founder of OpenlyLocal and invited expert on the UK Government’s Local Public Data Panel. In a blog post about the new release, Chris writes:
The first of those is now online, and it’s a good one, the 2006-07 Local Spending Report for England, published in April 2009. What is this? In a nutshell it lists the spending by category for every council in England at the time of the report (there have been a couple of new ones since then).
Now this report has been available to download online if you knew it existed, as a pretty nasty and unwieldy spreadsheet (in fact the recent report to Parliament, Making local public expenditure data public and the development of Local Spending Reports, even has several backhanded references to the inaccessibility of it).
However, unless you enjoy playing with spreadsheets (and at the very minimum know how to unhide hidden sheets and read complex formulae), it’s not much use to you. Much more helpful, I think, is an accessible table you can drill down for more details.
He’s done a great first pass at making this data easier to understand. To see what he’s done so far see:
While this is currently experimental, in the future he plans to make it easy to export data in XML/JSON as well as to create more sophisticated visual representations of the data.
For anyone who is interested, we’ve also started a CKAN group for collecting data on UK public finance:
We’ve also started a ‘reading list’ for key official documents and secondary sources on UK Government finance on the OKF wiki:
If you’re interested in helping out with any of this - please get in touch!
Open Knowledge Foundation Newsletter No. 13
December 23rd, 2009
Welcome to the thirteenth Open Knowledge Foundation newsletter! For a plain text version for email, please see:
Microblog version:
- RT @jwyg: Open Knowledge Foundation @okfn Newsletter No. 13: http://bit.ly/7CeAfN
OPEN KNOWLEDGE FOUNDATION NEWSLETTER NO. 13
Contents:
- Seasons Greetings from the Open Knowledge Foundation!
- Where Does My Money Go? Prototype Launched
- Data.gov.uk is using OKF’s CKAN software
- Open Knowledge Conference (OKCon) 2010 Call for Proposals
- OKF at Chaos Computer Congress in Berlin
- After the Open Data and Semantic Web workshop
- Documentation from the Public Domain Calculators meeting
- Climate Change, Climate Sceptics and Open Data
- New board and advisory board members
- Macedonian translation of the Open Knowledge Definition (OKD)
- Other news in brief
- Thanks to our volunteers!
- Support the Open Knowledge Foundation
- Further information
To support the OKF see: http://www.okfn.org/support
SEASONS GREETINGS FROM THE OPEN KNOWLEDGE FOUNDATION!
A big Merry Christmas from the Open Knowledge Foundation to all our friends and supporters. In the festive spirit, we’ve put together a few images, texts and audio recordings from various open knowledge projects for your delectation. See you again in 2010!
WHERE DOES MY MONEY GO? PROTOTYPE LAUNCHED
In mid December we had the first full release of our Where Does My Money Go? prototype. The project aims promote transparency and citizen engagement through the analysis and visualisation of information about UK public spending. A winner of the Cabinet Office’s Show Us A Better Way competition, we were very pleased to publicly release the first stage of this project. Where Does My Money Go? received coverage in the BBC and the Guardian newspaper, as well as in national press in Germany, Italy and Poland.
Tom Watson MP, commented on the release:
Where Does My Money Go represents another milestone in the UK’s transparency movement. We know that transparency changes individual and institutional behaviour and this new tool will have a big impact on the way the public sector is held to account by UK citizens.
As well as being a great public benefit, Where Does My Money Go is also an immensely complicated tool to code and design. I applaud the team behind the project for their commitment and hard work. They’re leading the way in transparency and making a difference for the country.
DATA.GOV.UK IS USING OKF’S CKAN SOFTWARE
The UK Government’s public sector data site launched in private beta in October. Its using the Open Knowledge Foundation’s CKAN, an open source registry for open data, as its backend for storing information about public datasets. Over 1000 existing data sets from 7 departments are all brought together for the first time in a reusable form. There’s been quite a lot of excitement about this in the developer community and in the media, and we’re very much looking forward to the launch!
OPEN KNOWLEDGE CONFERENCE (OKCON) 2010 CALL FOR PROPOSALS
OKCon, now in its fifth year, is the interdisciplinary conference that brings together individuals from across the open knowledge spectrum for a day of presentations and workshops - ‘from sonnets to statistics, genes to geodata’. The Call for Proposals for OKCon 2010 is now open. We welcome proposals on any aspect of creating, publishing or reusing content or data that is open in accordance with opendefinition.org.
OKF AT CHAOS COMPUTER CONGRESS IN BERLIN
Several of us from the Open Knowledge Foundation will be at the Chaos Computer Congress in Berlin after Christmas. The 26th Chaos Communication Congress takes place from December 27th to December 30th 2009. OKF Director Rufus Pollock will give a talk on ‘CKAN, apt-get for the Debian of Data’. If you’re planning to attend, we’d love to hear from you. Just send us a message.
AFTER THE OPEN DATA AND SEMANTIC WEB WORKSHOP
In November we had a workshop on Open Data and the Semantic Web in London. There event brought together key people from the semantic web community - including developers, academics, and representatives from the UK Government, the BBC, and other public bodies. There were some excellent talks, demos and discussions - and documentation is now online! At the event we also launched a new Linking Open Data Group on CKAN, our open source registry of open data.
As a result of discussions we had at the workshop, we now have two new volunteer positions at the Open Knowledge Foundation:
- An Editor for the Linking Open Data Group on CKAN to help keep the collection of datasets up to date with the latest offerings from the LOD/semantic web community!
- An Linking Open Data/Open Data Commons Community Liason. Open Data Commons are looking for an member of the LOD/Semantic Web community to join their Open Data Commons Advisory Council with their role being to exchange information between the two communities.
If you’re interested in either of these positions - please get in touch!
DOCUMENTATION FROM THE PUBLIC DOMAIN CALCULATORS MEETING
In November we also had a meeting at the University of Cambridge about building a set of Public Domain Calculators for countries across Europe. The public domain calculators will help to determine whether or not a given work is in copyright in a given jurisdiction.
We started out by reviewing existing work on the calculators. We then put together first drafts of diagrams representing copyright law in the Netherlands and the United Kingdom. We also started work on a tutorial to help others getting started in building public domain flow diagrams for other countries. Finally we shot some footage for a micro-short film introducing the project - so watch this space!
CLIMATE CHANGE, CLIMATE SCEPTICS AND OPEN DATA
To mark the UN Climate talks in Copenhagen, we launched a Climate Data Group on CKAN - a collaboration between the Open Knowledge Foundation, Clear Climate Code and the scientists at Real Climate. Environmental data is an excellent case of where sharing is the key to scaling. Research institutions must share data with each other in order to build up as detailed a picture as possible of the climate, incorporating as much evidence as possible from around the world. As much of this research is publicly funded, and due to increasing public interest, there are now strong arguments for extending this sharing from sharing between research institutions to sharing to the public.
By better documenting existing open environmental data, we hope to make some small contribution to laying the groundwork for the shared picture about the state of our climate that we currently need!
NEW BOARD AND ADVISORY BOARD MEMBERS!
In the past month several new faces have joined the OKF’s Board and Advisory Board, namely:
- Dr. Ian Brown of the Oxford Internet Institute,
- Glyn Moody, a technology writer and expert on all things open,
- Mark Surman, Executive Director at the Mozilla Foundation and one of the founders of Open Everything,
We hope you join us in giving them a big welcome!
- Dr Ian Brown joins OKF Board of Directors
- Glyn Moody and Mark Surman join OKF Advisory Board!
- Open Knowledge Foundation - People
MACEDONIAN TRANSLATION OF THE OPEN KNOWLEDGE DEFINITION (OKD)
There is now a Macedonian translation of the Open Knowledge Definition (OKD) thanks to Ljube Babunski. The OKD provides a clear set of standards for making content and data open - whether this is open government data, open geospatial data or open data in science. If you’d like to translate the Definition into another language, or if you’ve already done so, please let us know!
OTHER NEWS IN BRIEF
- Large collection of German texts opened up!
- Visualizar ‘09
- Interview with Jordan Hatcher on legal tools for open data
- US Government announces more open government data!
- UK Government announces lots of new open data!
- Looking for a design guru to give the Open Knowledge Foundation a makeover!
- Featured Project: MusicBrainz
- Which works fall into the public domain in 2010?
- Ordnance Survey to open up UK geospatial data
- KForge v0.17 Released
- Slides from Open Data Session at ISWC 2009
- Open data on cities: an international round up
- New CKAN features!
- Latest Developments on Open Shakespeare (v0.8)
- Conservatives Pledge to Open 20 Most Socially Useful Datasets
- OpenFlights data released under Open Database License (ODbL)
- Opening up e-Government in Europe: accessibility, transparency and the ‘right to reuse’
- Ernest Marples UK postcode site has been taken down
- Australian government releases open data for MashupAustralia competition
THANKS TO OKF VOLUNTEERS!
As usual, a big thank you to our volunteers and to our extended virtual community for all of their valuable input!
FURTHER INFORMATION
If you would like to know more about what we are up to, please take a look at our active projects page.
If you are interested in participating in any of the OKF’s projects, please see our participate page, or join the OKF discuss list.
For further news and comments, see our blog:
You can follow us on Identi.ca or Twitter at:
The Open Knowledge Foundation is a not-for-profit organization. It is incorporated in the United Kingdom as a company limited by guarantee with company number 5133759. The registered office is 37 Panton Street, Cambridge, CB2 1HL, UK.
OKF talking at Chaos Computer Congress in Berlin
December 22nd, 2009
Several of us from the Open Knowledge Foundation will be at the Chaos Computer Congress in Berlin after Christmas. The 26th Chaos Communication Congress takes place from December 27th to December 30th 2009. OKF Director Rufus Pollock will give a talk on ‘CKAN: apt-get for the Debian of Data‘. If you’re planning to attend, we’d love to hear from you. Just pop us a message!
We’ve started a page for the OKF on the CCC wiki:
In the next few months we’re also going to be working on a German version of CKAN, our open source registry for open data, as well as setting up OKF Germany.
Visualizar ‘09
December 17th, 2009
The project presentations from last month’s Visualizar seminar have now been posted online. This annual event brought together creative teams from a range of disciplines, with the objective of delivering workable presentations using freely available data resources. The theme for 2009 was Public Data - Data In Public. I was fortunate enough to attend on behalf of the Open Knowledge Foundation: you can find my presentation here [pdf].
The event opened with two days of lectures, papers and debate around the areas of public data re-use and visualization. Highlights included a review of works by the Sunlight Foundation, and an exciting presentation of green lifestyle applications in development for Helsinki’s City as Living Factory of Ecology project, as well as stimulating presentations by Ben Cerveny of Stamen Design, and artist Aaron Koblin.
Project presentations took the form of a market in which initiators laid out their ideas in order to recruit collaborators. During the next two days and the following weekend, working groups came together around each of the concepts, to be developed over the ensuing fortnight.

Outcomes
Experience shows that it’s very difficult to judge at the outset which of these projects will deliver the most interesting or usable results, when success depends on many diverse factors. So I was thrilled to see that two of the projects that seemed particularly viable at the start of the seminar yielded such interesting results.
The team working on Madrid’s Kultur-O-Meter produced a detailed poster showing how the city’s cultural budget is distributed. With so much emphasis on the interactive, it’s refreshing to see how a very simple static model can be used to present detailed information concisely and elegantly. I particularly like how the design shows very clearly where the uncertainties lie. Accompanying their presentation is an account of the challenging process of data gathering and analysis (in Spanish).
Piotr Adamczyk, of the Metropolitan Museum of Art developed a timeline framework for exploring the museum’s extensive collection, which could be a wonderful resource for visitors and curators when it’s done. Props also go to the team behind New Political Interfaces for their fun and well-designed toy for visualizing political discourse online.
Liz Turner is founder of visualization studio iconomical and designer of Where Does My Money Go?
Interview with Jordan Hatcher on legal tools for open data
December 15th, 2009
The Open Knowledge Foundation’s Jordan Hatcher was recently interviewed by the Semantic Web Company about Why we can’t use the same open licensing approach for databases as we do for content and software:
Legal certainty is crucial when it comes to build business around new technologies. The Open Knowledge Foundation has started to tackle this problem with respect to Linked Data. Tassilo Pellegrini spoke to the Open Content Lawyer Jordan S. Hatcher about licensing issues in Open Data and got some practical advice to get started on a complex but crucial topic.
Where Does My Money Go? Prototype Launched
December 11th, 2009
We’re very pleased to announce the first full release of our Where Does My Money Go? prototype. This is now online at:
Tom Watson MP, commented on the new release:
Where Does My Money Go represents another milestone in the UK’s transparency movement. We know that transparency changes individual and institutional behaviour and this new tool will have a big impact on the way the public sector is held to account by UK citizens.
As well as being a great public benefit, Where Does My Money Go is also an immensely complicated tool to code and design. I applaud the team behind the project for their commitment and hard work. They’re leading the way in transparency and making a difference for the country.
Our press release below contains more background information on the new prototype. For all you microbloggers out there, here is a 136 character version of the project announce:
- RT @jwyg: New visualisations of #ukgov spending! See @okfn’s Where Does My Money Go? #wdmmg: http://bit.ly/4N2p5Y + http://bit.ly/8O4WEq

Press Release
Now more than ever, UK taxpayers will be wondering where public funds are being spent - not least because of the long shadow cast by the financial crisis and last week’s announcements of an estimated £850 billion price tag for bailing out UK banks. Yesterday’s pre-budget report also raises questions about spending cutbacks and how public money is being allocated across different key areas.
However, closing the loop between ordinary citizens and the paper-trail of government receipts is no mean feat. Relevant documents and datasets are scattered around numerous government websites - and, once located, spending figures often require background knowledge to interpret and can be hard put into context. In the UK there is no equivalent to the US Federal Funding Accountability and Transparency Act, which requires official bodies to publish figures on spending in a single place. There were proposals for similar legislation in 2007, but these were never approved.
On Friday 11th December the Open Knowledge Foundation will launch a free interactive online tool for showing where UK public spending goes. The Where Does My Money Go? project allows the public to explore data on UK public spending over the past 6 years in an intuitive way using an array of maps, timelines and graphs. By means of the tool, anyone can make sense of information on public spending in ways which were not previously possible.
For example, while playing around with the tool, we noticed:
- Total public spending as a percentage of gross domestic product this year increased to levels not seen since the recession of 1992.
- Healthcare spending in real terms under New Labour has almost doubled since they came to power in 1997. Education spending has increased by 75%.
- The UK spends more on old age than on education. The amount of money spent to support those in retirement is £87bn compared to the £82bn on the whole of education.
- £665 was spent in Northern Ireland on housing and amenities for every man, woman and child in 2008-9, compared to £413 in London. Spending per capita in Britain’s capital on housing, transport and public order and safety all exceeded the national average by over 60%.
Notes to editors
The Open Knowledge Foundation is a not-for-profit organisation dedicated to improving the way knowledge is shared. The Where Does My Money? project was a winner of the Cabinet Office’s Show Us A Better Way competition. The project benefits from an advisory group which includes leading transparency advocates and information visualisation experts. The prototype was conceived by the Open Knowledge Foundation and developed with data visualisation specialists iconomical, based in Amsterdam. The Foundation is also currently working with the UK Government on the technology behind the new data.gov.uk site.
Currently the Where Does My Money Go prototype is based on data from HM Treasury - but the project team is working to collect, aggregate and incorporate much more fine-grained information, including on local spending. On Monday Gordon Brown announced plans to publish much more detailed information on public spending in a more systematic way as part of the Smarter Government initiative.

US Government announces more open government data!
December 8th, 2009

So far its been a good week for open government data (and its only Tuesday)! After yesterday’s announcement from UK Prime Minister Gordon Brown, today the US Government’s Chief Technology Officer Aneesh Chopra and Chief Information Officer Vivek Kundra gave a live webcast from the Whitehouse to announce the release of the new Open Government Directive as well as a progress report on the state of the Open Government Initiative.
So - whats new in the world of open government data in the US? Well first of all, the announcement blog says we can expect lots more data in the coming weeks and months:
We are also publishing online never-before-available data about federal spending and research. At Data.gov, for instance, what started as 47 data sets from a small group of federal agencies has grown into more than 118,000 today – with thousands more ready to be released starting this week.
For example, the open government progress report explicitly mentions (p.7) :
- New nutrition data: “a database of the 1,000 most commonly eaten foods”
- New data on FOIA requests: in machine readable format with “detailed statistics on the number and disposition of FOIA requests, including response times, volume of requests, and personnel costs”
- New data on Federal Advisory Committees: “12 years of Committee data [...] including 11,430 individual committee records detailing $3.24 billion in related spending for 77,740 meetings and 11,317 reports.”
- New data on energy: including detailed timely information on the “amount of raw energy generated through hydropower”
- New data on patents: details of 7,000,000 patents
- New data on migration: IRS Statistics of Migration Data, information on “migration patterns of tax return filers moving from county-to-county or state-to-state” and “raw data on the volume of applications to the United States Customs and Immigration Service field offices”
- New data on health and care: including on veterans facilities.
- New data on education: including information on recipients of Federal financial assistance
The Directive also sets lots of milestones for the release of new data from US Federal Government agencies across the board. For example:
Within 45 days: Each agency shall publish at Data.gov at least three new, high-value data sets.
Within 60 days: Each agency shall create an open-government web page to serve as the gateway for agency activities related to the Open Government Directive.
Finally the Directive reiterates previous policies, such as a “presumption in favor of openness“, commitment to publish in a timely manner, and so on. We were particularly pleased to see a focus on legal and technical reusability. It looks like the new data will be raw and machine readable as well as compliant with the Open Knowledge Definition!
To the extent practicable and subject to valid restrictions, agencies should publish information online in an open format that can be retrieved, downloaded, indexed, and searched by commonly used web search applications. An open format is one that is platform independent, machine readable, and made available to the public without restrictions that would impede the re-use of that information.
UK Government announces lots of new open data!
December 7th, 2009

This morning UK Prime Minister Gordon Brown announced plans to open up lots more UK Government data! His speech describes plans to put much more detailed information online under open licenses in 2010.
This includes:
- public services performance data - including on crime, hospitals and schools
- new transport data
- geospatial data from Ordnance survey (as we recently blogged about)
We are very pleased that it looks like the new datasets will be:
- Released in raw form - as OKF Director Rufus Pollock first blogged about two year ago last month, and alluded to by Sir Tim Berners-Lee at TED.
- Released in a way which is compliant with the Open Knowledge Definition - i.e. free for anyone to use for any purpose, include commercial.
We’re also very proud that the new data.gov.uk site, the official registry of UK Government open datasets, is powered by CKAN (as we announced a couple of months ago). If you’re interested in following the latest development about this as they happen, please join the official mailing list.
The new Putting the Frontline First: Smarter Government initiative gives further detail on how the new data will be published. In particular, section 1.3. Radically opening up data and promoting transparency, gives a set of “public data principles”, which are as follows:
‘Public data’ are ‘government-held non-personal data that are collected or generated in the course of public service delivery’.
Our public data principles state that:
- Public data will be published in reusable, machine-readable form
- Public data will be available and easy to find through a single easy to use online access point (http://www.data.gov.uk/)
- Public data will be published using open standards and following the recommendations of the World Wide Web Consortium
- Any ‘raw’ dataset will be represented in linked data form
- More public data will be released under an open licence which enables free reuse, including commercial reuse
- Data underlying the Government’s own websites will be published in reusable form for others to use
- Personal, classified, commercially sensitive and third-party data will continue to be protected.
This is fantastic news - and we’ve highlighted key parts of the Prime Minister’s speech below:
Information is the key. An informed citizen is a powerful citizen.
[...] We are determined to be among the first governments in the world to open up public information in a way that is far more accessible to the general public.
So I am grateful to Sir Tim Berners-Lee and Professor Nigel Shadbolt for leading a project to ‘make public data public’.
This has enormous potential. Already more than 1,000 active users of the internet have registered their interest in working with government on this, and we have so far made around 1,100 datasets accessible to them.
And there are many hundreds more that can be opened up - not only from central government but also from local councils, the NHS, police and education authorities.
[...] In this way people will no longer be passive recipients of services but, through dialogue and engagement, active participants - shaping, controlling and determining what is best for them.
And I can announce today that we will actively publish all public services performance data online during 2010 completing the process by 2011. Crime data, hospital costs and parts of the national pupil database will go on line in 2010. We will use this data to benchmark the best and the worst and drive better value for money.
It will have a direct effect on how we allocate resources. We will introduce next year NHS tariffs based on best practice on the ground not average price. And we will be benchmarking the whole of the prison and probation system by 2011.
And we will give our frontline services greater freedoms and flexibilities to respond innovatively to this data, reducing the number of ring fenced budgets, rationalising different central funding projects and joining-up capital funding within a local area.
Releasing data can and must unleash the innovation and entrepreneurship at which Britain excels - one of the most powerful forces of change we can harness.
When, for example, figures on London’s most dangerous roads for cyclists were published, an online map detailing where accidents happened was produced almost immediately to help cyclists avoid blackspots and reduce the numbers injured.
And after data on dentists went live, an iphone application was created to show people where the nearest surgery was to their current location.
And from April next year ordnance survey will open up information about administrative boundaries, postcode areas and mid-scale mapping.
All of this will be available for free commercial re-use, enabling people for the first time to take the material and easily turn it into applications, like fix my street or the postcode paper.
And I can further announce today that, again from next April, we will also release public transport data hitherto inaccessible or expensive and release significant underlying data for weather forecasts for free download and re-use.
We are currently working on a new project which will map open government data initiatives from around the world. We are also working on a guidance document for opening up government data, and starting a new working group on open government data to promote technical and legal standards, as well as to help document what open government data is out there. If you’re interested in any of this, we’d love to hear from you!
Climate Change, Climate Sceptics and Open Data
December 5th, 2009

With the United Nations Climate Change Conference in Copenhagen starting on Monday, it is of vital important that there is consensus on the scientific evidence about climate change, in order to inform debates about the best course of action for the international community. Sharing the same basic picture about the climate, global warming and the impact of human sources of carbon dioxide (regardless of the details of this picture, regardless of differences in opinion about the most appropriate course of action in reponse to it) is surely a critical prerequisite to effective and fruitful negotiations.
The recent illegally obtained emails from the University of East Anglia’s Climatic Research Unit (so-called ‘Climategate’) and the subsequent accusations of secrecy and malpractice from climate change sceptics have provoked debate in the media about the openness and availability of datasets related to climate change.
Partly in response to accusations of secrecy and falsification of key datasets from sceptics, the UK Met Office announced today they will be publishing new climate datasets. Earlier the Telegraph reported:
Sceptics alleged that emails stolen from the Climatic Research Unit at the university show scientists were willing to manipulate data to show global warming.
They also complain that the raw data for the climate models was not made available to the public.
To try to restore public confidence the Met Office is talking to other meteorological organisations around the world about recreating the model using the same raw data but more modern computers.
The whole process will also use any new information and be more open to the public.
This evening, the BBC reported:
Meanwhile, the Met Office said it would publish all the data from weather stations worldwide, which it said proved climate change was caused by humans.
Its database is a main source of analysis for the IPCC.
It has written to 188 countries for permission to publish the material, dating back 160 years from more than 1,000 weather stations.
As UEA said in an announcement from the end of November, over 95% of the CRU climate data is already available and permission to publish the remaining data will have to be sought from each of the relevant National Meteorological Services (NMSs) around the world on a case by case basis. Professor Davies of UEA, suggests there are partly commercial reasons for this:
We are grateful for the necessary support of the Met Office in requesting the permissions for releasing the information but understand that responses may take several months and that some countries may refuse permission due to the economic value of the data.
An editorial piece in Nature from a couple of days ago suggests:
Researchers are barred from publicly releasing meteorological data from many countries owing to contractual restrictions. Moreover, in countries such as Germany, France and the United Kingdom, the national meteorological services will provide data sets only when researchers specifically request them, and only after a significant delay. The lack of standard formats can also make it hard to compare and integrate data from different sources. Every aspect of this situation needs to change: if the current episode does not spur meteorological services to improve researchers’ ease of access, governments should force them to do so.
Mike Hulme of UEA and Jerome Ravetz of Oxford Univeristy argue in a recent BBC article that climate scientists will have to become better at engaging the public in their research:
While there will always be a unique function for expert scientific reviewers to play in authenticating knowledge, this need not exclude other interested and motivated citizens from being active.
These demands for more openness in science are intensified by the embedding of the internet and Web 2.0 media as central features of many people’s social exchanges.
In particular they suggest that scientists should respond to demands that:
- To be validated, knowledge must also be subject to the scrutiny of an extended community of citizens who have legitimate stakes in the significance of what is being claimed
- And to be empowered for use in public deliberation and policy-making, knowledge must be fully exposed to the proliferating new communication media by which such extended peer scrutiny takes place.
Roger Pielke, Professor of Environmental Studies at the University of Colorado, argues in a recent interview in the Washington Post that:
More openness, more transparency, more diversity, and more attention to the social construction of expertise is needed.
While it is important to remember, as Cameron Neylon notes, that proper interpretation of climate change data requires significant background knowledge and a thorough grounding in relevant scientific literature and tools, nevertheless it is clear that there is an increasing demand from interested non-expert non-scientists to access and reuse climate data. The Times recently published two pieces analysing and refuting a climate change sceptic’s interpretation of the publicly available HADCRU data. Another blogger points out that public environmental datasets allow non-expert members of the public to explore the evidence and draw different conclusions about climate change - and argues that the peer review process will act as a quality filter for their research.
In response to the demand for data, Real Climate (who were also hacked, and who provide two excellent posts on the CRU hack and background context) have published a very useful list of public climate datasets as well as a blog post asking the climate science community for further suggestions.
All of this interest in public sources of climate data, reminded us of our Open Environmental Data project which we started two years ago this autumn. The project aimed to answer the question:
- What environmental data is out there, and how open is it?
It also aimed to document relevant legislation and policy relevant to environmental data in different jurisdictions.
We have picked up this work again by starting a climate data group on CKAN, our open source registry of open data:
We have started to go through available public sources of climate data, looking at:
- Whether datasets are open as in the Open Knowledge Definition - i.e. whether they explicitly say that they can be used by anyone, for any purpose, without restriction (except perhaps attribution, integrity or sharealike requirements).
- Whether or not there are facilities to download raw data in bulk - i.e. whether they easily allow users to directly download all the data in open, machine readable formats.
Environmental data is an excellent case of where sharing is the key to scaling. Research institutions must share data with each other in order to build up as detailed a picture as possible, incorporating as much evidence as possible from around the world. As much of this research is publicly funded, and due to increasing public interest, there are now strong arguments for extending this sharing from sharing between research institutions to sharing to the public.
Furthermore, often access is not enough. Datasets need to be combined with other datasets, or reused in visual representations. Hence there are arguments for making data open as in the Open Knowledge Definition, which means that anyone can reuse and redistribute it for any purpose. This allow allows for innovation in the ways in which the data can be presented to the public by third parties, including not-for-profit organisations and companies - such as through the creation of new web services to allow the data to be explored.
There are currently 38 data sources listed, over half of which are fully open. However many datasets are still not explicitly legally open, and many of them have restrictions on how they can be reused. There are still plenty of datasets to add! We’ve been in touch with the folks at Real Climate, and they’ve been supportive of the project and encouraged us to reuse and build on their list of data sources.
In order to mark the occasion of the Copenhagen Conference, over the next few weeks we will be continuing to add publicly available climate data to CKAN. By better documenting existing open environmental data, we hope to make some small contribution to laying the groundwork for the shared picture about the state of our climate that we currently need.
If you are interested in contributing to the climate data group - please either drop us a line, or get stuck in and register a package!
