Climate Change, Climate Sceptics and Open Data

A radiant turret lit by the midsummer midnight sun by the State Library of New South Wales collection on Flickr

With the United Nations Climate Change Conference in Copenhagen starting on Monday, it is of vital important that there is consensus on the scientific evidence about climate change, in order to inform debates about the best course of action for the international community. Sharing the same basic picture about the climate, global warming and the impact of human sources of carbon dioxide (regardless of the details of this picture, regardless of differences in opinion about the most appropriate course of action in reponse to it) is surely a critical prerequisite to effective and fruitful negotiations.

The recent illegally obtained emails from the University of East Anglia’s Climatic Research Unit (so-called ‘Climategate’) and the subsequent accusations of secrecy and malpractice from climate change sceptics have provoked debate in the media about the openness and availability of datasets related to climate change.

Partly in response to accusations of secrecy and falsification of key datasets from sceptics, the UK Met Office announced today they will be publishing new climate datasets. Earlier the Telegraph reported:

Sceptics alleged that emails stolen from the Climatic Research Unit at the university show scientists were willing to manipulate data to show global warming.

They also complain that the raw data for the climate models was not made available to the public.

To try to restore public confidence the Met Office is talking to other meteorological organisations around the world about recreating the model using the same raw data but more modern computers.

The whole process will also use any new information and be more open to the public.

This evening, the BBC reported:

Meanwhile, the Met Office said it would publish all the data from weather stations worldwide, which it said proved climate change was caused by humans.

Its database is a main source of analysis for the IPCC.

It has written to 188 countries for permission to publish the material, dating back 160 years from more than 1,000 weather stations.

As UEA said in an announcement from the end of November, over 95% of the CRU climate data is already available and permission to publish the remaining data will have to be sought from each of the relevant National Meteorological Services (NMSs) around the world on a case by case basis. Professor Davies of UEA, suggests there are partly commercial reasons for this:

We are grateful for the necessary support of the Met Office in requesting the permissions for releasing the information but understand that responses may take several months and that some countries may refuse permission due to the economic value of the data.

An editorial piece in Nature from a couple of days ago suggests:

Researchers are barred from publicly releasing meteorological data from many countries owing to contractual restrictions. Moreover, in countries such as Germany, France and the United Kingdom, the national meteorological services will provide data sets only when researchers specifically request them, and only after a significant delay. The lack of standard formats can also make it hard to compare and integrate data from different sources. Every aspect of this situation needs to change: if the current episode does not spur meteorological services to improve researchers’ ease of access, governments should force them to do so.

Mike Hulme of UEA and Jerome Ravetz of Oxford Univeristy argue in a recent BBC article that climate scientists will have to become better at engaging the public in their research:

While there will always be a unique function for expert scientific reviewers to play in authenticating knowledge, this need not exclude other interested and motivated citizens from being active.

These demands for more openness in science are intensified by the embedding of the internet and Web 2.0 media as central features of many people’s social exchanges.

In particular they suggest that scientists should respond to demands that:

To be validated, knowledge must also be subject to the scrutiny of an extended community of citizens who have legitimate stakes in the significance of what is being claimed
And to be empowered for use in public deliberation and policy-making, knowledge must be fully exposed to the proliferating new communication media by which such extended peer scrutiny takes place.

Roger Pielke, Professor of Environmental Studies at the University of Colorado, argues in a recent interview in the Washington Post that:

More openness, more transparency, more diversity, and more attention to the social construction of expertise is needed.

While it is important to remember, as Cameron Neylon notes, that proper interpretation of climate change data requires significant background knowledge and a thorough grounding in relevant scientific literature and tools, nevertheless it is clear that there is an increasing demand from interested non-expert non-scientists to access and reuse climate data. The Times recently published two pieces analysing and refuting a climate change sceptic’s interpretation of the publicly available HADCRU data. Another blogger points out that public environmental datasets allow non-expert members of the public to explore the evidence and draw different conclusions about climate change – and argues that the peer review process will act as a quality filter for their research.

In response to the demand for data, Real Climate (who were also hacked, and who provide two excellent posts on the CRU hack and background context) have published a very useful list of public climate datasets as well as a blog post asking the climate science community for further suggestions.

All of this interest in public sources of climate data, reminded us of our Open Environmental Data project which we started two years ago this autumn. The project aimed to answer the question:

What environmental data is out there, and how open is it?

It also aimed to document relevant legislation and policy relevant to environmental data in different jurisdictions.

We have picked up this work again by starting a climate data group on CKAN, our open source registry of open data:

We have started to go through available public sources of climate data, looking at:
Whether datasets are open as in the Open Knowledge Definition – i.e. whether they explicitly say that they can be used by anyone, for any purpose, without restriction (except perhaps attribution, integrity or sharealike requirements).
Whether or not there are facilities to download raw data in bulk – i.e. whether they easily allow users to directly download all the data in open, machine readable formats.

Environmental data is an excellent case of where sharing is the key to scaling. Research institutions must share data with each other in order to build up as detailed a picture as possible, incorporating as much evidence as possible from around the world. As much of this research is publicly funded, and due to increasing public interest, there are now strong arguments for extending this sharing from sharing between research institutions to sharing to the public.

Furthermore, often access is not enough. Datasets need to be combined with other datasets, or reused in visual representations. Hence there are arguments for making data open as in the Open Knowledge Definition, which means that anyone can reuse and redistribute it for any purpose. This allow allows for innovation in the ways in which the data can be presented to the public by third parties, including not-for-profit organisations and companies – such as through the creation of new web services to allow the data to be explored.

There are currently 38 data sources listed, over half of which are fully open. However many datasets are still not explicitly legally open, and many of them have restrictions on how they can be reused. There are still plenty of datasets to add! We’ve been in touch with the folks at Real Climate, and they’ve been supportive of the project and encouraged us to reuse and build on their list of data sources.

In order to mark the occasion of the Copenhagen Conference, over the next few weeks we will be continuing to add publicly available climate data to CKAN. By better documenting existing open environmental data, we hope to make some small contribution to laying the groundwork for the shared picture about the state of our climate that we currently need.

If you are interested in contributing to the climate data group – please either drop us a line, or get stuck in and register a package!

Jonathan Gray

Website | + posts

Dr. Jonathan Gray is Lecturer in Critical Infrastructure Studies at the Department of Digital Humanities, King’s College London, where he is currently writing a book on data worlds. He is also Cofounder of the Public Data Lab; and Research Associate at the Digital Methods Initiative (University of Amsterdam) and the médialab (Sciences Po, Paris). More about his work can be found at jonathangray.org and he tweets at @jwyg.

5 thoughts on “Climate Change, Climate Sceptics and Open Data”

Pingback: uberVU - social comments
William says:

December 6, 2009 at 11:50

Just added a couple of packages – including a very good collection of data sets from Environment Canada. What tag do these need to show up on http://ckan.net/group/climatedata? “climate” doesn’t seem to do it.

Also, minor feature request: on the web page, display a count of packages so we don’t have to use our fingers :P

Good job!

Cheers,
-w
Pingback: Links 06/12/2009: FreeNAS Moves to GNU/Linux | Boycott Novell
Rufus Pollock says:

December 7, 2009 at 12:26

@William: to show up in the group they need to be explicitly added to the group by a group “admin” (groups are moderated unlike tags …). If you’d like to be a group admin just let us know your openid and we’ll make you one :)

Re. feature request for counts: completely agree and we’re looking to put that in there as soon as possible: http://knowledgeforge.net/ckan/trac/ticket/203
Jane Sawyer says:

May 5, 2011 at 10:15

How to collate this data now on Twitter? I’m working on a curation solution to add relevant global warming themed posts and blogs on my live stream