Support Us

You are browsing the archive for Open Data.

Exploring openness and the Open Definition

Laura James - October 7, 2013 in Featured, Open Data, Open Definition, Open Knowledge Definition

We’ve set out the basics of what open data means, so here we explore the Open Definition in more detail, including the importance of bulk access to open information, commercial use of open data, machine-readability, and what conditions can be imposed by a data provider.

Commercial Use

A key element of the definition is that commercial use of open data is allowed – there should be no restrictions on commercial or for-profit use of open data.

In the full Open Definition, this is included as “No Discrimination Against Fields of Endeavor — The license must not restrict anyone from making use of the work in a specific field of endeavor. For example, it may not restrict the work from being used in a business, or from being used for genetic research.”

The major intention of this clause is to prohibit license traps that prevent open material from being used commercially; we want commercial users to join our community, not feel excluded from it.

Examples of commercial open data business models

It may seem odd that companies can make money from open data. Business models in this area are still being invented and explored but here are a couple of options to help illustrate why commercial use is a vital aspect of openness.

open data buttons

You can use an open data set to create a high capacity, reliable API which others can access and build apps and websites with, and to charge for access to that API – as long as a free bulk download is also available. (An API is a way for different pieces of software or different computers to connect and exchange information; most applications and apps use APIs to access data via the internet, such as the latest news or maps or prices for products.)

Businesses can also offer services around data improvement and cleaning; for example, taking several sets of open data, combining them and enhancing them (by creating consistent naming for items within the data, say, or connecting two different datasets to generate new insights).

(Note that charging for data licensing is not an option here – charging for access to the data means it is not open data! This business model is often talked about in the context of personal information or datasets which have been compiled by a business. These are perfectly fine business models for data but they aren’t open data.)

Attribution, “Integrity” and Share-alike

Whilst the Open Definition permits very few conditions to be placed on how someone can use open data it does allow a few specific exceptions:

  • Attribution: an open data provider may require attribution (. that you credit them in an appropriate way). This can be important in allowing open data providers to receive credit for their work, and for downstream users to know where data came from.
  • Integrity: an open data provider may require that a user of the data makes it clear if the data has been changed. This can be very relevant for governments, for example, who wish to ensure that people do not claim data is official if it has been modified.
  • Share-alike: an open data provider may impose a share-alike licence, requiring that any new datasets created using their data are also shared as open data.

Machine-readability and bulk access

Data can be provided in many ways, and this can have a significant impact on how easy it is to use it. The Open Definition requires that data be both machine-readable and available in “bulk” to help make sure it’s not too difficult to make useful.

Data is machine-readable if it can be easily processed by a computer. This does not just mean that it’s digital, but that it is in a digital structure that is appropriate for the relevant processing. For example, consider a PDF document containing tables of data. These are digital, but computers will struggle to extract the information from the PDF (even though it is very human readable!). The equivalent tables in a format such as a spreadsheet would be machine-readable. Read more about machine-readability in the open data glossary.

Some machine readable data being read by a machine

Data is available in bulk if you can download or access the whole dataset easily. It is not available in bulk if you are you limited to just getting parts of the dataset, for example, if you are restricted to getting just a few elements of the data at a time – imagine for example trying to access a dataset of all the towns in the world one country at a time.

APIs versus Bulk

Providing data through an API is great – and often more convenient for many of the things one might want to do with data than bulk access, such as presenting some useful information in a mobile app.

However, the Open Definition requires bulk access rather than an API. There are two main reasons for this:

  • Bulk access allows you to build an API (if you want to!). If you need all the data, using an API to get it can be difficult or inefficient. For example, think about Twitter: using their API to download all the tweets would be very hard and slow. Thus, bulk access is the only way to guarantee full access to the data for everyone. Once bulk access is available, anyone else can build an API which will help others use the data. You can also use bulk data to create interesting new things such as search indexes and complex visualisations.
  • Bulk access is significantly cheaper than providing an API. Today you can store gigabytes of data for less than a dollar a month; but running even a basic API can cost much more, and running a proper API that supports high demand can be very expensive.

So having an API is not a requirement for data to be open – although of course it is great if one is available.

Moreover, it is perfectly fine for someone to charge for access to open data through an API – as long as they also provide the data for free in bulk. (Strictly speaking, the requirement isn’t that the bulk data is available for free but that the charge is no more than the extra cost of reproduction. For online downloads, that’s very close to free!) This makes sense: open data must be free but open data services (such as an API) can be charged for.

(It’s worth considering what this means for real-time data, where new information is being generated all the time, such as live traffic information. The answer here depends somewhat on the situation, but for open real-time data one would imagine a combination of bulk download access, and some way to get rapid or regular updates. For example, you might provide a stream of the latest updates which is available all the time, and a bulk download of a complete day’s data every night.)

Licensing and the public domain

Generally, when we want to know whether a dataset is legally open, we check to see whether it is available under an open licence (or that it’s in the public domain by means of a “dedication”).

However, it is important to note that it is not always clear whether there are any exclusive, intellectual-property-style rights in the data such as copyright or sui-generis database rights (for example, this may depend on your jurisdiction). You can read more about this complex issue in the Open Definition legal overview of rights in data. If there aren’t exclusive rights in the data, then it would automatically be in the public domain, and putting it online would be sufficient to make it open.

However, since, this is an area where things are not very clear, it is generally recommended to apply an appropriate open license – that way if there are exclusive rights you’ve licensed them and if there aren’t any rights you’ve not done any harm (the data was already in the public domain!).

More about openness coming soon

In coming days we’ll post more on the theme of explaining openness, including the relationship of the Open Definition to specific sets of principles for openness – such as the Sunlight Foundation’s 10 principles and Tim Berners-Lee’s 5 star system, why having a shared and agreed definition of open data is so important, and how one can go about “doing open data”.

Next Steps on “Follow the Money” – from OKCon to the Open Government Partnership Summit

Jonathan Gray - October 4, 2013 in OKCon, Open Data, Open Government Data, Public Money

The following post is from Alan Hudson, Policy Director (Transparency & Accountability) at ONE and Jonathan Gray, Director of Policy and Ideas at the Open Knowledge Foundation.

Last month we announced the Open Knowledge Foundation and ONE’s plans to support and strengthen the community of activists and advocacy organisations working to enable citizens to follow the money and hold decision-makers to account for the use of public money.

A few weeks ago at OKCon 2013 we had a brainstorming session with a group of leading financial transparency and open data organisations to define next steps for the collaboration.

We had an excellent turnout including many of the key organisations promoting financial transparency such as Development Initiatives, Publish What You Fund, Publish What You Pay, the Revenue Watch Institute, the Sunlight Foundation, the Transparency and Accountability Initiative, and Transparency International.

Participants in the session shared their experience of trying to follow the money – the challenges and opportunities – and explored how we might collectively join the dots between various efforts to promote transparency. We talked about creating better data standards so information is easier to connect and compare, sharing resources and information about the flow of public money, and how to ensure that transparency initiatives meet the needs of campaigners pushing for change.

The top two priorities identified were as follows. First, mapping the ‘Follow the Money’ space to get a better sense of who is doing what to follow flows of public money from revenue to results, across different sectors and in different countries around the world. Second, doing much more to understand what citizens and civil society organisations need to help them to follow the money and collecting use-cases of how joining the transparency dots will help.

We’re currently planning ‘Follow the Money’ activities around the Open Government Partnership Summit in London on 31st October to 1st November, where we will continue the conversation – in particular focusing on the needs of campaigners in developing countries.

If you or your organisation are interested in joining us to Follow the Money, you can get in touch via the following form.

Defining Open Data

Laura James - October 3, 2013 in Featured, Open Data, Open Definition, Open Knowledge Definition

Open data is data that can be freely used, shared and built-on by anyone, anywhere, for any purpose. This is the summary of the full Open Definition which the Open Knowledge Foundation created in 2005 to provide both a succinct explanation and a detailed definition of open data.

As the open data movement grows, and even more governments and organisations sign up to open data, it becomes ever more important that there is a clear and agreed definition for what “open data” means if we are to realise the full benefits of openness, and avoid the risks of creating incompatibility between projects and splintering the community.

Open can apply to information from any source and about any topic. Anyone can release their data under an open licence for free use by and benefit to the public. Although we may think mostly about government and public sector bodies releasing public information such as budgets or maps, or researchers sharing their results data and publications, any organisation can open information (corporations, universities, NGOs, startups, charities, community groups and individuals).

Read more about different kinds of data in our one page introduction to open data

There is open information in transport, science, products, education, sustainability, maps, legislation, libraries, economics, culture, development, business, design, finance …. So the explanation of what open means applies to all of these information sources and types. Open may also apply both to data – big data and small data – or to content, like images, text and music!

So here we set out clearly what open means, and why this agreed definition is vital for us to collaborate, share and scale as open data and open content grow and reach new communities.

What is Open?

The full Open Definition provides a precise definition of what open data is. There are 2 important elements to openness:

  • Legal openness: you must be allowed to get the data legally, to build on it, and to share it. Legal openness is usually provided by applying an appropriate (open) license which allows for free access to and reuse of the data, or by placing data into the public domain.
  • Technical openness: there should be no technical barriers to using that data. For example, providing data as printouts on paper (or as tables in PDF documents) makes the information extremely difficult to work with. So the Open Definition has various requirements for “technical openness,” such as requiring that data be machine readable and available in bulk.

There are a few key aspects of open which the Open Definition explains in detail. Open Data is useable by anyone, regardless of who they are, where they are, or what they want to do with the data; there must be no restriction on who can use it, and commercial use is fine too.

Open data must be available in bulk (so it’s easy to work with) and it should be available free of charge, or at least at no more than a reasonable reproduction cost. The information should be digital, preferably available by downloading through the internet, and easily processed by a computer too (otherwise users can’t fully exploit the power of data – that it can be combined together to create new insights).

Open Data must permit people to use it, re-use it, and redistribute it, including intermixing with other datasets and distributing the results.

The Open Definition generally doesn’t allow conditions to be placed on how people can use Open Data, but it does permit a data provider to require that data users credit them in some appropriate way, make it clear if the data has been changed, or that any new datasets created using their data are also shared as open data.

There are 3 important principles behind this definition of open, which are why Open Data is so powerful:

  • Availability and Access: that people can get the data
  • Re-use and Redistribution: that people can reuse and share the data
  • Universal Participation: that anyone can use the data

Governance of the Open Definition

Since 2007, the Open Definition has been governed by an Advisory Council. This is the group formally responsible for maintaining and developing the Definition and associated material. Its mission is to take forward Open Definition work for the general benefit of the open knowledge community, and it has specific responsibility for deciding on what licences comply with the Open Definition.

The Council is a community-run body. New members of the Council can be appointed at any time by agreement of the existing members of the Advisory Council, and are selected for demonstrated knowledge and competence in the areas of work of the Council.

The Advisory Council operates in the open and anyone can join the mailing list.

About the Open Definition

The Open Definition was created in 2005 by the Open Knowledge Foundation with input from many people. The Definition was based directly on the Open Source Definition from the Open Source Initiative and we were able to reuse most of these well-established principles and practices that the free and open source community had developed for software, and apply them to data and content.

Thanks to the efforts of many translators in the community, the Open Definition is available in 30+ languages.

More about openness coming soon

In coming days we’ll post more on the theme of explaining openness, including a more detailed exploration of the Open Definition, the relationship of the Open Definition to specific sets of principles for openness – such as the Sunlight Foundation’s 10 principles and Tim Berners-Lee’s 5 star system, why having a shared and agreed definition of open data is so important, and how one can go about “doing open data”.

Open Data Training at the Open Knowledge Foundation

Laura James - September 26, 2013 in Business, CKAN, Featured, Open Data, Open Government Data, Open Knowledge Foundation, Our Work, School of Data, Technical, Training

We’re delighted to announce today the launch of a new portfolio of open data training programs.

For many years the Open Knowledge Foundation has been working — both formally and informally — with governments, civil society organisations and others to provide this kind of advice and training. Today marks the first time we’ve brought it all together in one place with a clear structure.

These training programs are designed for two main groups of people interested in open data:

  1. Those within government and other organisations seeking a short introduction to open data – what it is, why to “do” open data, what the challenges are, and how to get started with an open data project or policy.

  2. The growing group of those specialising in open data, perhaps as policy experts, open data program managers, technology specialists, and so on, generally within government or other organisations. Here we offer more in-depth training including detailed material on how to run an open data program or project, and also a technical course for those deploying or maintaining open data portals.

Our training programs are designed and delivered by our team of open data experts with many years of experience creating, maintaining and supporting open data projects around the world.

Please contact us for details on any of the these courses, or if you’d be interested in discussing a custom program tailored to your needs.

Our Open Data Training Programs

Open Data Introduction

Who is this for?

This course is a short introduction to open data for anyone and is perfectly suited to teams from diverse functions across organisations who are thinking about or adopting open data for the first time.

Topics covered

Everything you need to understand and start working in this exciting new area: what is open data, why should institutions open data, what are the benefits and opportunities to doing so, and of course how you can get started with an open data policy or project.

This is a one day course to help you and your team get started with open data.

Photo by Victor1558

Administrative Open Data Management

Who is this for?

Those specialising in open data, whether as policy experts, open data program managers and similar roles in government, civil service, and other organisations. This course is specifically for non-technical staff who are responsible for managing Open Data programs in their organisation. Such activities typically include implementing an Open Data strategy, designing/launching an Open Data portal, coordinating publication processes, preparing data for publication, and fostering data re-use.

Topics covered

Basics of Open Data (legal, managerial, technical); Success factors for the design and execution of an Open Data program; Overview of the technology landscape; Success factors for community re-use.

Open Data Portal Technology

Who is this for?

Those specializing in open data, whether as software or data experts, and open data delivery managers and similar roles in government, civil service, and other organisations. Technical staff who are responsible for maintaining or running an enterprise Open Data portal. Such activities typically include deployment, system administration and hosting, site theming, development of custom extensions and applications, ETL procedures, data conversions, data life-cycle management.

Topics covered

Basics of Open Data, publication process, and technology landscape; architecture and core functionality of a modern Open Data Management System (CKAN used as example). Deployment, administration and customisation; deploying extensions; integration; geospatial and other special capabilities; engaging with the CKAN community.

Photo by Victor1558

Custom training

We can offer training programs tailored to your specific needs, for your organisation, data domain, or locale. Get in touch today to discuss your requirements!

Working with data

We also run the School of Data, which helps civil society organisations, journalists and citizens learn the skills they need to use data effectively, through both online and in-person “learning through doing” workshops. The School of Data runs data-driven investigations and explorations, and data clinics and workshops from “What is Data” up to advanced visualisation and data handling. As well as general training and materials, we offer topic-specific and custom courses and workshops. Please contact schoolofdata@okfn.org to find out more.

As with all of our work, all relevant materials will be openly licensed, and we encourage others (in the global Open Knowledge Foundation network and beyond) to use and build on them.

What’s the point of open data?

Martin Tisne - September 17, 2013 in Access to Information, Open Data, Open Government Data

I’ve been puzzling for a while how the open data community can help the many great groups that have been fighting for transparency of key money flows for the past decade and more. I think one answer may be that open data helps us go beyond simply making information available. If done well, it can help us make it accessible and relevant to people, which has been the holy grail for transparency advocates for a long time.

The transparency community has focused too much on just getting information out there (making information available). But what’s the point of having information available if it’s not accessible? What’s the use of public reports that are only nominally ‘public’ because they languish in filing cabinets or ‘PDF deserts’ hidden within an obscure website?

If we can get this information more accessible, we can then work to increase participation and help people use it. This for me is what open data people are talking about when they talk about open formats. Machine readability and open formats matter because they are tools to increase access. I’ve seen too many techies talk about ‘open formats’ and activists’ eyes glaze over. But I think we’re both talking about the same thing we hold dear: improving access to vital data for all.

Likewise, it’s the connections between the datasets that are powerful and interesting. You may not care so much to know where most people under 15 years old live in your country, but if you’re told that those that live close to a nuclear waste disposal site happen to have the highest cancer rates, then it becomes seriously relevant. Same as above, techies often talk about technical data standards and get quizzical/skeptical – at best – looks in exchange. But technical data standards are the fuel that allows policy wonks to compare datasets, which creates relevant data. Connecting the dots makes it policy relevant – without data, you can’t make policy.

[availability of data] => [accessibility of data] => [comparability of data]

[availability of data] => [open formats] => [data standards]

Follow the Money groups do amazing work: extractives’ transparency advocates campaigning for vital releases of information on oil, gas, mining revenues into the hundreds of millions of dollars. Groups looking at curbing illicit flows of funds out of desperately poor countries via shell companies and phantom firms. Activists who scrutinize budgets, everything from big ticket national budget allocations, all the way down to very local issues like your local school spending on basic reading materials. And many more.

Together, these groups share one big thing in common – they are all seeking to follow the money. In other words, they are all trying to understand how money either gets in to government coffers, or how it fails to get there, and then how and whether it is spent for the good of the many, rather than the few lining their pockets.

To succeed, we all need data that’s not only public (e.g. public registries of beneficial ownership) but also accessible (in open formats) and comparable to other money flows.

Let’s work together to make it happen.

The following guest post from Martin Tisné was first published on his personal blog.

If you’re at OKCon 2013 and interested in joining the Open Knowledge Foundation and ONE to follow the money, you can come to our session on this topic at OKCon 2013 in Geneva, on Wednesday 18th September, 10:30-11:30 in Room 8, Floor 2 at the Centre International de Conférences Genève – CICG). Due to limited space, if you’re interested in joining us please email followthemoney@okcon.org.

“Follow the Money” with ONE and the Open Knowledge Foundation

Jonathan Gray - September 12, 2013 in Featured, OKCon, Open Data, Public Money

The following post is from Alan Hudson, Policy Director (Transparency & Accountability) at ONE and Jonathan Gray, Director of Policy and Ideas at the Open Knowledge Foundation.

We want to see a world in which citizens are able to hold decision-makers to account for the use of public money, using information about where it comes from, how it’s spent and what results it delivers, to drive improvements in service delivery and accelerate progress against poverty.

To this end, ONE and the Open Knowledge Foundation are excited to share the news about our plans to support and strengthen the community of activists and advocacy organisations pushing for the transparency that is needed if citizens around the world are to be able to follow the money.

The Challenge: Building a Better Connected Global Financial Transparency Movement

The number of organisations and initiatives working to enhance transparency about the use of public money is growing.

There are various focal points for that activity, covering different stages of the flow of public money – from resource availability (tax, aid, extractives and illicit financial flows), to resource allocation (budgets and contracts) to results (inc. in particular sectors).

This focused work is essential, but following the money requires that people can track public money throughout the flow of resources.

Put simply, there is a need to smash the silos that too often separate various transparency initiatives around the world, focusing on different aspects of financial transparency.

Furthermore, there is a need for the emerging fiscal transparency movement to ensure that transparency gains are translated into improved accountability and service delivery.

To enable this, we need to make sure that the data that is made available as a result of transparency wins is usable, used and proves to be useful.

And, we need to join the dots – creating a better connected global fiscal transparency movement that supports more effective collaboration between organisations and individuals working in this space.

To help to join the dots, in the first instance we plan to do four things:

  • Firstly, we will identify and bring together organisations and individuals that are keen, and have the capacity, to work together to join the dots in the fiscal transparency space, to start talking about ways in which we might be able to work together more effectively.
  • Secondly, we will work with those organisations to develop a shared vision and a set of principles that are key to achieving that vision, with input from a network of organisations who are committed to promoting them around the world and across different sectors.
  • Thirdly, we develop a campaign to promote the principles that need to be in place to support citizens’ efforts to follow the money.
  • Finally, we will identify opportunities for specific activities that participating organisations might pursue. These might be at the international level (e.g. through the G20), in the north (e.g. as regards EU Anti-Money Laundering legislation), in the south (e.g. through in-country Follow the Money campaigns), or, better still, across multiple levels using local learning to influence international policy processes.

What’s Next?

We’re holding a session to discuss plans for the Follow the Money initiative at OKCon 2013 in Geneva, on Wednesday 18th September, 10:30-11:30 (in Room 8, Floor 2 at the Centre International de Conférences Genève – CICG). Due to limited space, if you’re interested in joining us please email followthemoney@okcon.org.

We’re also planning various activities around the Open Government Partnership Summit in the UK later this autumn – so watch this space!

If you or your organisation are interested in joining us, you can get in touch via the following form.

An Open Letter on the UK’s Proposed Lobbying Bill

Jonathan Gray - September 9, 2013 in Access to Information, Featured, Open Data, Open Government Data, Policy

The following is an open letter to the Prime Minister and Deputy Prime Minister about the UK’s proposed Lobbying Bill, initiated by the Open Knowledge Foundation and signed by organisations working for greater government transparency and openness in the UK and around the world. A version of the letter was printed in today’s edition of The Independent newspaper.

For more about our position on this topic, you can read our recent blog post on the importance of lobbyist registers. For press enquiries please contact press@okfn.org.

The Lobbying Bill will be a missed opportunity for government openness unless crucial changes are made


Rt Hon David Cameron MP
Rt Hon Nick Clegg MP
Houses of Parliament
London
SW1A 0AA

Cc: Andrew Lansley CBE MP (Leader of the House of Commons),
Francis Maude MP (Minister for the Cabinet Office),
Chloe Smith MP (Minister for Political and Constitutional Reform),
Graham Allen MP (Chair of Political and Constitutional Reform Committee).

6th September 2013

Dear Prime Minister and Deputy Prime Minister,

We, the undersigned, strongly urge government to pause and redraft the proposed Lobbying Bill so that it will provide citizens with a genuine opportunity to scrutinise the activities of lobbyists in the UK.

The current version of the lobbyist register would only cover a small fraction of active lobbyists, leaving the public in the dark about the rest of the UK’s £2 billion lobbying industry. It will also not reveal any meaningful information on their activities.

We think a decent lobbyist register – which says who is lobbying whom, what they are lobbying for and how much they are spending – should be an essential part of the UK government’s openness agenda, and a key measure to ensure that lobbying is transparent and effectively regulated.

Crucially it should not just be restricted to consultant lobbyists, but should also include in-house lobbyists, big consultancies who offer a range of services, and other entities which offer lobbying services such as think tanks.

Furthermore we think it is essential the UK’s lobbyist register is published as machine-readable open data so that its contents can be analysed, connected with other information sources, and republished.

The UK has been a pioneer in opening up its public data and has a major opportunity to be a world leader in government openness at the Open Government Partnership Summit in the UK this autumn, following on from its success in putting open data at the top of the agenda at the G8 with the Open Data Charter.

However, if the Lobbying Bill goes ahead as it is without further changes, then it will be a significant missed opportunity for government openness in the UK, and a major blow to the government’s aspiration to be – in the words of the Prime Minister – “the most open and transparent government in the world”.

Signed,

The world needs better lobbyist registers – but the UK’s proposed lobbying bill won’t help

Jonathan Gray - September 4, 2013 in Featured, Open Data, Open Government Data, Policy

Lobbyist registers are supposed to enable citizens to find out who is lobbying whom for what, and how much they are spending in the process.

They are supposed to help to safeguard against big money having an unfair influence in politics – ultimately to ensure that political decisions are based on argument, evidence and democratic deliberation, and not bought with cash from the highest bidder.

We think lobbyist registers are an essential part of government transparency, and that every country in the world ought to have one.

Furthermore we think it is essential that lobbyist registers are published as open data so that their contents can be easily analysed, queried, and connected with other information sources.

As we’re increasingly seeing corporations and special interest groups lobbying across borders, we’d like to track how big money is shaping discussion and decisions about issues that matter – from energy and the environment to tax and trade – in countries around the world.

We think that this kind of inquiry is essential for democracies to function.

While the UK is a world leader in opening up its public data, unfortunately the proposed Lobbying Bill in its current form will not deliver the lobbyist register that the UK needs.

Aside from widespread concerns that it will have a “chilling effect on civil society and its freedom of expression”, the bill contains major loopholes and omissions which means that it will not deliver real or meaningful transparency around lobbying in the UK.

Firstly, the bill would only apply to a fraction of the UK’s £2 billion lobbying industry. It would only require disclosures from those whose main business is lobbying. Hence it would not cover companies who have in-house lobbyists, big lobbying consultancies who offer a range of services, and other entities which offer lobbying services such as think tanks, law firms or management consultancies. And for those whose main business is lobbying it only covers those who lobby the highest echelons of government – not special advisers or mid-level civil servants.

Secondly, the bill would require lobbyists to disclose very little information about their activities. Essentially it asks lobbyists for a list of their clients and nothing at all about which issues they lobby on, which departments they target, or how much they are paid.

We at the Open Knowledge Foundation sincerely hope that the proposed bill will be revised to address these and other limitations.

If the bill goes ahead as it is, then it will be a significant missed opportunity for government openness in the UK, and a major blow to the government’s aspiration to be – in the words of the Prime Minister – “the most open and transparent government in the world”.

If you’d like to read more you can take a look at SpinWatch’s analysis. While MPs voted for a second reading last night, there’s still time to ask them to reconsider the bill. If you’re based in the UK you can write to your MP either via SpinWatch’s form or with your own message at WriteToThem.

How can open data lead to better data quality?

Jonathan Gray - September 3, 2013 in Featured, Open Data, Open Government Data, Policy, WG Open Government Data

Open data can be freely used by anyone – which means that data users can help to fix, enrich or flag problems with the data, leading to improvements in its quality.

The Open Knowledge Foundation is currently looking to collect the best examples and stories we can find about how open data can lead to better data.

We’re particularly interested in hearing about stories about how open government data has been checked, corrected and enhanced by citizens, civil society groups and others.

So far we have some really great examples – including:

  • Russian open data advocates trawling for errors in over 20 million procurement documents leading to fixes from the Treasury
  • Open Street Map volunteers correcting the locations of 18,000 bus stops in the UK and over 1,800 street names in Denmark
  • Data quality reports from the OpenSpending project leading to rapid improvements in the quality of UK government expenditure data

You can see the full list in progress at: http://bit.ly/opendata-betterdata

If you know of any more good examples, please send them our way and we’ll add them to the list.

We hope this will become a powerful piece of evidence that we can use to encourage public bodies and other data publishers to open up.

Edits to OpenStreetMap

Map showing history of edits to OpenStreetMap in London by Mapbox.

This initiative started life on the Open Knowledge Foundation’s open-government mailing list, which we encourage you to join if you are interested in open government data and how it can be used to increase accountability around the world.

Open Data Privacy

Laura James - August 27, 2013 in Featured, Ideas and musings, Open Data, Open Data and My Data, Open Government Data, Privacy

“yes, the government should open other people’s data”

Traditionally, the Open Knowledge Foundation has worked to open non-personal data – things like publicly-funded research papers, government spending data, and so on. Where individual data was a part of some shared dataset, such as a census, great amounts of thought and effort had gone in to ensuring that individual privacy was protected and that the aggregate data released was a shared, communal asset.

But times change. Increasing amounts of data are collected by governments and corporations, vast quantities of it about individuals (whether or not they realise that it is happening). The risks to privacy through data collection and sharing are probably greater than they have ever been. Data analytics – whether of “big “ or “small” data – has the potential to provide unprecedented insight; however some of that insight may be at the cost of personal privacy, as separate datasets are connected/correlated.

Medical data loss dress

Both open data and big data are hot topics right now, and at such times it is tempting for organisations to get involved in such topics without necessarily thinking through all the issues. The intersection of big data and open data is somewhat worrying, as the temptation to combine the economic benefits of open data with the current growth potential of big data may lead to privacy concerns being disregarded. Privacy International are right to draw attention to this in their recent article on data for development, but of course other domains are affected too.

Today, we’d like to suggest some terms to help the growing discussion about open data and privacy.

Our Data is data with no personal element, and a clear sense of shared ownership. Some examples would be where the buses run in my city, what the government decides to spend my tax money on, how the national census is structured and the aggregate data resulting from it. At the Open Knowledge Foundation, our default position is that our data should be open data – it is a shared asset we can and should all benefit from.

My Data is information about me personally, where I am identified in some way, regardless of who collects it. It should not be made open or public by others without my direct permission – but it should be “open” to me (I should have access to data about me in a useable form, and the right to share it myself, however I wish if I choose to do so).

Transformed Data is information about individuals, where some effort has been made to anonymise or aggregate the data to remove individually identified elements.

big-data_conew1

We propose that there should be some clear steps which need to be followed to confirm whether transformed data can be published openly as our data. A set of privacy principles for open data, setting out considerations that need to be made, would be a good start. These might include things like consulting key stakeholders including representatives of whatever group(s) the data is about and data privacy experts around how the data is transformed. For some datasets, it may not prove possible to transform them sufficiently such that a reasonable level of privacy can be maintained for citizens; these datasets simply should not be opened up. For others, it may be that further work on transformation is needed to achieve an acceptable standard of privacy before the data is fit to be released openly. Ensuring the risks are considered and managed before data release is essential. If the transformations provide sufficient privacy for the individuals concerned, and the principles have been adhered to, the data can be released as open data.

We note that some of “our data” will have personal elements. For instance, members of parliament have made a positive choice to enter the public sphere, and some information about them is therefore necessarily available to citizens. Data of this type should still be considered against the principles of open data privacy we propose before publication, although the standards compared against may be different given the public interest.

This is part of a series of posts exploring the areas of open data and privacy, which we feel is a very important issue. If you are interested in these matters, or would like to help develop privacy principles for open data, join the working group mailing list. We’d welcome suggestions and thoughts on the mailing list or in the comments below, or talk to us and the Open Rights Group, who we are working with, at the Open Knowledge Conference and other events this autumn.

Get Updates