An unprecedented amount of freely reusable government information is currently being released by public bodies around the globe. This is being consumed and reused by numerous stakeholders – including civic developers, data literate citizens, data journalists, NGOs, researchers, and companies. There is a tremendous opportunity to create a thriving ecosystem of open data, whereby numerous actors add value to a shared ‘commons’ of open data.
There is also a possibility that (whether through oversight or design) reusers consume government information without sharing back, leading to the creation of new data silos or restrictive licensing models.
For example, a university researcher might republish a dataset under a standard set of terms and conditions which prohibits republication and limits users to using the material for personal, research purposes. A company might use their standard licensing agreement which means that others can’t build on republished datasets. A developer might put a collection of re-formatted datasets online, but without an explicit license or legal notice – meaning others won’t know whether they are allowed to reuse them or not.
How can we try to ensure that open government data stays open? Perhaps we can try to promote ‘norms’ for reusers of open government data, to encourage them to contribute to a shared commons that everyone can benefit from. Here are a few suggestions:
- Open in, open out – If you pull open data into your website or system, then others should be able to pull it out as open data as well.
- If in doubt, let people know – If you are republishing open data, and you would like to keep it open, make sure and use an appropriate license or legal tool.
- Keep track of provenance and permissions – If you are mixing datasets from different sources, and want to make sure you don’t accidentally republish proprietary data under an open license, make sure you keep track of which bits are open and which bits are not. Rather than going for the lowest common denominator (e.g. ‘because not all of this is open, we better use a restrictive license’), keep track of where different datasets come from and let users know which they can and can’t reuse.
Can you think of any more? If so let us know in a comment below or on our open-government list.
Dr. Jonathan Gray is Lecturer in Critical Infrastructure Studies at the Department of Digital Humanities, King’s College London, where he is currently writing a book on data worlds. He is also Cofounder of the Public Data Lab; and Research Associate at the Digital Methods Initiative (University of Amsterdam) and the médialab (Sciences Po, Paris). More about his work can be found at jonathangray.org and he tweets at @jwyg.