Defining Open Data
Open data is data that can be freely used, shared and built-on by anyone, anywhere, for any purpose. This is the summary of the full Open Definition which the Open Knowledge Foundation created in 2005 to provide both a succinct explanation and a detailed definition of open data.
As the open data movement grows, and even more governments and organisations sign up to open data, it becomes ever more important that there is a clear and agreed definition for what “open data” means if we are to realise the full benefits of openness, and avoid the risks of creating incompatibility between projects and splintering the community.
Open can apply to information from any source and about any topic. Anyone can release their data under an open licence for free use by and benefit to the public. Although we may think mostly about government and public sector bodies releasing public information such as budgets or maps, or researchers sharing their results data and publications, any organisation can open information (corporations, universities, NGOs, startups, charities, community groups and individuals).
There is open information in transport, science, products, education, sustainability, maps, legislation, libraries, economics, culture, development, business, design, finance …. So the explanation of what open means applies to all of these information sources and types. Open may also apply both to data – big data and small data – or to content, like images, text and music!
So here we set out clearly what open means, and why this agreed definition is vital for us to collaborate, share and scale as open data and open content grow and reach new communities.
What is Open?
The full Open Definition provides a precise definition of what open data is. There are 2 important elements to openness:
- Legal openness: you must be allowed to get the data legally, to build on it, and to share it. Legal openness is usually provided by applying an appropriate (open) license which allows for free access to and reuse of the data, or by placing data into the public domain.
- Technical openness: there should be no technical barriers to using that data. For example, providing data as printouts on paper (or as tables in PDF documents) makes the information extremely difficult to work with. So the Open Definition has various requirements for “technical openness,” such as requiring that data be machine readable and available in bulk.
There are a few key aspects of open which the Open Definition explains in detail. Open Data is useable by anyone, regardless of who they are, where they are, or what they want to do with the data; there must be no restriction on who can use it, and commercial use is fine too.
Open data must be available in bulk (so it’s easy to work with) and it should be available free of charge, or at least at no more than a reasonable reproduction cost. The information should be digital, preferably available by downloading through the internet, and easily processed by a computer too (otherwise users can’t fully exploit the power of data – that it can be combined together to create new insights).
Open Data must permit people to use it, re-use it, and redistribute it, including intermixing with other datasets and distributing the results.
The Open Definition generally doesn’t allow conditions to be placed on how people can use Open Data, but it does permit a data provider to require that data users credit them in some appropriate way, make it clear if the data has been changed, or that any new datasets created using their data are also shared as open data.
There are 3 important principles behind this definition of open, which are why Open Data is so powerful:
- Availability and Access: that people can get the data
- Re-use and Redistribution: that people can reuse and share the data
- Universal Participation: that anyone can use the data
Governance of the Open Definition
Since 2007, the Open Definition has been governed by an Advisory Council. This is the group formally responsible for maintaining and developing the Definition and associated material. Its mission is to take forward Open Definition work for the general benefit of the open knowledge community, and it has specific responsibility for deciding on what licences comply with the Open Definition.
The Council is a community-run body. New members of the Council can be appointed at any time by agreement of the existing members of the Advisory Council, and are selected for demonstrated knowledge and competence in the areas of work of the Council.
The Advisory Council operates in the open and anyone can join the mailing list.
About the Open Definition
The Open Definition was created in 2005 by the Open Knowledge Foundation with input from many people. The Definition was based directly on the Open Source Definition from the Open Source Initiative and we were able to reuse most of these well-established principles and practices that the free and open source community had developed for software, and apply them to data and content.
Thanks to the efforts of many translators in the community, the Open Definition is available in 30+ languages.
More about openness coming soon
In coming days we’ll post more on the theme of explaining openness, including a more detailed exploration of the Open Definition, the relationship of the Open Definition to specific sets of principles for openness – such as the Sunlight Foundation’s 10 principles and Tim Berners-Lee’s 5 star system, why having a shared and agreed definition of open data is so important, and how one can go about “doing open data”.