If you’d like to translate the Definition into another language, or if you’ve already done so, please get in touch on our discuss list, or on info at okfn dot org.
You are browsing the archive for Open Definition.
The following guest post is from Regards Citoyens, a French organisation that promotes open data.
Three months ago, the French Prime Minister announced officially the creation of the EtaLab governmental team, dedicated to the future data.gouv.fr. On Friday May 27th, two official texts have been published: a decree (fr) that defines new juridic rules regarding data licencing, and a circular (fr) intended to all administrative services to precise the processes of data release on the future platform.
The good news is that the lobbyists in favor of paying data lost this battle: free of charge data will be the default rule for new datasets as well as for currently paying data which wouldn’t be registered as such by July 2012. The decision to keep fees on some datasets will have to be discussed by an official commission — unfortunately its board offers a disproportionate representation of private data sellers — duly motivated and officialised by an individual official decree from the Prime Minister. Every paying dataset will further have to be listed on an official repository of paying data. It is sadly still unclear whether this repository will be part of data.gouv.fr. Will real open data be intermixed with paying data on the platform?
Every other data should be released freely under the conditions of a new licence that will be drafted and discussed in the next few months and released by August 24th. The circular states that this licence should follow the free and open principles, even though its designers never took part in the writing of such a licence in the past. We will actively follow this process in order to push for a total compatibility with open data principles such as the Open Knowledge Definition.
Much worse news would be the exclusion from these rules of public administrations with commercial objectives (EPIC) and public service delegations to private businesses, such as transport companies. The circular also offers the possibility for administrations to write their own licences in order to meet some specific, unspecified needs: this might lead to an increasing number of slightly incompatible licences, with the risk of having some datasets undermined by non-commercial clauses.
At last, the circular mentions file formats. It is heartwarming to read the recommandation to use a list of various open formats such as CSV, XML, KML or ODS. It is unfortunate though to read ODS mistakingly listed as a text document format, or to see also the discriminatory format XLS proposed for spreadsheet documents.
To conclude, these official texts are a real progress which strongly encourages French administrations to open their data. But some serious risks remain, and Open Data shall not be restricted to free of charge datasets: open formats and free licences have to be the core of EtaLab’s coming work.
The following guest post is from Sverre Andreas Lunde-Danbolt who works for the Department for ICT and renewal in the Norwegian Ministry of Government Administration, Reform and Church Affairs, and who is a member of the OKF’s Working Group on Open Government Data
The Norwegian Ministry of Government Administration and Reform have just sent a draft version of a new Norwegian Licence for Open Data (NLOD) on a formal hearing here in Norway (the hearing documents (in Norwegian), and a blog post about the licence and the hearing (also in Norwegian)). After the hearing, we intend to recommend all government agencies in Norway to use this licence when they publish data.
Government agencies publishing data are not always very good at specifying the terms under which the information can be reused. In Norway, at least, the introduction of a new sui generis licence for each new data set has become a predictable exercise. This is confusing to the reuser, adding an uneccessary layer of uncertainty, and, in some cases, even impeding legitimate reuse.
The Ministry has therefore decided to establish one common licence. This will reduce the number of open data licences in Norway (one licence to rule them all). The licence is a rather straightforward attribution licence under Norwegian law. Its main purpose is to enable reuse in Norway, but to make sure data under NLOD can be combined with other data as well as reused internationally, the licence states clearly that it is compatible with Open Government Licence (v1.0), Creative Commons Attribution Licence (generic v1.0, v2.0, v2.5 and unported v3.0), and Open Data Commons Attribution Licence (v 1.0).
The most important details in the licence are the following:
- Personal data is not covered by the licence. This is the same as in Open Government Licence.
- The reuser cannot distort the information or use the information to mislead. The NLOD definition of this seems to be less restrictive than the definition used in Open Government Licence.
- NLOD specifies that the licencor can provide more information on the quality or delivery of data, but that this kind of information is outside the scope of the licence. NLOD only covers the rights to use the information.
- Information licenced under NLOD will also be licenced under future versions of the licence, provided that the licensor has not explicitly licenced information under v1.0. This gives the Norwegian Government more sway over public sector information, and reduces the chances of data ending up as a kind of orphan works in the future.
What do you think?
The following post is by Mark McGillivrary, a member of the Open Knowledge Foundation Working Group on Open Bibliographic Data.
Last week the Open Biblio Principles were launched by the Open Knowledge Foundation’s Working Group on Open Bibliographic Data. The principles are the product of six months of development and discussion within the working group and the wider bibliographic community:
Producers of bibliographic data such as libraries, publishers, universities, scholars or social reference management communities have an important role in supporting the advance of humanity’s knowledge. For society to reap the full benefits from bibliographic endeavours, it is imperative that bibliographic data be made open — that is available for anyone to use and re-use freely for any purpose.
- When publishing bibliographic data make an explicit and robust license statement.
- Use a recognized waiver or license that is appropriate for data.
- If you want your data to be effectively used and added to by others it should be open as defined by the Open Definition (http://opendefinition.org/) – in particular non-commercial and other restrictive clauses should not be used.
- Where possible, explicitly place bibliographic data in the Public Domain via PDDL or CC0.
You can read the full version of the principles at: http://openbiblio.net/principles
Please help us spread the word, and the links, to individuals and organisations across the academic, library and publisher community.
Lastly, we are also working on alternative language versions so if you are interested in doing a translation please leave a comment or email mark [dot] macgillivrary [at] okfn [dot] org.
Last week, an article in the Wall Street Journal talked about the Open Data Partnership, which “will allow consumers to edit the interests, demographics and other profile information collected about them. It also will allow people to choose to not be tracked at all.” The article goes on to discuss data mining and privacy issues, which are hot topics in today’s digital world, where we all wonder just how much of our personal data is out there and how it’s being used. These are valid concerns being talked about in other, more appropriate fora. I, however, would like to address my personal pet peeve about the dilution of the term open data.
The Open Knowledge Definition says it this way, “A piece of content or data is open if you are free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.” Generally, this means that the data should be released in a format that is free of royalties and other IP restrictions. The problem is that an increasing number of people are using the term open data to mean publicly available data.
In the article, the CEO of the startup directing the Open Data Initiative says the goal is to “be more transparent and give consumers more control” of the data that is collected and shared. Providing a mechanism in which consumers can decide what information can be made available to advertisers is a laudable goal. However, this “open data” initiative focuses on what data is made available, when open data is really about how data is made available. This definitional shift is a problem, particularly for governments that are implementing data policies.
Simply put, all open data is publicly available. But not all publicly available data is open.
Open data does not mean that a government or other entity releases all of its data to the public. It would be unconscionable for the government to give out all of your private, personal data to anyone who asks for it. Rather, open data means that whatever data is released is done so in a specific way to allow the public to access it without having to pay fees or be unfairly restricted in its use.
In a previous article, I wrote about how the Massachusetts Bay Transit Authority (MBTA) opened up their transit data to software developers. Within 2 months, six new trip planning applications for bus and train riders had been built at no cost to the MBTA. That’s the power of open data. It was data produced by the government which was released to the public in an open format (GTFS) for free, under a license that allowed for use and redistribution.
Why does this matter? If open data is misunderstood as releasing any and all data to the public, people will become opposed to the concept due to their concerns about privacy. What we, as policy advocates, want to encourage is that the data that governments do and should publish is done so in a way to ensure equal public access by all citizens. In other words, you shouldn’t have to buy a particular vendor’s product in order to be able to open, use, or repurpose the data. You, as a taxpayer, have already paid for the collection of the data. You shouldn’t have to pay an additional fee to open it.
We’ve all seen, from the recent news about Wikileaks, that there are real privacy and/or security concerns with putting all the government’s data out there, but that is a separate issue and shouldn’t be confused with open data. Whether data should be made publicly available is where privacy concerns come into play. Once it has been determined that government data should be made public, then it should be done so in an open format.
Am I being nitpicky about the term? Maybe. But we’ve seen from other tech policy battles that good definitions are crucial to framing the debate.
Guest - October 19, 2010 in Interviews, Legal, Open Data, Open Data Commons, Open Definition, Open Government Data, Open Knowledge Definition, Open Knowledge Foundation, Public Domain, WG Open Licensing
Open Acccess journalist extraordinaire Richard Poynder recently interviewed the Open Knowledge Foundation’s Jordan Hatcher about data licensing, the public domain, and lots more. An excerpt is reproduced below. The full version is available on Richard’s website.
Over the past twenty years or so we have seen a rising tide of alternative copyright licences emerge — for software, music and most types of content. These include the Berkeley Software Distribution (BSD) licence, the General Public Licence (GPL), and the range of licences devised by Creative Commons (CC). More recently a number of open licences and “dedications” have also been developed to assist people make data more freely available.
Why have these licences been developed? How do they differ from traditional copyright licences? And can we expect them to help or hinder reform of the traditional copyright system — which many now believe has got out of control? I discussed these and other questions in a recent email interview with Jordan Hatcher.
A UK-based Texas lawyer specialising in IT and intellectual property law, Jordan Hatcher is co-founder of OpenDataCommons.org, a board member of the Open Knowledge Foundation (OKF), and blogs under the name opencontentlawyer.
RP: Can you begin by saying something about yourself and your experience in the IP/copyright field?
JH: I’m a Texas lawyer living in the UK and focusing on IP and IT law. I concentrate on practical solutions and legal issues centred on the intersection of law and technology. While I like the entire field of IP, international IP and copyright are my most favourite areas.
As to more formal qualifications, I have a BA in Radio/TV/Film, a JD in Law, and an LLM in Innovation, Technology and the Law. I’ve been on the team that helped bring Creative Commons licences to Scotland and have led, or been a team member on, a number of studies looking at open content licences and their use within universities and the cultural heritage sector.
I was formerly a researcher at the University of Edinburgh in IP/IT, and for the past 2.5 years have been providing IP strategy and IP due diligence services with a leading IP strategy consultancy in London.
I’m also the co-founder and principal legal drafter behind Open Data Commons, a project to provide legal tools for open data, and the Chair of the Advisory Council for the Open Definition. I sit on the board for the Open Knowledge Foundation.
RP: It might also help if you reminded us what role copyright is supposed to play in society, how that role has changed over time (assuming that you feel it has) and whether you think it plays the role that society assigned to it successfully today.
JH: Wow that’s a big question and one that has changed quite a bit since the origin of copyright. As with most law, I take a utilitarian / legal realist view that the law is there to encourage a set of behaviours.
Copyright law is often described as being created to encourage more production and dissemination of works, and like any law, its imperfect in its execution.
I think what’s most interesting about copyright history is the technology side (without trying to sound like a technological determinist!). As new and potentially disruptive technologies have come along and changed the balance — from the printing press all the way to digital technology — the way we have reacted has been fairly consistent: some try to hang on to the old model as others eagerly adopt the new model.
For those interested in learning more about copyright’s history, I highly recommend the work of Ronan Deazley, and suggest people look at the first sections in Patry on Copyright. They could also usefully read Patry’s Moral Panics and the Copyright Wars. Additionally, there are many historical materials on copyright available at the homepage for a specific research project on the topic here.
RP: In the past twenty years or so we have seen a number of alternative approaches to licensing content develop — most notably through the General Public Licence and the set of licences developed by the Creative Commons. Why do you think these licences have emerged, and what are the implications of their emergence in your view?
JH: I see free and open licence development as happening within three tranches, all related to a specific area of use.
1. FOSS for software. Alongside the GPL, there have been a number of licences developed since the birth of the movement (and continuing to today), all aimed at software. These licences work best for software and tend to fall over when applied to other areas.
2. Open licences and Public licences for content. These are aimed at content, such as video, images, music, and so on. Creative Commons is certainly the most popular, but definitely not the first. The birth of CC does however represent a watershed moment in thinking about open licensing for content.
I distinguish open licences from public licences here, mostly because Creative Commons is so popular. Open has so many meanings to people (as do “free”) that it is critical to define from a legal perspective what is meant when one says “open”. The Open Knowledge Definition does this, and states that “open” means users have the right to use, reuse, and redistribute the content with very few restrictions — only attribution and share-alike are allowed restrictions, and commercial use must specifically be allowed.
The Open Definition means that only two out of the main six CC licences are open content licences — CC-BY and CC-BY-SA. The other four involve the No Derivatives (ND) restriction (thus prohibiting reuse) or have Non Commercial (NC) restrictions. The other four are what I refer to as “public licences”; in other words they are licences provided for use by the general public.
Of course CC’s public domain tools, such as CC0, all meet the Open Definition as well because they have no restrictions on use, reuse, and redistribution.
I wrote about this in a bit more detail recently on my blog.
3. Open Data Licences. Databases are different from content and software — they are a little like both in what users want to do with them and how licensors want to protect them, but are different from software and content in both the legal rights that apply and how database creators want to use open data licences.
As a result, there’s a need for specific open data licences, which is why we founded Open Data Commons. Today we have three tools available. It’s a new area of open licensing and we’re all still trying to work out all the questions and implications.
RP: As you say, data needs to be treated differently from other types of content, and for this reason a number of specific licences have been developed — including the Public Domain Dedication Licence (PDDL), the Public Doman Dedication Certificate (PDDC) and Creative Commons Zero. Can you explain how these licences approach the issue of licensing data in an open way?
JH: The three you’ve mentioned are all aimed at placing work into the public domain. The public domain has a very specific meaning in a legal context: It means that there are no copyright or other IP rights over the work. This is the most open/free approach as the aim is to eliminate any restrictions from an IP perspective.
There are some rights that can be hard to eliminate, and so of course patents may still be an issue depending on the context, (but perhaps that’s conversation for another time).
RP: Can you say something about these tools, and what they bring to the party?
JH: All three are tools to help increase the public domain and make it more known and accessible.
There’s some really exciting stuff going on with the public domain right now, including with PD calculators — tools to automatically determine whether a work is in the public domain. The great thing about work in the public domain is that it is completely legally interoperable, as it eliminates copyright restrictions.
See the rest of the interview on Open and Shut…
Let’s face it, we often have a definition problem.
It’s critical to distinguish “open licenses” from “public licenses” when discussing IP licensing, especially online — mostly because Creative Commons is so popular and as a result has muddied the waters a bit.
Open has so many meanings to people (same of course as with “free software” or free cultural works) that it is critical to define from a legal perspective what is meant when one says “open”. The Open Knowledge Definition does this, and states that “open” means users have the right to use, reuse, and redistribute the content with very few restrictions — only attribution and share-alike restrictions are ok, and commercial use must specifically be allowed.
The Open Definition means that only two out of the main six CC licenses are open content licenses — CC-BY and CC-BY-SA. The other four involve the two non-open license elements the No Derivatives (ND) restriction (thus prohibiting reuse) or have Non Commercial (NC) restrictions. The other four are “public licenses”, in other words they are licenses provided for use by the general public.
Of course CC’s public domain tools, such as CC0, all meet the Open Definition as well because they have no restrictions on use, reuse, and redistribution.
The Open Data Commons legal tools, including the PDDL, the ODbL and the ODC Attribution License, all comply with the Open Definition, and so are all open public licenses.
I haven’t done a full survey, but the majority of open licenses (in terms of popularity) probably also fit the definition of public licenses, as open license authors tend to draft licenses for public consumption (and these tend to be the most used ones, naturally) . Many open licenses aren’t public licenses though — mainly those drafted for specific use by a specific licensor, such as a government or business. So the UK government’s new Open Government License isn’t a public license because it’s not meant to be used without alteration by other governments, but provided it meets the definition of the Open Definition, would be an Open License.
A simple Venn Diagram might be:
A few weeks back we blogged about Russ Nelson’s proposals for the Open Source Initiative (OSI) to adopt the Open Knowledge Definition, our standard for openness in relation to content and data.
Russ has written back to us with some notes and questions from a session on this at OSCON:
Okay, so, as promised, here is my report on the “Open Data Definition” BOF held on Wednesday, July 21, at 7PM. There were about ten people present, which is a reasonable attendance, particularly when set against the Google Android Hands-on session at which they gave out free Nexus One phones.
Didn’t seem wise to me to start from scratch, especially given the good work done by the Open Knowledge Foundation on their Open Knowledge Definition: http://www.opendefinition.org/okd/. So we read through it section by section, by way of review. Here are the questions we arrived at (thanks to Skud aka Kirrily Robert for taking notes):
- What happens with data that’s not copyrightable? 1a. What about data that consists of facts about the world and thus even a collection of it cannot be copyrighted, but the exact file format can be copyrighted? Many sub-federal-level governments in the US have to publish facts on demand but claim a copyright on the formatting.
- What about data that’s not accessible as a whole, but only through an API?
- We’re thinking that OKD #9 should read “execution of an additional agreement” rather than “additional license”.
- Does OKD #4 apply to works distributed in a particular file format? Is a movie not open data if it’s encoded in a patent-encumbered codec? Does it become open data if it’s re-encoded?
- What constitutes onerous attribution in OKD #5? If you get open data from somebody, and they have an attribution page, is it sufficient for you to comply with the attribution requirement if you point to the attribution page?
This serves as an invitation to discuss these issues on the new list email@example.com . Send subscription requests to firstname.lastname@example.org . Unsubscribe by sending a request to email@example.com .
If these issues are successfully resolved, then this committee will recommend to the OSI board that the OKD should be adopted as OSI approved. If they can’t be resolved by, say, the end of 2010, then we will give up on trying. Either way, the intent is to lay down the list by the end of this year unless the participants desire otherwise.
So if you’d like to join the conversation, please join the list! We’ve also created an Etherpad to gather responses to some of these issues:
If you’d like to translate the Definition into another language, or if you’ve already done so, please get in touch on our discuss list, or on info at the OKF’s domain name (okfn dot org).
I’m running a BOF at OSCON on Wednesday night July 21st at 7PM, with the declared purpose of adopting an Open Source Definition for Open Data. Safe enough to say that the OSD has been quite successful in laying out a set of criteria for what is, and what is not, Open Source. We should adopt a definition Open Data, even if it means merely endorsing an existing one. Will you join me there?
Subsequently a bunch of people wrote to Russell letting him know about the Open Knowledge Definition that we created a few years ago:
The Open Knowledge Definition (OKD) sets out principles to define ‘openness’ in knowledge – that’s any kind of content or data ‘from sonnets to statistics, genes to geodata’. The definition can be summed up in the statement that “A piece of knowledge is open if you are free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.”
Russell suggested there was scope for the OSI to adopt the OKD, and emailed us a further blurb for the event:
Should the Open Source Initiative write its own definition of Open Data? Or is the Open Knowledge Foundation’s definition up to snuff? Come help us decide at OSCON next week. We have a BOF scheduled at 19:00 on 21 July 2010. We’ll present the results of our decision to the OSI for adoption at its next board meeting.
We’re excited at the prospect that the OKD might get adopted as an official open data definition by OSI, and would love to hear from folks who plan to attend the session!