Support Us

You are browsing the archive for Open Knowledge Definition.

Update on Open Source Initiative’s adoption of the Open Knowledge Definition

August 4, 2010 in External, OKF, Open Data, Open Definition, Open Knowledge Definition

A few weeks back we blogged about Russ Nelson’s proposals for the Open Source Initiative (OSI) to adopt the Open Knowledge Definition, our standard for openness in relation to content and data.

Russ has written back to us with some notes and questions from a session on this at OSCON:

Okay, so, as promised, here is my report on the “Open Data Definition” BOF held on Wednesday, July 21, at 7PM. There were about ten people present, which is a reasonable attendance, particularly when set against the Google Android Hands-on session at which they gave out free Nexus One phones.

Didn’t seem wise to me to start from scratch, especially given the good work done by the Open Knowledge Foundation on their Open Knowledge Definition: http://www.opendefinition.org/okd/. So we read through it section by section, by way of review. Here are the questions we arrived at (thanks to Skud aka Kirrily Robert for taking notes):

  1. What happens with data that’s not copyrightable? 1a. What about data that consists of facts about the world and thus even a collection of it cannot be copyrighted, but the exact file format can be copyrighted? Many sub-federal-level governments in the US have to publish facts on demand but claim a copyright on the formatting.
  2. What about data that’s not accessible as a whole, but only through an API?
  3. We’re thinking that OKD #9 should read “execution of an additional agreement” rather than “additional license”.
  4. Does OKD #4 apply to works distributed in a particular file format? Is a movie not open data if it’s encoded in a patent-encumbered codec? Does it become open data if it’s re-encoded?
  5. What constitutes onerous attribution in OKD #5? If you get open data from somebody, and they have an attribution page, is it sufficient for you to comply with the attribution requirement if you point to the attribution page?

This serves as an invitation to discuss these issues on the new list open-data@opensource.org . Send subscription requests to open-data-subscribe@opensource.org . Unsubscribe by sending a request to open-data-unsubscribe@opensource.org .

If these issues are successfully resolved, then this committee will recommend to the OSI board that the OKD should be adopted as OSI approved. If they can’t be resolved by, say, the end of 2010, then we will give up on trying. Either way, the intent is to lay down the list by the end of this year unless the participants desire otherwise.

So if you’d like to join the conversation, please join the list! We’ve also created an Etherpad to gather responses to some of these issues:

Belarusian translation of the Open Knowledge Definition (OKD)

July 28, 2010 in Open Data, Open Definition, Open Knowledge Definition

We’ve just added a Belarusian translation of the Open Knowledge Definition thanks to Patricia Clausnitzer!

If you’d like to translate the Definition into another language, or if you’ve already done so, please get in touch on our discuss list, or on info at the OKF’s domain name (okfn dot org).

Should the Open Source Initiative adopt the Open Knowledge Definition?

July 19, 2010 in Open Data, Open Definition, Open Knowledge Definition

Russ Nelson, License Approval Chair at the Open Source Initiative (OSI), recently proposed a session at OSCON about OSI adopting a definition for open data:

I’m running a BOF at OSCON on Wednesday night July 21st at 7PM, with the declared purpose of adopting an Open Source Definition for Open Data. Safe enough to say that the OSD has been quite successful in laying out a set of criteria for what is, and what is not, Open Source. We should adopt a definition Open Data, even if it means merely endorsing an existing one. Will you join me there?

Subsequently a bunch of people wrote to Russell letting him know about the Open Knowledge Definition that we created a few years ago:

The Open Knowledge Definition (OKD) sets out principles to define ‘openness’ in knowledge – that’s any kind of content or data ‘from sonnets to statistics, genes to geodata’. The definition can be summed up in the statement that “A piece of knowledge is open if you are free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and share-alike.”

Russell suggested there was scope for the OSI to adopt the OKD, and emailed us a further blurb for the event:

Should the Open Source Initiative write its own definition of Open Data? Or is the Open Knowledge Foundation’s definition up to snuff? Come help us decide at OSCON next week. We have a BOF scheduled at 19:00 on 21 July 2010. We’ll present the results of our decision to the OSI for adoption at its next board meeting.

We’re excited at the prospect that the OKD might get adopted as an official open data definition by OSI, and would love to hear from folks who plan to attend the session!

Why Share-Alike Licenses are Open but Non-Commercial Ones Aren’t

June 24, 2010 in Ideas and musings, Open Data, Open Definition, Open Knowledge Definition

It is sometimes suggested that there isn’t a real difference in terms of “openness” between share-alike (SA) and non-commercial (NC) clauses — both being some restriction on what the user of that material can do, and, as such, a step away from openness.

This is not true. A meaningful distinction can be drawn between share-alike and non-commercial clauses (or any other clause that discriminates against a particular type of person or field of endeavour), with the former being “open” and the latter being not “open”.

This distinction is important. It has relevance, for example, as to why Open Data Commons should not provide NC licenses but will provide a share-alike one. As well as to Creative Commons whose set of licenses includes both share-alike and non-commercial options. As such, not all CC licenses are open and CC licenses are are not all mutually compatible. This is something of an irony as it means that Creative Commons provide a set of licenses that don’t, in fact, result in a commons.

What’s the Problem? Why Does This Matter?

> What’s the problem with NC licenses, aren’t “SA” licenses a step away from open too? And if we debate this, don’t we just end up having a pointless license holy war?

The distinction between NC and SA licenses isn’t about “holy war” but something very practical: license compatibility and the integrity of the “open” commons. The core of a “commons” of data (or code) is that one piece of “open” material contained therein can be freely intermixed with other “open” material.

This interoperability is absolutely key to realizing the main practical benefits of “openness” which is the ease of use and reuse — which, in turn, mean more and better stuff getting created and used.

The Open Knowledge/Data Definition functions as a “standard” to ensure interoperability just in the same way as normal tech standards operate (but in this case for licenses rather than for a piece of hardware or software). The aim is to ensure that any license which complies with the definition will be interoperable with any other such license meaning that data or content under the one license can be combined with data or content under the other license.

Share-alike or attribution requirements are allowed within the definition precisely because they do not break this interoperability (and may even help promote the commons by ensuring material is “shared back”). Non-commercial provisions are not permitted because they fundamentally break the commons, not only through being incompatible with other licenses but because they overtly discriminate against particular types of users. (I should emphasize here that the definition is directly following the line set out in the original open source definition …)

Thus, there is a meaningful distinction between attribution and share-alike requirements and other such as non-commercial (NC), and it is a distinction that merits the description of share-alike licenses as being open but non-commercial licenses as not being open.

Isn’t It Just About Degree?

> Yes, NC and especially ND are more restrictive, but stating that NC > licenses aren’t open is wrong – they’re just not as open.

This is incorrect.

To reiterate: it is a mistake to view the set of licenses as some continuous spectrum of ‘openness’ with PD at one end and full rights reserved at the other — with the implication that all licenses in between are more or less open.

There are significant discontinuities and in particular we can meaningfully partition the set of licenses into open and not-open based on a) their interoperability b) the freedom they provide to all persons (and companies) to use, reuse and redistribute.

But You Can’t Trademark Openness …

> it’s annoying that someone claims to be releasing data openly, but it turns out to be > NC and no-compete and a bunch of other stuff. It would be nice to say to them – “you can’t claim to be open because you don’t meet this > definition”. But unfortunately it would probably be difficult to get > the trademark on the word “open”

It’s quite right that you can’t trademark openness — and no-one should want to! However, we can make an effort as a community to have a clear shared meaning for “open” in relation to data and content along the lines of — just as the open source definition has done for code. By insisting on this meaning we are doing something valuable: creating a standard and maintaining interoperability.

Avatar of lisa

by lisa

Russian translation of the Open Knowledge Definition (OKD)

April 27, 2010 in OKF, Open Definition, Open Knowledge Definition

We’ve just added a Russian translation of the Open Knowledge Definition thanks to Maxim Dubinin.

If you’d like to translate the Definition into another language, or if you’ve already done so, please get in touch on our discuss list, or on info at the OKF’s domain name (okfn dot org).

Avatar of lisa

by lisa

Norwegian translation of the Open Knowledge Definition (OKD)

April 22, 2010 in OKF, OKF Projects, Open Definition, Open Knowledge Definition

We are pleased to now have a Norwegian translation of the Open Knowledge Definition thanks to Svein-Magnus Sørensen, Harald Groven and Olav Anders Øvrebø.

If you’d like to translate the Definition into another language, or if you’ve already done so, please get in touch on our discuss list, or on info at the OKF’s domain name (okfn dot org).

Avatar of jwalsh

by jwalsh

A free software model for open knowledge

March 17, 2010 in CKAN, datapkg, Events, OKF, OKF Projects, Open Data Commons, Open Knowledge Definition, Talks

Notes describing the talk on the work of the Open Knowledge Foundation given last week at Jornadas SIG Libre.

OKF activity graph

I was happily surprised to be asked to give this open knowledge talk at an open source software conference. But it makes sense – the free software movement has created the conditions in which an open data movement is possible. There is lots to learn from open source process, in both a technical and organisational sense.

In English we have one word “free” where Spanish like most languages has two, gratis and libre, signifying separately “free of cost” and “freedom to”. The Open Source Institute coined Open Source as a branding or marketing exercise to avoid the primary meaning “free of cost”. So whenever I say “open” I want you to hear the word “libre” [Later i was told that libre can have meaning in at least 15 different ways]

The best way to talk about the work of the Open Knowledge Foundation is to look at its projects, which form an open knowledge stack similar to the OSGeo software stack.

Open Definition

The Open Knowledge Definition is based on the OSI Open Source Software Definition (which OSGeo uses as a reference for acceptable software licenses). No restrictions on field of endeavour – non-commercial-use licenses are not open as in the OKD. An open data license will pass the cake test.

Open Data Commons

Open Data Commons is run by Jordan Hatcher, who started work on the Open Database License with support from Talis, later extensive negotiation with the OpenStreetmap community. ODbL is a ShareAlike license for data, that obviates the problems of inapplicability of copyright to facts, and greediness of the ShareAlike clause when it comes to use of maps in PDFs, etc.

PDDL is a license that implements the Science Commons protocol for open access data, explicitly placing it in the public domain.

The Panton Principles are four precepts for publishers of scientific research data who wish that data to be freely reusable. Being openly able to inspect, critique and re-analyse data is critical to the effectiveness of scientific research.

Open Data Grid

The Open Data Grid is a project in early incubation; based on the Tahoe distributed filesystem. It’s in need of development effort on Tahoe to really get going. Provide secure storage for open datasets around the edges of infrastructure that people are already running. 4340727578_da9a6671a5_b

People are handwaving about the Cloud, but storage and backup are not problems that it is really meant to solve. People make different claims about the Cloud – cheaper, greener, more efficient, more flexible. Can we get these things in other ways?

There is a saying, “never underestimate the bandwidth of a truck full of DAT tapes”

Comprehensive Knowledge Archive Network (CKAN)

CKAN is inspired by free software package repositories, perl’s CPAN, R’s CRAN, python’s PyPi. It provides a wiki-like interface to create minimal metadata for packages with a versioned domain model and HTTP API.

CKAN supports groups, which can curate a package namespace – e.g. climate data – and assess priorities for turning into fully installable packages.

CKAN’s open source code is being used in the data package catalogue for the data.gov.uk project, part of the Making Public Data Public effort in the UK.

datapkg

The Debian of Data – datapkg takes Debian’s apt tool as inspiration for fully automatable install of data packages, with dependencies between them. This is currently in usable alpha stage with a python implementation.

Where Does My Money Go?

The next challenge really is to bring the concerns and the solutions to a mainstream public. Agustín Lobo spoke of “a personal consciousness but not an institutional consciousness” when it comes to open source and open data. Media coverage, exemplary government implementations, help to create this kind of consciousness.

Pressure for increased open access is coming from academia – for the research data underlying papers, for the right to data mine and correlate different sources, for library data open for re-use. Pressure is also coming from within museums, libraries and archives – memory institutions who want to increase exposure to their collections with new technology, and recognise that open data, linked to a network of resources, will work for sustainability and not against it.

The next generation of researchers, who are kids in school now, will grow up with an expectation that code and data are naturally open. It will be interesting to see what they make!

Meanwhile OpenStreetmap is feeding several startups, and more commercial presence in open data space will be of benefit. Illustrative that one does not have to be proprietary to be commercial.

Now higher-profile government projects opening data are helping to mainstream. To what extent is open a fashionable position, to what extent is open reflected throughout the way of working?

Open process; early release, public sharing of bugs, public discussion of plans – everything in Nat Torkington’s post on Truly Open Data. The opportunity to fail in public, to learn from others’ problems, and self-interestedly collaborate.


I had a great time at SIG Libre 10. Oscar Fonts’ talk on OpenSearch Geospatial interfaces to popular services has me itching to add an OpenSearch +Geo interface to CKAN, as well as to work on getting the apparent version skew in the Geo extensions resolved amicably.

Genís Roca spoke thought-provokingly on Retorno y rentabilidad (there isn’t really an equivalent English word – “rentability” – less exploitative or focused than profitability). Rentability, especially for online services, can come in ways that sustain an organisation predictably, and don’t involve fishing in the pockets of ultimate end-users.

Ivan Sanchez showed areas of OpenStreetmap Spain with stunning level of detail, trees and fences, MasterMap-quality coverage. I’m inspired to pick up JOSM and Markaartor to add building-level detail from out of copyright 1:500 Edinburgh town plans at the National Library of Scotland’s map services.

Agustin Lobo talked about the distributed work and cross-institutional support and benefit of the R project, and the impact of open source on open access to data in science. He mentioned a Nature open peer review experiment that was discarded – am thinking it wasn’t curated enough. The talk helped me to connect the OKF’s work to the rest of the Jornadas.

The shiny slides prezi.com which many people asked for details of – this should show embedded in the page I hope. I stupidly forgot to put URLs on the slides which is partly why i have written this blog.

UK Government announces lots of new open data!

December 7, 2009 in CKAN, News, Open Data, Open Definition, Open Government Data, Open Knowledge Definition, Policy

Smarter Government seminar by Downing Street on Flickr

This morning UK Prime Minister Gordon Brown announced plans to open up lots more UK Government data! His speech describes plans to put much more detailed information online under open licenses in 2010.

This includes:

  • public services performance data – including on crime, hospitals and schools
  • new transport data
  • geospatial data from Ordnance survey (as we recently blogged about)

We are very pleased that it looks like the new datasets will be:

  1. Released in raw form – as OKF Director Rufus Pollock first blogged about two year ago last month, and alluded to by Sir Tim Berners-Lee at TED.
  2. Released in a way which is compliant with the Open Knowledge Definition – i.e. free for anyone to use for any purpose, include commercial.

We’re also very proud that the new data.gov.uk site, the official registry of UK Government open datasets, is powered by CKAN (as we announced a couple of months ago). If you’re interested in following the latest development about this as they happen, please join the official mailing list.

The new Putting the Frontline First: Smarter Government initiative gives further detail on how the new data will be published. In particular, section 1.3. Radically opening up data and promoting transparency, gives a set of “public data principles”, which are as follows:

> ‘Public data’ are ‘government-held non-personal data that are collected or generated in the course of public service delivery’.

> Our public data principles state that:

> * Public data will be published in reusable, machine-readable form > * Public data will be available and easy to find through a single easy to use online access point (http://www.data.gov.uk/) > * Public data will be published using open standards and following the recommendations of the World Wide Web Consortium > * Any ‘raw’ dataset will be represented in linked data form > * More public data will be released under an open licence which enables free reuse, including commercial reuse > * Data underlying the Government’s own websites will be published in reusable form for others to use > * Personal, classified, commercially sensitive and third-party data will continue to be protected.

This is fantastic news – and we’ve highlighted key parts of the Prime Minister’s speech below:

> Information is the key. An informed citizen is a powerful citizen.

> [...] We are determined to be among the first governments in the world to open up public information in a way that is far more accessible to the general public.

> So I am grateful to Sir Tim Berners-Lee and Professor Nigel Shadbolt for leading a project to ‘make public data public’.

> This has enormous potential. Already more than 1,000 active users of the internet have registered their interest in working with government on this, and we have so far made around 1,100 datasets accessible to them.

> And there are many hundreds more that can be opened up – not only from central government but also from local councils, the NHS, police and education authorities.

> [...] In this way people will no longer be passive recipients of services but, through dialogue and engagement, active participants – shaping, controlling and determining what is best for them.

> And I can announce today that we will actively publish all public services performance data online during 2010 completing the process by 2011. Crime data, hospital costs and parts of the national pupil database will go on line in 2010. We will use this data to benchmark the best and the worst and drive better value for money.

> It will have a direct effect on how we allocate resources. We will introduce next year NHS tariffs based on best practice on the ground not average price. And we will be benchmarking the whole of the prison and probation system by 2011.

> And we will give our frontline services greater freedoms and flexibilities to respond innovatively to this data, reducing the number of ring fenced budgets, rationalising different central funding projects and joining-up capital funding within a local area.

> Releasing data can and must unleash the innovation and entrepreneurship at which Britain excels – one of the most powerful forces of change we can harness.

> When, for example, figures on London’s most dangerous roads for cyclists were published, an online map detailing where accidents happened was produced almost immediately to help cyclists avoid blackspots and reduce the numbers injured.

> And after data on dentists went live, an iphone application was created to show people where the nearest surgery was to their current location.

> And from April next year ordnance survey will open up information about administrative boundaries, postcode areas and mid-scale mapping.

> All of this will be available for free commercial re-use, enabling people for the first time to take the material and easily turn it into applications, like fix my street or the postcode paper.

> And I can further announce today that, again from next April, we will also release public transport data hitherto inaccessible or expensive and release significant underlying data for weather forecasts for free download and re-use.

We are currently working on a new project which will map open government data initiatives from around the world. We are also working on a guidance document for opening up government data, and starting a new working group on open government data to promote technical and legal standards, as well as to help document what open government data is out there. If you’re interested in any of this, we’d love to hear from you!

Open Knowledge Conference (OKCon) 2010: Call for Proposals

November 10, 2009 in Events, News, OKCon, OKF, Open Access, Open Data, Open Geodata, Open Government Data, Open Knowledge Definition, Open Science

The Open Knowledge Conference (OKCon) 2010 Call for Proposals is now open!

We would be grateful for help in circulating the call to relevant lists and communities! You can reuse or point to:

Open Knowledge Conference (OKCon) 2010: Call for Proposals

Introduction

OKCon, now in its fifth year, is the interdisciplinary conference that brings together individuals from across the open knowledge spectrum for a day of presentations and workshops.

Open knowledge promises significant social and economic benefits in a wide range of areas from governance to science, culture to technology. Opening up access to content and data can radically increase access and reuse, improving transparency, fostering innovation and increasing societal welfare.

This is a time of great change. In addition to high profile initiatives such as Wikipedia, OpenStreetMap and the Human Genome Project, there is enormous growth among open knowledge projects and communities at all levels. Moreover, in the last year, governments across the world have begun opening up huge amounts of their data.

And it doesn’t stop there. In academia, open access to both publications and data has been gathering momentum, and similar calls to open up learning materials have been heard in education. Furthermore this gathering flood of open data and content is the creator and driver of massive technological change. How can we make this data available, how can we connect it together, how can we use it collaborate and share our work?

Join us to discuss all of this and more!

Topics

We welcome proposals on any aspect of creating, publishing or reusing content or data that is open in accordance with opendefinition.org. Topics include but are not limited to:

Technology

  • Semantic Web and Linked Data in relation to open knowledge
  • Platforms, methods and tools for creating, sharing and curating open knowledge
  • Light-weight, adaptive interaction models
  • Open, decentralized social network applications
  • Open geospatial data

Law, Society and Democracy

  • Open Licensing, Legal Tools and the Public Domain
  • Open government data and content (public sector information)
  • Open knowledge and international development
  • Opening up access to the law

Culture and Education

  • Open educational tools and resources
  • Business models for open content
  • Incentive and rewards open-knowledge contributors
  • Open textbooks
  • Public domain digitisation initiatives

Science and Research

  • Opening up scientific data
  • Supporting scientific workflows with open knowledge models
  • Open models for scientific innovation, funding and publication (‘open-access’)
  • Tools for analysing and visualizing open data
  • Open knowledge in the humanities

Important Dates

  • Submission deadline: January 31st 2010
  • Notification of acceptance: March 1st
  • Camera-ready papers due: March 31st
  • OKCon: April 24th 2010

Submission Details

We are accepting three types of submissions:

  1. Full papers of 5-10 pages describing novel strategies, tools, services or best-practices related to open knowledge,
  2. Extended talk abstracts of 2-4 pages focusing on novel ideas, ongoing work and upcoming research challenges.
  3. Proposals for short talks and demonstrations

OKCon will implement an open submission and reviewing process. To make a submission visit:

Depending on the assessment of the submissions by the programme committee and external reviewers, submissions will be accepted either as full, short or lightning/poster presentations.

Proceedings of OKCON will be published at CEUR-WS.org. If you want your submission to be included in the conference proceedings you have to prepare a manuscript of your submission according to the LNCS Style.

Programme Committee

  • Sören Auer, AKSW/Universität Leipzig
  • Christopher Corbin, UK Advisory Board on Public Sector Information (APPSI)
  • Adnan Hadzi and Andrea Rota, Department of Media and Communications, Goldsmiths College, University of London
  • Claudia Müller-Birn, Carnegie Mellon University
  • Peter Murray-Rust, University of Cambridge
  • Rufus Pollock, Open Knowledge Foundation and Emmanuel College, University of Cambridge
  • John Wilbanks, Science Commons

Macedonian translation of the Open Knowledge Definition (OKD)

September 30, 2009 in News, Open Data, Open Knowledge Definition

We’ve just added a Macedonian translation of the Open Knowledge Definition thanks to Ljube Babunski.

If you’d like to translate the Definition into another language, or if you’ve already done so, please get in touch on our discuss list, or on info at the OKF’s domain name (okfn dot org).

Please create an account to get started.

Sign up to the Open Knowledge Newsletter

Get Updates