Last week, the Foundation’s legal expert Jordan Hatcher, was at iSemantic conference in Graz to give a session on open data licensing (especially for linked data). Here are the slides:

Last week I attended the Data-driven journalism in Amsterdam (which we blogged about here) run by the European Journalism (who interviewed me here).

My slides from the event are now up here:

Below are some lovely lofi graphical notes from Anna Lena Schiller:

It was a very well organised event and there were lots of interesting presentations and discussions. While many there were sold on the value of public bodies opening up datasets for others to use, there were more reservations about news organisations sharing datasets with each other and with the public. To address this, I’d like to start a brief document called:

  • Why should journalists and media organisations consider opening up their data?

The document would refer to existing success stories (such as the Guardian Datablog datasets, NYT Linked Data, …), compelling reasons, evidence, etc. and would appeal to enlightened self-interest. I’ve started some very preliminary notes at:

I hope this is something we will be able to discuss and add to at the data journalism event in Berlin later this week!

We’re delighted to announce a meetup on Data Journalism in Berlin in September organised by the Open Knowledge Foundation and Georgi Kobilarov at Uberblic Labs. Details are as follows:

  • When? 1st September 2010
  • Where? Fjord Office, Friedrichstrasse 210, Berlin
  • Register? You can register here!

Speakers will include:

  • Martin Belam, The Guardian
  • Jonathan Gray, The Open Knowledge Foundation
  • Christian Heise, ZEIT Online
  • Gerd Kamp, Deutsche Presse Agentur
  • Georgi Kobilarov, Uberblic Labs
  • John O’Donovan, BBC News
  • Tom Scott, BBC Earth
  • Ole Wintermann, Bertelsmann Foundation

From the blurb:

Data Journalism and the new and exciting possibilities that the Web of Data opens up for creators and consumers of news and media online will be the topic of this first meetup.

We have a brilliant lineup of speakers from media organisations like the BBC, The Guardian, the Deutsche Presse Agentur, the Bertelsmann Foundation coming to Berlin and talking about data journalism and the latest developments and projects in this field, and our friends from ZEIT Online will join the discussion.

The event takes place at the office of our friends at Fjord in the heart of Berlin. Starting at 2pm, you’ll hear talks followed by a panel discussion and an open space for working groups, and when the official programme ends at 7pm we’ll of course have drinks with all of you.

Language of all talks at the event will be English, but don’t be surprised to hear a bit of German here and there in conversations.

The Open Knowledge Foundation Working Group on EU Open Data is organising a session on linked data and open data at the ICT2010 event in Brussels later this year.

  • Where? T 003, Brussels Expo
  • When? 11:00-12:30 CET, 28th September 2010

From the blurb:

This networking session will discuss how public access to government data – crucial for an open and transparent society – can be improved.

This session has been proposed by IT professionals, scientists and government representatives organised – under the auspices of the Open Knowledge Foundation – as the Working Group on EU Open Data. It aims to establish a forum for networking and exchanging ideas with regard to publishing and linking governmental data, identifying technological developments and showcasing successful cases of linked governmental data. Developments in linked data could help further integrate information published by regional, national and European public administrations. The session is thematically relevant to a number of pillars within the Framework Programme as well as the Competitiveness and Innovation Programme.

Coordinator: Sören AUER (Universität Leipzig, AKSW, Institute for Computer Science, Germany)

The Telegraph ask Open Knowledge Foundation Director Rufus Pollock and Chris Taggart of OpenlyLocal about what UK government datasets they’d like to see opened up next…

We are delighted to announce the launch of the Open Knowledge Foundation Germany, which took place yesterday at the Leipzig Semantic Web Day.

The OKF Germany chapter will be dedicated to promoting all forms of open knowledge in Germany — including open government data, open data in science, and the public domain.

Work is already underway create a citizen driven registry of open data (with a particular focus on government information) which can be seen at:

OKF Germany has attracted a distinguished board of academics, technologists, journalists, librarians, scientists and policy experts. These include:

In addition to the launch of the OKF chapter there were plenty of interesting discussions about the state of open knowledge in Germany. Some key highlights:

  • A keynote from Prof Rainer Kuhlen, talking about the “bigger picture” of information society, the right of access to information and the issues with restrictive copyright licensing and the artificial scarcity of “immaterial goods”.
  • A talk from Prof Jörn von Lucke, talking about the potential innovation that can be brought about by open government data and greater transparency and participation.
  • A talk from Jonathan Gray on the Open Knowledge Foundation, open government data in the UK, and the future of open data.
  • Matthias Spielkamp talking about the importance of free licenses for sustainable open data strategies.
  • Daniel Kinzler talking about structured data in the Wikipedia project, with a sneak preview of how Wikipedia will look in about one year (exciting!)
  • A panel discussion on the state of open government data in Germany with representatives form public sector, the scientific community, the private sector and civil society. This discussion highlighted the obstacles and opportunities in German politics to help government and public sector to become more open and transparent and what initiatives and prototypes can be developed to underline the value of the reuse of public sector information.

This is fantastic news and we look forward to following the activities of the new chapter — and doing everything we can to support it! If you’re interested in getting involved, please say hello on the mailing list.

For further information see:

It’s April, and in the UK the sun has, at last, been sighted! To add to the cheer, The Open Knowledge Foundation’s 5th Open Knowledge Conference (OKCon) takes place in ten days time on Saturday 24th April in London.

Tickets for OKCon 2010 are selling rapidly, so for those who’d like to ensure their place should register now:

http://www.okfn.org/okcon/register/

The event will see a whole host of individuals descend on London for a full day of sessions and workshops spanning the Open Knowledge spectrum including:

  • State of the Nation Keynotes
    • Matthias Schindler, Wikimedia (Germany) on Bibliographic Data and the Public Domain
    • Glyn Moody, on the Post-Analogue World
    • Peter Murray-Rust, on Recent Developments in Open Science
    • Chris Taggart, on Open Local Government Data
    • Sören Auer, on Linked Open Data
    • Jordan Hatcher, on Open Licensing for Data
  • Ideas and Culture with talks on analyzing Dickens Letters and Making the Physical from the Digital
  • Open Bibliographic Information with talks on the Itinerant Poetry Library and the Journal Commons
  • Community Driven Research with talks on Climate data and Open Archaeology
  • Civic Information with talks on Using Open Government Data to Profile Politicians and the Straight Choice
  • Open Government Data and PSI in the EU which looks at the current state of play in France, Norway, Germany, the UK and elsewhere
  • Tools with talks on Large-scale data handling and revisioning with the Genome, Ontowiki, CKAN and more
  • Open Data and the Semantic Web with talks about South Korean DBPedia and Thesaurus Management Tool ‘Pool Party’

We’re also delighted to have a wide variety of short and lightning talks:

http://wiki.okfn.org/okcon/2010/lightning

And we’ve still got space for more, so if you’re interested in a giving a lightning talk sign up on that wiki page.

Full Programme information for OKCon 2010 is available at:

http://www.okfn.org/okcon/programme

More information:

We look forward to seeing people in a sunny London in April and making OKCon 2010 an event to remember!

Notes describing the talk on the work of the Open Knowledge Foundation given last week at Jornadas SIG Libre.

OKF activity graph

I was happily surprised to be asked to give this open knowledge talk at an open source software conference. But it makes sense - the free software movement has created the conditions in which an open data movement is possible. There is lots to learn from open source process, in both a technical and organisational sense.

In English we have one word “free” where Spanish like most languages has two, gratis and libre, signifying separately “free of cost” and “freedom to”. The Open Source Institute coined Open Source as a branding or marketing exercise to avoid the primary meaning “free of cost”. So whenever I say “open” I want you to hear the word “libre” [Later i was told that libre can have meaning in at least 15 different ways]

The best way to talk about the work of the Open Knowledge Foundation is to look at its projects, which form an open knowledge stack similar to the OSGeo software stack.

Open Definition

The Open Knowledge Definition is based on the OSI Open Source Software Definition (which OSGeo uses as a reference for acceptable software licenses). No restrictions on field of endeavour - non-commercial-use licenses are not open as in the OKD. An open data license will pass the cake test.

Open Data Commons

Open Data Commons is run by Jordan Hatcher, who started work on the Open Database License with support from Talis, later extensive negotiation with the OpenStreetmap community. ODbL is a ShareAlike license for data, that obviates the problems of inapplicability of copyright to facts, and greediness of the ShareAlike clause when it comes to use of maps in PDFs, etc.

PDDL is a license that implements the Science Commons protocol for open access data, explicitly placing it in the public domain.

The Panton Principles are four precepts for publishers of scientific research data who wish that data to be freely reusable. Being openly able to inspect, critique and re-analyse data is critical to the effectiveness of scientific research.

Open Data Grid

The Open Data Grid is a project in early incubation; based on the Tahoe distributed filesystem. It’s in need of development effort on Tahoe to really get going. Provide secure storage for open datasets around the edges of infrastructure that people are already running. 4340727578_da9a6671a5_b

People are handwaving about the Cloud, but storage and backup are not problems that it is really meant to solve. People make different claims about the Cloud - cheaper, greener, more efficient, more flexible. Can we get these things in other ways?

There is a saying, “never underestimate the bandwidth of a truck full of DAT tapes”

Comprehensive Knowledge Archive Network (CKAN)

CKAN is inspired by free software package repositories, perl’s CPAN, R’s CRAN, python’s PyPi. It provides a wiki-like interface to create minimal metadata for packages with a versioned domain model and HTTP API.

CKAN supports groups, which can curate a package namespace - e.g. climate data - and assess priorities for turning into fully installable packages.

CKAN’s open source code is being used in the data package catalogue for the data.gov.uk project, part of the Making Public Data Public effort in the UK.

datapkg

The Debian of Data - datapkg takes Debian’s apt tool as inspiration for fully automatable install of data packages, with dependencies between them. This is currently in usable alpha stage with a python implementation.

Where Does My Money Go?

The next challenge really is to bring the concerns and the solutions to a mainstream public. Agustín Lobo spoke of “a personal consciousness but not an institutional consciousness” when it comes to open source and open data. Media coverage, exemplary government implementations, help to create this kind of consciousness.

Pressure for increased open access is coming from academia - for the research data underlying papers, for the right to data mine and correlate different sources, for library data open for re-use. Pressure is also coming from within museums, libraries and archives - memory institutions who want to increase exposure to their collections with new technology, and recognise that open data, linked to a network of resources, will work for sustainability and not against it.

The next generation of researchers, who are kids in school now, will grow up with an expectation that code and data are naturally open. It will be interesting to see what they make!

Meanwhile OpenStreetmap is feeding several startups, and more commercial presence in open data space will be of benefit. Illustrative that one does not have to be proprietary to be commercial.

Now higher-profile government projects opening data are helping to mainstream. To what extent is open a fashionable position, to what extent is open reflected throughout the way of working?

Open process; early release, public sharing of bugs, public discussion of plans - everything in Nat Torkington’s post on Truly Open Data. The opportunity to fail in public, to learn from others’ problems, and self-interestedly collaborate.


I had a great time at SIG Libre 10. Oscar Fonts’ talk on OpenSearch Geospatial interfaces to popular services has me itching to add an OpenSearch +Geo interface to CKAN, as well as to work on getting the apparent version skew in the Geo extensions resolved amicably.

Genís Roca spoke thought-provokingly on Retorno y rentabilidad (there isn’t really an equivalent English word - “rentability” - less exploitative or focused than profitability). Rentability, especially for online services, can come in ways that sustain an organisation predictably, and don’t involve fishing in the pockets of ultimate end-users.

Ivan Sanchez showed areas of OpenStreetmap Spain with stunning level of detail, trees and fences, MasterMap-quality coverage. I’m inspired to pick up JOSM and Markaartor to add building-level detail from out of copyright 1:500 Edinburgh town plans at the National Library of Scotland’s map services.

Agustin Lobo talked about the distributed work and cross-institutional support and benefit of the R project, and the impact of open source on open access to data in science. He mentioned a Nature open peer review experiment that was discarded - am thinking it wasn’t curated enough. The talk helped me to connect the OKF’s work to the rest of the Jornadas.

The shiny slides prezi.com which many people asked for details of - this should show embedded in the page I hope. I stupidly forgot to put URLs on the slides which is partly why i have written this blog.

This Thursday (11th March) I’m speaking at the Forum Virium’s Open Up the City event in Helsinki.

This year their focus is on “open data, design, interfaces and innovation” and I’m speaking under the title “Open Data: What, Why, How?”. It looks like this will be a very interesting event and it’s also a chance to catch up with the very active open data people in Finland!

Update: I’ve now posted the slides from my talk Open (Public) Data: What, Why How? (AKA: Open Data for a Read/Write City).

In creating the talk I also put together a logo to go with the slogan that Petri Kola and I came up with at the brainstorming session yesterday. It’s hereby dedicated to the Public Domain so please feel free to use and re-use:

Open Data for a Read/Write City

Update: ideas from a brainstorming session at a Forum Virium workshop on Wednesday:

  • Slogan: open data for the read/write city
    • Open data isn’t just about building a better “read” services and governance. It is about building better “write” services and institutions which support real participatory activity by citizens. To give a very simple and concrete example: open access to public transport timetable is great but I want want to be able to send back information (like the actual arrival time of that bus or train, how clean it was, whether the bus stop has moved …).
  • Top services to work on with open (local government) data: transport, spending, food, legislation/voting, education, health.

The IV Jornadas de SIG Libre is taking place this week from the 10th-12th of March in Girona, Spain. This is the premier spanish F/OSS GIS event and OKFNer Jo Walsh will be speaking:

http://www.sigte.udg.edu/jornadassiglibre/keynotes