Support Us

You are browsing the archive for Open/Closed.

Introducing “A free, libre and open Glossary”

Guest - September 3, 2013 in Open/Closed

The following guest post is by Chris Sakkas.

A few months ago, we ran into a problem at work. ‘Let’s open source this,’ my boss said, and then ran a conventional brainstorming session. I am constantly frustrated by people misusing terms like free, libre and open that have well-established definitions. I decided to spend an afternoon writing the first draft of a glossary that would explain in depth what these words mean and their relationship to one another. My hope is that if someone read the glossary from start to finish, they would never again confuse crowdsourcing with open source, or freeware with software.

Here’s the summary:

  • A free/libre/open work is one that can be shared and adapted by any person for any purpose, without infringing copyright.
  • A crowdsourced work is one that was solicited from the community, rather than internally or by conventional contracting.
  • Freeware describes software that is free of cost to download.
  • Free software is free/libre/open, but might cost money to buy.

The glossary is a collaboration by the community, but I’ve also released it as an ODT and PDF in a fixed form. The advantage of this is that it is proofread, verified and able to be cited. However, it also survives as a living document that you are welcome to contribute to.

The research that I needed to do to write the glossary made me more sympathetic to those who blur or misuse the terms. While the big concepts – open knowledge, open source, free software, free cultural works – are clearly defined, they are not quite synonyms. What is free, libre and open has been filtered through the expectations of the drafters: the Open Knowledge Foundation’s Open Definition requires a work to be open access to qualify as open knowledge; the Definition of Free Cultural Works requires the work to be in a free format to qualify as a free cultural work. It’s thus possible to have a free cultural work that is not open knowledge, and vice versa; it’s also not unusual for a work to be under a free cultural licence/open knowledge licence but be neither a free cultural work nor open knowledge.

The community response to my first draft was interesting and useful. I experienced early and sustained criticism of my use of the non-free, libre and open Google Drive to host the file. I also learned first-hand the power of the carrot over the stick: I ignored people simply criticising the use of Google Drive, but transferred it to the Etherpad when someone suggested it.

If this sounds of interest to you, jump in and check it out!


Chris Sakkas is the admin of the FOSsil Bank wiki and the Living Libre blog and Twitter feed.

We Need an Open Database of Clinical Trials

Jonathan Gray - February 5, 2013 in Access to Information, Campaigning, Featured, Open Data, Open Science, Open/Closed, Policy

The award winning science writer and physician Ben Goldacre recently launched a major campaign to open up the results of clinical trials.

The AllTrials initiative calls for all clinical trials to be reported and for the “full methods and the results” of each trial to be published.

Currently negative results are poorly recorded and positive results are overhyped, leading to what Goldacre calls ‘research fraud’, misleading doctors about the drugs they are prescribing and misleading patients about the drugs they are taking.

The Open Knowledge Foundation is an organisational supporter of AllTrials, and we encourage you to sign and share the petition if you have not already done so:

There have been some big wins in the past 48 hours. The lead legislator for a new EU Clinical Trials Regulation recently came out in favour of transparency for clinical trials. Today GlaxoSmithKline announced their support for the campaign, which, as Goldacre says, is “huge, and internationally huge”.

As well as continuing to push for stronger policies and practises that support the release of information about clinical trials, we would like to see a public repository of reports and results that doctors, patients and researchers can access and add to. We need an open database of clinical trials.

Over the past few days we’ve been corresponding with Ben and others on the AllTrials about how we might be able to work together to create such a database – building on the prototyping work that was presented at last year’s Strata event.

In the mean time, you can watch the TED talk if you haven’t seen it already – and help us to make some noise about the petition!

“Carbon dioxide data is not on the world’s dashboard” says Hans Rosling

Jonathan Gray - January 21, 2013 in Featured, Interviews, OKFest, Open Data, Open Government Data, Open/Closed, WG Sustainability, Working Groups

Professor Hans Rosling, co-founder and chairman of the Gapminder Foundation and Advisory Board Member at the Open Knowledge Foundation, received a standing ovation for his keynote at OKFestival in Helsinki in September in which he urged open data advocates to demand CO2 data from governments around the world.

Following on from this, the Open Knowledge Foundation’s Jonathan Gray interviewed Professor Rosling about CO2 data and his ideas about how better data-driven advocacy and reportage might help to mobilise citizens and pressure governments to act to avert catastrophic changes in the world’s climate.

Hello Professor Rosling!

Hi.

Thank you for taking the time to talk to us. Is it okay if we jump straight into it?

Yes! I’m just going to get myself a banana and some ginger cake.

Good idea.

Just so you know: if I sound strange, it’s because I’ve got this ginger cake.

A very sensible idea. So in your talk in Helsinki you said you’d like to see more CO2 data opened up. Can you say a bit more about this?

In order to get access to public statistics, first the microdata must be collected, then it must be compiled into useful indicators, and then these indicators must be published. The amount of coal one factory burnt during one year is microdata. The emission of carbon dioxide per year per person in one country is an indicator. Microdata and indicators are very very different numbers. CO2 emissions data is often compiled with great delays. The collection is based on already existing microdata from several sources, which civil servants compile and convert into carbon dioxide emissions.

Let’s compare this with calculating GDP per capita, which also requires an amazing amount of collection of microdata, which has to be compiled and converted and so on. That is done every quarter for each country. And it is swiftly published. It guides economic policy. It is like a speedometer. You know when you drive your car you have to check your speed all the time. The speed is shown on the dashboard.

Carbon dioxide is not on the dashboard at all. It’s like something you get with several years delay, when you are back from the trip. It seems that governments don’t want to get it swiftly. And when they publish it finally, they publish it as total emissions per country. They don’t want to show emission per person, because then the rich countries stand out as worse polluters than China and India. So it is not just an issue about open data. We must push for change in the whole way in which emissions data is handled and compiled.

You also said that you’d like to see more data-driven advocacy and reportage. Can you tell us what kind of thing you are thinking of?

Basically everyone admits that the basic vision of the green movement is correct. Everyone agrees on that. By continuing to exploit natural resources for short term benefits you will cause a lot of harm. You have to understand the long-term impact. Businesses have to be regulated. Everyone agrees.

Now, how much should we regulate? Which risks are worse, climate or nuclear? How should we judge the bad effects of having nuclear electricity? The bad effects of coal production? These are difficult political judgments. I don’t want to interfere with these political judgments, but people should know the orders of magnitude involved, the changes, what is needed to avoid certain consequences. But that data is not even compiled fast enough, and the activists do not protest, because it seems they do not need data?

Let’s take one example. In Sweden we have data from the energy authority. They say: “energy produced from nuclear”. Then they include two outputs. One is the electricity that goes out into the lines and that lights the house that I’m sitting in. The other is the warm waste water that goes back into the sea. That is also energy they say. It is actually like a fraud to pretend that that is energy production. Nobody gets any benefit from it. On the contrary, they are changing the ecology of the sea. But they get away with it as the destination is energy produced.

We need to be able to see the energy supply for human activity from each source and how it changes over time. The people who are now involved in producing solar and wind produce very nice reports on how production increase each year. Many get the impression that we have 10, 20, 30% of our energy from solar and wind. But even with fast growth from almost zero solar and wind it is nothing yet. The news reports mostly neglect to explain the difference in percentage growth of solar and wind energy and their percent of total energy supply.

People who are too much into data and into handling data may not understand how the main misconceptions come about. Most people are so surprised when I show them total energy production in the world on one graph. They can’t yet see solar because it hasn’t reached one pixel yet.

So this isn’t of course just about having more data, but about having more data literate discussion and debate – ultimately about improving public understanding?

It’s like that basic rule in nutrition: Food that is not eaten has no nutritional value. Data which is not understood has no value.

It is interesting that you use the term data literacy. Actually I think it is presentation skills we are talking about. Because if you don’t adapt your way of presenting to the way that people understand it, then you won’t get it through. You must prepare the food in a way that makes people want to eat it. The dream that you will train the entire population to about one semester of statistics in university: that’s wrong. Statisticians often think that they will teach the public to understand data the way they do, but instead they should turn data into Donald Duck animations and make the story interesting. Otherwise you will never ever make it. Remember, you are fighting with Britney Spears and tabloid newspapers. My biggest success in life was December 2010 on the YouTube entertainment category in the United Kingdom. I had most views that month. And I beat Lady Gaga with statistics.

Amazing.

Just the fact that the guy in the BBC in charge of uploading the trailer put me under ‘entertainment’ was a success. No-one thought of putting a trailer for a statistics documentary under entertainment.

That’s what we do at Gapminder. We try to present data in a way that makes people want to consume it. It’s a bit like being a chef in a restaurant. I don’t grow the crop. The statisticians are like the farmers that produce the food. Open data provide free access to potatoes, tomatoes and eggs and whatever it is. We are preparing it and making a delicious food. If you really want people to read it, you have to make data as easy to consume as fish and chips. Do not expect people to become statistically literate! Turn data into understandable animations.

My impression is that some of the best applications of open data that we find are when we get access to data in a specific area, which is highly organized. One of my favorite applications in Sweden is a train timetable app. I can check all the communter train departures from Stockholm to Uppsala, including the last change of platform and whether there is a delay. I can choose how to transfer quickly from the underground to the train to get home fastest. The government owns the rails and every train reports their arrival and departure continuously. This data is publicly available as open data. Then a designer made an app and made the data very easy for me to understand and use.

But to create an app which shows the determinants of unemployment in the different counties of Sweden? No-one can do that because that is a great analytical research task. You have to take data from very many different sources and make predictions. I saw a presentation about this yesterday at the Institute for Future Studies. The PowerPoint graphics were ugly, but the analysis was beautiful. In this case the researchers need a designer to make their findings understandable to the broad public, and together they could build an app that would predict unemployment month by month.

The CDIAC publish CO2 data for the atmosphere and the ocean, and they publish national and global emissions data. The UNFCCC publish national greenhouse gas inventories. What are the key datasets that you’d like to get hold of that are currently hard to get, and who currently holds these?

I have no coherent CO2 dataset for the world beyond 2008 at the present. I want to have this data until last year, at least. I would also welcome half year data but I understand this can be difficult because carbon dioxide emission vary for transport, heating or cooling of houses over the seasons of the year. So just give me the past year’s data in March. And in April/May for all countries in the world. Then we can hold government accountable for what happens year by year.

Let me tell you a bit about what happens in Sweden. The National Natural Protection Agency gets the data from the Energy Department and from other public sources. Then they give these datasets to consultants at the University of Agriculture and the Meteorological Authority. Then the consultants work on these datasets for half a year. They compile them, the administrators look through them and they publish them in mid-December, when Swedes start to get obsessed about Christmas. So that means that there was a delay of eleven and a half months.

So I started to criticize that. My cutting line was when I was with the Minister of Environment and she was going to Durban. And I said “But you are going to Durban with eleven and a half month constipation. What if all of this shit comes out on stage? That would be embarrassing wouldn’t it?”. Because I knew that she had in 2010 an increase in carbon dioxide emission and it increased by 10%. But she only published that coming back from Durban. So that became a political issue on TV. And then the government promised to make it earlier. So 2012 we got CO2 data by mid-October, and 2013 we’re going to get it in April.

Fantastic.

But actually ridiculing is the only way that worked. That’s how we liberated the World Bank’s data. I ridiculed the President of the World Bank at an international meeting. People were laughing. That became too much.

The governments in the rich countries don’t want the world to see emissions per capita. They want to publish emissions per country. This is very convenient for Germany, UK, not to mention Denmark and Norway. Then they can say the big emission countries are China and India. It is so stupid to look at total emissions per country. This allows small countries to emit as much as they want because they are just not big enough to matter. Norway hasn’t reduced their emissions for the last forty years. Instead they spend their aid money to help Brazil to replant rainforest. At the same time Brazil lends 200 times more money to the United States of America to help them consume more and emit more carbon dioxide into the atmosphere. Just to put these numbers up makes a very strong case. But I need to have timely carbon dioxide emission data. But not even climate activists ask for this. Perhaps it is because they are not really governing countries. The right wing politicians need data on economic growth, the left wing need data on unemployment but the greens don’t yet seem to need data in the same way.

As well as issues getting hold of data at a national level, are there international agencies that hold data that you can’t get hold?

It is like a reflection. If you can’t get data from the countries for eleven and a half months, why the heck should the UN or the World Bank compile it faster? Think of your household. There are things you do daily, that you need swiftly. Breakfast for your kids. Then, you know, repainting the house. I didn’t do it last year, so why should I do it this year? It just becomes slow the whole system. If politicians are not in a hurry to get data for their own country, they are not in a hurry to compare their data to other countries. They just do not want this data to be seen during their election period.

So really what you’re saying that you’d recommend is stronger political pressure through ridicule on different national agencies?

Yes. Or sit outside and protest. Do a Greenpeace action on them.

Can you think of datasets about carbon dioxide emissions which aren’t currently being collected, but which you think should be collected?

Yes. In a very cunning way China, South Africa and Russia like to be placed in the developing world and they don’t publish CO2 data very rapidly because they know it will be turned against them in international negotiations. They are not in a hurry. The Kyoto Protocol at least made it compulsory for the richest countries to report their data because they had committed to decrease. But every country should do this. All should be able to know how much coal each country consumed, how much oil they consumed, etc and from that data have a calculation made on how much CO2 each country emitted last year.

It is strange that the best country to do this – and it is painful for a Swede to accept this – is the United States. CDIAC. Federal Agencies in US are very good on data and they take on the whole world. CDIAC make estimates for the rest of the world. Another US agency I really like is the National Snow and Ice Data Centre in Denver, Colorado. Thay give us 24 hours updates on the polar sea ice area. That’s really useful. They are also highly professional. In the US the data producers are far away from political manipulation. When you see the use of fossil fuels in the world there is only one distinct dip. That dip could be attributed to the best environmental politician ever. The dip in CO2 emissions took place in 2008. George W. Bush, Greenspan and the Lehman Brothers decreased CO2 emissions by inducing a financial crisis. It was the most significant reduction on the use of fossil fuels in modern history.

I say this to put things into proportion. So far it is only financial downturns that have had an effect on the emission of greenhouse gases. The whole of environmental policy hasn’t yet had any such dramatic effect. I checked this with Al Gore personally. I asked him “Can I make this joke? That Bush was better for the climate than you were?”. “Do that!”, he said, “You’re correct.” Once we show this data people can see that the economic downturn so far was the most forceful effect on CO2 emission.

If you could have all of the CO2 and climate data in the world, what would you do with it?

We’re going to make teaching materials for high schools and colleges. We will cover the main aspects of global change so that we produce a coherent data-driven worldview, which starts with population, and then covers money, energy, living standards, food, education, health, security, and a few other major aspects of human life. And for each dimension we will pick a few indicators. Instead of doing Gapminder World with the bubbles that can display hundreds of indicators we plan a few small apps where you get a selected few indicators but can drill down. Start with world, world regions, countries, subnational level, sometimes you split male and female, sometimes counties, sometimes you split income groups. And we’re trying to make this in a coherent graphic and color scheme, so that we really can convey an upgraded world view.

Very very simple and beautiful but with very few jokes. Just straightforward understanding. And for climate impact we will relate to the economy. To relate to the number of people at different economic levels, how much energy they use and then drill down into the type of energy they use and how that energy source mix affects the carbon dioxide emissions. And make trends forward. We will rely on the official and most credible trend forecast for population, one, two or more for energy and economic trends etc. But we will not go into what needs to be done. Or how should it be achieved. We will stay away from politics. We will stay away from all data which is under debate. Just use data with good consensus, so that we create a basic worldview. Users can then benefit from an upgraded world view when thinking and debating about the future. That’s our idea. If we provide the very basic worldview, others will create more precise data in each area, and break it down into details.

A group of people inspired by your talk in Helsinki are currently starting a working group dedicated to opening up and reusing CO2 data. What advice would you give them and what would you suggest that they focus on?

Put me in contact with them! We can just go for one indicator: carbon dioxide emission per person per year. Swift reporting. Just that.

Thank you very much Professor Rosling.

Thank you.

If you want help to liberate, analyse or communicate carbon emissions data in your country, you can join the OKFN’s Open Sustainability Working Group.

Did Gale Cengage just liberate all of their public domain content? Sadly not…

Jonathan Gray - January 9, 2013 in Featured, Free Culture, Legal, Open Access, Open/Closed, Public Domain, WG Public Domain

Earlier today we received a strange and intriguing press release from a certain ‘Marmaduke Robida’ claiming to be ‘Director for Public Domain Content’ at Gale Cengage’s UK premises in Andover. Said the press release:

Gale, part of Cengage Learning, is thrilled to announce that all its public domain content will be freely accessible on the open web. “On this Public Domain Day, we are proud to have taken such a ground-breaking
decision. As a common good, the Public Domain content we have digitized has to be accessible to everyone” said Marmaduke Robida, Director for Public Domain Content, Gale.

Hundreds of thousands of digitized books coming from some of the world’s most prestigious libraries and belonging to top-rated products highly appreciated by the academic community such as “Nineteenth Century Collection Online”, “Eighteenth Century Collection Online”, “Sabin America”, “Making of the Modern World” and two major digitized historical newspaper collections (The Times and the Illustrated London news) are now accessible from a dedicated websit. The other Gale digital collections will be progressively added to this web site throughout 2013 so all Public Domain content will be freely accessible by 2014. All the images are or will be available under the Public Domain Mark 1.0 license and can be reused for any purpose.

Gale’s global strategy is inspired by the recommandations issued by the European reflection group “Comite des sages” and the Public Domain manifesto. For Public Domain content, Gale decided to move to a freemium
business model : all the content is freely accessible through basic tools (Public Domain Downloader, URL lists, …), but additional services are charged for. “We are confident that there still is a market for our products. Our state-of-art research platforms offer high quality services and added value which universities or research libraries are ready to pay for” said Robida.

A specific campaign targeted to national and academic libraries for promoting the usage of Public Domain Mark for digitized content will be launched in 2013. “We are ready to help the libraries that have a digitization programme fulfill their initial mission : make the knowledge accessible to everyone. We also hope that our competitors will follow the same way in the near future. Public Domain should not be enclosed by paywalls or dubious licensing terms” said Robida.

The press release linked to a website which proudly proclaimed:

All Public Domain content to be freely available online. Gale Digital Collections has changed the nature of research forever by providing a wealth of rare, formerly inaccessible historical content from the world’s most prestigious libraries. In january 2013, Gale has taken a ground-breaking decision and chosen to offer this content to all the academic community, and beyond to mankind, to which it belongs

This was met with astonishment by members of our public domain discussion list, many of whom suspected that the news might well be too good to be true. The somewhat mysterious, yet ever-helpful Marmaduke attempted to allay these concerns on the list, commenting:

I acknowledge this decision might seem a bit disorientating. As you may know, Gale is already familiar to give access freely to some of its content [...], but for Public Domain content we have decided to move to the next degree by putting the content under the Public Domain Mark.

Several brave people had a go at testing out the so-called ‘Public Domain Downloader’ and said that it did indeed appear to provide access to digitised images of public domain texts – in spite of concerns in the Twittersphere that the software might well be malware (in case of any ambiguity, we certainly do not suggest that you try this at home!).

I quickly fired off an email to Cengage’s Director of Media and Public Relations to see if they had any comments. A few hours later a reply came back:

This is NOT an authorized Cengage Learning press release or website – our website appears to have been illegally cloned in violation of U.S. copyright and trademark laws. Our Legal department is in the process of trying to have the site taken down as a result. We saw that you made this information available via your listserv and realize that you may not have been aware of the validity of the site at the time, but ask that you now remove the post and/or alert the listserv subscribers to the fact that this is an illegal site and that any downloads would be in violation of copyright laws.

Sadly the reformed Gale Cengage – the Gale Cengage opposed to paywalls, restrictive licensing and clickwrap agreements on public domain material from public collections, the Gale Cengage supportive of the Public Domain Manifesto and dedicated to liberating of public domain content for everyone to enjoy – was just a hoax, a fantasm. At least this imaginary, illicit doppelgänger Gale gives a fleeting glimpse of a parallel world in which one of the biggest gatekeepers turned into one of the biggest liberators overnight. One can only hope that Gale Cengage and their staff might – in the midst of their legal wrangling – be inspired by this uncanny vision of the good public domain stewards that they could one day become. If only for a moment.

Let’s defend Open Formats for Public Sector Information in Europe!

Regards Citoyens - December 3, 2012 in Access to Information, Campaigning, Open Data, Open Government Data, Open Standards, Open/Closed, Policy, WG EU Open Data, WG Open Government Data

Following some
remarks from Richard Swetenham from the European Commission
, we
made a few changes relative to the trialogue process and the coming
steps: the trialogue will start its meetings on 17th December and it
is therefore already very useful to call on our governments to support
Open Formats!

When we work on building all these amazing democratic transparency collaborative tools all over the world, all of us, Open Data users and producers, struggle with these incredibly frustrating closed or unexploitable formats under which public data is unfortunately so often released: XLS, PDF, DOC, JPG, completely misformatted tables, and so on.

The EU PSI directive revision is a chance to push for a clear Open Formats definition!

As part of Neelie Kroes’s Digital Agenda, the European Commission recently proposed a revision of the Public Sector Information (PSI) Directive widening the scope of the existing directive to encourage public bodies to open up the data they produce as part of their own activities.

The revision will be discussed at the European Parliament (EP), and this is the citizen’s chance to advocate for a clear definition of the Open Formats under which public sector information (PSI) should be released.

We believe at Regards Citoyens that having a proper definition of Open Formats within the EU PSI directive revision would be a fantastic help to citizens and contribute to economic innovation. We believe such a definition can be summed-up to in two simple rules inspired by the Open Knowledge Foundation’s OpenDefinition principles:

  • being platform independant and machine-readable without any legal, financial or technical restriction;
  • being the result of an openly developped process in which all users can actually be part of the specifications evolution.

Those are the principles we advocated in a policy note on Open Formats we published last week and sent individually to all Members of the European Parliament (MEPs) from the committee voting on the revision of the PSI directive last thursday.

Good news: the first rule was adopted! But the second one was not. How did that work?

ITRE vote on Nov 29th: what happened and how?

EP meetingA meeting at the European Parliament
CC-BY-ND EPP Group

The European parliamentary process first involves a main committee in charge of preparing the debates before the plenary session, in our case the Industry, Research and Energy committee (ITRE). Its members met on 29th November around 10am to vote on the PSI revision amongst other files.

MEPs can propose amendments to the revision beforehand, but, to speed up the process, the European Parliament works with what is called “compromise amendments” (CAs): the committee chooses a rapporteur leading the file in its name and each political group gets a “shadow rapporteur” to work together with the main rapporteur. They all study the proposed amendments together and try to sum them up in a few consensual ones called CAs, hence leading MEPs to pull away some amendments when they consider their concerns met. During the committee meeting, both kinds of amendment are voted on in accordance with predefined voting-list indicating the rapporteur’s recommandations.

Regarding Open Formats, everything relied on a proposition to add to the directive‘s 2nd article a paragraph providing a clear definition of what an Open Format actually is. The rapporteurs work led to a pretty good compromise amendment 18, which speaks pretty much for itself:

« An open format is one that is platform independent, machine readable, and made available to the public without legal, technical or financial restrictions that would impede the re-use of that information. »

This amendment was adopted, meaning this change will be proposed as a new amendment to all MEPs during the plenary debate. Given that it has the support of the rapporteur in the name of the responsible committee, it stands a good chance of being carried.

Regarding the open development process condition, MEP Amelia Andersdotter, shadow rapporteur for the European Parliament Greens group, maintained and adapted to this new definition her amendment 65:

« "open format" means that the format’s specification is maintained by a not-for-profit organisation the membership of which is not contingent on membership fees; its ongoing development occurs on the basis of an open decision-making procedure available to all interested parties; the format specification document is available freely; the intellectual property of the standard is made irrevocably available on a royalty-free basis. »

Even though it also got recommanded for approval by the main rapporteur, unfortunately the ALDE and EPP groups were not ready to support it yet and it got rejected.

Watching the 12 seconds during which the Open Formats issues were voted is a strange experience to anyone not familiar with the European Parliament, since most of the actual debate happens beforehand between the different rapporteurs, the committee meeting mainly consists of a succession of raised hand votes calls, which are occasionally electronically checked. Therefore, there are no public individual votes or records of these discussions available and the vote happens very quickly.

What next? Can we do anything?

Now that the ITRE committee has voted, its report should soon be made available online

As the European institutions work as a tripartite organisation, the text adopted by the ITRE committee will now be transferred to both the European Commission and Council for approval. This includes a trialogue procedure in which a consensus towards a common text must be driven. This is an occasion to call on our respective national governments to push in favor of Open Formats in order toc maintain and improve the definition which the EP already adopted.

The text which comes out of the tripartite debate will be discussed in plenary session, planned at the moment for 11th March 2013. Until noon on the Wednesday preceding the plenary, MEPs will still have the possibility to propose new amendments to be voted on at plenary: they can do so either as a whole political group, or as a group of at least 40 different MEPs from any groups.

Possible next steps to advocate Open Formats could therefore be the following:

  • Call on our national governments to push in favor of Open Formats;
  • Keep up-to-date with documents and procedures from the European Parliament: ParlTrack offers e-mail alerts on the dossier;
  • Whenever the proposition of new amendments towards the plenary debate is opened, we should contact our respective national MEPs from all political groups and urge them to propose amendments requiring Open Formats to be based on an open development process. Having multiple amendments coming from different political groups would certainly help MEPs realize this is not a partisan issue;
  • When the deadline for proposing amendments is reached, we should call on our MEPs by email, phone calls or Tweets to vote for such amendments and possibly against some opposed ones. In order to allow anyone to easily and freely phone their MEPs, we’re thinking about reusing La Quadrature du Net‘s excellent PiPhone tool for EU citizen advocacy.

In any case, contacting MEPs to raise concerns on Open Formats policies can of course always be useful at all times before and after the plenary debates. Policy papers, amendments proposals, vulgarisation documents, blogposts, open-letters, a petititon, tweets, … It can all help!

To conclude, we would like to stress once again that Regards Citoyens is an entirely voluntary organisation without much prior experience with the European Parliament. This means help and expertise is much appreciated! Let’s get all ready to defend Open Formats for European Open Data in a few weeks!

Regards Citoyens — CC-BY-SA

Following Money and Influence in the EU: The Open Interests Europe Hackathon

Liliana Bounegru - November 29, 2012 in Data Journalism, Events, Featured, Open Data, Open Government Data, Open/Closed, Sprint / Hackday

This blog post is cross-posted from the Data-driven Journalism Blog.

Making sense of massive datasets that document the processes of lobbying and public procurement at European Union level is not an easy task. Yet a group of 25 journalists, developers, graphic designers and activists worked together at the Open Interests Europe hackathon last weekend to create tools and maps that make it easier for citizens and journalists to see how lobbyists try to influence European policies and to understand how governments award contracts for public services. The hackathon was organised by the European Journalism Centre and the Open Knowledge Foundation with support from Knight-Mozilla OpenNews.

At the Google Campus Cafe in London, one group dived into European lobbying data made available via an API: api.lobbyfacts.eu. Created by a group of five NGOs: Corporate Europe Observatory, Friends of the Earth Europe, Lobby Control, Tactical Tech and the Open Knowledge Foundation, the API gives access to up-to-date, structured information about persons and organisations registered as lobbyists in the EU Transparency Register. The API is part of lobbyfacts.eu, a website that aims to make it easy for anyone to track lobbyists and their influence at European Union level, due to launch in January 2013.

One of the projects created with the lobby register data is a map showing the locations of the offices of lobby firms based on their turnover. The size of the bubbles on the map corresponds to the turnover of the firm. Built by Friedrich Lindenberg, the map is an overlay of a Stamen Design map with Leafletjs.

Screenshot of api.lobbyfacts.eu/map showing locations of lobbying firms across Europe

Other teams focused on data analysis, comparing the data from the EU Transparency Register with that of the Register of Expert Groups. Interesting leads for possible further investigative work resulted from the comparison of the figures reported by lobby firms in the Transparency Register with those collected by the National Bank of Belgium. “Some companies underreported massively to the National Bank of Belgium and some of them were making themselves look bigger in the Transparency Register,” said Eric Wesselius, leader of the lobby transparency challenge and co-founder of Corporate Europe Observatory. Wesselius’ organisation will continue investigations in this area.

A second group of journalists and graphic designers led by Jack Thurston, an activist involved in Fishsubsidy.org, discussed how fish subsidy data could be used for finding journalistic stories and explored various ways in which the unintended consequences of the EU fish subsidies programme, such as overfishing, could be compellingly presented to the general public.  

Sketch for interactive graphic showing fishing vessels, their trajectory and the subsidies they receive, made by graphic designer Helene Sears

A third group looked into European public procurement data. “Public procurement is an area that is underreported by journalists,” said data journalist Anders Pedersen, founder of OpenTED. “9-25% of the GDP in the EU is procurement – highest in the Netherlands where it is around 35%. It’s a real issue in times of austerity who provides our services,” he added.

Several scrapers were built to access the data relating to winners of contracts and the values of these contracts from the EU publication TED (Tenders Electronic Daily). A map of public procurement contracts by awarding city was created using Google Fusion Tables by geocoding the original CSV file, enriched with OpenStreetMap.

Screenshot of map of public procurement contracts by Benjamin Simatos and Martin Stabe

Pedersen’s long term goal is to create an interface and an API for EU public procurement data and to publish some more visualisations. “A lot of the work that got done here [at the hackathon] we would not have gotten done in the next months maybe. It really helped us push far ahead in terms of ideas and in terms of getting stuff done.”

 

Photo of participants at the hackahon by Mehdi Guiraud.

US Doctor Data to be “Open Eventually”

Theodora Middleton - November 9, 2012 in Access to Information, Open Government Data, Open Science, Open/Closed, WG Open Licensing

Here’s an interesting project using slightly unorthodox means to get data out into the open: crowdfunding the purchase of US healthcare data for subsequent open release.

The company behind the project is NotOnly Dev, a Health IT software incubator who describe themselves as a “not-only-for-profit” company. Earlier this year they released a Doctor Social Graph, which displays the connections between doctors, hospitals and other healthcare organizations in the US. The dataset used is billed as “THE data set that any academic, scientist, or health policy junkie could ever want to conduct almost any study.”

They say:

Our goal is to empower the patient, make the system transparent and accountable, and release this data to the people who can use it to revitalize our health system.

They’d acquired the data for that project through a whole load of FOI requests which they gradually stitched together – but to take the project to the next level they wanted to buy the State Medical Board report data for every state in the US. This data usually includes information on the doctors medical school, information about board certification and information on disciplinary actions against the doctor. In combination with what they’ve already done through FOI requests, this data has massive potential.

Yesterday the crowdfunding drive ended, and the project has raised well over its target of $15,000. So now they can get hold of the data, clean it up and release it back into the community openly.

But not right away.

The company have developed the idea of an “Open Source Eventually” license. The data, once purchased, will remain exclusive for another six months. Thus the incentive for funders is that they will have exclusive access rights during that period, before the data goes open (under a CC BY-SA 3.0 license).

Here’s Fred Trotter talking about the idea:

NotOnly Dev reckon the “Open Source Eventually” license is “the perfect compromise” for securing the funds necessary to get expensive data out into the public. What do you think?

Reputation Factor in Economic Publishing

Daniel Scott - November 1, 2012 in External, Open Access, Open Economics, Open/Closed, WG Economics

SSDL

“The big problem in economics is that it really matters in which journals you publish, so the reputation factor is a big hindrance in getting open access journals up and going”. Can the accepted norms of scholarly publishing be successfully challenged?

This quotation is a line from the correspondence about writing this blogpost for the OKFN. The invitation came to write for the Open Economics Working Group, hence the focus on economics, but in reality the same situation pertains across pretty much any scholarly discipline you can mention. From the funding bodies down through faculty departments and academic librarians to individual researchers, an enormous worldwide system of research measurement has grown up that conflates the quality of research output with the publications in which it appears. Journals that receive a Thomson ISI ranking and high impact factors are perceived as the holy grail and, as is being witnessed currently in the UK during the Research Excellence Framework (REF) process, these carry tremendous weight when it comes to research fund awards.


Earlier this year, I attended a meeting with a Head of School at a Russell Group university, in response to an email that I had sent with information about Social Sciences Directory, the ‘gold’ open access publication that I was then in the first weeks of setting up. Buoyed by their acceptance to meet, I was optimistic that there would be interest and support for the idea of breaking the shackles of existing ranked journals and their subscription paywall barriers. I believed then – and still believe now – that if one or two senior university administrators had the courage to say, “We don’t care about the rankings. We will support alternative publishing solutions as a matter of principle”, then it would create a snowball effect and expedite the break up of the current monopolistic, archaic system. However, I was rapidly disabused. The faculty in the meeting listened politely and then stated categorically that they would never consider publishing in a start up venture such as Social Sciences Directory because of the requirements of the REF. The gist of it was, “We know subscription journals are restrictive and expensive, but that is what is required and we are not going to rock the boat”.

I left feeling deflated, though not entirely surprised. I realised some time ago that the notion of profit & loss, or cost control, or budgetary management, was simply anathema to many academic administrators and that trying to present an alternative model as a good thing because it is a better deal for taxpayers is an argument that is likely to founder on the rocks of the requirements of the funding and ranking systems, if not apathy and intransigence. A few years ago, whilst working as a sales manager in subscription publishing, I attended a conference of business school deans and directors. (This in itself was unusual, as most conferences that I attended were for librarians – ALA, UKSG, IFLA and the like – as the ‘customer’ in a subscription sense is usually the university library). During a breakout session, a game of one-upmanship began between three deans, as they waxed lyrically about the overseas campuses they were opening, the international exchanges of staff and students they had fixed up, the new campus buildings that were under construction, and so on.

Eventually, I asked the fairly reasonable question whether these costly ventures were being undertaken with a strategic view that they would eventually recoup their costs and were designed to help make their schools self-funding. Or indeed, whether education and research are of such importance for the greater good of all that they should be viewed as investments. The discomfort was palpable. One of the deans even strongly denied that this is a question of money. That the deans of business schools should take this view was an eye-opening insight in to the general academic attitude towards state funding. It is an attitude that is wrong because ultimately, of course, it is entirely about the money. The great irony was that this conversation took place in September 2008, with the collapse of Lehman Brothers and the full force of the Global Financial Crisis (GFC) soon to impact gravely on the global higher education and research sector. A system that for years had been awash with money had allowed all manner of poor practices to take effect, in which many different actors were complicit. Publishers had seized on the opportunity to expand output massively and charge vast fees for access; faculty had demanded that their libraries

subscribe to key journals, regardless of cost; libraries and consortia had agreed to publishers’ demands because they had the money to do so; and the funding bodies had built journal metrics into the measurement for future financing. No wonder, then, that neither academia nor publishers could or would take the great leap forward that is required to bring about change, even after the GFC had made it patently clear that the ongoing subscription model is ultimately unsustainable. Change needs to be imposed, as the British government bravely did in July with the decision to adopt the recommendations of the Finch Report.

However, this brings us back to the central issue and the quotation in the title. For now, the funding mechanisms are the same and the requirement to publish in journals with a reputation is still paramount. Until now, arguments against open access publishing have tended to focus on quality issues. The argument goes that the premier (subscription) journals take the best submissions and then there is a cascade downwards through second tier journals (which may or may not be subscription-based) until you get to a pile of leftover papers that can only be published by the author paying a fee to some sort of piratical publisher. This does not stand much scrutiny. Plenty of subscription-based journals are average and have been churned out by publishers looking to beef up their portfolios and justify charging ever-larger sums. Good research gets unnecessarily dumped by leading journals because they adhere to review policies dating from the print age when limited pagination forced them to be highly selective. Other academics, as we have seen at Social Sciences Directory, have chosen to publish and review beyond the established means because they believe in finding and helping alternatives. My point is that good research exists outside the ‘top’ journals. It is just a question of finding it.

So, after all this, do I believe that the “big hindrance” of reputation can be overcome? Yes, but only through planning and mandate. Here is what I believe should happen:

  1. The sheer number of journals is overwhelming and, in actuality, at odds with modern user behaviour which generally accesses content online and uses a keyword search to find information. Who needs journals? What you want is a large collection of articles that are well indexed and easily searchable, and freely available. This will enable the threads of inter-disciplinary research to spread much more effectively. It will increase usage and reduce cost-per-download (increasingly the metrics that librarians use to measure the return on investment of journals and databases), whilst helping to increase citation and impact.
  2. Ensure quality control of peer review by setting guidelines and adhering to them.
  3. De-couple the link between publishing and tenure & department funding.
  4. In many cases, universities will have subscribed to a particular journal for years and will therefore have access to a substantial back catalogue. This has often been supplemented by the purchase of digitised archives, as publishers cottoned on to other sources of revenue which happened to chime with librarians’ preferences to complete online collections and take advantage of non-repeatable purchases. Many publishers also sell their content to aggregators, who agree to an embargo period so that the publisher can also sell the most up-to-date research directly. Although the axe has fallen on many print subscriptions, some departments and individuals still prefer having a copy on their shelves (even though they could print off a PDF from the web version and have the same thing, minus the cover). So, aside from libraries often paying more than once for the same content, they will have complete collections up to a given point in time. University administrators need to take the bold decision to change, to pick an end date as a ‘cut off’ after which they will publicly state that they are switching to new policies in support of OA. This will allow funds to be freed up and used to pay for institutional memberships, article processing fees, institutional repositories – whatever the choice may be. Editors, authors and reviewers will be encouraged to offer their services elsewhere, which will in turn rapidly build the reputation of new publications.

Scholarly publishing is being subjected to a classic confrontation between tradition and modernity. For me, it is inevitable that modernity will win out and that the norms will be successfully challenged.

This post is also available on the Open Economics blog. If you’re interested in the issues raised, join our Open Economics or our Open Access lists to discuss them further!

Making a Real Commons: Creative Commons should Drop the Non-Commercial and No-Derivatives Licenses

Rufus Pollock - October 4, 2012 in Featured, Free Culture, Open Content, Open Data, Open Definition, Open Standards, Open/Closed, WG Open Licensing

Students for Free Culture recently published two excellent pieces about why Creative Commons should drop their Non-Commercial and No-Derivatives license variants:

As the first post says:

Over the past several years, Creative Commons has increasingly recommended free culture licenses over non-free ones. Now that the drafting process for version 4.0 of their license set is in full gear, this is a “a once-in-a-decade-or-more opportunity” to deprecate the proprietary NonCommercial and NoDerivatives clauses. This is the best chance we have to dramatically shift the direction of Creative Commons to be fully aligned with the definition of free cultural works by preventing the inheritance of these proprietary clauses in CC 4.0′s final release.

After reiterating some of the most common criticisms and objections against the NC and ND restrictions (if you are not familiar with these then they are worth reading up on), the post continues:

Most importantly, though, is that both clauses do not actually contribute to a shared commons. They oppose it.

This is a crucial point and one that I and others at the Open Knowledge Foundation have made time and time again. Simply: the Creative Commons licenses do not make a commons.

As I wrote on my personal blog last year:

Ironically, despite its name, Creative Commons, or more precisely its licenses, do not produce a commons. The CC licenses are not mutually compatible, for example, material with a CC Attribution-Sharealike (by-sa) license cannot be intermixed with material licensed with any of the CC NonCommercial licenses (e.g. Attribution-NonCommercial, Attribution-Sharealike-Noncommercial).

Given that a) the majority of CC licenses in use are ‘non-commercial’ b) there is also large usage of ShareAlike (e.g. Wikipedia), this is an issue affects a large set of ‘Creative Commons’ material.

Unfortunately, the presence of the word ‘Commons’ in CC’s name and the prominence of ‘remix’ in the advocacy around CC tends to make people think, falsely, that all CC licenses as in some way similar or substitutable.

The NC and ND licenses prevent CC licensed works forming a unified open digital commons that everyone is free to use, reuse and redistribute.

Perhaps if Creative Commons were instead called ‘Creative Choice’ and it were clearer that only a subset of the licenses (namely CC0, CC-BY and CC-BY-SA) contribute to the development of a genuine, unified, interoperable commons then this would not be so problematic. But the the fact that CC appears to promote such a commons (which in fact it does not) ultimately has a detrimental effect on the growth and development of the open digital commons.

As the Free Culture blog puts it:

Creative Commons could have moved towards being a highly-flexible modular licensing platform that enabled rightsholders to fine-tune the exact rights they wished to grant on their works, but there’s a reason that didn’t happen. We would be left with a plethora of incompatible puddles of culture. Copyright already gives rightsholdors all of the power. Creative Commons tries to offer a few simple options not merely to make the lives of rightsholders easier, but to do so towards the ends of creating a commons.

Whilst Free Culture are focused on “content” the situation is, if anything, more serious for data where combination and reuse is central and therefore interoperability (and the resulting open commons) are especially important.

We therefore believe this is the time for Creative Commons to either retire the NC and ND license variants, or spin them off into a separate entity which does not purport to promote or advance a digital commons (e.g. ‘Creative Choice’).

Please consider joining us and Students for a Free Culture in the call to Creative Commons to make the necessary changes:

New open source “publishing-house-in-a-box” makes it easier for scholars to publish open access monographs

Jonathan Gray - September 20, 2012 in Open Access, Open/Closed

Today the Public Knowledge Project (PKP) released a new piece of software called the Open Monograph Press. As it says in their press release:

OMP is an open source software platform for managing the editorial workflow required to see monographs, edited volumes, and scholarly editions through internal and external review, editing, cataloguing, production, and publication. OMP will operate, as well, as a press website with catalog, distribution, and sales capacities.

Why does this matter? Several years ago the PKP launched a project called Open Journal Systems which helps users with the the management, editorial and publication of journals.

The project was founded in response to the rising costs associated with running and managing journals – which ranged from tens of thousands to millions of dollars, excluding peer review. Substantial amounts of these costs were administrative in nature – including software and system costs. One study suggested that the average cost of publishing an article “excluding noncash peer review costs”, was around $3,800 (for more details see this paper).

Now the open source Open Journal Systems software helps thousands of scholars around the world (over 11,500 as of December 2011) to edit and publish journals themselves – dramatically reducing the cost of starting and maintaining a scholarly journal.

The UK government’s recently announced plans to open up publicly funded research have had a luke warm response from some quarters, partly as millions of pounds of taxpayer’s money that would otherwise be spent on research has been earmarked to pay major publishers £2,000 processing fees per article.

Open Journal Systems enables scholars to start and run academic journals by themselves – doing peer review, editorial, and publication as part of their academic roles and cutting administrative costs. This means that more research can be made freely available on the web, and potentially frees up cash that might otherwise have been spent on article publication fees to subsidise more research, or more academic jobs.

Hopefully the newly launched Open Monograph Press will have a similar impact in cutting costs associated with publishing scholarly monographs, and will encourage the publication of more open access monographs on the web.

If you’re interested in the OKFN’s open access activities, then you can follow our open access mailing list.

Get Updates