Support Us

You are browsing the archive for OKF Projects.

Treasures from the Public Domain in New Essays Book

Adam Green - November 12, 2015 in Featured, Public Domain, Public Domain Review


Open Knowledge project The Public Domain Review is very proud to announce the launch of its second book of selected essays! For nearly five years now we’ve been diligently trawling the rich waters of the public domain, bringing to the surface all sorts of goodness from various openly licensed archives of historical material: from the Library of Congress to the Rijksmuseum, from Wikimedia Commons to the wonderful Internet Archive. We’ve also been showcasing, each fortnight, new writing on a selection of these public domain works, and this new book picks out our very best offerings from 2014.

All manner of oft-overlooked histories are explored in the book. We learn of the strange skeletal tableaux of Frederik Ruysch, pay a visit to Humphry Davy high on laughing gas, and peruse the pages of the first ever picture book for children (which includes the excellent table of Latin animal sounds pictured below). There’s also fireworks in art, petty pirates on trial, brainwashing machines, truth-revealing diseases, synesthetic auras, Byronic vampires, and Charles Darwin’s photograph collection of asylum patients. Together the fifteen illustrated essays chart a wonderfully curious course through the last five hundred years of history — from sea serpents of the 16th-century deep to early-20th-century Ouija literature — taking us on a journey through some of the darker, stranger, and altogether more intriguing corners of the past.

Order by 18th November to benefit from a special reduced price and delivery in time for Christmas

If you are wanting to get the book in time for Christmas (and we do think it’d make an excellent gift for that history-loving relative or friend!), then please make sure to order before midnight on Wednesday 18th November. Orders placed before this date will also benefit from a special reduced price!

Please visit the dedicated page on The Public Domain Review site to learn more and also buy the book!

Double page spread (full bleed!), showing a magnificent 18th-century print of a fireworks display at the Hague – from our essay on how artists have responded to the challenge of depicting fireworks through the ages.

Join the School of Data team: Technical Trainer wanted

Open Knowledge - November 9, 2015 in Featured, Jobs, School of Data


The mission of Open Knowledge International is to open up all essential public interest information and see it utilized to create insight that drives change. To this end we work to create a global movement for open knowledge, supporting a network of leaders and local groups around the world; we facilitate coordination and knowledge sharing within the movement; we build collaboration with other change-making organisations both within our space and outside; and, finally, we prototype and provide a home for pioneering products.

A decade after its foundation, Open Knowledge International is ready for its next phase of development. We started as an organisation that led the quest for the opening up of existing data sets – and in today’s world most of the big data portals run on CKAN, an open source software product developed first by us.

Today, it is not only about opening up of data; it is making sure that this data is usable, useful and – most importantly – used, to improve people’s lives. Our current projects (School of Data, OpenSpending, OpenTrials, and many more) all aim towards giving people access to data, the knowledge to understand it, and the power to use it in our everyday lives.

The School of Data is growing in size and scope, and to support this project – alongside our partners – we are looking for an enthusiastic Technical Trainer (flexible location, part time).

School of Data is a network of data literacy practitioners, both organisations and individuals, implementing training and other data literacy activities in their respective countries and regions. Members of the School of Data work to empower civil society organizations (CSOs), journalists, governments and citizens with the skills they need to use data effectively in their efforts to create better, more equitable and more sustainable societies. Over the past four years, School of Data has succeeded in developing and sustaining a thriving and active network of data literacy practitioners in partnership with our implementing partners across Europe, Latin America, Asia and Africa.

Our local implementing partners are Social TIC, Code for Africa, Metamorphosis, and several Open Knowledge chapters around the world. Together, we have produced dozens of lessons and hands-on tutorials on how to work with data published online, benefitting thousands of people around the world. Over 4500 people have attended our tailored training events, and our network has mentored dozens of organisations to become tech savvy and data driven. Our methodologies and approach for delivering hands-on data training and data literacy skills – such as the data expedition – have now been replicated in various formats by organisations around the world.

One of our flagship initiatives, the School of Data Fellowship Programme, was first piloted in 2013 and has now successfully supported 26 fellows in 25 countries to provide long-term data support to CSOs in their communities. School of Data coordination team members are also consistently invited to give support locally to fellows in their projects and organisations that want to become more data-savvy.

In order to give fellows a solid point of reference in terms of content development and training resources, and also to have a point person to give capacity building support for our members and partners around the world, School of Data is now hiring an outstanding trainer/consultant who’s familiar with all the steps of the Data Pipeline and School of Data’s innovative training methodology to be the all-things-content-and-training for the School of Data network.


The hired professional will have three main objectives:

  • Technical Trainer & Data Wrangler: represent School of Data in training activities around the world, either supporting local members through our Training Dispatch or delivering the training themselves;
  • Data Pipeline & Training Consultant: give support for members and fellows regarding training (planning, agenda, content) and curriculum development using School of Data’s Data Pipeline;
  • Curriculum development: work closely with the Programme Manager & Coordination team to steer School of Data’s curriculum development, updating and refreshing our resources as novel techniques and tools arise.

Terms of Reference

  • Attend regular (weekly) planning calls with School of Data Coordination Team;
  • Work with current and future School of Data funders and partners in data-literacy related activities in an assortment of areas: Extractive Industries, Natural Disaster, Health, Transportation, Elections, etc;
  • Be available to organise and run in person data-literacy training events around the world, sometimes in short notice (agenda, content planning, identifying data sources, etc);
  • Provide reports of training events and support given to members and partners of School of Data Network;
  • Work closely with all School of Data Fellows around the world to aid them in their content development and training events planning & delivery;
  • Write for the School of Data blog about curriculum and training events;
  • Take ownership of the development of curriculum for School of Data and support training events of the School of Data network;
  • Work with Fellows and other School of Data Members to design and develop their skillshare curriculum;
  • Coordinate support for the Fellows when they do their trainings;
  • Mentor Fellows including monthly point person calls, providing feedback on blog posts and curriculum & general troubleshooting;
  • The position reports to School of Data’s Programme Manager and will work closely with other members of the project delivery team;
  • This part-time role is paid by the hour. You will be compensated with a market salary, in line with the parameters of a non-profit-organisation;
  • We offer employment contracts for residents of the UK with valid permits, and services contracts to overseas residents


  • A lightweight monthly report of performed activities with Fellows and members of the network;
  • A final narrative report at the end of the first period (6 months) summarising performed activities;
  • Map the current School of Data curriculum to diagnose potential areas of improvement and to update;
  • Plan and suggest a curriculum development & training delivery toolkit for Fellows and members of the network


  • Be self-motivated and autonomous;
  • Fluency in written and spoken English (Spanish & French are a plus);
  • Reliable internet connection;
  • Outstanding presentation and communication skills;
  • Proven experience running and planning training events;
  • Proven experience developing curriculum around data-related topics;
  • Experience working remotely with workmates in multiple timezones is a plus;
  • Experience in project management;
  • Major in Journalism, Computer Science, or related field is a plus

We strive for diversity in our team and encourage applicants from the Global South and from minorities.


Six months to one year: from November 2015 (as soon as possible) to April 2016, with the possibility to extend until October 2016 and beyond, at 10-12 days per month (8 hours/day).

Application Process

Interested? Then send us a motivational letter and a one page CV via

Please indicate your current country of residence, as well as your salary expectations (in GBP) and your earliest availability.

Early application is encouraged, as we are looking to fill the positions as soon as possible. These vacancies will close when we find a suitable candidate.

Interviews will be conducted on a rolling basis and may be requested on short notice.

If you have any questions, please direct them to jobs [at]

Your input needed: final review of 2015 Global Open Data Index

Mor Rubinstein - October 28, 2015 in Global Open Data Index, open knowledge

We’re now in the final stretch for the 2015 Global Open Data Index, and will be publishing the results in the very near future! As a community driven measurement tool, this year we have incorporated feedback we’ve received over the past several years to make the Index more useful as an instrument for civil society — particularly around what data should be measured and what attributes are important for each dataset.

As a crowdsourced survey, we have taken extra steps to ensure the measurement instrument is more reliable. We are are aware that there is no perfect measurement that can be applied globally, but we aim to be as accurate as we possibly can. We have documented our processes year to year and inevitably not everything has been perfect, but by engaging in this process of experimentation, trial and error we hope the Global Open Data Index will continue to evolve as an innovative, grassroots, global tool for civil society to measure the state of open data.

The journey this year was long, but productive. Here is a recap of the steps we have taken in the long road to publishing the 2015 Index:

  1. Global consultation on new datasets — We sought your opinions and ideas for new themes that are important for civil society which should be added to the Index. As a result of this initiative, we have added 4 new datasets to this year’s Index, including: Government procurement tenders, Water Quality, Land Ownership and Weather forecast.
  2. Consultation on methodology —The Index team refined the definitions of the datasets based on feedback from open data advocates, researchers and communities from around the world. We have tightened the definitions of the datasets to allow for greater accuracy and comparability.
  3. Submissions phase – The crowdsourced phase where submissions are made to the Index with the help of the great index community and our new local index coordinators.
  4. Quality Assurance of the data — We added a preliminary stage of QA this year to conduct a systematic review of the license and machine readable questions — the two attributes that have given past submitters the most trouble.
  5. Thematic review with experts — This year, instead of conducting reviews of complete submissions by country or regional reviewers, we deployed expert thematic reviewers. Thematic reviewers assessed the submissions of all entries for a given dataset, and made sure that we are comparing the right datasets to one another between all 120 places included in this year’s Index, and that they were compliant with the new definition we made for each dataset .

Now, we are in the final phase of assessing the submissions for this year’s Index. After conducting a lengthy review phase, we seek your help to understand if we have evaluated the submissions correctly before finalizing the Index and publishing this year’s scores. In the next two weeks, from today until November 6, we will open the Index again to your comments. We encourage everyone to comment on the Index, civil society and governments alike.

Before you comment on a submission, note that we allowed thematic reviewers to apply their own logic to their review based on their expertise and assessment of the entire body of submissions across all places. This logic was grounded in the published definitions for each dataset, but allowed for some subjective flexibility in order to maintain a consistent review and account for the challenges faced by submitters, particularly in the cases of the datasets that were added this year and those with substantial changes to their definitions. Please read this section carefully before commenting on submissions. Note two things:

After careful consideration, we’ve omitted two datasets from the final scoring of the 2015 Index — public transport and health performance. We omitted public transport because 45 countries do not have a national level public transport system, which accounts for 37% of the Index sample. This does not allow an equal comparison between places. We omitted health performance data because we asked for two different datasets, and could record only one dataset faithfully in the Index system and as such it was almost impossible to score any of these entries as a unified submission. In both cases we will review the data and make it available for further investigation, and will see how we can make adjustments and incorporate these important datasets into future indexes. In some places, our reviewers could not complete their evaluation and needed more information. We would appreciate if you can help provide more information on any of these submissions. Any entry that displays a number ‘1’ in an orange circle on it needs further attention.

Here is a summary of the reviewers approaches to evaluating submissions for each dataset included in the 2015 Index:

Government Budget

Reviewer: Mor Rubinstein

The stated description of the Government Budget dataset is as follows:

National government budget at a high level. This category is looking at budgets, or the planned government expenditure for the upcoming year, and not the actual expenditure. To satisfy this category, the following minimum criteria must be met: Planned budget divided by government department and sub-department Updated once a year. The budget should include descriptions regarding the different budget sections.

Submissions that included data for both department AND sub-department/program were accepted. Submissions that included only department level data were not accepted. Additionally, budget speeches that did not include detailed data about the the estimated expenditures for the coming year were not accepted as a submission. Only datasets from an official source (e.g. The Ministry of Finance or equivalent agency) were accepted.

Government Spending

Reviewer: Tryggvi Björgvinsson

The stated description of the Government Spending dataset is as follows:

Records of actual (past) national government spending at a detailed transactional level; A database of contracts awarded or similar will not considered sufficient. This data category refers to detailed ongoing data on actual expenditure. Data submitted in this category should meet the following minimum criteria: Individual record of transactions. Date of the transactions Government office which had the transaction Name of vendor amount of the transaction Update on a monthly basis

Submissions that included aggregate data or simply procurement contracts (results of calls for tenders) were not accepted. In cases where aggregate data or procurement data was submitted or the submitter claimed that the data did not exist, an attempt was made to locate transactional data with a simple Google search and/or via IBP’s Open Budget Survey. If data was available for the previous year (or applicable recent budget cycle) the submission was adjusted accordingly and accepted.

Election Results

Reviewer: Kamil Gregor

The stated description of the Election Results dataset is as follows:

This data category requires results by constituency / district for all major national electoral contests. To satisfy this category, the following minimum criteria must be met: Result for all major electoral contests Number of registered votes Number of invalid votes Number of spoiled ballots All data should be reported at the level of the polling station

Submissions that did not show the data at polling station level were omitted and marked as ‘Data does not exist’, even if votes are not counted at polling station level as a matter of policy. The reason for this is the polling station level is the most granular level that allow to monitor election fraud .

Company Register

Reviewer: Rebecca Sentance

The stated description of the Company Register dataset is as follows:

List of registered (limited liability) companies. The submissions in this data category does not need to include detailed financial data such as balance sheet etc. To satisfy this category, the following minimum criteria must be met: Name of company Unique identifier of the company Company address Updated at least once a month

Data was marked as unsure if it exists when the submitted dataset did not contain address or a company ID. If the submission referenced a relevant government website that does not indicate the data exists, or if there is no evidence even which government body would hold the data, the submission was changed to ‘data does not exist’.. If it is clear that a governmental body collects company data, but there is no way of knowing what it consists of, where it is held, or how to access it, and no indication that it would fulfil our requirements, the submission was also marked as ‘data does not exist’.

Based on the definition, it was decided that a company register that is freely available to searchable by the public but requires entering a search term (search applications) did not count as free or publicly accessible. However, a company register that can be browsed through page-by-page does present all of the data and is the type of dataset required for acceptance.

National Statistics

Reviewer: Zach Christensen

The stated description of the National Statistics dataset is as follows:

Key national statistics such as demographic and economic indicators (GDP, unemployment, population, etc). To satisfy this category, the following minimum criteria must be met: GDP for the whole country updated at least quarterly Unemployment statistics updated at least monthly Population updated at least once a year

For each submission, the reviewer checked for national accounts, unemployment, and population data as required by the description. It was found that most countries don’t have these data for the last year and very few had quarterly GDP figures or monthly unemployment figures. Submissions were only marked as ‘data does not exist’ if they did not have any national statistics more recent than 2010.


Reviewer: Kamil Gregor

The stated description of the Legislation dataset is as follows:

This data category requires all national laws and statutes available to be available online, although it is not a requirement that information on legislative behaviour e.g. voting records is available. To satisfy this category, the following minimum criteria must be met: Content of the law / status If applicable, all relevant amendments to the law Date of last amendments Data should be updated at least on quarterly

Submissions were reviewed to ensure the data met the criteria. Regularity of updating was assessed based on the date of the most recently submitted data.

Pollutant Emissions

Reviewer: Yaron Michl

The stated description of the Pollutant Emissions dataset is as follows:

Aggregate data about the emission of air pollutants especially those potentially harmful to human health (although it is not a requirement to include information on greenhouse gas emissions). Aggregate means national-level or available for at least three major cities. In order to satisfy the minimum requirements for this category, data must be available for the following pollutants and meet the following minimum criteria: Particulate matter (PM) Levels Sulphur oxides (SOx) Nitrogen oxides (NOx) Volatile organic compounds (VOCs) Carbon monoxide (CO) Updated on at least once a week. Measured either at a national level by regions or at leasts in 3 big cities.

VOCs is a generic designation for many organic chemicals, therefore, when measuring VOCs it is possible to measure any one of a number of compounds such as Benzene or MTBE. Measurements of Volatile Organic compounds(VOCs) was ultimately not considered as part of the data requirements because of this discrepancy and the fact that it is rarely measured on a national level (see this link).

Carbon monoxide (CO) and Nitrogen Oxides (NoX) were also not considered as a requirement because their main origin is usually from transportation.

In addition, some countries publish air pollution by using the Air Quality Index, a formula that translates air quality data into numbers and colors to help citizens understand when to take action to protect their health. Submissions that relied on the Air Quality Index was considered not to exist because it is not raw data.

Government Procurement Tenders

Reviewer: Georg Neumann

The stated description of the Government Procurement Tenders dataset is as follows:

All tenders and awards of the national/federal government aggregated by office. Monitoring tenders can help new groups to participate in tenders and increase government compliance. Data submitted in this category must be aggregated by office, updated at least monthly & satisfy the following minimum criteria: Tenders: tenders name tender description tender status Awards: Award title Award description value of the award suppliers name

Quality of published information varied strongly and was not evaluated here. As long as the minimum information was available the data was said to exist for a given place.

Thresholds for publication of this information varies strongly by country. For all EU countries, tenders above a specific amount, detailed here, need to be published. This allowed for all EU submissions to qualify as publishing open procurement data even though some countries, such as Germany, do not publish award value for contracts below those thresholds, and others have closed systems to access specific information on contracts awarded.

In other countries not all sectors of government publish tenders and awards data. Submissions were evaluated to ensure that the main government tenders and contracts were made public, notwithstanding that data from certain ministries may have been missing.

Water Quality

Reviewer: Nisha Thompson

The stated description of the Water Quality dataset is as follows:

Data, measured at the water source, on the quality of water is essential for both the delivery of services and the prevention of diseases. In order to satisfy the minimum requirements for this category, data should be available on level of the following chemicals by water source and be updated at least weekly: fecal coliform arsenic fluoride levels nitrates TDS (Total dissolved solids)

If a country treats water or distributes it, then there will be data regarding water quality because all water treatment requires quality checks. Even though water quality is a local responsibility in most countries, very few countries have a completely decentralized system. Usually there is a monitoring role by the central government, either by the Environmental Protection Agency, Ministry of the Environment or the Ministry of Public Health. If there is monitoring role, the data does exist, if monitoring is completely decentralized, like in the UK, the submission was marked as ‘does not exist’ because there is no aggregation of the data. If data was not available daily or weekly it wasn’t considered timely.

In some cases, all the parameters were accounted for except TDS. Even though it is standard, some countries only collect conductivity, which can be used to calculate TDS. In this case, the submission was approved as is.

Land Ownership

Reviewer: Codrina Maria Ilie

The stated description of the Land Ownership dataset is as follows: Cadaster showing land ownership data on a map and include all metadata on the land. Cadaster data submitted in this category must include the following characteristics: Land borders Land owners name Land size national level updated yearly For various reasons, the land owner’s name attribute was widely unmet and as such, lack of this data was not considered a factor in evaluating these submissions. As this dataset is subject to well-kept historic records (not always the case), to legislation (which can be fluctuant), to very expensive activities that a government must implement in order to keep data up to date, to the complexity of the data itself (sometimes data that makes a national cadastre is registred in different registries or systems), a first year indexing exercise must not be considered exhaustive.


Reviewers: Neal Bastek & Stephen Gates

The stated description of the Weather dataset is as follows:

5 days forecast of temperature, precipitation and wind as well as recorded data for temperature, wind and precipitation for the past year. In order to satisfy the minimum requirements for this category, data submitted should meet the following criteria: 5 days forecast of temperature updated daily 5 days forecast of wind updated daily 5 days forecast of precipitation updated daily Historical temperature data for the past year

Based on a general assessment of the submissions, a minimum threshold for claiming the data existed was set at forecast data for today + two days (three days) with a qualitative allowance made for arid regions substituting humidity data for precipitation data. The threshold for inclusion could also be met with four day forecasts that include temperature and precipitation data, and/or a generic statement using text or descriptive icons about conditions (e.g. windy, stormy, partly cloudy, sunny, fair, etc.).


Reviewer: Codrina Maria Ilie

The stated description of the Location dataset is as follows:

A database of postcodes/zipcodes and the corresponding spatial locations in terms of a latitude and a longitude (or similar coordinates in an openly published national coordinate system). If a postcode/zipcode system does not exist in the country, please submit a dataset of administrative borders. Data submitted in this category must satisfy the following minimum conditions Zipcodes Address Coordinate (latitude longitude) national level updated once a year Administrative boundaries Boarders poligone name of poligone (city, neighborhood) national level updated once a year

In cases in which a country has not adopted a postcode system, the location dataset is considered to be administrative boundaries. The Universal Postal Union – Postal Addressing System was used to identify the structure of a postcode for a given place []. This tool proved significantly useful when identifying countries that do not use a postcode system.

In situations where countries only had a postcode search service, either by postcode or address, data was said to not exist. If the postcodes were not geocoded, submissions did not meet the Index requirements due to the difficulty of geocoding such a dataset. On the other hand, if the postcode system took into account just the smallest administrative boundary and if that boundary was officially available, considering the easiness of obtaining the geocoded postcodes number, the data was marked as ‘does exist’ for that submission.

National Map Reviewer: Gil Zaretzer

The stated description of the National Map dataset is as follows:

This data category requires a high level national map. To satisfy this category, the following minimum criteria must be met: Scale of 1:250,000 (1 cm = 2.5km). Markings of national roads National borders Marking of streams, rivers, lakes, mountains. Updated at least once a year.

Only submissions from an official source, with original data was considered. A link to Google Maps, which was often provided, does not satisfy the criteria for these submissions.

In cases where there was no link provided in the submission and entries were marked as “unsure” if there was any indication that the data exists but was not available online, i.e. a national mapping service without a website.

Event Guide, 2015 Open Data Index

nealbastek - September 6, 2015 in Global Open Data Index, Open Data, Open Data Census, Open Data Index, open knowledge, WG Open Government Data

Getting together at a public event can be a fun way to contribute to the 2015 Global Open Data Index. It can also be a great way to engage and organize people locally around open data. Here are some guidelines and tips for hosting an event in support of the 2015 Index and getting the most out of it.

OLYMPUS DIGITAL CAMERAHosting an event around the Global Open Data Index is an excellent opportunity to spread the word about open data in your community and country, not to mention a chance to make a contribution to this year’s Index. Ideally, your event would focus broadly on open data themes, possibly even identifying the status of all 15 key datasets and completing the survey. Set a reasonable goal for yourself based on the audience you think you can attract. You may choose to not even make a submission at your event, but just discuss the state of open data in your country, that’s fine too.

It may make sense to host an event focused around one or more of the datasets. For instance, if you can organize people around government spending issues, host a party focused on the budget, spending, and procurement tender datasets. If you can organize people around environmental issues, focus on the pollutant emissions and water quality datasets. Choose whichever path you wish, but it’s good to establish a focused agenda, a clear set of goals and outcomes for any event you plan.

OLYMPUS DIGITAL CAMERAWe believe the datasets included in the survey represent a solid baseline of open data for any nation and any citizenry; you should be prepared to make this case to the participants at your events. You don’t have to have be an expert yourself, or even have topical experts on hand to discuss or contribute to the survey. Any group of interested and motivated citizens can contribute to a successful event. Meet people where they are at, and help them understand why this work is important in your community and country. It will set a good tone for your event by helping participants realize they are part of a global effort and that the outcomes of their work will be a valuable national asset.

Ahmed Maawy, who hosted an event in Kenya around the 2014 Index, sums up the value of the Index with these key points that you can use to set the stage for your event:

  • It defines a benchmark to assess how healthy and helpful our open datasets are.
  • It allows us to make comparisons between different countries.
  • Allows us to asses what countries are doing right and what countries are doing wrong and to learn from each other.
  • Provides a standard framework that allows us to identify what we need to do or even how to implement or make use of open data in our countries and identify what we are strong at or what we are week at.

What to do at an Open Data Index event

It’s great to start your event with an open discussion so you can gauge the experience in the room and how much time you should spend educating and discussing introductory materials. You might not even get around to making a contribution, and that’s ok. Introducing the Index in anyway will put your group on the right path.

If you’re hosting an event with mostly newcomers, it’s always a good idea to look to the Open Definition and the Open Data Handbook for inspiration and basic information.

  • If your group is more experienced, everything you need to contribute to the survey can be found in this year’s Index contribution tutorial.
  • If you’re actively contributing at an event, we recommend splitting into teams and assigning one or more datasets to each of the group and having them use the Tutorial as a guide. There can only be one submission per dataset, so be sure to not have teams working on the same task.
  • Pair more experienced people with less experienced people so teams can better rely on themselves to answer questions and solve problems.

More practical tips can be found at the 2015 Open Data Index Event Guide.

Photo credits: Ahmed Maawy

Global Open Data Index 2015 is open for submissions

Mor Rubinstein - August 25, 2015 in Featured, Global Open Data Index, open knowledge

The Global Open Data Index measures and benchmarks the openness of government data around the world, and then presents this information in a way that is easy to understand and easy to use. Each year the open data community and Open Knowledge produces an annual ranking of countries, peer reviewed by our network of local open data experts. Launched in 2012 as tool to track the state of open data around the world. More and more governments were being to set up open data portals and make commitments to release open government data and we wanted to know whether those commitments were really translating into release of actual data.

The Index focuses on 15 key datasets that are essential for transparency and accountability (such as election results and government spending data), and those vital for providing critical services to citizens (such as maps and water quality). Today, we are pleased to announce that we are collecting submissions for the 2015 Index!

The Global Open Data Index tracks whether this data is actually released in a way that is accessible to citizens, media and civil society, and is unique in that it crowdsources its survey results from the global open data community. Crowdsourcing this data provides a tool for communities around the world to learn more about the open data available in their respective countries, and ensures that the results reflect the experience of civil society in finding open information, rather than accepting government claims of openness. Furthermore, the Global Open Data Index is not only a benchmarking tool, it also plays a foundational role in sustaining the open government data community around the world. If, for example, the government of a country does publish a dataset, but this is not clear to the public and it cannot be found through a simple search, then the data can easily be overlooked. Governments and open data practitioners can review the Index results to locate the data, see how accessible the data appears to citizens, and, in the case that improvements are necessary, advocate for making the data truly open.

Screen Shot 2015-08-25 at 13.35.24


Methodology and Dataset Updates

After four years of leading this global civil society assessment of the state of open data around the world, we have learned a few things and have updated both the datasets we are evaluating and the methodology of the Index itself to reflect these learnings! One of the major changes has been to run a massive consultation of the open data community to determine the datasets that we should be tracking. As a result of this consultation, we have added five datasets to the 2015 Index. This year, in addition to the ten datasets we evaluated last year, we will also be evaluating the release of water quality data, procurement data, health performance data, weather data and land ownership data. If you are interested in learning more about the consultation and its results, you can read more on our blog!

How can I contribute?

2015 Index contributions open today! We have done our best to make contributing to the Index as easy as possible. Check out the contribution tutorial in English and Spanish, ask questions in the discussion forum, reach out on twitter (#GODI15) or speak to one of our 10 regional community leads! There are countless ways to get help so please do not hesitate to ask! We would love for you to be involved. Follow #GODI15 on Twitter for more updates.

Important Dates

The Index team is hitting the road! We will be talking to people about the Index at the African Open Data Conference in Tanzania next week and will also be running Index sessions at both AbreLATAM and ConDatos in two weeks! Mor and Katelyn will be on the ground so please feel free to reach out!

Contributions will be open from August 25th, 2015 through September 20th, 2015. After the 20th of September we will begin the arduous peer review process! If you are interested in getting involved in the review, please do not hesitate to contact us. Finally, we will be launching the final version of the 2015 Global Open Data Index Ranking at the OGP Summit in Mexico in late October! This will be your opportunity to talk to us about the results and what that means in terms of the national action plans and commitments that governments are making! We are looking forward to a lively discussion!

The 2015 Global Open Data Index is around the corner – these are the new datasets we are adding to it!

Mor Rubinstein - August 20, 2015 in Global Open Data Index

After a two months, 82 ideas for datasets, 386 voters, thirteen civil society organisation consultations and very active discussions on the Index forum, we have finally arrived at a consensus on what datasets will be including in the 2015 Global Open Data Index (GODI).

This year, as part of our objective to ensure that the Global Open Data index is more than a simple measurement tool, we started a discussion with the open data community and our partners in civil society to help us determine which datasets are of high social and democratic value and should be assessed in the 2015 Index. We believe that by making the choice of datasets a collaborative decision, we will be able to raise awareness of and start a conversation around the datasets required for the Index to truly become a civil society audit of the open data revolution. The process included a global survey, a civil society consultation and a forum discussion (read more in a previous blog post about the process).

The community had some wonderful suggestions, making deciding on fifteen datasets no easy task. To narrow down the selection, we started by eliminating the datasets that were not suitable for global analysis. For example, some datasets are collected at the city level and can therefore not be easily compared at a national level. Secondly, we looked to see if there is was a global standard that would allow us to easily compare between countries (such as UN requirements for countries etc). Finally, we tried to find a balance between financial datasets, environmental datasets, geographical datasets and datasets pertaining to the quality of public services. We consulted with experts from different fields and refined our definitions before finally choosing the following datasets:

  1. Government procurement data (past and present tenders) – This dataset is crucial for monitoring government contracts be it to expose corruption or to ensure the efficient use of public funds. Furthermore, when combined with budget and spending data, contracting data helps to provide a full and coherent picture of public finance. We will be looking at both tenders and awards.
  2. Water quality -Water is life and it belongs to all of us. Since this is an important and basic building stone of society, having access to data on drinking water may assist us not only in monitoring safe drinking water but also to help providing it everywhere.
  3. Weather forecast – Weather forecast data is not only one of the most commonly used datasets in mobile and web applications, it is also of fundamental importance for agriculture and disaster relief. Having both weather predictions and historical weather data helps not only to improve quality of life, but to monitor climate change. As such, through the index, we will measure whether governments openly publish data both data on the 5 day forecast and historical figures.
  4. Land ownership – Land ownership data can help citizens understand their urban planning and development as well as assisting in legal disputes over land. In order to assess this category, we are using national cadastres, a map showing land registry.
  5. Health performance data – While this was one of the most popular datasets requested during the consultation, it was challenging to define what would be the best dataset(s) to assess health performance (see the forum discussion). We decided to use this category as an opportunity to test ideas about what to evaluate. After numerous discussions and debates, we decided that this year we would use the following as proxy indicators of health performance:
      Location of public hospitals and clinics.
      Data on infectious diseases rates in a country.
    That being said, we are actively seeking and would greatly appreciate your feedback! Please use the country level comment section to suggest any other datasets that you encounter that might also be a good measure of health performance (for example, from number of beds to budgets). This feedback will help us to learn and define this data category even better for next year’s Index.

2015 Global Open Data Index



In addition to the new datasets, we refined the definitions to some of the existing datasets, while using our new datasets definition guidelines. These were written in order to both produce a more accurate measurement and to create more clarity about what we are looking for with each dataset. The guidelines suggest at least 3 key data characteristics for each datasets, define how often each dataset needs to be updated in order to be considered timely, and suggests level aggregation acceptable for each datasets. The following datasets were changed in order to meet the guidelines:

Elections results – Data should be reported at the polling station level as to allow civil society to monitor elections results better and uncover false reporting. In addition, we added indicators such as number of registered voters, number of invalid votes and number of spoiled ballots.

National map – In addition to the scale of 1:250,000, we added features such as – markings of national roads, national borders, marking of streams, rivers, lakes, mountains.

Pollutant emissions – We defined the specific pollutants that should be included in the datasets.

National Statistics – GDP, unemployment and populations have been selected as the indicators that must be reported.

Public Transport – We refined the definition so it will examine only national level services (as opposed to inter cities ones). We also do not looking for real time data, but time tables.

Location datasets (previously Postcodes) – Postcode data is incredibly valuable for all kinds of business and civic activity; however, 60 countries in the world do not have a postcode system and as such, this dataset has been problematic in the past. For these countries, we have suggested examining a different dataset, administrative boundaries. While it is not as specific as postcodes, administrative boundaries can help to enrich different datasets and create better geographical analysis.

Adding datasets and changing definitions has been part of ongoing iterations and improvements that we have done to the Index this year. While it has been a challenge, we are hoping that these improvements help to create a more fair and accurate assessment of open data progress globally. Your feedback plays an essential role in shaping and improving the Index going forward, please do share it with us.

For the full descriptions of this year’s datasets can be found here.

Beauty behind the scenes

Tryggvi Björgvinsson - August 5, 2015 in CKAN, OKF Sweden, Open Data, open knowledge

Good things can often go unnoticed, especially if they’re not immediately visible. Last month the government of Sweden, through Vinnova, released a revamped version of their open data portal, Ö The portal still runs on CKAN, the open data management system. It even has the same visual feeling but the principles behind the portal are completely different. The main idea behind the new version of Ö is automation. Open Knowledge teamed up with the Swedish company Metasolutions to build and deliver an automated open data portal.

Responsive design

In modern web development, one aspect of website automation called responsive design has become very popular. With this technique the website automatically adjusts the presentation depending on the screen size. That is, it knows how best to present the content given different screen sizes. Ö got a slight facelift in terms of tweaks to its appearance, but the big news on that front is that it now has a responsive design. The portal looks different if you access it on mobile phones or if you visit it on desktops, but the content is still the same.

These changes were contributed to CKAN. They are now a part of the CKAN core web application as of version 2.3. This means everyone can now have responsive data portals as long as they use a recent version of CKAN.

New Ö

New Ö

Old Ö

Old Ö

Data catalogs

Perhaps the biggest innovation of Ö is how the automation process works for adding new datasets to the catalog. Normally with CKAN, data publishers log in and create or update their datasets on the CKAN site. CKAN has for a long time also supported something called harvesting, where an instance of CKAN goes out and fetches new datasets and makes them available. That’s a form of automation, but it’s dependent on specific software being used or special harvesters for each source. So harvesting from one CKAN instance to another is simple. Harvesting from a specific geospatial data source is simple. Automatically harvesting from something you don’t know and doesn’t exist yet is hard.

That’s the reality which Ö faces. Only a minority of public organisations and municipalities in Sweden publish open data at the moment. So a decision hasn’t been made by a majority of the public entities for what software or solution will be used to publish open data.

To tackle this problem, Ö relies on an open standard from the World Wide Web Consortium called DCAT (Data Catalog Vocabulary). The open standard describes how to publish a list of datasets and it allows Swedish public bodies to pick whatever solution they like to publish datasets, as long as one of its outputs conforms with DCAT.

Ö actually uses a DCAT application profile which was specially created for Sweden by Metasolutions and defines in more detail what to expect, for example that Ö expects to find dataset classifications according the Eurovoc classification system.

Thanks to this effort significant improvements have been made to CKAN’s support for RDF and DCAT. They include application profiles (like the Swedish one) for harvesting and exposing DCAT metadata in different formats. So a CKAN instance can now automatically harvest datasets from a range of DCAT sources, which is exactly what Ö does. For Ö, the CKAN support also makes it easy for Swedish public bodies who use CKAN to automatically expose their datasets correctly so that they can be automatically harvested by Ö For more information have a look at the CKAN DCAT extension documentation.

Dead or alive

The Web is decentralised and always changing. A link to a webpage that worked yesterday might not work today because the page was moved. When automatically adding external links, for example, links to resources for a dataset, you run into the risk of adding links to resources that no longer exist.

To counter that Ö uses a CKAN extension called Dead or alive. It may not be the best name, but that’s what it does. It checks if a link is dead or alive. The checking itself is performed by an external service called deadoralive. The extension just serves a set of links that the external service decides to check to see if some links are alive. In this way dead links are automatically marked as broken and system administrators of Ö can find problematic public bodies and notify them that they need to update their DCAT catalog (this is not automatic because nobody likes spam).

These are only the automation highlights of the new Ö Other changes were made that have little to do with automation but are still not immediately visible, so a lot of Ö’s beauty happens behind the scenes. That’s also the case for other open data portals. You might just visit your open data portal to get some open data, but you might not realise the amount of effort and coordination it takes to get that data to you.

Image of Swedish flag by Allie_Caulfield on Flickr (cc-by)

This post has been republished from the CKAN blog.

Just Released: “Where Does Europe’s Money Go? A Guide to EU Budget Data Sources”

Jonathan Gray - July 2, 2015 in Data Journalism, Featured, open knowledge, Open Spending, Policy, Research, Where Does My Money Go

The EU has committed to spending €959 988 million between 2014 and 2020. This money is disbursed through over 80 funds and programmes that are managed by over 100 different authorities. Where does this money come from? How is it allocated? And how is it spent?

Today we are delighted to announce the release of “Where Does Europe’s Money Go? A Guide to EU Budget Data Sources”, which aims to help civil society groups, journalists and others to navigate the vast landscape of documents and datasets in order to “follow the money” in the EU. The guide also suggests steps that institutions should take in order to enable greater democratic oversight of EU public finances. It was undertaken by Open Knowledge with support from the Adessium Foundation.

Where Does Europe's Money Go?

As we have seen from projects like Farm Subsidy and journalistic collaborations around the EU Structural Funds it can be very difficult and time-consuming to put together all of the different pieces needed to understand flows of EU money.

Groups of journalists on these projects have spent many months requesting, scraping, cleaning and assembling data to get an overview of just a handful of the many different funds and programmes through which EU money is spent. The analysis of this data has led to many dozens of news stories, and in some cases even criminal investigations.

Better data, documentation, advocacy and journalism around EU public money is vital to addressing the “democratic deficit” in EU fiscal policy. To this end, we make the following recommendations to EU institutions and civil society organisations:

  1. Establish a single central point of reference for data and documents about EU revenue, budgeting and expenditure and ensure all the information is up to date at this domain (e.g. at a website such as At the same time, ensure all EU budget data are available from the EU open data portal as open data.
  2. Create an open dataset with key details about each EU fund, including name of the fund, heading, policy, type of management, implementing authorities, link to information on beneficiaries, link to legal basis in Eur-Lex and link to regulation in Eur-Lex.
  3. Extend the Financial Transparency System to all EU funds by integrating or federating detailed data expenditures from Members States, non-EU Members and international organisations. Data on beneficiaries should include, when relevant, a unique European identifier of company, and when the project is co-financed, the exact amount of EU funding received and the total amount of the project.
  4. Clarify and harmonise the legal framework regarding transparency rules for the beneficiaries of EU funds.
  5. Support and strengthen funding for civil society groups and journalists working on EU public finances.
  6. Conduct a more detailed assessment of beneficiary data availability for all EU funds and for all implementing authorities – e.g., through a dedicated “open data audit”.
  7. Build a stronger central base of evidence about the uses and users of EU fiscal data – including data projects, investigative journalism projects and data users in the media and civil society.

Our intention is that the material in this report will become a living resource that we can continue to expand and update. If you have any comments or suggestions, we’d love to hear from you.

If you are interested in learning more about Open Knowledge’s other initiatives around open data and financial transparency you can explore the Where Does My Money Go? project, the OpenSpending project, read our other previous guides and reports or join the Follow the Money network.

Where Does Europe’s Money Go - A Guide to EU Budget Data Sources

Become a Friend of The Public Domain Review

Adam Green - June 25, 2015 in Featured, Featured Project, Free Culture, Open GLAM, open knowledge, Public Domain, Public Domain Review

Open Knowledge project The Public Domain Review launches a major new fundraising drive, encouraging people to become Friends of the site by giving an annual donation.

For those not yet in the know, The Public Domain Review is a project dedicated to protecting and celebrating, in all its richness and variety, the cultural public domain. In particular, our focus is on the digital copies of public domain works, the mission being to facilitate the appreciation, use and growth of a digital cultural commons which is open for everyone.

We create collections of openly licensed works comprised of highlights from a variety of galleries, libraries, archives, and museums, many of whom also contribute to our popular Curator’s Choice series (including The British Library, Rijksmuseum, and The Getty). We also host a fortnightly essay series in which top academics and authors write about interesting and unusual public domain works which are available online.

Founded in 2011, the site has gone from strength to strength. In its 4 plus years it has seen contributions from the likes of Jack Zipes, Frank Delaney, and Julian Barnes – and garnered praise from such media luminaries as The Paris Review, who called us “one of their favourite journals”, and The Guardian, who hailed us as a “model of digital curation”.

This is all very exciting but we need your help to continue the project into the future.

We are currently only bringing in around half of the base minimum required – the amount we need in order to tick along in a healthy manner. (And around a third of our ideal goal, which would allow us to pay contributors). So it is of urgent importance that we increase our donations if we want the project to continue.

Hence the launch of a brand new fundraising model through which we hope to make The Public Domain Review sustainable and able to continue into the future. Introducing “Friends of The Public Domain Review”

Image 1: one of the eight postcards included in the inaugural postcard set. The theme is "Flight" and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July - Source.

Image 1: one of the eight postcards included in the inaugural postcard set. The theme is “Flight” and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July. Source =

What is it?

This new model revolves around building a group of loyal PDR (Public Domain Review) supporters – the “Friends” – each of whom makes an annual donation to the project. This club of patrons will form the beating heart of the site, creating a bedrock of support vital to the project’s survival.

How can one become a Friend?

There is no fixed yearly cost to become a Friend – any annual donation will qualify you – but there is a guide price of $60 a year (£40/€55).

Are there any perks of being a Friend?

Yes! Any donation above $30 will make you eligible to receive our exclusive twice-a-year “postcard set” – 8 beautiful postcards curated around a theme, with a textual insert. Friends will also be honoured in a special section of the site and on a dedicated page in all PDR Press publications. They will also get first refusal in all future limited edition PDR Press creations, and receive a special end of year letter from the Editor.

How do I make my donation?

We’ve worked hard to make it as easy as possible to donate. You no longer have to use PayPal on the PDR site, but can rather donate using your credit or debit card directly on the site.

For more info, and to make your donation, visit:

Become a Friend before 8th July to receive the inaugural postcard set upon the theme of “Flight”

Image 2: one of the eight postcards included in the inaugural postcard set. The theme is "Flight" and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July - Source.

Image 2: one of the eight postcards included in the inaugural postcard set. The theme is “Flight” and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July. Source =

What should we include in the Global Open Data Index? From reference data to civil society audit.

Mor Rubinstein - June 18, 2015 in Global Open Data Index

Three years ago we decided to begin to systematically track the state of open data around the world. We wanted to know which countries were the strongest and which national governments were lagging behind in releasing the key datasets as open data so that we could better understand the gaps and work with our global community to advocate for these to be addressed.

In order to do this, we created the Global Open Data Index, which was a global civil society collaboration to map the state of open data in countries around the world. The result was more than just a benchmark. Governments started to use the Index as a reference to inform their priorities on open data. Civil society actors began to use it as a tool to teach newcomers about open data and as advocacy mechanism to encourage governments to improve their performance in releasing key datasets.

Three years on we want the Global Open Data Index to become much more than a measurement tool. We would like it to become a civil society audit of the data revolution. As a tool driven by campaigners, researchers and advocacy organisations, it can help us, as a movement, determine the topics and issues we want to promote and to track progress on them together. This will mean going beyond a “baseline” of reference datasets which are widely held to be important. We would like the Index to include more datasets which are critical for democratic accountability but which may be more ambitious than what is made available by many governments today.

The 10 datasets we have now and their score in France

The 10 datasets we have now and their score in France

To do this, we are today opening a consultation on what themes and datasets civil society think should be included in the Global Open Data Index. We want you to help us decide on the priority datasets that we should be tracking and advocating to have opened up. We want to work with our global network to collaboratively determine the datasets that are most important to obtaining progress on different issues – from democratic accountability, to stronger action on climate change, to tackling tax avoidance and tax evasion.

Drawing inspiration from our chapter Open Knowledge Belgium’s activities to run their own local open data census, we decided to conduct a public consultation. This public consultation will be divided into two parts:

Crowdsourced Survey – Using the platform of WikiSurvey, a platform inspired by kittens war (and as we all know, anything inspired by viral kittens cannot be bad), we are interested in what you think about which datasets are most important. The platform is simple, just choose between two datasets the one that you see as being a higher priority to include in the Global Open Data Index. Can’t find a dataset that you think is important? Add your own idea to the pool. You do not have a vote limit, so vote as much as you want and shape the index. SUBMIT YOUR DATA NOW

Wiki Survey

Our Wiki Survey


Focused consultation with civil society organisations – This survey will be sent to a group of NGOs working on a variety of issues to find out what they think about what specific datasets are needed and how they can be used. We will add ideas from the survey to general pool as they come in. Want to answer the survey as well? You can find it here.

This public consultation will be open for the next 10 days and will be closed at June 28th. At the end of the process we will analyse the results and share them with you.

We hope that this new process that we are starting today will lead to an even better index. If you have thoughts about the process, please do share your thoughts with us on our new forum on this topic:

Get Updates