Support Us

You are browsing the archive for Releases.

Launching Spending Stories: How much is it really?

Anders Pedersen - November 21, 2013 in Data Journalism, Featured, Open Spending, Releases

spendingstories

Spending Stories is a new way to put spending figures in their proper perspective. Developed by the Open Knowledge Foundation and Journalism++ with funding from the Knight Foundation, Spending Stories is an app that helps citizens and journalists understand and compare amounts in stories from the news.

When we hear that the UK’s school meals programme costs £6 million, what does that really mean? It means, for one thing, that it costs about a fifth of the annual spending on the monarchy.

Spending Stories draws out comparisons between amounts of money, giving users a context in which to understand how money is being spent across society while referencing the original news stories.

Users can enter a figure into Spending Stories and get a scale visualisation showing how it compares with spending stories from the app’s database.

£700,000: scale visualisation

The app displays the big picture, and users can then click through to a card visualisation that shows how the amount relates to specific stories.

£700,000: card visualisation

Users can filter stories to only show amounts that relate to the user’s interests, for example aid or energy.

Filtering stories

If users find news stories of interest, they can contribute these to the database in three easy steps and share them.

Contribute new data

Due to the good availability of UK spending data in OpenSpending, this first release of Spending Stories focuses on the UK. Spending Stories is, however, an open source project and can easily be forked and translated into other languages.

We hope to help Spending Stories sites launch on their own and expand with new features and local news stories. At launch, we are already in touch with Open Knowledge Foundation Japan about the potential deployment of Spending Stories in Japanese.

If you would like to know more about the options for setting up a local Spending Stories site, get in touch.

Campaigners challenge Cameron to keep promise to tackle company secrecy

Open Knowledge - October 24, 2013 in Featured, Releases, Transparency

The UK government must use the Open Government Partnership summit in London next week to end the secrecy surrounding who really owns millions of UK companies, campaigners said today. Discussions are underway right now at the highest levels of government and campaigners are expecting a decision to be made by the end of this week.

At the G8 summit earlier this year, David Cameron promised “to push for more transparency on who owns companies”. Failure to do so would be a massive missed opportunity to stop tax evasion, money laundering and other forms of crime and corruption, and would seriously undermine UK government claims to lead the world on government openness and accountability, according to a coalition of non-governmental organisations including the Tax Justice Network and the Financial Transparency Coalition.

Laura James, CEO of the Open Knowledge Foundation, said:

“Increasing transparency around company ownership was a key commitment at the UK G8 and is part of the legacy on which this government will be judged. Cameron got a lot of credit for leading on this issue, which could make a massive difference in the fight against corruption and financial crime. Not announcing plans this week to mandate public registries of who really owns companies in the UK would be a missed opportunity and a failure of leadership.”

Campaigners warned that a private registry of ultimate or ‘beneficial’ company ownership that was only accessible to tax authorities and law enforcement agencies would incur the same costs as a public registry, but would bring none of the benefits. They argue that without broader scrutiny from the media, civil society, businesses and the public, errant companies would have much less incentive to change their behaviour.

Many companies have expressed their support for establishing a public registry – including over 20,000 business owners who signed an open letter organised by Avaaz earlier this year – as well as others via organisations such as the European Banking Federation and the Institute of Directors.

Chris Taggart, Co-Founder & CEO, OpenCorporates said:

“This isn’t just about tackling crime and corruption, it’s also about good business. Companies need to know who they are dealing with for markets to function effectively. In a globalised world, with transnational corporations increasingly dominant, ownership transparency is also critical for democracy. Only those with something to hide should oppose this reform.”

Joseph Stead, Senior Adviser on Economic Justice at Christian Aid said:

“Phantom firms registered in developed countries, like the UK and its overseas territories, facilitate huge outflows of illicit money from developing countries. Ending the secrecy will help stop the outflows and ensure developing countries have the resources to provide for essential public services such as health and education.”

Robert Palmer, Banks and Corruption Campaign Leader at Global Witness said:

“Global Witness’ investigations have shown repeatedly how anonymous shell companies are the getaway cars for crime and corruption. A public register of who owns and controls companies would make it much harder for the British financial system to be abused in this way.”

Richard Murphy of Tax Research UK said:

“This issue is vital to tax justice and to closing the tax gap in the UK. Over 300,000 companies quite literally disappear from official records each year because our tax and company authorities do not know how to contact them. Knowing who the real owners of these companies are will help ensure all businesses pay the tax they owe – which will benefit everyone.”

/ Ends

Contact: Rachel Baird, Christian Aid, 0207 523 2446; Robert Palmer, Global Witness, 07545 645406; Chris Taggart, OpenCorporates, 0771 306 7285; Amy Barry, Open Knowledge Foundation, 07980 664397

Signatories: Christian Aid, Global Witness, Financial Transparency Coalition, OpenCorporates, Open Knowledge Foundation, Tax Justice Network, Tax Research LLP

Notes to editors: The annual summit for the Open Government Partnership will take place in London on 31st October to 1st November. More details at: http://www.opengovpartnership.org/

U.S. government’s data portal relaunched on CKAN

Irina Bolychevsky - May 23, 2013 in CKAN, Featured, News, Releases

Today, we are excited to announce that our work with the US Federal Government (data.gov) has gone live at catalog.data.gov! You can also read the announcement from the data.gov blog with their description of the new catalog.

Catalog.Data.gov

The Open Knowledge Foundation’s Services team, which deploys CKAN, have been working hard on a new unified catalog to replace the numerous previously existing catalogs of data.gov. All geospatial and raw data is federated into a single portal where data from different portals, sources and catalogs is displayed in a beautiful standardized user interface allowing users to search, filter and facet through thousands of datasets.

This is a key part of the U.S. meeting their newly announced Open Data Policy and marks data.gov’s first major step into open source. All the code is available on Github and data.gov plan to make their CKAN / Drupal set-up reusable for others as part of OGPL.

As one of the first major production sites to launch with the shiny new CKAN 2.0, data.gov takes advantage of the much improved information architecture, templating and distributed scalable authorization model. CKAN provides data.gov with a web interface for over 200 publishing organizations to manage their members, harvest sources and datasets – supporting requirements being outlined in Project Open Data. This means that agencies can maintain their data sources individually, schedule regular refreshes of the metadata into the central repository and manage an approval workflow.

There have been many additions to CKAN’s geospatial functionality, most notably a fast and elegant geospatial search:

Geospatial search filter

We have added robust support for harvesting FGDC and ISO 19139 documents from WAFs, single spatial documents, CSW endpoints, ArcGIS portals, Z39:50 sources, ESRI Geoportal Servers as well as other CKAN catalogs. This is available for re-use as part of our harvesting and spatial extensions.

Most importantly, this is a big move towards greater accessibility and engagement with re-users. Not only is metadata displayed through a browsable web interface (instead of XML files), there is now a comprehensive CKAN API with access to all web functionality including search queries and downloads which respects user and publisher permission settings. Users can preview the data in graphic previews as well as exploring Web Map Services, whilst the dataset page provides context, browsable tags, dataset extent, and maintainers.

Web Map Service

As data.gov invites users to get involved and provide feedback, we would also like to say that we are really excited about CKAN’s future. We have a very active mailing list, new documentation for installing CKAN and ways to contribute to the code for anyone wanting to join the CKAN community.

If you’re launching a CKAN portal soon or have one we don’t know about, let us know and we’ll make sure to add you to our wall of awesome!

Announcing CKAN 2.0

Mark Wainwright - May 10, 2013 in CKAN, Featured, Featured Project, News, OKF Projects, Open Data, Open Government Data, Releases, Technical

CKAN is a powerful, open source, open data management platform, used by governments and organizations around the world to make large collections of data accessible, including the UK and US government open data portals.

Today we are very happy and excited to announce the final release of CKAN 2.0. This is the most significant piece of CKAN news since the project began, and represents months of hectic work by the team and other contributors since before the release of version 1.8 last October, and of the 2.0 beta in February. Thank you to the many CKAN users for your patience – we think you’ll agree it’s been worth the wait.

[Screenshot: Front page]

CKAN 2.0 is a significant improvement on 1.x versions for data users, programmers, and publishers. Enormous thanks are due to the many users, data publishers, and others in the data community, who have submitted comments, code contributions and bug reports, and helped to get CKAN to where it is. Thanks also to OKF clients who have supported bespoke work in various areas that has become part of the core code. These include data.gov, the US government open data portal, which will be re-launched using CKAN 2.0 in a few weeks. Let’s look at the main changes in version 2.0. If you are in a hurry to see it in action, head on over to demo.ckan.org, where you can try it out.

Summary

CKAN 2.0 introduces a new sleek default design, and easier theming to build custom sites. It has a completely redesigned authorisation system enabling different departments or bodies to control their own workflow. It has more built-in previews, and publishers can add custom previews for their favourite file types. News feeds and activity streams enable users to keep up with changes or new datasets in areas of interest. A new version of the API enables other applications to have full access to all the capabilities of CKAN. And there are many other smaller changes and bug fixes.

Design and theming

The first thing that previous CKAN users notice will be the greatly improved page design. For the first time, CKAN’s look and feel has been carefully designed from the ground up by experienced professionals in web and information design. This has affected not only the visual appearance but many aspects of the information architecture, from the ‘breadcrumb trail’ navigation on each page, to the appearance and position of buttons and links to make their function as transparent as possible.

[Screenshot: dataset page]

Under the surface, an even more radical change has affected how pages are themed in CKAN. Themes are implemented using templates, and the old templating system has been replaced with the newer and more flexible Jinja2. This makes it much easier for developers to theme their CKAN instance to fit in with the overall theme or branding of their web presence.

Authorisation and workflow: introducing CKAN ‘Organizations’

Another major change affects how users are authorised to create, publish and update datasets. In CKAN 1.x, authorisation was granted to individual users for each dataset. This could be augmented with a ‘publisher mode’ to provide group-level access to datasets. A greatly expanded version of this mode, called ‘Organizations’, is now the default system of authorisation in CKAN. This is much more in line with how most CKAN sites are actually used.

[Screenshot: Organizations page]

Organizations make it possible for individual departments, bodies, groups, etc, to publish their own data in CKAN, and to have control over their own publishing workflow. Different users can have different roles within an Organization, with different authorisations. Linked to this is the possibility for each dataset to have different statuses, reflecting their progress through the workflow, and to be public or private. In the default set-up, Organization user roles include Members (who can read the Organization’s private datsets), Editors (who can add, edit and publish datasets) and Admins (who can add and change roles for users).

More previews

In addition to the existing image previews and table, graph and map previews for spreadsheet data, CKAN 2.0 includes previews for PDF files (shown below), HTML (in an iframe), and JSON. Additionally there is a new plugin extension point that makes it possible to add custom previews for different data types, as described in this recent blog post.

[Screenshot: PDF preview]

News feeds and activity streams

CKAN 2.0 provides users with ways to see when new data or changes are made in areas that they are interested in. Users can ‘follow’ datasets, Organizations, or groups (curated collections of datasets). A user’s personalised dashboard includes a news feed showing activity from the followed items – new datasets, revised metadata and changes or additions to dataset resources. If there are entries in your news feed since you last read it, a small flag shows the number of new items, and you can opt to receive notifications of them via e-mail.

Each dataset, Organization etc also has an ‘activity stream’, enabling users to see a summary of its recent history.

[Screenshot: News feed]

Programming with CKAN: meet version 3 of the API

CKAN’s powerful application programming interface (API) makes it possible for other machines and programs to automatically read, search and update datasets. CKAN’s API was previously designed according to REST principles. RESTful APIs are deservedly popular as a way to expose a clean interface to certain views on a collection of data. However, for CKAN we felt it would be better to give applications full access to CKAN’s own internal machinery.

A new version of the API – version 3 – trialled in beta in CKAN 1.8, replaced the REST design with remote procedure calls, enabling applications or programmers to call the same procedures as CKAN’s own code uses to implement its user interface. Anything that is possible via the user interface, and a good deal more, is therefore possible through the API. This proved popular and stable, and so, with minor tweaks, it is now the recommended API. Old versions of the API will continue to be provided for backward compatibility.

Documentation, documentation, documentation

CKAN comes with installation and administration documentation which we try to keep complete and up-to-date. The major changes in the rest of CKAN have thus required a similarly concerted effort on the documentation. It’s great when we hear that others have implemented their own installation of CKAN, something that’s been increasing lately, and we hope to see even more of this. The docs have therefore been overhauled for 2.0. CKAN is a large and complex system to deploy and work on improving the docs continues: version 2.1 will be another step forward. Where people do run into problems, help remains available as usual on the community mailing lists.

… And more

There are many other minor changes and bug fixes in CKAN 2.0. For a full list, see the CKAN changelog.

Installing

To install your own CKAN, or to upgrade an existing installation, you can install it as a package on Ubuntu 12.04 or do a source installation. Full installation and configuration instructions are at docs.ckan.org.

Try it out

You can try out the main features at demo.ckan.org. Please let us know what you think!

Opening up the wisdom of crowds for science

Francois Grey - April 22, 2013 in Featured, News, Open Data, Open Science, Our Work, PyBossa, Releases

We are excited to announce the official launch of Crowdcrafting.org, an open source software platform – powered by our Pybossa technology – for developing and sharing projects that rely on the help of thousands of online volunteers.

crowdcrafting logo

At a workshop on Citizen Cyberscience held this week at University of Geneva, a novel open source software platform called Crowdcrafting was officially launched. This platform, which already has attracted thousands of participants during several months of testing, enables the rapid development of online citizen science applications, by both amateur and professional scientists.

Applications already running on Crowdcrafting range from classifying images of magnetic molecules to analyzing tweets about natural disasters. During the testing phase, some 50 new applications have been created, with over 50 more under development. The Crowdcrafting platform is hosted by University of Geneva, and is a joint initiative between the Open Knowledge Foundation and the Citizen Cyberscience Centre, a Geneva-based partnership co-founded by University of Geneva. The Sloan Foundation has recently awarded a grant to this joint initiative for the further development of the Crowdcrafting platform.

Crowdcrafting fills a valuable niche in the broad spectrum of online citizen science. There are already many citizen science projects that use online volunteers to achieve breakthrough results, in fields as diverse as proteomics and astronomy. These projects often involve hundreds of thousands of dedicated volunteers over many years. The objective of Crowdcrafting is to make it quick and easy for professional scientists as well as amateurs to design and launch their own online citizen science projects. This enables even relatively small projects to get started, which may require the effort of just a hundred volunteers for only a few weeks. Such initiatives may be small on the scale of most online social networks, but they still correspond to many man-years of scientific effort achieved in a short time and at low cost.

“By emphasizing openness and simplicity, Crowdcrafting is lowering the threshold in investment and expertise needed to develop online citizen science projects”, says Guillemette Bolens, Deputy Rector for Research at the University of Geneva. “As a result, dozens of projects are under development, many of them in the digital humanities and data journalism, some of them created by university students, others still by people outside of academia.”

An example occurred after the tropical storm that wreaked havoc in the Philippines late last year. A volunteer initiative called Digital Humanitarian Network used Crowdcrafting to launch a project called Philippines Typhoon. This enabled online volunteers to classify thousands of tweets about the impact of the storm, in order to more rapidly filter information that could be vital to first responders. “We are excited about how Crowdcrafting is assisting the digital volunteer community worldwide in responding to natural disasters,” says Francesco Pisano, Director of Research at UNITAR.

“Crowdcrafting is also enabling the general public to contribute in a direct way to fundamental science,” says Gabriel Aeppli, Director of the London Centre for Nanotechnology (LCN), a joint venture between UCL and Imperial College. A case in point is the project Feynman’s Flowers, set up by researchers at LCN. In this project, volunteers use Crowdcrafting to measure the orientation of magnetic molecules on a crystalline surface. This is part of a fundamental research effort aimed at creating novel nanoscale storage systems for the emerging field of quantum computing.

Commenting on the underlying technology, Rufus Pollock, founder of the Open Knowledge Foundation, said, “Crowdcrafting is powered by the open-source PyBossa software, developed by ourselves in collaboration with the Citizen Cyberscience Centre. Its aim is to make it quick and easy to do “crowdsourcing for good” – getting volunteers to help out with tasks such as image classification, transcription and geocoding in relation to scientific and humanitarian projects”. The Shuttleworth Foundation and the Open Society Foundations funded much of the early development work for this technology.

Francois Grey, coordinator of the Citizen Cyberscience Centre, says, “Our goal now, with support from the Sloan Foundation, is to integrate other apps for data collection, processing and storage, to make Crowdcrafting an open-source ecosystem for building a new generation of browser-based citizen science projects.”

For further information about Crowdcrafting, see Crowdcrafting.org.

UK Departmental Government Spending – Improving the Quality of Reporting

Lucy Chambers - September 13, 2012 in Featured, Open Spending, Releases

Continuing in their mission to make spending data more accessible and comprehensible, the Spending Stories team and the team of Data.Gov.Uk are releasing a reporting tool today that will help journalists and analysts to pick the freshest and best departmental spending data to work with when exploring the UK central government expenditure.

Spending data is juicy for journalists – why does it get neglected?

Many reasons. One key one is that the shelf-life of a spending dataset is pretty short from a journalist’s point of view, if they have to wait 6 months or even a year for spending data they need for a story to be released, then chances are – the sniff of the story they were wanting to write will probably have gone stale.

Journalists, campaigners and activists need access to well-structured, machine readable and timely data from national as well as sub-national administrations. At OpenSpending, we’re often contacted by journalists with story ideas, or they approach us with a lead. The stumbling stone for them is either lack of information, or worse data that they can’t use because they are not sure of its completeness. The problem is thus the one of trees falling in a wood: If a transaction is missing from a list – does that mean there was no transaction for that amount on that date, or does it mean that the transaction simply was not reported?

These distinctions are important for anyone trying to understand the data – and up to now they have been pretty tricky to answer. As an attempt to make this a little easier, today, we announce the availability of an automatic reporting tool for spending data (available both on data.gov.uk and on OpenSpending), the result of a collaboration between data.gov.uk and us in order to increase the visibility of the spend data and to increase the ease of browsing the substantial volume of datasets that make up the reporting of Government expenditure in data.gov.uk.

The tool lists departments registered as data publishers on data.gov.uk and details how precisely they have followed the HM Treasury reporting guidelines. It will also make the whole of the reported data available for search and analysis both on data.gov.uk and on the OpenSpending site.

The tool is useful to those both using the data, and those within government in assuring that departments are reporting on time. It helps to check:

  1. Quality of the data (i.e. adherence to HMT reporting guidelines, well-structured data)
  2. Status of reporting (i.e. how complete the reports are or if there is a reporting period missing)

Why was this possible?

Having all of these datasets organised under a single catalogue at Data.Gov.UK  in simple spreadsheet format combined with the Data.gov.uk team’s work in making the necessary metadata available enabled the OpenSpending team to create an extraction system to be set up to clean the data on a regular basis. The team then cleaned over 6000 column names to add compliance with HMT guidance.

How does it work?

The report generator then highlights in red departments who are registered as a publisher on Data.gov.uk but have failed to publish any information on their spending, in yellow those who have published data which cannot be interpreted as spending data (e.g. PDF format or not complying with the template provided by HMT) and green those departments whose records have been updated as regularly as demanded as per the publication requirements (latest data must have been published as recently as a month ago).

The first stage of this release deals with central departments, who are obliged to report all spending over 25,000 GBP. Subsequent stages to follow soon after will monitor local councils and other governmental bodies, which have different reporting requirements. The interface will be useful both inside and out of government, to ensure transparency regulations are met and to better understand where gaps in data may alter the completeness of the picture offered by government data.

Interested in more regular updates from the Spending Stories team? Join the discussion via the OpenSpending mailing list.

Translators needed!

David Read - February 10, 2012 in CKAN, Join us, OKF Projects, Open Data, Our Work, Releases

Do you speak another language apart from English? Have you got a little bit of spare time over the next week?

CKAN 1.6 is set to release in one week’s time and all the new features need translating. Can you help us complete it in time? If you can spend 15 minutes filling in the gaps using the Transifex website, then not only will community CKANs in your country benefit (e.g. Czech, Swedish, French etc), but so will the international CKANs run in your language! (e.g. thedatahub.org, datacatalogs.org, publicdata.eu)

These are the languages and how complete the translations are:

https://www.transifex.net/projects/p/ckan/resource/1-6/

Serbian 83%

Finnish 83% Norwegian 83%

Portuguese 83%

Italian 83%

Catalan 83%

French 83%

Polish 82% Czech 82%

German 80%

Spanish 76%

Swedish 74%

Hungarian 58%

Albanian 43%

Dutch 37%

Bulgarian 37%

Greek 27%

Slovenian 23%

It’s easy to do some translating!

First timers will need to setup their account first:

  1. Log-in with Transifex/Facebook/Twitter/Google account here.

  2. Choose a CKAN language team: https://www.transifex.net/projects/p/ckan/teams/

  3. Click “Join this team”

  4. Wait for me or another admin to approve you

Now to translate:

  1. https://www.transifex.net/projects/p/ckan/resource/1-6/

  2. Click on your language

  3. Press “Translate”.

Every day this week I’ll put the translations up on thedatahub.org for you to see the results. Please help make help make this open data catalogue readable by as many people as possible!

Corruption-busting data releases in Croatia

Theodora Middleton - December 2, 2011 in Open Data, Releases, WG EU Open Data, WG Open Government Data

The following post is by Theodora Middleton, the OKF’s blog editor.

Government transparency has been making the headlines over in Croatia, thanks to the amazing work of Marko Rakar, Croatia’s leading transparency expert. He has secured the release of all the public procurement data for government spending, dating back to July 1st 2009 in a fully searchable format. The database includes about 58,000 individual contracts totally 80 billion Croatian kuna (or about $15-$16 billion) covering more than 13,000 companies, which allows you to see which agency ordered what goods and services, and who received each contract. Marko Rakar has made a name for himself in Croatia and internationally for his corruption-busting data releases, including his release of the country’s fraud-ridden voter files, and the release of a database of war veterans that also demonstrated massive fraud – for which he was arrested (although charges were later dropped).

As reported on the TechPresident website:

A search by contractors shows their overall procurement record (how many contracts, what type of contract, what amount of money is involved and to whom was sold goods and services). The database, which is modeled on FedSpending.gov but in some ways more detailed, also allows a user to see how dependent a company is on government contracts.
This information was theoretically already available on official government websites, but it was essentially useless. Customer names and supplier names aren’t shown on the same page; you can’t search by other criteria; and the data only goes back two months, after which it is removed. Rakar worked with colleagues to collect all the data going back 27 months, and then machine-processed and indexed them. So far, in its first two days, the site has had 56,000 unique visits (for a nation of two million internet users).

The new release is already bearing fruit, as Marko reported to TechPresident:

We have found a number of companies which appear to be founded only to service a single government contract. Journalists have already found a number of companies which have a number of multimillion contracts and are at the same time huge donors to the ruling party. We have found a horse farm which bid on and won a contract to lay underground power cable, we have found a company which is related to the Speaker of the House which reports unusually high profit rates (50% and above) worth millions (both in Croatian and US currency) and which primarily deals with advertising in public spaces (schools, hospitals and similar). We have found one company which belonged to the Minister of Interior which also received multimillion security related contracts with the government (while he is still in the office).

OpenSpending v0.10 released

Lucy Chambers - September 20, 2011 in Open Spending, Releases

This post is by Martin Keegan, project lead on OpenSpending.

We’ve released v0.10 of the OpenSpending code, and made it live on http://openspending.org/

Changes in v0.10:

  • Data loading has been separated from the main web application. Web-based and command-line tools for data wranglers to load/reload datasets have been separated from the main end-user facing web application. They now reside in separate code repositories; there has been signficant reorganisation of the resulting source trees

  • More tests. Test coverage and organisation has been improved, and considerably more of the tests pass

  • Model overhaul. The integration between python and MongoDB has been effectively replaced

  • Removed dependency on celery. Long-running import tasks used to use a third-party subsystem called celery, which proved an administration and reliability hassle. It has been replaced by our own code.

  • Command line interface tidied up

  • Data-wrangler workflow improvements. Drop dataset is now supported, as are CLI and WUI tools for tagging CKAN packages for use with OpenSpending

Get Updates