Support Us

You are browsing the archive for OKF Projects.

Second Open Economics International Workshop

June 5, 2013 in Events, Featured, Open Data, Open Economics, WG Economics, Workshop

Next week, on June 11-12, at the MIT Sloan School of Management, the Open Economics Working Group of the Open Knowledge Foundation will gather about 40 economics professors, social scientists, research data professionals, funders, publishers and journal editors for the second Open Economics International Workshop.

The event will follow up on the first workshop held in Cambridge UK and will conclude with agreeing a statement on the Open Economics principles. Some of the speakers include Eric von Hippel, T Wilson Professor of Innovation Management and also Professor of Engineering Systems at MIT, Shaida Badiee, Director of the Development Data Group at the World Bank and champion for the Open Data Initiative, Micah Altman, Director of Research and Head of the Program on Information Science for the MIT Libraries as well as Philip E. Bourne, Professor at the University of California San Diego and Associate Director of the RCSB Protein Data Bank.

The workshop will address topics including:

  • Research data sharing: how and where to share economics social science research data, enforce data management plans, promote better data management and data use
  • Open and collaborative research: how to create incentives for economists and social scientists to share their research data and methods openly with the academic community
  • Transparent economics: how to achieve greater involvement of the public in the research agenda of economics and social science

The knowledge sharing in economics session will invite a discussion between Joshua Gans, Jeffrey S. Skoll Chair of Technical Innovation and Entrepreneurship at the Rotman School of Management at the University of Toronto and Co-Director of the Research Program on the Economics of Knowledge Contribution and Distribution, John Rust, Professor of Economics at Georgetown University and co-founder of EconJobMarket.org, Gert Wagner, Professor of Economics at the Berlin University of Technology (TUB) and Chairman of the German Census Commission and German Council for Social and Economic Data as well as Daniel Feenberg, Research Associate in the Public Economics program and Director of Information Technology at the National Bureau of Economic Research.

The session on research data sharing will be chaired by Thomas Bourke, Economics Librarian at the European University Institute, and will discuss the efficient sharing of data and how to create and enforce reward structures for researchers who produce and share high quality data, gathering experts from the field including Mercè Crosas, Director of Data Science at the Institute for Quantitative Social Science (IQSS) at Harvard University, Amy Pienta, Acquisitions Director at the Inter-university Consortium for Political and Social Research (ICPSR), Joan Starr, Chair of the Metadata Working Group of DataCite as well as Brian Hole, the founder of the open access academic publisher Ubiquity Press.

Benjamin Mako Hill, researcher and PhD Candidate at the MIT and Berkman Center for Internet and Society at Harvard Univeresity, will chair the session on the evolving evidence base of social science, which will highlight examples of how economists can broaden their perspective on collecting and using data through different means: through mobile data collection, through the web or through crowd-sourcing and also consider how to engage the broader community and do more transparent economic research and decision-making. Speakers include Amparo Ballivian, Lead Economist working with the Development Data Group of the World Bank, Michael P. McDonald, Associate Professor at George Mason University and co-principle investigator on the Public Mapping Project and Pablo de Pedraza, Professor at the University of Salamanca and Chair of Webdatanet.

The morning session on June 12 will gather different stakeholders to discuss how to share responsibility and how to pursue joint action. It will be chaired by Mireille van Eechoud, Professor of Information Law at IViR and will include short statements by Daniel Goroff, Vice President and Program Director at the Alfred P. Sloan Foundation, Nikos Askitas, Head of Data and Technology at the Institute for the Study of Labor (IZA), Carson Christiano, Head of CEGA’s partnership development efforts and coordinating the Berkeley Initiative for Transparency in the Social Sciences (BITSS) and Jean Roth, the Data Specialist at the National Bureau of Economic Research.

At the end of the workshop the Working Group will discuss the future plans of the project and gather feedback on possible initiatives for translating discussions in concrete action plans. Slides and audio will be available on the website after the workshop. If you have any questions please contact economics [at] okfn.org

Data Expedition story: Why garment retailers need to do more in Bangladesh

June 4, 2013 in School of Data

This post is cross-posted from the School of Data blog

On May 25-26 almost 50 participants from several teams set out on a data expedition to map the garment factories. This is a report from the team comprised of Roy Keyes, Naomi Colvin, Sybern, Bhanupriya Rao and Daniela Mattern. The team used a crowdsourced database on garment factories to expose questionable standards and highlight the need for open supplier lists from all retailers. The article concludes that major retailers like Wal-Mart maintains high levels of opacity around their supply chain and audit standards, which are detrimental to improving working standards in the garment industry.

Not the first time! When the Rana Plaza collapsed killing 1127 people and injuring over 2500 people of its 5000 workforce, it shocked the world and shone an instant light on the working conditions of the garment factories in Bangladesh. While it may have been the worst disaster of our times, it is my no means the first in Bangladesh, where fire due to faulty electrics and short-circuits or building collapses due to structural and maintenance issues are commonplace. Just 8 days later, another fire broke out in one of the Tung Hai group factory killing 8 people. The fire in Tazreen garment factory in November 2012, which killed 100 people should have acted as a wake up call to take health and safety issues seriously. But all it did was lull the government, retailers and the Bangladesh Garment Manufacturers and Exporters Association (BGMEA) into deeper slumber after dubbing it as arson.

Holier-than-thou? The Rana Plaza tragedy seemed like a rude awakening, one that shone a spotlight on the appalling conditions that Human Rights Watch and others have warned about for many years in sweat shops. There was an instant rush by Western retailers who source a major chunk of their ready-made garments from Bangladesh, to appear to be doing the right thing: to be holier-than-thou. Wal-Mart was quick to release a list of 250 factories that it blacklisted from its supplier list in what appears to be a PR exercise, without any transparency around their audit findings or the exact reasons for the blacklist except for a vague statement that the ‘violations could relate to safety issues, social issues, unauthorized subcontracting or other requirements established by our set of Standards for Suppliers. Suffice it to say that, H&M still sources from eleven and Van-Gruppen from two of the factories. In the absence of transparent data on their methods of audit and their findings, simply blacklisting of companies is not very helpful. Wal-Mart’s blacklist consists of large textile groups such as Akh Fashions, Hop Lun and Mohammadi Group that that own several factories and supply to several big western retailers. MJ Group – whose subsidiary, Columbia Garments, is on the Wal-Mart list – lists Replay, New Yorker, C&A, Espirit, GAP, Old Navy and Macys alongside H&M as customers on its website.

Sustainability and Ethical codes The essential point being missed in the rush to appear holier-than-thou is the compliance with ethical standards initiatives that rely largely on a multi-stake holder model. Worldwide Responsible Accredited Production (WRAP) is one such accreditation initiative which has released a list of 194 factories in Bangladesh that meets its standards. That these certified factories constitute a mere 3% of all factories in Bangladesh gives us an insight into how far the industry has to go as far as certification is concerned. Interestingly, 22 of the Wal-Mart blacklisted factories feature on this list. While Wal-Mart was quick to disclose a blacklist in a bid to appear responsible, it would do well to disclose all its suppliers in the interests of transparency and responsible sourcing.

H&M has been much more transparent here, not just disclosing a list of its worldwide suppliers, but also spelling out its stringent audit policy. Only one H&M factory was both WRAP certified and on Wal-Mart blacklist. And the story is a bit more encouraging because 15% of H&M’s suppliers in Bangladesh are WRAP accredited. Brands like Puma (10%) and Varner-Gruppen (15%) show some good signs of sourcing from accredited suppliers as opposed to Timberland and Nike, none of whose suppliers are WRAP accredited. While by no means adequate, it does show that some retailers are better at sourcing ethically than the others.

Table: Which retailers use WRAP Certified factories?

Retailer

Factories

in Bangladesh

WRAP Certified

Retailer % WRAP Certified

H&M

164

24

15

Levi’s

13

1

8

Nike

6

0

0

Puma

10

1

10

Timerland

5

0

0

Varner-Gruppen

46

7

15

Source: Crowdsourced garment factory list
The blacklist from Wal-Mart is pretty rich considering that along with Gap it has refused to sign the Accord on Fire and Building Safety in Bangladesh, instead preferring to rely on their own codes and audits. H&M was the first retailer, followed by 31 others, to sign the agreement which includes provisions for independent safety inspections, mandatory repairs and renovations and a commitment to pay for them and a role for workers and their unions to make garment factories safe in Bangladesh safe. The accord is a watershed moment for the reason that it is a multilateral initiative driven by retailers, global unions IndustriALL and UNI, in alliance with Clean Clothes Campaign and Worker Rights Consortium.

It certainly could be the last! In the aftermath of the Wal-Mart blacklist, other retailers like H&M have rushed in to rethink their sourcing policy and look at new supply chains in Africa and Latin America. While any rethink is welcome, it needs to be in the area of more responsible auditing, greater transparency in supply chains, not just of primary suppliers, but secondary ones where there is astounding opacity. What would be a great step forward for western retailers like H&M is to make public their factory wise audit findings for greater accountability. Simply moving supply chains and tolerating the same conditions will not see the end of tragedies such as the Rana Plaza. There needs to be timely and better audit data as well as supplier data down to the last in the supply chain as well as greater commitment to multi-stakeholder processes such as the Fire safety accord. This could be the beginning of a long-term political engagement on workers safety and better wage and working conditions. This also means that Rana Plaza could be the last in the list of terrible tragedies.

Data Expedition: Tax Avoidance and Evasion – 6th June

May 24, 2013 in Open Spending, School of Data, Spending Stories

Tax expedition

Want to dig deep into tax avoidance and evasion? We have gathered a wide range of data on this sensitive topic and for one afternoon we’ll guide you through some of the key decisions to think about when writing a story on the topics. With tax evasion and tax avoidance currently such a hot topic in the media, it’s crucial that people can understand the difference between the two terms as well as the mechanisms by which they happen.

When: Thursday June 6th – 12:00 BST to 17:00 BST – link to your timezone

We’ll be looking for projects such as:

  • Exploring the tax avoidance schemes used by Apple, Google, Amazon, or Starbucks

  • Looking at data gathered by tax collection authorities and patterns of avoidance that emerge from that dataset

  • Creating a “most wanted” list tax evaders for future research

  • Your project here!

Sign up here for the Data Expedition!

Please note that limited space is available. For more information about the Data Expedition format, we encourage you to read this article.

How can I participate?

To get involved either:

  • Lead a team! (Up to 6 hours) Are you able to help to coordinate a team on the day? This involves, helping your team to understand the options and research that has been conducted and starting a discussion about the choice of story and how to construct a plan for making the story happen. The School of Data team will hold a specific hangout for team leads on Monday 3rd June at 12:00 BST to prepare for Thursday’s activities. Please email schoolofdata [at] okfn.org if you are interested in getting involved.

  • Offer an expert introduction! (Up to one hour) We’re looking for experts who understand the loopholes or tactics used by companies in different countries to offer quick introductions from 5-30 mins long to get the expedition started.

  • Join us as a participant on the day! (3-6 hours) You will need to be prepared to brainstorm ideas with others in your group and ultimately explain your choice of story. There will be two roles you can take on the day – either getting stuck into the data (analyst) or writing (storyteller).

Aims of the expedition

We will aim to give people:

  • A clear understanding of the difference between tax evasion and tax avoidance
  • An key understanding of a few schemes via which people engage in them
  • Perhaps also a few story ideas!

How to get involved

Please make sure you are registered here and that you select “Tax Avoidance/Evasion” in the “I’m Interested in…” section. Please note: you will need to be available for at least 3 hours during the expedition period and spaces will be limited, so preference will be given to those who can definitely commit to the expedition. Spaces will be confirmed shortly before the expedition.

Stay up to date with the latest data expeditions

Want to be informed any time there is a new data expedition? Join the School of Data announcement list to get notifications of the expeditions as soon as they are announced!

IRS: Turn Over A New Leaf, Open Up Data

May 24, 2013 in Data Journalism, Open Spending, Spending Stories

The following post is co-authored by Stefan Verhulst and Beth Noveck. It is cross-posted from Forbes.com. If you’d like to learn more about tax data, check out our data expedition on tax evasion and avoidance on the 6th June!

The core task for Danny Werfel, the new acting commissioner of the U.S. Internal Revenue Service (IRS), is to repair the agency’s tarnished reputation and achieve greater efficacy and fairness in IRS investigations. Mr. Werfel can show true leadership by restructuring how the IRS handles its tax-exempt enforcement processes.

People filing tax forms at the IRS in 1920.

One of Mr. Werfel’s first actions on the job should be the immediate implementation of the groundbreaking Presidential Executive Order and Open Data policy, released last week, that requires data captured and generated by the government be made available in open, machine-readable formats. Doing so will make the IRS a beacon to other agencies in how to use open data to screen any wrongdoing and strengthen law enforcement.

By sharing readily available IRS data on tax-exempt organizations, encouraging Congress to pass a budget proposal that mandates release of all tax-exempt returns in a machine-readable format, and increasing the transparency of its own processes, the agency can begin to turn the page on this scandal and help rebuild trust and partnership between government and its citizens.

Every year in the United States approximately 1.5 million registered tax-exempt organizations file a version of the “Form 990” with the IRS and state tax authorities. The 990 collects details on the financial, governance and organizational structure of America’s universities, hospitals, foundations, and charities to the end of ensuring that they are deserving of tax exempt status. We are missing an opportunity to analyze this data so that decisions about whom to investigate can be based on evidence rather than conjecture, on patterns rather than prejudice.

Currently, hundreds of thousands of the largest tax-exempt organizations are required to file their returns electronically. The IRS should release this data in bulk as a free database immediately. If the IRS were to make these 990 data available in a form that could be easily downloadable and processed by computer programs for visualization and statistical analysis, researchers could quickly do more extensive, in-depth empirical research to better understand the sector and spot fraud, waste and abuse more systematically. Knowing who runs a nonprofit can help detect fraud. Attorneys General have occasionally found the same person collecting full time salaries from several different nonprofits.

Check out the guide on tax avoidance and evasion from OpenSpending to find out more about how to follow the money.

While the IRS is using robo-audits, catching large evasions still happens mainly by happenstance. With open data, they could be detected, first, through computer analysis. By using technology to expand the regulator’s toolkit, it becomes possible to target limited enforcement resources to where problems really are. The Securities and Exchange Commission has, for instance, developed an improved capacity to detect and prevent insider trading more effectively by making public information computable and easier to mine. In addition, open data creates the means for government and citizens to collaborate on spotting problems. As the adage goes, with many eyes, all bugs are shallow.

Similarly, Form 990 requires charities to disclose loans to or from current and former officers. Making these and other transactions that correlate with instances of fraud like these would save government resources at the state and federal levels.

With a 990 database, it would also be easier to run queries to understand which executives receive the highest compensation. By combining 990 and other data, such as lobbying data, it might become possible to spot impermissible political activities.

President Obama’s 2014 budget calls for requiring all tax exempts to file electronically, but also requires that the IRS makes these already public returns available in a timely, machine-readable format. These data would create a corpus of open, computable information that could be used to understand where nonprofits are providing services and where there are gaps. Enabling more people and organizations to analyze, visualize, and mash up the data, creating a large public community that is interested in the nonprofit sector and can collaborate to find ways to improve it.

In sum, the data that the IRS collect about nonprofit organizations present a great opportunity to learn about the sector and make it more effective.

Making IRS data open won’t solve every problem; the recent scandal has proven that the IRS must be more transparent about both the information it collects, but also how it manages that information. A commitment on day one to share the data it collects in a machine readable manner would show true leadership by Mr. Werfel and help solidify the Obama administration’s legacy as an open government.


Stefaan G. Verhulst is the Chief Research and Development Officer of the Governance Laboratory @NYU (GovLab) where he is responsible for building a research foundation on how to transform governance using advances in science and technology.

Beth Noveck is Founder and Director of the Governance Laboratory. She served in the White House as the first United States Deputy Chief Technology Officer and founder of the White House Open Government Initiative (2009-2011). She was appointed senior advisor for Open Government to the UK Prime Minister David Cameron. She is the author of “Wiki Government: How Technology Can Make Government Better, Democracy Stronger and Citizens More Powerful.”

U.S. government’s data portal relaunched on CKAN

May 23, 2013 in CKAN, Featured, News, Releases

Today, we are excited to announce that our work with the US Federal Government (data.gov) has gone live at catalog.data.gov! You can also read the announcement from the data.gov blog with their description of the new catalog.

Catalog.Data.gov

The Open Knowledge Foundation’s Services team, which deploys CKAN, have been working hard on a new unified catalog to replace the numerous previously existing catalogs of data.gov. All geospatial and raw data is federated into a single portal where data from different portals, sources and catalogs is displayed in a beautiful standardized user interface allowing users to search, filter and facet through thousands of datasets.

This is a key part of the U.S. meeting their newly announced Open Data Policy and marks data.gov’s first major step into open source. All the code is available on Github and data.gov plan to make their CKAN / Drupal set-up reusable for others as part of OGPL.

As one of the first major production sites to launch with the shiny new CKAN 2.0, data.gov takes advantage of the much improved information architecture, templating and distributed scalable authorization model. CKAN provides data.gov with a web interface for over 200 publishing organizations to manage their members, harvest sources and datasets – supporting requirements being outlined in Project Open Data. This means that agencies can maintain their data sources individually, schedule regular refreshes of the metadata into the central repository and manage an approval workflow.

There have been many additions to CKAN’s geospatial functionality, most notably a fast and elegant geospatial search:

Geospatial search filter

We have added robust support for harvesting FGDC and ISO 19139 documents from WAFs, single spatial documents, CSW endpoints, ArcGIS portals, Z39:50 sources, ESRI Geoportal Servers as well as other CKAN catalogs. This is available for re-use as part of our harvesting and spatial extensions.

Most importantly, this is a big move towards greater accessibility and engagement with re-users. Not only is metadata displayed through a browsable web interface (instead of XML files), there is now a comprehensive CKAN API with access to all web functionality including search queries and downloads which respects user and publisher permission settings. Users can preview the data in graphic previews as well as exploring Web Map Services, whilst the dataset page provides context, browsable tags, dataset extent, and maintainers.

Web Map Service

As data.gov invites users to get involved and provide feedback, we would also like to say that we are really excited about CKAN’s future. We have a very active mailing list, new documentation for installing CKAN and ways to contribute to the code for anyone wanting to join the CKAN community.

If you’re launching a CKAN portal soon or have one we don’t know about, let us know and we’ll make sure to add you to our wall of awesome!

Data Expedition: Mapping the garment factories

May 20, 2013 in Events, School of Data

Women sewing at long tables next to tall windows in a garment factory.

The horrific factory collapse at Rana Plaza in Dhaka has brought the business practices of global garment brands, as well their thousands of suppliers, into the spotlight.

At School of Data we noted that corrupt and missing data were part of the story. Data on building permits in Bangladesh is largely unavailable due to lack of state inspections. However, after years of pressure on global apparel brands from labor activists, the publishing of garment factory supplier lists is becoming increasingly standardized. We’re asking you to join us in mapping the data on garment factories.

Data Expedition: Mapping the garment factories 

When: Saturday May 25 – 12:00 BST to May 26 18:00 BST - link to your timezone

We’ll be looking for projects such as:

  • Mapping garment factories locally and globally

  • Exploring the global supply chain of garment export and imports

  • Mapping the ownership of local factories and global brands with open company data

  • Finding stories and patterns in the connections between global brands and local garment factories

Sign up here for the Data Expedition!

Please note that limited space is available. For more information about the Data Expedition format, we encourage you to read this article.

Before the Data Expedition – Help us build an open garment factory supply list

Before heading out on this important expedition, we’ll need to gather as much data as possible on garment factories. Labor activists and campaigners typically articulate the data in terms of ”supplier lists.” Some brands, such as Nike, provide a list of all factories in their supplier network via Excel and JSON downloads; while others, such as Levi-Strauss, only offer lists in PDF format. In order to prepare a solid dataset for the Data Expedition, we’re asking you to help locate, clean, and merge the supplier lists from across garment brands into one comprehensive Open Garment Factory List.

Begin today by adding to the Open Garment Factory List and join us for a GoogleHangout on Thursday, 23 May at 19:00 CET, where we’ll be engaging in joint data collection.

Announcing CKAN 2.0

May 10, 2013 in CKAN, Featured, Featured Project, News, OKF Projects, Open Data, Open Government Data, Releases, Technical

CKAN is a powerful, open source, open data management platform, used by governments and organizations around the world to make large collections of data accessible, including the UK and US government open data portals.

Today we are very happy and excited to announce the final release of CKAN 2.0. This is the most significant piece of CKAN news since the project began, and represents months of hectic work by the team and other contributors since before the release of version 1.8 last October, and of the 2.0 beta in February. Thank you to the many CKAN users for your patience – we think you’ll agree it’s been worth the wait.

[Screenshot: Front page]

CKAN 2.0 is a significant improvement on 1.x versions for data users, programmers, and publishers. Enormous thanks are due to the many users, data publishers, and others in the data community, who have submitted comments, code contributions and bug reports, and helped to get CKAN to where it is. Thanks also to OKF clients who have supported bespoke work in various areas that has become part of the core code. These include data.gov, the US government open data portal, which will be re-launched using CKAN 2.0 in a few weeks. Let’s look at the main changes in version 2.0. If you are in a hurry to see it in action, head on over to demo.ckan.org, where you can try it out.

Summary

CKAN 2.0 introduces a new sleek default design, and easier theming to build custom sites. It has a completely redesigned authorisation system enabling different departments or bodies to control their own workflow. It has more built-in previews, and publishers can add custom previews for their favourite file types. News feeds and activity streams enable users to keep up with changes or new datasets in areas of interest. A new version of the API enables other applications to have full access to all the capabilities of CKAN. And there are many other smaller changes and bug fixes.

Design and theming

The first thing that previous CKAN users notice will be the greatly improved page design. For the first time, CKAN’s look and feel has been carefully designed from the ground up by experienced professionals in web and information design. This has affected not only the visual appearance but many aspects of the information architecture, from the ‘breadcrumb trail’ navigation on each page, to the appearance and position of buttons and links to make their function as transparent as possible.

[Screenshot: dataset page]

Under the surface, an even more radical change has affected how pages are themed in CKAN. Themes are implemented using templates, and the old templating system has been replaced with the newer and more flexible Jinja2. This makes it much easier for developers to theme their CKAN instance to fit in with the overall theme or branding of their web presence.

Authorisation and workflow: introducing CKAN ‘Organizations’

Another major change affects how users are authorised to create, publish and update datasets. In CKAN 1.x, authorisation was granted to individual users for each dataset. This could be augmented with a ‘publisher mode’ to provide group-level access to datasets. A greatly expanded version of this mode, called ‘Organizations’, is now the default system of authorisation in CKAN. This is much more in line with how most CKAN sites are actually used.

[Screenshot: Organizations page]

Organizations make it possible for individual departments, bodies, groups, etc, to publish their own data in CKAN, and to have control over their own publishing workflow. Different users can have different roles within an Organization, with different authorisations. Linked to this is the possibility for each dataset to have different statuses, reflecting their progress through the workflow, and to be public or private. In the default set-up, Organization user roles include Members (who can read the Organization’s private datsets), Editors (who can add, edit and publish datasets) and Admins (who can add and change roles for users).

More previews

In addition to the existing image previews and table, graph and map previews for spreadsheet data, CKAN 2.0 includes previews for PDF files (shown below), HTML (in an iframe), and JSON. Additionally there is a new plugin extension point that makes it possible to add custom previews for different data types, as described in this recent blog post.

[Screenshot: PDF preview]

News feeds and activity streams

CKAN 2.0 provides users with ways to see when new data or changes are made in areas that they are interested in. Users can ‘follow’ datasets, Organizations, or groups (curated collections of datasets). A user’s personalised dashboard includes a news feed showing activity from the followed items – new datasets, revised metadata and changes or additions to dataset resources. If there are entries in your news feed since you last read it, a small flag shows the number of new items, and you can opt to receive notifications of them via e-mail.

Each dataset, Organization etc also has an ‘activity stream’, enabling users to see a summary of its recent history.

[Screenshot: News feed]

Programming with CKAN: meet version 3 of the API

CKAN’s powerful application programming interface (API) makes it possible for other machines and programs to automatically read, search and update datasets. CKAN’s API was previously designed according to REST principles. RESTful APIs are deservedly popular as a way to expose a clean interface to certain views on a collection of data. However, for CKAN we felt it would be better to give applications full access to CKAN’s own internal machinery.

A new version of the API – version 3 – trialled in beta in CKAN 1.8, replaced the REST design with remote procedure calls, enabling applications or programmers to call the same procedures as CKAN’s own code uses to implement its user interface. Anything that is possible via the user interface, and a good deal more, is therefore possible through the API. This proved popular and stable, and so, with minor tweaks, it is now the recommended API. Old versions of the API will continue to be provided for backward compatibility.

Documentation, documentation, documentation

CKAN comes with installation and administration documentation which we try to keep complete and up-to-date. The major changes in the rest of CKAN have thus required a similarly concerted effort on the documentation. It’s great when we hear that others have implemented their own installation of CKAN, something that’s been increasing lately, and we hope to see even more of this. The docs have therefore been overhauled for 2.0. CKAN is a large and complex system to deploy and work on improving the docs continues: version 2.1 will be another step forward. Where people do run into problems, help remains available as usual on the community mailing lists.

… And more

There are many other minor changes and bug fixes in CKAN 2.0. For a full list, see the CKAN changelog.

Installing

To install your own CKAN, or to upgrade an existing installation, you can install it as a package on Ubuntu 12.04 or do a source installation. Full installation and configuration instructions are at docs.ckan.org.

Try it out

You can try out the main features at demo.ckan.org. Please let us know what you think!

Follow the Money, Follow the Data

May 3, 2013 in Ideas and musings, Open Data, Open Government Data, Open Spending

The following guest post from Martin Tisné was first published on his personal blog.

Money tunnel by RambergMediaImages, CC-BY-SA on Flickr

Some thoughts which I hope may be helpful in advance of the ‘follow the data‘ hack day this week-end:

The open data sector has quite successfully focused on socially-relevant information: fixing potholes a la http://www.fixmystreet.com/, adopting fire hydrants a la http://adoptahydrant.org/. My sense is that the next frontier will be to free the data that can enable citizens, NGOs and journalists to hold their governments to account. What this will likely mean is engaging in issues such as data on extractives’ transparency, government contracting, political finance, budgeting etc. So far, these are not the bread and butter of the open data movement (which isn’t to say there aren’t great initiatives like http://openspending.org/). But they should be:

At its heart, this agenda revolves around ‘following the money’. Without knowing the ‘total resource flow’:

  • Parents’ associations cannot question the lack of textbooks in their schools by interrogating the school’s budget
  • Healthcare groups cannot access data related to local spending on doctors, nurses
  • Great orgs such as Open Knowledge Foundation or BudgIT cannot get the data they need for their interpretative tools (e.g. budget tracking tool)
  • Investigative journalists cannot access the data they need to pursue a story

Our field has sought to ‘follow the money’ for over two decades, but in practice we still lack the fundamental ability to trace funding flows from A to Z, across the revenue chain. We should be able to get to what aid transparency experts call ‘traceability’ (the ability to trace aid funds from the donor down the project level) for all, or at least most fiscal flows.

Open data enables this to happen. This is exciting: it’s about enabling follow the money to happen at scale. Up until now, instances of ‘following the money’ have been the fruit of the hard work of investigative journalists, in isolated instances.

If we can ensure that data on revenues (extractives, aid, tax etc), expenditures (from planning to allocation to spending to auditing), and results (service delivery data) is timely, accessible, comparable and comprehensive, we will have gone a long way to helping ‘follow the money’ efforts reach the scale they deserve.

Follow the Money is a pretty tangible concept (if you disagree, please let me know!) – it helps demonstrate how government funds buy specific outcomes, and how/whether resources are siphoned away. We need to now make it a reality.

The Public Domain Review is Saved!

May 2, 2013 in OKF Projects, Public Domain, Public Domain Review

At 12:00pm BST today, as midnight struck over the Pacific island of American Samoa and the 1st of May truly ended all over the world, so did end the inaugural Public Domain Review Fundraiser. In 58 days, with the help of 676 wonderful supporters we managed to leapfrog our target of $20,000 and raise an amazing $22,070, ca. £14200 / €16,800. Thank you all so much, we’ve been really blown away by your amazing generosity.

We saw donations come in from all over the world, and the Tote Bags have been sent out to homes far and wide across 6 of the 7 continents on the planet (still missing that ever elusive Antarctica). There weren’t just offers of monetary support – a few people also pledged their skills and time. We’ve had a very kind offer to build a PDR App for Android which is currently in progress, and also a printmaker interested in partnering up to do some prints for us using an old Victorian letterpress. There are also other interesting collaborations currently being discussed – all to be revealed soon!

We have lots of really exciting things lined up for the future, and thanks to all the incredible generosity we’ve seen we can them happen. Amongst others, we have coming soon a brand new monthly feature – “Guest Curator of the Month” – in which an invited curator shall do a guest post focusing on works in their institutions openly licensed digital collections: the British Library, Rijksmuseum and others are onboard already. In addition to improving the website with new features like these, part of the work we’ll also be doing is, of course, trying to secure additional funding which we’ll be very much focusing on over the next few months.

All in all, very exciting times ahead. And, again, a huge thank you to all who donated!

And in case you missed it, here’s the super-extended version of the fundraising film: aptly retitled “SAVED!” and with a new happy ending!


Open Budget Oakland and OpenSpending

April 29, 2013 in Open Spending, Tools in Use

From small beginnings in a hackathon, here’s a great story from Oakland of how OpenSpending can be deployed to improve civic engagement on a local level.

open budget oakland

The beta version of Open Budget Oakland went public last week with the release of our mayor’s proposed budget for the next two years. Her announcement was made Wednesday afternoon and by evening we had visualized and made available for discussion three levels of spending data. Within the week, the site was starting to help people make sense of the budget, and City Council invited us to present the site at their next meeting as they begin the budget process.

While still only an outline of the resource that we plan to build, we’re seeing the first glimpses of people gaining a better understanding of the city’s budget, asking questions, and sharing ideas.

From hackathon to civic collaboration

At a hackathon in Oakland last July, Shawn McDougal, a university math teacher and community organizer pitched an idea: we need an app to help people understand our city’s budget — to see where our money comes from and how it’s spent, to enable people to share and discuss their own budget priorities. The idea grabbed people’s attention — by the end of the day, about ten of us had copy/pasted budget data from a 350-page PDF and made an almost-interactive pie chart with big aspirations. For a one-day project it was enough to win grand prize. It also provided a realization: accessing our city’s budget data isn’t easy, and once you have data, it isn’t immediately clear how it can be shared in a way that helps people.

We met weekly to dig deeper, drawn in varying degrees to the coding challenge, to open data, to how a budget app could support more engaged democracy, in particular, processes like participatory budgeting. But an initially slow process with the city proved too long for most of our data-hungry volunteer developers who slowly went back to their day jobs. In pursuit of coders and better communication with the city, we joined with OpenOakland, a Code for America brigade that meets every Tuesday at City Hall. This is a forum where residents and city officials collaborate to improve access to public data and build civic apps. Here we were able to connect with Bradley Johnson, a city budget analyst who now works with us to access and interpret the data.

While OpenOakland connected us to City Hall and local programmers, what we were envisioning wasn’t going to happen over a few evening hack sessions. In researching how a non-coder non-budget-analyst like myself could build a database and visualize the city’s budget, I found OpenSpending and realized, with envy and relief, that what we wanted to build had already been started. The OpenSpending community helped us assess which tools would work best for our particular vision, and developers provided support as we customized code to allow people to comment on and share various views of the budget.

Building a conversation

Early in the design process we agreed that both visualization and conversation are necessary for either to be meaningful. Simply seeing the budget, while absolutely necessary, will not in itself lead to civic engagement or empower people to advocate for different budget priorities. People need a means to ask questions, to share insights, to connect with the people who decide and communicate the budget.

In the coming weeks, we’re adding discussion forums, voting mechanisms, and considering ways to connect people’s questions to answers — whether related to how the budget impacts local communities, open data, visualization, or the idiosyncrasies of the Oakland budget process. We’re also encouraging city officials to participate in discussions on the site, and sharing visualizations with journalists when it can help tell their story.

Now that we have a basic model — visualization, conversation, sharing — the hardest work is to make it relevant and useful to people’s real lives. It means listening to the many communities of Oakland to learn how exactly the budget matters to them, working with the budget office to make that data available, and recruiting people to help build the tool they want to see.

Please create an account to get started.

Sign up to the Open Knowledge Newsletter

Get Updates