Support Us

You are browsing the archive for WG Economics.

Working Group Stories: Open Design & Hardware, Open Economics

Katelyn Rogers - September 12, 2013 in WG Economics, WG Open Design & Hardware

Last week we published updates from the Public Domain, Open Sustainability and Open Education Working Groups. Look out for more stories next month, to keep up-to-date with the buzzing brilliance of the Network.

open design

Open Design and Hardware Working Group

The Open Design and Hardware working group celebrated its one year anniversary and is busy working on new projects and plans for grandeur.

The Working Group is promoting the development of new artistic practices through projects that encourage members to actually produce new works. Initiatives like the Chuff-a-thon competition, a competition to design a Chuff (the OKF mascot), will be done entirely through open design. This competition will be promoted through a series of Physical Mashup workshops organised in collaboration with the Public Domain Remix (the first of these workshops will take place in Paris in the middle of October 2014). The group is also planning the creation of a new digital art journal, Miðli, featuring the work of group members. The journal will be a collaborative digital art magazine that examines the emergent tensions between creativity and openness across physical, machine and social spectrums. Details and planning found here.

Finally (and most excitingly!) the working group is applying for an art grant at Burning Man 2014 in order to disseminate the values of Open Design, Open Hardware, and Open Technology in the midst of Nevada desert…! To get involved, join the mailing list and follow @ODandH!

Open Economics Working Group

The Open Economics working group has been busy building apps, drafting principles and helping to define what open data means for the economics profession.

Over the course of the last year and a half, the main focus has been defining open data in economics, and this has led to the working group becoming the central point of reference in this field. The Open Economics Principles were released last month, and they have already received widespread endorsements across the economics community, including institutional endorsement from the World Bank’s Data Development Group! In June this year the second of two international workshops organised by the Working Group took place at the MIT Sloan in the U.S., with senior academics, funders, data professionals and students. Projects and flagship initiatives were showcased, and incentives and reward structures which would encourage economists to share more data were discussed.

The Open Economics working group will continue to work with a growing community of graduate students and early career professionals, as we strive to inspire a generation of economic researchers for whom sharing data and code is a natural part of their profession. Please support our work by endorsing the Principles.

Introducing the Open Economics Principles

Velichka Dimitrova - August 7, 2013 in Featured, Open Economics, WG Economics

The Open Economics Working Group would like to introduce the Open Economics Principles, a Statement on Openness of Economic Data and Code. A year and a half ago the Open Economics project began with a mission of becoming central point of reference and support for those interested in open economic data. In the process of identifying examples and ongoing barriers for opening up data and code for the economics profession, we saw the need to present a statement on the guiding principles of transparency and accountability in economics that would enable replication and scholarly debate as well as access to knowledge as a public good.

We wrote the Statement on the Open Economics Principles during our First and Second Open Economics International Workshops, receiving feedback from our Advisory Panel and community with the aim to emphasise the importance of having open access to data and code by default and address some of the issues around the roles of researchers, journal editors, funders and information professionals.

Second Open Economics International Workshop, June 11-12, 2013

Second Open Economics International Workshop, June 11-12, 2013

Read the statement below and follow this link to endorse the Principles.

Open Economics Principles

Statement on Openness of Economic Data and Code

Economic research is based on building on, reusing and openly criticising the published body of economic knowledge. Furthermore, empirical economic research and data play a central role for policy-making in many important areas of our
economies and societies.

Openness enables and underpins scholarly enquiry and debate, and is crucial in ensuring the reproducibility of economic research and analysis. Thus, for economics to function effectively, and for society to reap the full benefits from economic research, it is therefore essential that economic research results, data and analysis be openly and freely available, wherever possible.

  1. Open by default: by default data in its different stages and formats, program code, experimental instructions and metadata – all of the evidence used by economists to support underlying claims – should be open as per the Open Definition1, free for anyone to use, reuse and redistribute. Specifically open material should be publicly available and licensed with an appropriate open licence2.
  2. Privacy and confidentiality: We recognise that there are often cases where for reasons of privacy, national security and commercial confidentiality the full data cannot be made openly available. In such cases researchers should share analysis under the least restrictive terms consistent with legal requirements, abiding by the research ethics and guidelines of their community. This should include opening up non-sensitive data, summary data, metadata and code, and facilitating access if the owner of the original data grants other researchers permission to use the data
  3. Reward structures and data citation: recognizing the importance of data and code to the discipline, reward structures should be established in order to recognise these scholarly contributions with appropriate credit and citation in an acknowledgement that producing data and code with the documentation that make them reusable by others requires a significant commitment of time and resources. At minimum, all data necessary to understand, assess, or extend conclusions in scholarly work should be cited. Acknowledgements of research funding, traditionally limited to publications, could be extended to research data and contribution of data curators should be recognised.
  4. Data availability: Investigators should share their data by the time of publication of initial results of analyses of the data, except in compelling circumstances. Data relevant to public policy should be shared as quickly and widely as possible. Funders, journals and their editorial boards should put in place and enforce data availability policies requiring data, code and any other relevant information to be made openly available as soon as possible and at latest upon publication. Data should be in a machine-readable format, with well-documented instructions, and distributed through institutions that have demonstrated the capability to provide long-term stewardship and access. This will enable other researchers to replicate empirical results.
  5. Publicly funded data should be open: publicly funded research work that generates or uses data should ensure that the data is open, free to use, reuse and redistribute under an open licence – and specifically, it should not be kept unavailable or sold under a proprietary licence. Funding agencies and organizations disbursing public funds have a central role to play and should establish policies and mandates that support these principles, including appropriate costs for long-term data availability in the funding of research and the evaluation of such policies3, and independent funding for systematic evaluation of open data policies and use.
  6. Usable and discoverable: as simply making data available may not be sufficient for reusing it, data publishers and repository managers should endeavour to also make the data usable and discoverable by others; for example, documentation, the use of standard code lists, etc., all help make data more interoperable and reusable and submission of the data to standard registries and of common metadata enable greater discoverability.

See Reasons and Background and a link to endorsing the Principles:


2. Open licences for code are those conformant with the Open Source Definition see and open licences for data should be conformant with the open definition, see

3. A good example of an important positive developments in this direction from the United States is

Second Open Economics International Workshop

Velichka Dimitrova - June 5, 2013 in Events, Featured, Open Data, Open Economics, WG Economics, Workshop

Next week, on June 11-12, at the MIT Sloan School of Management, the Open Economics Working Group of the Open Knowledge Foundation will gather about 40 economics professors, social scientists, research data professionals, funders, publishers and journal editors for the second Open Economics International Workshop.

The event will follow up on the first workshop held in Cambridge UK and will conclude with agreeing a statement on the Open Economics principles. Some of the speakers include Eric von Hippel, T Wilson Professor of Innovation Management and also Professor of Engineering Systems at MIT, Shaida Badiee, Director of the Development Data Group at the World Bank and champion for the Open Data Initiative, Micah Altman, Director of Research and Head of the Program on Information Science for the MIT Libraries as well as Philip E. Bourne, Professor at the University of California San Diego and Associate Director of the RCSB Protein Data Bank.

The workshop will address topics including:

  • Research data sharing: how and where to share economics social science research data, enforce data management plans, promote better data management and data use
  • Open and collaborative research: how to create incentives for economists and social scientists to share their research data and methods openly with the academic community
  • Transparent economics: how to achieve greater involvement of the public in the research agenda of economics and social science

The knowledge sharing in economics session will invite a discussion between Joshua Gans, Jeffrey S. Skoll Chair of Technical Innovation and Entrepreneurship at the Rotman School of Management at the University of Toronto and Co-Director of the Research Program on the Economics of Knowledge Contribution and Distribution, John Rust, Professor of Economics at Georgetown University and co-founder of, Gert Wagner, Professor of Economics at the Berlin University of Technology (TUB) and Chairman of the German Census Commission and German Council for Social and Economic Data as well as Daniel Feenberg, Research Associate in the Public Economics program and Director of Information Technology at the National Bureau of Economic Research.

The session on research data sharing will be chaired by Thomas Bourke, Economics Librarian at the European University Institute, and will discuss the efficient sharing of data and how to create and enforce reward structures for researchers who produce and share high quality data, gathering experts from the field including Mercè Crosas, Director of Data Science at the Institute for Quantitative Social Science (IQSS) at Harvard University, Amy Pienta, Acquisitions Director at the Inter-university Consortium for Political and Social Research (ICPSR), Joan Starr, Chair of the Metadata Working Group of DataCite as well as Brian Hole, the founder of the open access academic publisher Ubiquity Press.

Benjamin Mako Hill, researcher and PhD Candidate at the MIT and Berkman Center for Internet and Society at Harvard Univeresity, will chair the session on the evolving evidence base of social science, which will highlight examples of how economists can broaden their perspective on collecting and using data through different means: through mobile data collection, through the web or through crowd-sourcing and also consider how to engage the broader community and do more transparent economic research and decision-making. Speakers include Amparo Ballivian, Lead Economist working with the Development Data Group of the World Bank, Michael P. McDonald, Associate Professor at George Mason University and co-principle investigator on the Public Mapping Project and Pablo de Pedraza, Professor at the University of Salamanca and Chair of Webdatanet.

The morning session on June 12 will gather different stakeholders to discuss how to share responsibility and how to pursue joint action. It will be chaired by Mireille van Eechoud, Professor of Information Law at IViR and will include short statements by Daniel Goroff, Vice President and Program Director at the Alfred P. Sloan Foundation, Nikos Askitas, Head of Data and Technology at the Institute for the Study of Labor (IZA), Carson Christiano, Head of CEGA’s partnership development efforts and coordinating the Berkeley Initiative for Transparency in the Social Sciences (BITSS) and Jean Roth, the Data Specialist at the National Bureau of Economic Research.

At the end of the workshop the Working Group will discuss the future plans of the project and gather feedback on possible initiatives for translating discussions in concrete action plans. Slides and audio will be available on the website after the workshop. If you have any questions please contact economics [at]

Reinhart-Rogoff Revisited: Why we need open data in economics

Velichka Dimitrova - April 22, 2013 in Open Data, Open Economics, WG Economics


This blog post is cross-posted from the Open Economics Blog.

Another economics scandal made the news last week. Harvard Kennedy School professor Carmen Reinhart and Harvard University professor Kenneth Rogoff argued in their 2010 NBER paper that economic growth slows down when the debt/GDP ratio exceeds the threshold of 90 percent of GDP. These results were also published in one of the most prestigious economics journals – the American Economic Review (AER) – and had a powerful resonance in a period of serious economic and public policy turmoil when governments around the world slashed spending in order to decrease the public deficit and stimulate economic growth.

Carmen Reinhart

Kenneth Rogoff

Yet, they were proven wrong. Thomas Herndon, Michael Ash and Robert Pollin from the University of Massachusetts (UMass) tried to replicate the results of Reinhart and Rogoff and criticised them on the basis of three reasons:

  • Coding errors: due to a spreadsheet error five countries were excluded completely from the sample resulting in significant error of the average real GDP growth and the debt/GDP ratio in several categories
  • Selective exclusion of available data and data gaps: Reinhart and Rogoff exclude Australia (1946-1950), New Zealand (1946-1949) and Canada (1946-1950). This exclusion is alone responsible for a significant reduction of the estimated real GDP growth in the highest public debt/GDP category
  • Unconventional weighting of summary statistics: the authors do not discuss their decision to weight equally by country rather than by country-year, which could be arbitrary and ignores the issue of serial correlation.

The implications of these results are that countries with high levels of public debt experience only “modestly diminished” average GDP growth rates and as the UMass authors show there is a wide range of GDP growth performances at every level of public debt among the twenty advanced economies in the survey of Reinhart and Rogoff. Even if the negative trend is still visible in the results of the UMass researchers, the data fits the trend very poorly: “low debt and poor growth, and high debt and strong growth, are both reasonably common outcomes.”

Source: Herndon, T., Ash, M. & Pollin, R., “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff, Public Economy Research Institute at University of Massachusetts: Amherst Working Paper Series. April 2013.

What makes it even more compelling news is that it is all a tale from the state of Massachusetts: distinguished Harvard professors (#1 university in the US) challenged by empiricists from the less known UMAss (#97 university in the US). Then despite the excellent AER data availability policy – which acts as a role model for other journals in economics – the AER has failed to enforce it and make the data and code of Reinhart and Rogoff available to other researchers.

Coding errors happen, yet the greater research misconduct was not allowing other researchers to review and replicate the results through making the data openly available. If the data and code were made available upon publication in 2010, it may not have taken three years to prove these results wrong, which may have influenced the direction of public policy around the world towards stricter austerity measures. Sharing research data means a possibility to replicate and discuss, enabling the scrutiny of research findings as well as improvement and validation of research methods through more scientific enquiry and debate.

Get in Touch

The Open Economics Working Group advocates the release of datasets and code, along with published academic articles, and provides practical assistance to researchers who would like to do so. Get in touch if you would like to learn more by writing us at economics [at] and signing up to our mailing list.


Herndon, T., Ash, M. & Pollin, R., “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff, Public Economy Research Institute at University of Massachusetts: Amherst Working Paper Series. April 2013: Link to paper | Link to data and code

Releasing the Automated Game Play Datasets

Velichka Dimitrova - March 7, 2013 in Open Economics, Our Work, WG Economics


This blog post is cross-posted from the Open Economics Blog.

We are very happy to announce that the Open Economics Working Group is releasing the datasets of the research project “Small Artificial Human Agents for Virtual Economies“, implemented by Professor David Levine and Professor Yixin Chen at the Washington University of St. Louis and funded by the National Science Foundation [See dedicated webpage].

The authors of the study have given their permission to publish their data online. We hope that through making this data available online we will aid researchers working in this field. This initiative is motivated by our belief that in order for economic research to be reliable and trusted, it should be possible to reproduce research findings – which is difficult or even impossible without the availability of the data and code. Making material openly available reduces to a minimum the barriers for doing reproducible research.

If you are interested to know more or you like to get help in releasing research data in your field, please contact us at: economics [at]

Project Background

An important requirement for developing better economic policy recommendations is improving the way we validate theories. Originally economics depended on field data from surveys and laboratory experiments. An alternative method of validating theories is through the use of artificial or virtual economies. If a virtual world is an adequate description of a real economy, then a good economic theory ought to be able to predict outcomes in that setting.

An artificial environment offers enormous advantages over the field and laboratory: complete control – for example, over risk aversion and social preferences – and great speed in creating economies and validating theories. In economics, the use of virtual economies can potentially enable us to deal with heterogeneity, with small frictions, and with expectations that are backward looking rather than determined in equilibrium. These are difficult or impractical to combine in existing calibrations or Monte Carlo simulations.

The goal of this project is to build artificial agents by developing computer programs that act like human beings in the laboratory. We focus on the simplest type of problem of interest to economists: the simple one-shot two-player simultaneous move games. The most well-known form of these are “Prisoner’s Dilemmas” – a much studied scenario in game theory which explores the circumstances of cooperation between two people. In its classic form, the model suggests that two agents who are fully self-interested and rational would always betray each other, even though the best outcome overall would be if they cooperated. However, laboratory humans show a tendency towards cooperation. Our challenge is therefore developing artificial agents who share this bias to the same degree as their human counterparts.

There is a wide variety of existing published data on laboratory behavior that will be the primary testing ground for the computer programs. As the project progresses, the programs will be challenged to see if they adapt themselves to changes in the rules in the same ways as human agents: for example, if payments are changed in a certain way, the computer programs will play differently: do people do the same? In some cases we may be able to answer these questions with data from existing studies; in others we will need to conduct our own experimental studies.

Find the full list of available datasets here

Preregistration in the Social Sciences: A Controversy and Available Resources

James Monogan - February 20, 2013 in Open Economics, WG Economics

This blog post is cross-posted from the Open Economics Blog.

For years now, the practice preregistering clinical trials has worked to reduce publication bias dramatically (Drummond Rennie offers more details). Trying to build on this trend for transparency, the Open Knowledge Foundation, which runs the Open Economics Working Group, has expressed support for All Trials Registered, All Results Reported ( This initiative argues that all clinical trial results should be reported because the spread of this free information will reduce bad treatment decisions in the future and allow others to find missed opportunities for good treatments. The idea of preregistration, therefore, has proved valuable for the medical profession.

In a similar push for openness, a debate now is emerging about the merits of preregistration in the social sciences. Specifically, could social scientific disciplines benefit from investigators’ committing themselves to a research design before the observation of their outcome variable? The winter 2013 issue of Political Analysis takes up this issue with a symposium on research registration, wherein two articles make a case in favor of preregistration, and three responses offer alternate views on this controversy.

There has been a trend for transparency in social research: Many journals now require authors to release public replication data as a condition for publication. Additionally, public funding agencies such as the U.S. National Science Foundation require public release of data as a condition for funding. This push for additional transparency allows for other researchers to conduct secondary analyses that may build on past results and also allows empirical findings to be subjected to scrutiny as new theory, data, and methods emerge. Preregistering a research design is a natural next step in this transparency process as it would allow readers, including other scholars, to gain a sense of how the project was developed and how the researcher made tough design choices.

Another advantage of preregistering a research design is it can curb the prospects of publication bias. Gerber & Malhotra observe that papers produced in print tend to have a higher rate of positive results in hypothesis tests than should be expected. Registration has the potential to curb publication bias, or at least its negative consequences. Even if committing oneself to a research design does not change the prospect for publishing an article in the traditional format, it would signal to the larger audience that a study was developed and that a publication never emerged. This would allow the scholarly community at large to investigate further, perhaps reanalyze data that were not published in print, and if nothing else get a sense of how preponderant null findings are for commonly-tested hypotheses. Also, if more researchers tie their hands in a registration phase, then there is less room for activities that might push a result over a common significance threshold.

To illustrate how preregistration can be useful, my article in this issue of Political Analysis analyzes the effect of Republican candidates’ position on the immigration issue on their share of the two-party vote in 2010 elections for the U.S. House of Representatives. In this analysis, I hypothesized that Republican candidates may have been able to garner additional electoral support by taking a harsh stand on the issue. I designed my model to estimate the effect on vote share of taking a harsher stand on immigration, holding the propensity of taking a harsh stand constant. This propensity was based on other factors known to shape election outcomes, such as district ideology, incumbency, campaign finances, and previous vote share. I crafted my design before votes were counted in the 2010 election and publicly posted it to the Society for Political Methodology’s website as a way of committing myself to this design.


In the figure, the horizontal axis represents values that the propensity scores for harsh rhetoric could take. The tick marks along the base of the graph indicate actual values in the data of the propensity for harsh rhetoric. The vertical axis represents the expected change in proportion of the two party vote that would be expected for moving from a welcoming position to a hostile position. The figure shows a solid black line, which indicates my estimate of the effect of a Republican’s taking a harsh stand on immigration on his or her proportion of the two-party vote. The two dashed black lines indicate the uncertainty in this estimate of the treatment effect. As can be seen, the estimated effects come with considerable uncertainty, and I can never reject the prospect of a zero effect.

However, a determined researcher could have tried alternate specifications until a discernible result emerged. The figure also shows a red line representing the estimated treatment effect from a simpler model that also omits the effect of how liberal or conservative the district is. The dotted red lines represent the uncertainty in this estimate. As can be seen, this reports a uniform treatment effect of 0.079 that is discernible from zero. After “fishing” with the model specification, a researcher could have manufactured a result suggesting that Republican candidates could boost their share of the vote by 7.9 percentage points by moving from a welcoming to a hostile stand on immigration! Such a result would be misleading because it overlooks district ideology. Whenever investigators commit themselves to a research design, this reduces the prospect of fishing after observing the outcome variable.

I hope to have illustrated the usefulness of preregistration and hope the idea will spread. Currently, though, there is not a comprehensive study registry in the social sciences. However, several proto-registries are available to researchers. All of these registries offer the opportunity for self-registration, wherein the scholar can commit him or herself to a design as a later signal to readers, reviewers, and editors.

In particular, any researcher from any discipline who is interested in self-registering a study is welcome to take advantage of the Political Science Registered Studies Dataverse. This dataverse is a fully-automated resource that allows researchers to upload design information, pre-outcome data, and any preliminary code. Uploaded designs will be publicized via a variety of free media. List members are welcome to subscribe to any of these announcement services, which are linked in the header of the dataverse page.

Besides this automated system, there are also a few other proto-registries of note:
* The EGAP: Experiments in Governance and Politics ( website has a registration tool that now accepts and posts detailed preanalysis plans. In instances when designs are sensitive, EGAP offers the service of accepting and archiving sensitive plans with an agreed trigger for posting them publicly.

  • J-PAL: The Abdul Latif Jameel Poverty Action Lab ( has been hosting a hypothesis registry since 2009. This registry is for pre-analysis plans of researchers working on randomized controlled trials, which may be submitted before data analysis begins.

  • The American Political Science Association’s Experimental Research Section ( hosts a registry for experiments at its website. (Please note, however, that the website currently is down for maintenance.)

Open Research Data Handbook Sprint

Velichka Dimitrova - February 15, 2013 in Open Access, Open Content, Open Data, Open Economics, Open Science, Open Standards, Our Work, WG Economics

On February 15-16 we are updating the Open Research Data Handbook to include more detail on sharing research data from scientific work, and to remix the book for different disciplines and settings. We’re doing this through an open book sprint. The sprint will happen at the Open Data Institute, 65 Clifton Street, London EC2A 4JE.

The Friday lunch seminar will be streamed through the Open Economics Bambuser channel. If you would like to participate, please see the Online Participation Hub for links to documents and programme updates. You can follow this event at the IRC channel #okfn-rbook and follow on twitter with hashtags #openresearch and #okfnrbook.

The Open Research Data Handbook aims to provide an introduction to the processes, tools and other areas that researchers need to consider to make their research data openly available.

Join us for a book sprint to develop the current draft, and explore ways to remix it for different disciplines and contexts.

Who it is for:

  • Researchers interested in carrying out their work in more open ways
  • Experts on sharing research and research data
  • Writers and copy editors
  • Web developers and designers to help present the handbook online
  • Anyone else interested in taking part in an intense and collaborative weekend of action

What will happen:

The main sprint will take place on Friday and Saturday. After initial discussions we’ll divide into open space groups to focus on research, writing and editing for different chapters of the handbook, developing a range of content including How To guidance, stories of impact, collections of links and decision tools.

A group will also look at digital tools for presenting the handbook online, including ways to easily tag content for different audiences and remix the guide for different contexts.


Where: 65 Clifton Street, EC2A 4JE (3rd floor – the Open Data Institute)

Friday, February 15th

  • 13:00 – 13:30: Arrival and sushi lunch
  • 13:30 – 14:30: Open research data seminar with Steven Hill, Head of Open Data Dialogue at RCUK.
  • 14:30 – 17:30: Working in teams

Friday, February 16th

  • 10:00 – 10:30: Arrival and coffee
  • 10:30 – 11:30: Introducing open research lightning talks (your space to present your project on research data)
  • 11:30 – 13:30: Working in teams
  • 13:30 – 14:30: Lunch
  • 14:30 – 17:30: Working in teams
  • 17:30 – 18:30: Reporting back

As many already registered for online participation we will broadcast the lunch seminar through the Open Economics Bambuser channel. Please drop by in the IRC channel #okfn-rbook


OKF Open Science Working Group – creators of the current Open Research Data Handbook
OKF Open Economic Working Group – exploring economics aspects of open research
Open Data Research Network - exploring a remix of the handbook to support open social science
research in a new global research network, focussed on research in the Global South.
Open Data Institute – hosting the event

Sovereign Credit Risk: An Open Database

Marc Joffe - January 31, 2013 in External, Featured, Open Data, Open Economics, WG Economics

This blog post is cross-posted from the Open Economics Blog. Sign up to the Open Economics mailing list for regular updates.

Throughout the Eurozone, credit rating agencies have been under attack for their lack of transparency and for their pro-cyclical sovereign rating actions. In the humble belief that the crowd can outperform the credit rating oracles, we are introducing an open database of historical sovereign risk data. It is available at where community members can both view and edit the data. Once the quality of this data is sufficient, the data set can be used to create unbiased, transparent models of sovereign credit risk.

The database contains central government revenue, expenditure, public debt and interest costs from the 19th century through 2011 – along with crisis indicators taken from Reinhart and Rogoff’s public database.


Why This Database?

Prior to the appearance of This Time is Different, discussions of sovereign credit more often revolved around political and trade-related factors. Reinhart and Rogoff have more appropriately focused the discussion on debt sustainability. As with individual and corporate debt, government debt becomes more risky as a government’s debt burden increases. While intuitively obvious, this truth too often gets lost among the multitude of criteria listed by rating agencies and within the politically charged fiscal policy debate.

In addition to emphasizing the importance of debt sustainability, Reinhart and Rogoff showed the virtues of considering a longer history of sovereign debt crises. As they state in their preface:

“Above all, our emphasis is on looking at long spans of history to catch sight of ’rare’ events that are all too often forgotten, although they turn out to be far more common and similar than people seem to think. Indeed, analysts, policy makers, and even academic economists have an unfortunate tendency to view recent experience through the narrow window opened by standard data sets, typically based on a narrow range of experience in terms of countries and time periods. A large fraction of the academic and policy literature on debt and default draws conclusions on data collected since 1980, in no small part because such data are the most readily accessible. This approach would be fine except for the fact that financial crises have much longer cycles, and a data set that covers twenty-five years simply cannot give one an adequate perspective…”

Reinhart and Rogoff greatly advanced what had been an innumerate conversation about public debt, by compiling, analyzing and promulgating a database containing a long time series of sovereign data. Their metric for analyzing debt sustainability – the ratio of general government debt to GDP – has now become a central focus of analysis.

We see this as a mixed blessing. While the general government debt to GDP ratio properly relates sovereign debt to the ability of the underlying economy to support it, the metric has three important limitations.

First, the use of a general government indicator can be misleading. General government debt refers to the aggregate borrowing of the sovereign and the country’s state, provincial and local governments. If a highly indebted local government – like Jefferson County, Alabama, USA – can default without being bailed out by the central government, it is hard to see why that local issuer’s debt should be included in the numerator of a sovereign risk metric. A counter to this argument is that the United States is almost unique in that it doesn’t guarantee sub-sovereign debts. But, clearly neither the rating agencies nor the market believe that these guarantees are ironclad: otherwise all sub-sovereign debt would carry the sovereign rating and there would be no spread between sovereign and sub-sovereign bonds – other than perhaps a small differential to accommodate liquidity concerns and transaction costs.

Second, governments vary in their ability to harvest tax revenue from their economic base. For example, the Greek and US governments are less capable of realizing revenue from a given amount of economic activity than a Scandinavian sovereign. Widespread tax evasion (as in Greece) or political barriers to tax increases (as in the US) can limit a government’s ability to raise revenue. Thus, government revenue may be a better metric than GDP for gauging a sovereign’s ability to service its debt.

Finally, the stock of debt is not the best measure of its burden. Countries that face comparatively low interest rates can sustain higher levels of debt. For example, The United Kingdom avoided default despite a debt/GDP ratio of roughly 250% at the end of World War II. The amount of interest a sovereign must pay on its debt each year may thus be a better indicator of debt burden.

Our new database attempts to address these concerns by layering central government revenue, expenditure and interest data on top of the statistics Reinhart and Rogoff previously published.

A Public Resource Requiring Public Input

Unlike many financial data sets, this compilation is being offered free of charge and without a registration requirement. It is offered in the hope that it, too, will advance our understanding of sovereign credit risk.

The database contains a large number of data points and we have made efforts to quality control the information. That said, there are substantial gaps, inconsistencies and inaccuracies in the data we are publishing.

Our goal in releasing the database is to encourage a mass collaboration process directed at enhancing the information. Just as Wikipedia articles asymptotically approach perfection through participation by the crowd, we hope that this database can be cleansed by its user community. There are tens of thousands of economists, historians, fiscal researchers and concerned citizens around the world that are capable of improving this data, and we hope that they will find us.

To encourage participation, we have added Wiki-style capabilities to the user interface. Users who wish to make changes can log in with an OpenID and edit individual data points. They can also enter comments to explain their changes. User changes are stored in an audit trail, which moderators will periodically review – accepting only those that can be verified while rolling back others.

This design leverages the trigger functionality of MySQL to build a database audit trail that moderators can view and edit. We have thus married the collaborative strengths of a Wiki to the structure of a relational database. Maintaining a consistent structure is crucial for a dataset like this because it must ultimately be analyzed by a statistical tool such as R.

The unique approach to editing database fields Wiki-style was developed by my colleague, Vadim Ivlev. Vadim will contribute the underlying Python, JavaScript and MySQL code to a public GitHub repository in a few days.

Implications for Sovereign Ratings

Once the dataset reaches an acceptable quality level, it can be used to support logit or probit analysis of sovereign defaults. Our belief – based on case study evidence at the sovereign level and statistical modeling of US sub-sovereigns – is that the ratio of interest expense to revenue and annual revenue change are statistically significant predictors of default. We await confirmation or refutation of this thesis from the data set. If statistically significant indicators are found, it will be possible to build a predictive model of sovereign default that could be hosted by our partners at Wikirating. The result, we hope, will be a credible, transparent and collaborative alternative to the credit ratings status quo.

Sources and Acknowledgements

Aside from the data set provided by Reinhart and Rogoff, we also relied heavily upon the Center for Financial Stability’s Historical Financial Statistics. The goal of HFS is “to be a source of comprehensive, authoritative, easy-to-use macroeconomic data stretching back several centuries.” This ambitious effort includes data on exchange rates, prices, interest rates, national income accounts and population in addition to government finance statistics. Kurt Schuler, the project leader for HFS, generously offered numerous suggestions about data sources as well as connections to other researchers who gave us advice.

Other key international data sources used in compiling the database were:

  • International Monetary Fund’s Government Finance Statistics
  • Eurostat
  • UN Statistical Yearbook
  • League of Nation’s Statistical Yearbook
  • B. R. Mitchell’s International Historical Statistics, Various Editions, London: Palgrave Macmillan.
  • Almanach de Gotha
  • The Statesman’s Year Book
  • Corporation of Foreign Bondholders Annual Reports
  • Statistical Abstract for the Principal and Other Foreign Countries
  • For several countries, we were able to obtain nation-specific time series from finance ministry or national statistical service websites.

We would also like to thank Dr. John Gerring of Boston University and Co-Director of the CLIO World Tables project, for sharing data and providing further leads as well as Dr. Joshua Greene, author of Public Finance: An International Perspective, for alerting us to the IMF Library in Washington, DC.

A number of researchers and developers played valuable roles in compiling the data and placing it on line. We would especially like to thank Charles Tian, T. Wayne Pugh, Amir Muhammed, Anshul Gupta and Vadim Ivlev, as well as Karthick Palaniappan and his colleagues at H-Garb Informatix in Chennai, India for their contributions.

Finally, we would like to thank the National University of Singapore’s Risk Management Institute for the generous grant that made this work possible.

First Open Economics International Workshop Recap

Velichka Dimitrova - January 28, 2013 in Access to Information, Events, Featured, Open Access, Open Data, Open Economics, Open Standards, Our Work, WG Economics, Workshop

The first Open Economics International Workshop gathered 40 academic economists, data publishers and funders of economics research, researchers and practitioners to a two-day event at Emmanuel College in Cambridge, UK. The aim of the workshop was to build an understanding around the value of open data and open tools for the Economics profession and the obstacles to opening up information, as well as the role of greater openness of the academy. This event was organised by the Open Knowledge Foundation and the Centre for Intellectual Property and Information Law and was supported by the Alfred P. Sloan Foundation. Audio and slides are available at the event’s webpage.

Open Economics Workshop

Setting the Scene

The Setting the Scene session was about giving a bit of context to “Open Economics” in the knowledge society, seeing also examples from outside of the discipline and discussing reproducible research. Rufus Pollock (Open Knowledge Foundation) emphasised that there is necessary change and substantial potential for economics: 1) open “core” economic data outside the academy, 2) open as default for data in the academy, 3) a real growth in citizen economics and outside participation. Daniel Goroff (Alfred P. Sloan Foundation) drew attention to the work of the Alfred P. Sloan Foundation in emphasising the importance of knowledge and its use for making decisions and data and knowledge as a non-rival, non-excludable public good. Tim Hubbard (Wellcome Trust Sanger Institute) spoke about the potential of large-scale data collection around individuals for improving healthcare and how centralised global repositories work in the field of bioinformatics. Victoria Stodden (Columbia University / RunMyCode) stressed the importance of reproducibility for economic research and as an essential part of scientific methodology and presented the RunMyCode project.

Open Data in Economics

The Open Data in Economics session was chaired by Christian Zimmermann (Federal Reserve Bank of St. Louis / RePEc) and was about several projects and ideas from various institutions. The session examined examples of open data in Economics and sought to discover whether these examples are sustainable and can be implemented in other contexts: whether the right incentives exist. Paul David (Stanford University / SIEPR) characterised the open science system as a system which is better than any other in the rapid accumulation of reliable knowledge, whereas the proprietary systems are very good in extracting the rent from the existing knowledge. A balance between these two systems should be established so that they can work within the same organisational system since separately they are distinctly suboptimal. Johannes Kiess (World Bank) underlined that having the data available is often not enough: “It is really important to teach people how to understand these datasets: data journalists, NGOs, citizens, coders, etc.”. The World Bank has implemented projects to incentivise the use of the data and is helping countries to open up their data. For economists, he mentioned, having a valuable dataset to publish on is an important asset, there are therefore not sufficient incentives for sharing.

Eustáquio J. Reis (Institute of Applied Economic Research – Ipea) related his experience on establishing the Ipea statistical database and other projects for historical data series and data digitalisation in Brazil. He shared that the culture of the economics community is not a culture of collaboration where people willingly share or support and encourage data curation. Sven Vlaeminck (ZBW – Leibniz Information Centre for Economics) spoke about the EDaWaX project which conducted a study of the data-availability of economics journals and will establish publication-related data archive for an economics journal in Germany.

Legal, Cultural and other Barriers to Information Sharing in Economics

The session presented different impediments to the disclosure of data in economics from the perspective of two lawyers and two economists. Lionel Bently (University of Cambridge / CIPIL) drew attention to the fact that there is a whole range of different legal mechanism which operate to restrict the dissemination of information, yet on the other hand there is also a range of mechanism which help to make information available. Lionel questioned whether the open data standard would be always the optimal way to produce high quality economic research or whether there is also a place for modulated/intermediate positions where data is available only on conditions, or only in certain part or for certain forms of use. Mireille van Eechoud (Institute for Information Law) described the EU Public Sector Information Directive – the most generic document related to open government data and progress made for opening up information published by the government. Mireille also pointed out that legal norms have only limited value if you don’t have the internalised, cultural attitudes and structures in place that really make more access to information work.

David Newbery (University of Cambridge) presented an example from the electricity markets and insisted that for a good supply of data, informed demand is needed, coming from regulators who are charged to monitor markets, detect abuse, uphold fair competition and defend consumers. John Rust (Georgetown University) said that the government is an important provider of data which is otherwise too costly to collect, yet a number of issues exist including confidentiality, excessive bureaucratic caution and the public finance crisis. There are a lot of opportunities for research also in the private sector where some part of the data can be made available (redacting confidential information) and the public non-profit sector also can have a tremendous role as force to organise markets for the better, set standards and focus of targeted domains.

Current Data Deposits and Releases – Mandating Open Data?

The session was chaired by Daniel Goroff (Alfred P. Sloan Foundation) and brought together funders and publishers to discuss their role in requiring data from economic research to be publicly available and the importance of dissemination for publishing.

Albert Bravo-Biosca (NESTA) emphasised that mandating open data begins much earlier in the process where funders can encourage the collection of particular data by the government which is the basis for research and can also act as an intermediary for the release of open data by the private sector. Open data is interesting but it is even more interesting when it is appropriately linked and combined with other data and the there is a value in examples and case studies for demonstrating benefits. There should be however caution as opening up some data might result in less data being collected.

Toby Green (OECD Publishing) made a point of the different between posting and publishing, where making content available does not always mean that it would be accessible, discoverable, usable and understandable. In his view, the challenge is to build up an audience by putting content where people would find it, which is very costly as proper dissemination is expensive. Nancy Lutz (National Science Foundation) explained the scope and workings of the NSF and the data management plans required from all economists who are applying for funding. Creating and maintaining data infrastructure and compliance with the data management policy might eventually mean that there would be less funding for other economic research.

Trends of Greater Participation and Growing Horizons in Economics

Chris Taggart (OpenCorporates) chaired the session which introduced different ways of participating and using data, different audiences and contributors. He stressed that data is being collected in new ways and by different communities, that access to data can be an enormous privilege and can generate data gravities with very unequal access and power to make use of and to generate more data and sometimes analysis is being done in new and unexpected ways and by unexpected contributors. Michael McDonald (George Mason University) related how the highly politicised process of drawing up district lines in the U.S. (also called Gerrymandering) could be done in a much more transparent way through an open-source re-districting process with meaningful participation allowing for an open conversation about public policy. Michael also underlined the importance of common data formats and told a cautionary tale about a group of academics misusing open data with a political agenda to encourage a storyline that a candidate would win a particular state.

Hans-Peter Brunner (Asian Development Bank) shared a vision about how open data and open analysis can aid in decision-making about investments in infrastructure, connectivity and policy. Simulated models about investments can demonstrate different scenarios according to investment priorities and crowd-sourced ideas. Hans-Peter asked for feedback and input on how to make data and code available. Perry Walker (new economics foundation) spoke about the conversation and that a good conversation has to be designed as it usually doesn’t happen by accident. Rufus Pollock (Open Knowledge Foundation) concluded with examples about citizen economics and the growth of contributions from the wider public, particularly through volunteering computing and volunteer thinking as a way of getting engaged in research.

During two sessions, the workshop participants also worked on Statement on the Open Economics principles will be revised with further input from the community and will be made public on the second Open Economics workshop taking place on 11-12 June in Cambridge, MA.

Open Research Data Handbook Sprint – 15-16 February

Velichka Dimitrova - January 16, 2013 in Events, Featured, Open Data Handbook, Open Economics, Open Science, Open Standards, Sprint / Hackday, WG Development, WG Economics, WG Open Bibliographic Data, WG Open Data in Science

On February 15-16, the Open Research Data Handbook Sprint will happen at the Open Data Institute, 65 Clifton Street, London EC2A 4JE.

The Open Research Data Handbook aims to provide an introduction to the processes, tools and other areas that researchers need to consider to make their research data openly available.

Join us for a book sprint to develop the current draft, and explore ways to remix it for different disciplines and contexts.

Who it is for:

  • Researchers interested in carrying out their work in more open ways
  • Experts on sharing research and research data
  • Writers and copy editors
  • Web developers and designers to help present the handbook online
  • Anyone else interested in taking part in an intense and collaborative weekend of action

Register at Eventbrite

What will happen:

The main sprint will take place on Friday and Saturday. After initial discussions we’ll divide into open space groups to focus on research, writing and editing for different chapters of the handbook, developing a range of content including How To guidance, stories of impact, collections of links and decision tools.

A group will also look at digital tools for presenting the handbook online, including ways to easily tag content for different audiences and remix the guide for different contexts.


Week before & after:

  • Calling for online contributions and reviews


  • Seminar or bring your own lunch on open research data.
  • From 2pm: planning and initial work in the handbook in small teams (optional)


  • 10.00 – 10:30: Arrive and coffee
  • 10.30 – 11.30: Introducing open research – lightning talks
  • 11.30 – 13:30: Forming teams and starting sprint. Groups on:
    • Writing chapters
    • Decision tools
    • Building website & framework for book
    • Remixing guide for particular contexts
  • 13.30 – 14:30: Lunch
  • 14.30 – 16:30: Working in teams
  • 17.30 – 18:30: Report back
  • 18:30 – …… : Pub


OKF Open Science Working Group – creators of the current Open Research Data Handbook
OKF Open Economic Working Group – exploring economics aspects of open research
Open Data Research Network - exploring a remix of the handbook to support open social science
research in a new global research network, focussed on research in the Global South.
Open Data Institute – hosting the event

Get Updates