This is a guest post by Ivan Begtin, Ambassador for Open Knowledge in Russia and co-founder of the Russian Local Group.
Dear friends, the end of 2014 and the beginning of 2015 have been marked by an event, which is terrific for all those who are interested in working with open data, participating in challenges for apps developers and generally for all people who are into the Open Data Movement. I’m also sure, by the way, that people who are fond of history will find it particularly fascinating to be involved in this event.
On 23 December 2014, the Russian Ministry of Finance together with NGO Infoculture launched an apps developers’ challenge BudgetApps based on the open data, which have been published by the Ministry of Finance over the past several years. There is a number of various datasets, including budget data, audit organisations registries, public debt, national reserve and many other kinds of data.
Now, it happened so that I have joined the jury. So I won’t be able to participate, but let me provide some details regarding this initiative.
All the published data can be found at the Ministry website. Lots of budget datasets are also available at The Single Web Portal of the Russian Federation Budget System. That includes the budget structure in CSV format, the data itself, reference books and many other instructive details. Data regarding all official institutions are placed here. This resource is particularly interesting, because it contains indicators, budgets, statutes and numerous other characteristics regarding each state organisation or municipal institution in Russia. Such data would be invaluable for anyone who considers creating a regional data-based project.
One of the challenge requirements is that the submitted projects should be based on the data published by the Ministry of Finance. However, it does not mean that participants cannot use data from other sources alongside with the Ministry data. It is actually expected that the apps developers will combine several data sources in their projects.
To my mind, one should not even restrict themselves to machine-readable data, because there are also available human-readable data that can be converted to open data formats by participants.
Many potential participants know how to write parsers on their own. For those who have never had such an experience there are great reference resources, e.g. ScraperWiki that can be helpful for scraping web pages. There are also various libraries for analysing Excel files or extracting spreadsheets from PDF documents (for instance, PDFtables, Abbyy Finereader software or other Abbyy services ).
Moreover, at other web resources of the Ministry of Finance there is a lot of interesting information that can be converted to data, including news items that recently have become especially relevant for the Russian audience.
There is a huge and powerful direction in the general process of opening data, which has long been missing in Russia. What I mean here is publishing open historical data that are kept in archives as large paper volumes of reference books containing myriads of tables with data. These are virtually necessary when we turn to history referring to facts and creating projects devoted to a certain event.
The time has come at last. Any day now the first scanned budgets of the Russian Empire and the Soviet Union will be openly published. A bit later, but also in the near future, the rest of the existing budgets of the Russian Empire, the Soviet Union, and the Russian Soviet Federated Socialist Republic will be published as well.
These scanned copies are being gradually converted to machine-readable formats, such as Excel and CSV data reconstructed from these reference books – both as raw data and as initially processed and ordered data. We created these ordered normalised versions to make it easier for developers to use them in further visualisations and projects. A number of such datasets have already been openly published. It is also worth mentioning that a considerable number of scanned copies of budget reference books (from both the Russian Empire and USSR) have already been published online by Historical Materials, a Russian-language grass-root project launched by a group of statisticians, historians and other enthusiasts.
Here are the historical machine-readable datasets published so far:
- Full public catalogue of budget spending and income for 1912
- Budgets for 1912-1915 by spending categories
- Budgets for 1912-1915 by agencies
- Reconstructed functional structure of the USSR public budget spending in 1937-1940 based on a statistics digest known as USSR Public Budget (parts 1 and 2)
- Reconstructed functional structure of the USSR public budget spending in 1941-1945 based on a statistics digest known as USSR Public Budget (parts 1 and 2)
- Reconstructed functional structure of the USSR public budget spending in 1946-1950 based on a statistics digest known as USSR Public Budget (parts 1 and 2)
I find this part of the challenge particularly inspiring. If I were not part of the jury, I would create my own project based on historical budgets data. Actually, I may well do something like that after the challenge is over (unless somebody does it earlier).
There is a greater stock of data sources that might be used alongside with the Ministry data. Here are some of them:
- Hub of Open Data is a non-governmental registry of open data created and supported by NGO Infoculture. It contains over 5,000 datasets, but all of them are non-official.
- Russian Federal Treasury data
- Web-services provided by the Central Bank of Russia contain quite a number of fascinating data regarding Russian finance.
- Single Open Data Portal of Russia may come in handy as an aggregator of numerous data, both Russian and external.
- The World Bank data include some information on Russia as well.
- The UN data.
What can be done?
First and foremost, there are great opportunities for creating projects aimed at enhancing the understandability of public finance. Among all, these could be visual demos of how the budget (or public debt, or some particular area of finance) is structured.
Second, lots of projects could be launched based on the data on official institutions at bus.gov.ru. For instance, it could be a comparative registry of all hospitals in Russia. Or a project comparing all state universities. Or a map of available public services. Or a visualisation of budgets of Moscow State University (or any other Russian state university for that matter).
As to the historical data, for starters it could be a simple visualisation comparing the current situation to the past. This might be a challenging and fascinating problem to solve.
Why is this important?
BudgetApps is a great way of promoting open data among apps developers, as well as data journalists. There are good reasons for participating. First off, there are many sources of data that provide a good opportunity for talented and creative developers to implement their ambitious ideas. Second, the winners will receive considerable cash prizes. And last, but not least, the most interesting and perspective projects will get a reference at the Ministry of Finance website, which is a good promotion for any worthy project. Considerable amounts of data have become available. It’s time now for a wider audience to become aware of what they are good for.