Show me the (quality) data!
Show me your data!
Put it online!
Make it re-useable and accessible!
That’s the rallying cry of many in the Open Data movement. Few, at this point, seem to be demanding: make sure your data is credible, robust and of high quality! Why is this important? It is true that there is value in making a range of data sets available to stimulate interest in data use. At the same time, there is a real risk that the Open Data momentum could be derailed if out- of-date or inaccurate data sets made available by governments are used for economic forecasting, developmental planning or attempts to hold a government to account. Imagine a CSO trying to measure a country’s progress toward a developmental goal based on 10-year-old poverty data. Think it’s an unrealistic scenario? Think again.
In Kenya, the most recent household poverty data available was compiled in 2005-06. This data has now been released though the Open Data portal. How useful is it? How can NGOs use it to argue for effecting changes? To measure government development goal delivery? How can the Government develop economic policies or make resource allocations based on this data?
Most data is, or should be, drawn from records, and if the records aren’t reliable, the data won’t be reliable. Records integrity is based on proper management of the information from the time it is created until it ceases to have value.
Where reliable records cannot be accessed, openness is unachievable. When record keeping is poor, ordinary citizens are the losers. Poorly managed records tend to be incomplete, difficult to locate, and hard to authenticate; they can be easily manipulated, deleted, fragmented or lost. They undermine Open Government initiatives and result in inaccurate or incomplete data and information, which in turn can lead to the misunderstanding and misuse of information, cover-up of fraud, skewed findings and statistics, misguided policy and misplaced funding, all with serious consequences for citizens’ lives. Poor quality records can impair delivery of justice, human rights cannot be protected, government services are compromised, and civil society cannot hold governments to account.
Paper-based records, which are still used extensively, are not well managed in many cases; while the rapid introduction of ICT systems across governments has not addressed the challenges of protecting the integrity of the digital information that these systems generate.
Take the example of paper record keeping in the Burundi Supreme Court, where records over seven years old were found to be in abysmal condition. Poorly stored in a basement, where they were subjected to rain and dust; shelves had collapsed, and the records were in heaped in indiscriminate piles. A stray dog even managed to make its way into the basement and ripped up some records to have a litter of puppies on. Imagine trying to generate judiciary statistics covering a 10-year period to measure number of rulings? Or the fairness of the trials? Try to generate the necessary data using these records to determine how accountable a court is, how transparent court rulings or processes are.
And digital record-keeping is not immune either: work in Sierra Leone on civil servants’ and teachers’ records demonstrates how misleading employment data can be if the records used to generate it are badly kept. Once accurate records had been created and provided as the basis for verifying actual teachers against the payrolls, it was determined that ‘ghost workers’ – people claiming the pay for dead or non existent people beyond retirement age – accounted for approximately 14% of the civil service payroll and approximately 25% of the teachers’ payroll. The discovery will save the government millions of dollars annually and enabling accurate human resource planning. Openness, transparency and accountability in relation to employment data would not have been meaningful before the records controls were introduced.
Ultimately Open Data will need to be credible.
It is important to move beyond the idea that simply publishing data sets will foster momentum and interest in the use of data for accountability or economic growth. More thought must be given to the integrity of the records that provide the basis for the data, and the means of tracing the data back to the source evidence. Governments must be held to account for what they publish, and we, too, must be accountable for the information that we encourage them to provide.