Show me the (quality) data!
Show me your data!
Put it online!
Make it re-useable and accessible!
That’s the rallying cry of many in the Open Data movement. Few, at this point, seem to be demanding: make
sure your data is credible, robust and of high quality! Why is this important? It is true that
there is value in making a range of data sets available to stimulate interest in data use. At
the same time, there is a real risk that the Open Data momentum could be derailed if out-
of-date or inaccurate data sets made available by governments are used for economic
forecasting, developmental planning or attempts to hold a government to account.
Imagine a CSO trying to measure a country’s progress toward a developmental goal based
on 10-year-old poverty data. Think it’s an unrealistic scenario? Think again.
In Kenya, the most recent household poverty data available was compiled in 2005-06. This
data has now been released though the Open Data portal. How useful is it? How can NGOs
use it to argue for effecting changes? To measure government development goal delivery?
How can the Government develop economic policies or make resource allocations based on
Most data is, or should be, drawn from records, and if the records aren’t reliable, the data
won’t be reliable. Records integrity is based on proper management of the information from
the time it is created until it ceases to have value.
Where reliable records cannot be
accessed, openness is unachievable. When record keeping is poor, ordinary citizens are
the losers. Poorly managed records tend to be incomplete, difficult to locate, and hard to
authenticate; they can be easily manipulated, deleted, fragmented or lost. They undermine
Open Government initiatives and result in inaccurate or incomplete data and information,
which in turn can lead to the misunderstanding and misuse of information, cover-up of
fraud, skewed findings and statistics, misguided policy and misplaced funding, all with
serious consequences for citizens’ lives. Poor quality records can impair delivery of justice,
human rights cannot be protected, government services are compromised, and civil society
cannot hold governments to account.
Paper-based records, which are still used extensively, are
not well managed in many cases; while the rapid introduction of ICT systems
has not addressed the challenges of protecting the integrity of the digital information that
these systems generate.
Take the example of paper record
keeping in the Burundi Supreme Court, where records over seven years old were found to be in
abysmal condition. Poorly stored in a basement, where they
were subjected to rain and dust; shelves had collapsed, and the records were in heaped
in indiscriminate piles. A stray dog even managed to make its way into the basement and
ripped up some records to have a litter of puppies on. Imagine trying to generate judiciary statistics covering a 10-year
period to measure number of rulings? Or the fairness of the trials? Try to
generate the necessary data using these records to determine how accountable a court is,
how transparent court rulings or processes are.
And digital record-keeping is not immune either: work in Sierra Leone on civil servants’
and teachers’ records demonstrates how misleading employment data can be if the
records used to generate it are badly kept. Once accurate records had been
created and provided as the basis for verifying actual teachers against the payrolls, it was
determined that ‘ghost workers’ – people claiming the pay for dead or non existent
people beyond retirement age – accounted for approximately 14% of the civil service payroll
and approximately 25% of the teachers’ payroll. The discovery will save the government millions of dollars
annually and enabling accurate human resource planning. Openness, transparency and
accountability in relation to employment data would not have been meaningful before the
records controls were introduced.
Ultimately Open Data will need to be credible.
It is important to move beyond the idea
that simply publishing data sets will foster momentum and interest in the use of data for
accountability or economic growth. More thought must be given to the integrity of
the records that provide the basis for the data, and the means of tracing the data back to the
source evidence. Governments must be held to account for what they publish, and we, too,
must be accountable for the information that we encourage them to provide.