A few weeks ago we had a conversation in the Statistical Conference of the Americas about Data Governance. This brief post is a follow up of some of the concepts and ideas that we shared in the panel.
Transparency as a Holistic Approach
As citizens and members of the Civil Society, a huge challenge we face in the upcoming years on Data Governance is Transparency. Not transparency as it is commonly known (Governments providing access to their data), but transparency as a holistic approach for the methods behind opening data.
What do we mean with transparency as a holistic approach?
It means that not only should the data be accessible but also the processes that lead to the production of that data (where was it taken from, and how?) and the analyses and interpretation of the data itself (how do we arrive at this metric/value/conclusion?)
Issues with the lack of a holistic approach to Transparency on Data Governance
Let us clarify with an example.
During the national elections of Bolivia in 2019 a report of the Organization of American States concluded: “it is statistically unlikely that Morales would have obtained the 10% difference to avoid a second round.”  Given the political context in Bolivia, this affirmation had tremendous implications.
In order to have a healthy debate about it, the first question to ask is: “How did you arrive at that conclusion? Can you show me the data, and the statistical processes used so that I can validate what you are saying?” But the information provided in the preliminary report wasn’t enough to start answering those follow-up questions. And so, there was no possibility (at that point in time) to have a debate about their conclusion. The only option available was to believe what the persons were saying and what was written in a PDF report.
A healthy public debate requires transparent data and processes. We should have the capacity (and the tools) to review and assess all the processes that lead to conclusions printed out in PDF reports. This concept is not new,the scientific process is based on peer reviews and reproducible research, so good Data Governance should include them as well.
It is time for Governments, Consultancy Firms and Organisations to open both their processes and data so their conclusions and assessments are not only based on reputation. We need the capacity to challenge and review the conclusions printed out in long PDF reports.
What’s our proposal?
Healthy public debates should take place with clear and reproducible analysis that allows others to follow the same steps and arrive at the same conclusions.
We believe in open and transparent information and we build tools to make it easier for everyone. We recently launched Livemark, a data presentation framework for Python that generates static sites from extended Markdown with interactive charts, tables and scripts.
Livemark is an excellent tool, not only to communicate reports and assessment but to do so in a clear and transparent way. It allows journalists and researchers to display the analysis and statistical processes made to reach the presented conclusions.
For a more open and transparent future, we require open and transparent communication and Livemark, alongside all our Frictionless Framework, is our proposal to it.