Csv,conf – the best community conference for data makers was back for its version 8 at the end of May in Puebla, Mexico. I am very lucky to be one of the conference co-organisers since its version 7, but this year I was also a speaker, giving a talk about Open Data Editor (ODE), the application we are developing at Open Knowledge Foundation. With ODE we want to make all (or at least most) Frictionless functionalities available to a non-tech audience, so people can validate their datasets or generate metadata and schemas without having to write one single line of code. Our final goal? Increase data portability and interoperability for all.

It was not the first time ODE was going to csv,conf. Actually last year my colleague Evgeny Karev presented the alpha release of the application (which did not have a name at the time). Of course, there was no point in just presenting the beta one year after (or at least I did not find it interesting enough), so I started thinking: what is new this year? How can I talk about ODE in a way that it is not just another product presentation, but instead in a way that can be useful to the community?

And that’s when it got to me: our journey with ODE has not been a straightforward one. It never is, for any tech product. But we never talk about the mistakes we make on the road. We don’t do it for our products and we don’t do it for any other aspect of our life. We only selectively communicate about the success stories. Because that’s what funders want to hear, but also because, ultimately, that’s what we want to believe. But the truth is that technological progress (and any progress whatsoever, actually!) is made of failures and mistakes, just as much as it is made of success stories. And I personally believe there is much more to learn from mistakes and pitfalls than success stories, but we hardly ever hear of them. So why not share our tormented journey with the community?

First of all, it’s worth mentioning that our journey did not start with ODE. At the Open Knowledge Foundation we have decades of iterative work experience with open data communities and have been engaging with issues around data interoperability, publication workflows, and analysis. We have been developing the Frictionless Data toolkit (which ODE uses under the hood), with its data and metadata standards and its software implementations for more than 15 years now. So we are not starting from scratch. Actually, we made a first attempt of “Frictionless Application” for a non-tech audience back in 2018, when we released try.goodtables.io, a web-hosted application for data validation. Quite quickly concerns were raised from users about data privacy, and we had internal concerns about the hosting. 

All of this to say, when we started developing ODE we already knew we wanted it to be a desktop application, with files stored locally, and usable even offline. We knew that because of the try.goodtables.io experience. We knew that because we had made mistakes and learnt from them. That of course did not prevent us from hitting other bumps in the road.

To start with, we did not establish an iterative process with our target community. We had numerous conversations with the GLAM community to understand their problems, but we never checked with them whether the solutions we were building were actually solving their problems. During the development we turned to our usual communities instead: very solid and proactive open data communities, but with very technical backgrounds. It was very natural for us to do so, because those are the communities we always interact with, it was not an intentional choice, more of a habit I would say. The outcome was that, for example, when the alpha was released, people still needed a few command line commands to install the application. Not the end of the world, but it meant that the alpha was non accessible to a non-tech audience, and therefore the only feedback we received was not coming from the target audience. 

We also decided to include a lot of features in the beta release, maybe a bit too many for a beta. It helped give potential funders an understanding of the ODE potential, but it also ended up being a bit more buggy than expected for end-users. One of the features we decided to include was an AI plug-in, initially intended to help people generate data-stories, maps and reports. To our great surprise, the AI plug-in generated far more scepticism than excitement, despite the fact it was never intended to share the data with the AI (something we probably did not make clear enough). 

So what did we learn from all that and what takeaways can we share?

  1. Put a lot of effort into identifying your target audience, and establish a relationship with them.
  1. Try to be driven by simplicity, utility and pragmatism as much as possible, a real mantra for us from the Frictionless project, and one of the pillars of our Tech We Want campaign at Open Knowledge Foundation. You want to be as little disruptive as possible of existing approaches and processes. 
  1. Don’t be seduced too quickly by the latest tech trend. Keep a critical eye on them, they don’t only bring benefits, but also problems.

  2. At the end of the day the problem ODE is trying to solve is an old problem that has probably existed since the beginning of tabulation: data is messy. It is often hard to find, archived in difficult to use formats, poorly structured and/or incomplete. This is still a reality despite the incredible technological progress we have witnessed in the last decades.

One last thing that I want to mention is that obviously we were not aware of all those mistakes while we were doing them, and it is only with time that we started realising them. So don’t be hard on yourself, and try to give yourself the space and the perspective to actually see your mistakes.

We were very lucky to find a generous funder who believed in our idea. Thanks to the support of the Patrick J. McGovern Foundation, we assembled an incredible team in the last months, who focused on user research first and is now working on a stable version to be released at the end of the year. On top of the development itself, we will dedicate ourselves to increase and simplify adoption, with tutorials, workshops, pilot projects, and even a free online course.

Do you want to know more about ODE? Go and read the blogs we published in the last months about the journey of the application.

If you are interested in my talk, my slides are on Zenodo, and the talk was actually recorded, so you can watch it if you want:

Open Data Editor: The tormented journey of an app

Presented by: Sara Petti (EN) Do you remember the Frictionless Data desktop application we presented last year in Buenos Aires? We are back with a name (Open Data Editor) and a beta version. In this talk we will share the journey that has taken us from the alpha, presented last year at csv,conf, to the beta released in October 2023, and the current work-in-progress v1.

Please get in touch if you would like to help. Our roadmap is openly available on GitHub. A great way to help us, especially if you don’t have a technical background, is to fill out this feedback form (which only takes a few minutes).

On a final note, some years ago I was profoundly touched by Melanie Stefan’s paper A CV of failures (which appears in a journal I will not mention because I don’t believe in paywalls and I think access to knowledge should be free – so try to find it on SciHub if possible). I surely had that paper in the back of my mind when outlining this talk and I want to acknowledge that. I would definitely encourage you to go and read it.