Earlier this year OKI announced new funding from The Alfred P. Sloan Foundation to explore “Frictionless Data for Reproducible Research”. Over the next three years we will be working closely with researchers to support the way they are using data with the Frictionless Data software and tools. The project is delighted to announce that Lilly Winfree has come on board as Product Manager to work with research communities on a series of focussed pilots in the research space and to help us develop focussed training and support for researchers.
Data practices in scientific research are transforming as researchers are facing a reproducibility revolution; there is a growing push to make research data more open, leading to more transparent and reproducible science. I’m really excited to join the team at OKI, whose mission of creating a world where knowledge creates power for the many, not the few really resonates with me and my desires to make science more open.
During my grad school years as a neuroscience researcher, I was often frustrated with “closed” practices (inaccessible data, poorly documented methods, paywalled articles) and I became an advocate for open science and open data. While investigating brain injury in fruit flies (yes, fruit fly brains are actually quite similar to human brains!), I taught myself coding to analyse and visualise my research data. After my PhD research, I worked on integrating open biological data with the Monarch Initiative, and delved into the open data licensing world with the Reusable Data Project. I am excited to take my passion for open data and join OKI to work on the Frictionless Data project, where I will get to go back to my scientific research roots and work with researchers to make their data more open, shareable, and reproducible.
Most people that use data know the frustrations of missing values, unknown variables, and confusing schema (just to name a few). This “friction” in data can lead to massive amounts of time being spent on data cleaning, with little time left for analysis. The Frictionless Data for Reproducible Research project will build upon years of work at OKI focused on making data more structured, discoverable, and usable. The core of Frictionless Data is the data preparation and validation stages, and the team has created specifications and tooling centered around these steps. For instance, the Data Package Creator packages tabular data with its machine readable metadata, allowing users to understand the data structure, meaning of values, how the data was created, and the license. Also, users can validate their data for structure and content with Goodtables, which reduces errors and increases data quality. By creating specifications and tooling and promoting best practices, we are aiming to make data more open and more easily shareable among people and between various tools.
For the next stage of the project, I will be working with organisations on pilots with researchers to work on reducing the friction in scientists’ data. I will be amassing a network of researchers interested in open data and open science, and giving trainings and workshops on using the Frictionless Data tools and specs. Importantly, I will work with researchers to integrate these tools and specs into their current workflows, to help shorten the time between experiment → data → analysis → insight. Ultimately, we are aiming to make science more open, efficient, and reproducible.
Are you a researcher interested in making your data more open? Do you work in a research-related organization and want to collaborate on a pilot? Are you an open source developer looking to build upon frictionless tools? We’d love to chat with you! We are eager to work with scientists from all disciplines. If you are interested, connect with the project team on the public gitter channel, join our community chat, or email Lilly at email@example.com!