In June 2019, we launched the Frictionless Data Tool Fund to facilitate reproducible data workflows in research contexts. Our four Tool Fund grantees are now at the halfway point of their projects, and have made great progress. Read on to learn more about these projects, their next steps, and how you can also contribute.
Stephan Max: Data Package tools for Google Sheets
Stephan’s Tool Fund work is focused on creating an add-on for Google Sheets to allow for Data Package import and export. With this tool, researchers (and other data wranglers) that use Google Sheets will be able to quickly and easily incorporate Data Packages into their existing data processing workflows. Recently, Stephan created a prototype that you can test at the project’s GitHub Repo by following the steps outlined in the README file: https://github.com/frictionlessdata/googlesheets-datapackage-tools. Next steps for Stephan’s project include enhancing the user interface, and adding additional information such as licensing options for the export button. If you try the prototype, please leave Stephan feedback as an issue in the repository.
João Peschanski and team: Neuroscience Experiments System (NES)
To improve the way neuroscience experimental data and metadata is shared, João and the team at the Research, Innovation and Dissemination Center for Neuromathematics (RIDC NeuroMat) are working on implementing Data Packages into their Neuroscience Experiments System (NES). NES is an open-source tool for data collection that stores large amounts of data in a structured way. This tool aims to assist neuroscience research laboratories in routine experimental procedures. During the Tool Fund, João and team have created a Data Package exportation module from within NES that reflects the Frictionless specifications for data and metadata interoperability. This export includes a JSON file descriptor (a datapackage.json file) with information related to how the experiment was performed, with a goal of increasing reproducibility. Next steps for the team include more testing and gathering feedback, and then a public release. The NES GitHub repository can be seen here: https://github.com/neuromat/nes.
André Heughebaert: DarwinCore Archive Data Package support
Inspired by his work with the Global Biodiversity Information Facility (GBIF), André is converting DarwinCore Archives into Data Packages for his Tool Fund project. The DarwinCore is a standard describing biological diversity that is intended to increase interoperability of biological data. André has recently completed a first release of the tool, which appends datapackage.json and README.md files containing the data descriptors and human readable metadata to the DarwinCore archive. This release supports all standard DarwinCore terms, and has been tested with several use cases. You can read more about Frictionless DarwinCore and see all of the use cases André tested for the beta release in the repo’s README file. If you want to test or contribute to this Tool Fund project, please open an issue in the repository.
Shelby Switzer and Greg Bloom: Open Referral Human Services data package support
Shelby’s Tool Fund work is building out datapackage support for Open Referral’s Human Service Data Specification (HSDS) and Human Service Data API Suite (HSDA). Open Referral develops data standards and open source tools for health, human, and social services. For the Tool Fund, Shelby has been developing on their HSDS-Transformer, which takes raw data, transforms it to HSDS format, and then packages it as a datapackage within a zip file, so users can work with tidily packaged data. For example, Shelby and the Open Referral team have been working with 2-1-1 in Miami-Dade, Florida, to help transform and share their resource directory database with their partners in a more sustainable fashion. Next steps for Shelby include creating a UI for their HSDS-Transformer so that anyone can access HSDS-compliant datapackages. Shelby will also be contributing to the improvement of the datapackage Ruby gem during this project.
Lilly is the Product Manager for the Frictionless Data for Reproducible Research project. She has her PhD in neuroscience from Oregon Health and Science University, where she researched brain injury in fruit flies and became an advocate for open science and open data. Lilly believes that the future of research is open, and is using Frictionless Data tooling within the researcher community to make science more reproducible.