Data Curator is a simple desktop editor to help describe, validate, and share usable open data.
Open data producers are increasingly focusing on improving open data so it can be easily used to create insight and drive positive change. Open data is more likely to be used if data consumers can:
- understand the structure and quality of the data
- understand why and how the data was collected
- look up the meaning of codes used in the data
- access the data in an open machine-readable format
- know how the data is licensed and how it can be reused
Data Curator enables open data producers to define all this information, and validate the data, prior to publishing it on the Internet. The data is published as a Tabular Data Package following the Frictionless Data specification. This allows open data consumers to read the data using Frictionless Data applications and software libraries.
“We need to make it easy to manage data throughout its lifecycle and ensure it can be easily and reliably retrieved by people who want to reuse and repurpose it. We developed Data Curator to help publishers define certain characteristics to improve data and metadata quality” – Dallas Stower, Assistant Director-General, Digital Platforms and Data, Queensland Government – Project Sponsor
Data Curator allows you to create data from scratch or open an Excel or CSV file. Data Curator requires that each column of data is given a type (e.g. text, number). Data can be defined further using a format (e.g. text may be a URL or email). Constraints can be applied to data values (e.g. required, unique, minimum value, etc.). This definition process can be accelerated by using the Guess feature, that guesses the data types and formats for all columns.
Data can be validated against the column type, format and constraints to identify and correct errors. If it’s not appropriate to correct the errors, they can be added to the provenance information to help people understand why and how the data was collected and determine if it is fit for their purpose.
Often a set of codes used in the data is defined in another table. Data Curator lets you validate data across tables. This is really useful if you want to share a set of standard codes across different datasets or organisations.
Data Curator lets you save data as a comma, semicolon, or tab separated value file. After you’ve applied an open license to the data, you can export a data package containing the data, its description, and provenance information. The data package can then be published to the Internet. Some open data platforms support uploading, displaying, and downloading data packages. Open data consumers can then confidently access and use quality open data.
Download Data Curator for Windows or macOS.
Learn more about Data Curator and Frictionless Data.
Who made Data Curator?
Data Curator was made possible with funding and guidance from the Queensland Government.
The project was led by Stephen Gates from the ODI Australian Network. Software development made possible by Gavin Kennedy and Matt Mulholland from the Queensland Cyber Infrastructure Foundation (QCIF).
Data Curator uses the Frictionless Data software libraries maintained by Open Knowledge International. Data Curator started life as Comma Chameleon an experiment by the Open Data Institute.
Stephen is an experienced business technology leader with over 30 years of experience. He had worked in Financial Services, Insurance, National, State and Local Government developing strategies and architectures that empower business through the innovative use of information technology. He holds a Bachelor of Science and Graduate Certificate in Spatial Science Technology.
Stephen is passionate about driving improvement in the quality of open data. He has contributed many Open Data projects including Data Curator, Frictionless Data, Open Data Certificates, Open Data Pathway, Open Data Charter Measurement Guide, the Open Definition, and established Australia’s Open Data Census.