Back in July of this year a crowd of coders, scientists and new media artists gathered in Berlin for the Open Science Workshop at OKCon. One of the projects to come out of this gathering was the Data Digitizer, a tool for transcribing documents and tables that are not currently machine-readable. Suggested applications for this tool ranged from the transcription of Brazilian census data to input of tables from economics articles to allow comparisons across multiple articles that examine the same variables.
The project is still ongoing with the code up on github. You can also find an Etherpad that details what was proposed and achieved in the first session of work on the Data Digitizer. Check out how far it’s got with a little demo that is up and running here.