Since the previous post we’ve succeeded in using tesseract and
we now have a nice plain text version of the EB entry on shakespeare:

http://knowledgeforge.net/shakespeare/svn/trunk/shksprdata/ancillary/britannica-11th.txt

What we now need to do is ‘proof’ this to correct the OCR errors. This
kind of think is perfect for distributed volunteers so if you’d like to
help out just step up and starting correcting with one of the sections
. To make it especially easy for people to make edits the text has in a temporary location on the Open Knowledge Foundation wiki (only the first five pages for the time being):

http://wiki.okfn.org/p/Open_Shakespeare/Britannica

Website | + posts

Rufus Pollock is Founder and President of Open Knowledge.

2 thoughts on “Proof-Editing Shakespeare Entry from Encyclopaedia Britannica 11th Edition”

  1. jean: thanks for the suggestions. We haven’t considered the mechanical turk so far because of the need to pay money (we’re pro bono publico and are volunteer based).

    We’ve definitely been considering pgdp.net (see discussions on the mailing list). However for the time being given that the whole piece is only 30 pages we thought it better just to ‘put it in a wiki’ and do it on a volunteer basis rather than have to go through the pgdp.net process. However as I said we’ve been considering pgdp and may submit there.

Comments are closed.