Open dictionary databases: an overview

dictionary

Open dictionaries are excellent examples of open knowledge projects. Whether monolingual or bilingual, and whether dealing with definitions, etymology, translation or pronounciation – they can often be large, collaborative undertakings.

Dictionary databases have a wide variety of potential applications – from education and research to machine translation and integration with software applications and services.

We’ve listed several [open](http://www.opendefinition.org) dictionary projects and packages on CKAN:

*

These include:

*
* Currently offers 69 bilingual dictionaries released under the GPL.
*
* Currently includes over currently 308 dictionary files in various languages published in XML format. All material is under the GPL.
*
* Offers a variety of dictionaries with over 20 different language pairs. Material is under the GPL, the GFDL and the Creative Commons Attribution-Sharealike license.
*
* The Wikimedia Foundation’s dictionary project – currently including [over 5 million entries in over 170 languages](http://meta.wikimedia.org/wiki/Wiktionary/Table).
*
* A project to build a basic public domain dictionary for children.
*
* Scans of the first several volumes of the [Oxford English Dictionary](http://en.wikipedia.org/wiki/Oxford_English_Dictionary) (the portion which has fallen into the public domain). It would be great to have a machine-readable version of this!
*
* A German-English dictionary with over 216,000 entries. Under the GPL.
*
* A Welsh-English, English-Welsh dictionary with over 13,000 entries. Under the GPL.
*
* A Japanese-Multilingual dictionary available under a Creative Commons Attribution Sharealike license.
*
* A set of thesauri in 8 different languages under the GPL.

We’d like to start using tags to correspond with the [ISO 639-2 codes for the representation of names of languages](http://ckan.net/package/read/iso-639-2), such as:

*
*

If you know of any other open dictionary projects – we’d love to hear about them! You can either pop us a line to the [okfn-discuss list](http://lists.okfn.org/cgi-bin/mailman/listinfo/okfn-discuss), or add packages directly to CKAN:

*

12 thoughts on “Open dictionary databases: an overview”

  1. Hoi,
    You may want to check out OmegaWiki.org. It provides lexical information like all the others. The difference is that the user interface can be changed to another language and it will be able to show the same information in the other language dependent on the availability of translations./

  2. Jean C: thanks for the pointer. Unfortunately it looks like Macmillan’s “Open Dictionary” isn’t open — at least not in any way we mean by that term.

    Their “open” means letting you give them information for free (by submitting word suggestions) but getting nothing back — as the terms and conditions make quite clear (emphasis added):

    Unless otherwise indicated, this Web Site and its contents are the property of Macmillan Publishers Limited, … The copyright in the material contained on this Web Site belongs to Macmillan or its licensors. … Reproduction of material on this Web Site is prohibited unless express permission is given by Macmillan.


    You may not redistribute any of the Content of this Web Site without the prior authorisation of Macmillan or create a database in electronic form or manually by downloading and storing any content.

    To my mind this is clear abuse of the term open and and more than a little exploitative — you do work for them for free and they don’t even promise to give you credit let alone permission to use the material you helped create. Such potential for abuse of the “open” label is a major reason we created the open definition — where open content and data is clearly defined as material that anyone is free to use, reuse and redistribute without restriction.

  3. From what I see, there is a figure of different positions on this. I mean you only have to browse the varied Internet forums and that gets starkly plain. Yet the trouble is, numerous people don’t appear to look that deep into this.

Comments are closed.