Over the past week or so there has been a flurry of posts about ’strong’ and ‘weak’ open access, including the following:

Peter Suber and Stevan Harnad both agree:

The term “open access” is now widely used in at least two senses. For some, “OA” literature is digital, online, and free of charge. It removes price barriers but not permission barriers. For others, “OA” literature is digital, online, free of charge, and free of unnecessary copyright and licensing restrictions. It removes both price barriers and permission barriers. It allows reuse rights which exceed fair use.

There are two good reasons why our central term became ambiguous. Most of our success stories deliver OA in the first sense, while the major public statements from Budapest, Bethesda, and Berlin (together, the BBB definition of OA) describe OA in the second sense.

As you know, Stevan Harnad and I have differed about which sense of the term to prefer –he favoring the first and I the second. What you may not know is that he and I agree on nearly all questions of substance and strategy, and that these differences were mostly about the label. While it may seem that we were at an impasse about the label, we have in fact agreed on a solution which may please everyone. At least it pleases us.

We have agreed to use the term “weak OA” for the removal of price barriers alone and “strong OA” for the removal of both price and permission barriers. To me, the new terms are a distinct improvement upon the previous state of ambiguity because they label one of those species weak and the other strong. To Stevan, the new terms are an improvement because they make clear that weak OA is still a kind of OA.

On this new terminology, the BBB definition describes one kind of strong OA. A typical funder or university mandate provides weak OA. Many OA journals provide strong OA, but many others provide weak OA.

Furthermore, Peter Suber adds:

As soon as we move beyond the removal of price barriers to the removal of permission barriers, we enter the range of strong OA. Hence, an article with a CC-NC license is strong OA because it allows some copying and redistribution beyond fair use (even if it doesn’t allow all copying and redistribution). My own preference is still for the CC-BY license, but we shouldn’t speak as if CC-NC were not strong OA or as if there were just one kind of strong OA.

According to this schema, a cost free publication counts as weak open access, and a publication licensed under a CC-NC license counts as strong open access. Stevan Harnad agrees with the distinction but suggests the need for ‘value-neutral’ terms to describe it - suggesting ‘basic’ and ‘full’.

Its worth adding to this discussion that there is also Open Definition compliant open access, which I understand is equivalent to BBB open access and which is more permissive than ’strong’ or ‘full’ open access. As we blogged a couple of weeks back - anything with the SPARC Europe Seal will be open access in this sense.

As Peter Murray-Rust comments:

Open Source has the OSI which determines whether ot not a given licence is OS. Open Knowledge after only a short time of volunteers has the OKF and has an agreed definition and a list of conformant licences.

Scholarly publications, as literary works, constitute knowledge and hence are covered by the OKD. A journal, monograph or any other publication can still be ‘open as in the OKD’ as with other forms of knowledge. Debates about open access aside, demarcating between knowledge that is ‘open’ and ‘closed’ is precisely what the OKD is there for!

It will be interesting to see what emerges as the new classificatory scheme for open access, and where OKD compliant publications sit on the spectrum. Perhaps these will be called ‘OKD/BBB compliant open access’ journals, or suchlike.

The first Open Knowledge London meetup will take place this Wednesday at the London Knowledge Lab. The meetup should be great opportunity for informal discussion of open knowledge projects and issues. If you’d like to participate or present, please add details to the wiki page!

SPARC Europe (Scholarly Publishing and Academic Resources Coalition) and the Directory of Open Access Journals (DOAJ) have just announced a new SPARC Europe Seal for Open Access Journals.

In order for journals to be approved, they must use a Creative Commons Attribution license - which is compliant with the Open Knowledge Definition. It is great to see growing support for making scholarly publications fully open!

The announcement - which includes comments from OKF advisory board members John Wilbanks and Peter Suber - is reproduced below.

Growing numbers of peer-reviewed research journals are opening-up their content online, removing access barriers and allowing all interested readers the opportunity of reading the papers online, with over 3300 such journals listed in the DOAJ, hosted by Lund University Libraries in Sweden.

However, the maximum benefit from this wonderful resource is not being realised as confusion surrounds the use and reuse of material published in such journals. Increasingly, researchers wish to mine large segments of the literature to discover new, unimagined connections and relationships. Librarians wish to host material locally for preservation purposes. Greater clarity will bring benefits to authors, users, and journals.

In order for open access journals to be even more useful and thus receive more exposure and provide more value to the research community it is very important that open access journals offer standardized, easily retrievable information about what kinds of reuse are allowed. Therefore, we are advising that all journals provide clear and unambiguous statements regarding the copyright statement of the papers they publish. To qualify for the SPARC Europe Seal a journal must use the Creative Commons By (CC-BY) license which is the most user-friendly license and corresponds to the ethos of the Budapest Open Access Initiative.

The second strand of the Seal is that journals should provide metadata for all their articles to the DOAJ, who will then make the metadata OAI-compliant. This will increase the visibility of the papers and allow OAI-harvesters to include details of the journal articles in their services.

‘We want to build on the great work already done by the publishers of many open access journals and improve the standards of open access titles,’ said David Prosser, Director of SPARC Europe. ‘Working with the DOAJ means that we can provide help and guidance to journals who wish to move beyond the first step of free access to full open access and our long-term aim is to ensure that all journals listed in the DOAJ can attain the standards expressed within the Seal’

‘Improving the standards of the rapidly increasing numbers of open access and contributing to the widest possible visibility, dissemination and readership of the journals is very much in line with our mission,’ said Lars Björnshauge, Director of Libraries at Lund University. ‘We are very happy to see the enormous usage of the DOAJ and the support from our membership’

‘Legal certainty is essential to the emergence of an internet that supports research. The proliferation of license terms forces researchers to act like lawyers, and slows innovative educational and scientific uses of the scholarly canon’ said John Wilbanks, Executive Director of Science Commons. ‘Using a seal to reward the journals who choose to adopt policies that ensure users’ rights to innovate is a great idea. It builds on a culture of trust rather than a culture of control, and it will make it easy to find the open access journals with the best policies.’

‘This is an excellent program with two important recommendations. CC-BY licenses make OA journals more useful, and interoperable metadata make them more discoverable. The recommendations are easy to adopt and will accelerate research, facilitate preservation, and make OA journal policies more open and more predictable for users. I hope all OA journals will adopt them –not to get the Seal from SPARC Europe and the DOAJ, but for the same reasons that moved these organizations to launch the program: to make OA journals more visible and useful than they already are,` said Peter Suber, Open Access Advocate & Author of Open Access News.

Dr. Paolo D’Iorio recently invited me to attend the first meeting of an EU funded Working Group “devoted to analyzing the current debate on the legal, economic and social conditions for setting-up open scholarly communities on the web”. The meeting was part of COST:

COST – European Cooperation in the field of Scientific and Technical Research – is one of the longest-running European instruments supporting cooperation among scientists and researchers across Europe. COST is also the first and widest European intergovernmental network for coordination of nationally funded research activities.

Action 32, of which Dr. D’Iorio is Chair, is called “Open Scholarly Communities on the Web” and has two aims:

  • to create a digital infrastructure for collaborative humanities research on the Web; and
  • to establish and foster the growth of Scholarly Communities that will provide feedback to the IT developers regarding the needs and expectations of humanities researchers and will serve as a core group of early adopters.

Talks included:

  • Paolo D’Iorio (CNRS-ITEM, Paris), How to build a Scholarly Community on the Web
  • Maria Chiara Pievatolo (University of Pisa), Copyright in Europe. History and perspectives
  • Thomas Margoni (University of Trento), How to access primary sources in Europe. The legal framework
  • Annaïg Mahé (URFIST, Paris), The market for SSH Journals in Europe
  • Jennie Grimshaw (British Library), Negotiating spaghetti junction: legal constraints on archiving government e-documents in the UK
  • Christine Madsen (OII, Oxford), The significance of “marketing” digital collections: the case of Harvard
  • Yann Moulier Boutang (Professeur de sciences Economiques - Université de Technologie de Compiègne, Directeur adjoint de Laboratoire de l’Unité de Recherche EA 22 23), Economic model(s) of Scholarly Communities: Open Source or Creative commons?
  • Francesca Di Donato (University of Pisa), The evaluation of science. From peer review to open peer review
  • Eric Meyer and Ralph Schroeder (OII, Oxford), Open Access and Online Visibility in the Age of e-Research

Notes and comments

  • For many humanities subjects, having something like the public domain calculators would help to facilitate the growth of open resources for scholarly communities built on works in which the copyright has expired.
  • Paolo’s presentation of Nietzsche Source and the Discovery project gave a compelling vision of how communities might grow around a resource for corpus based scholarship - with users having their own virtual workspace with annotations and notes that could be shared with other users. The ‘Scholarsource’ system would have stable URLs to support accurate citation, and robust ontologies to facilitate exploration of the material. Licensing that permits re-distribution is also a good preservation strategy.
  • The term ‘open’ was often not used in the sense of the Open Knowledge Definition. Several projects used licenses with non-commercial restrictions. While some participants assumed that scholars and institutions would often prefer that their work was not exploited commercially - it would be great if public domain sources such as documents, images and records, could be published under an open license. An approach which recommended open licensing for material that had not been enhanced (scans, text files …) could help to stimulate the growth of a commons that would encourage greater experimentation and collaboration than one which restricted certain kinds of re-use (cf. 7. and 8. in the OKD).
  • The importance of a close working relationship between scholarly communities and technologists. It is crucial that technical development is informed by the needs and working practices of researchers. This is something we’ve been thinking about in relation to Open Shakespeare and Open Milton. Open licensing allows developers to experiment with scholarly material to develop new tools and applications that could be of unanticipated value (e.g. semantic approaches, text analysis or visualisation).
  • Legal, technological and social obstacles to building open scholarly communities. We have various legal mechanisms and emerging technologies to facilitate such communities. Sometime the most hard parts are social - in growing user base, increasing participation and so on. Value and limits of ‘build it and they will come’ approach.

Make Textbooks Affordable, a campaign composed of Student Associations and Public Interest Research Groups from across the US, yesterday released a statement in support of open textbooks signed by 1000 academics. From the press release:

Open textbooks are complete, reviewed textbooks written by academics that can be used online at no cost and printed for a small cost. What sets them apart from conventional textbooks is their open license, which allows instructors and students flexibility to use, customize and print the textbook. Open textbooks are already used at some of the nation’s most prestigious institutions - including Harvard, Caltech and Yale - and the nation’s largest institutions - including the California community colleges and the Arizona State University system.

“Open textbooks are comparable, affordable and flexible alternatives to traditional expensive textbooks,” said Professor Linda Bisson, Chair of the Enology and Viticulture Department at the University of California, Davis. “Not only do they save students money, but they provide instructors with a high-quality textbook that they can customize to meet their needs.”

Textbooks cost students an average of $900 per year, which is a quarter of tuition at an average four-year public university and nearly three-quarters of tuition at a community college, according to a study conducted by the Government Accountability Office (GAO).

“Textbooks can price students out of higher education. With costs rising faster than inflation and tuition, some students are faced with the difficult choice to drop out, take on additional debt, or undercut their own learning by not purchasing textbooks,” said Nicole Allen, Textbooks Advocate for The Student PIRGs.

Research conducted by The Student PIRGs identifies publisher tactics as the primary cause of escalating prices. Bundling textbooks with unnecessary supplements forces students to purchase items they do not need; unnecessary new editions undermine the used book market; and withholding critical price information keeps faculty in the dark.

“As faculty members, our top priority is to choose the textbook that is best for our students. We share concerns about affordability, and face similar frustrations with publisher practices,” said Sandra Schroeder, Chair of the American Federation of Teachers Higher Education Program and Policy Council. “Open textbooks and other affordable options, when appropriate for a course, are a win-win for everyone.”

On the What are Open Textbooks? page, they mention our Open Text Book project, and the Open Knowledge Definition - which is great to see! Its good that they emphasise the importance of licensing that permits people to “reproduce, customize, or distribute” as well as access.

However while they allude to Creative Commons licenses - they don’t explicitly distinguish between those licenses which are open (Creative Commons Attribution and Attribution Sharealike), and those which are not (Creative Commons licenses with No Derivatives or Non-commercial options).

While the latter do afford people more choice about what can be done with their work - there are problems with interoperability, and do not serve well as the basis of an ecosystem of textbooks and textbook content that may be built upon, modified and redistributed without restriction. For example, publishers may not have the incentive to add value to existing content if they would be unable to re-distribute this in a commercial context.

Nevertheless its fantastic to see growing support for open textbooks!

Open Data Going Mainstream?

April 10th, 2008

Bret Taylor’s recent post entitled “We Need a Wikipedia for Data” has been garnering a lot of attention around the blogosphere. While his suggestions are not particularly novel, the post and the attention it has garnered, is, I think, indicative of the growing interests in the issues of (open) data and its importance for the development of related services and products.

While generally in agreement with Bret’s arguments, there are a few differences that are worth raising. First Bret appears to favour some kind of centralized repository that everyone can read from and write to:

To this end, I think we should create a Wikipedia for data: a global database for all of these important data sources to which we all contribute and that anyone can use.

As readers of this blog will know, we’re sceptical of this ‘one ring to rule them all’ approach. In this regard, it is also important to distinguish finding material, parsing it, and plugging it together, issues that got rather run together in the surrounding discussion. As I wrote in a comment to Bret’s post:

There seem to be several distinct issues you (and your commenters) are concerned with:

1. Discoverability of datasets. For this you want a registry of some kind and this is exactly what the Comprehensive Knowledge Archive Network (CKAN) is designed to do. …

2. ‘Developing’ data particularly using many contributors and a versioning (wiki-like) model. This seems a general problem and one which I wrote about in this post on the collaborative development of data back in February last year. Since then various projects have launched or developed which attempt to address this issue, even if only partially (e.g. Freebase, Swivel, Numbrary, http://www.openeconomics.net …). This then leads into:

3. Componentizing data so that one can easily plug different datasets together rather than having to aggregate data together in one big place (crudely: ‘One Ring to Rule them All’ vs. ‘Small Pieces, Loosely Joined’). After all it seems unlikely that any one organization, however large, can hold ‘all the data’, and in ay case doing so would negate the benefits of having ‘many minds’ working on a problem. It is our hope that CKAN would start to facilitate the kind of packaging that one frequently observes in software but is, as yet, fairly rare for knowledge (data/content/…). More on this can be found in this blog post on componentization plus the slides from our presentation at XTech.

To conclude, I definitely agree about the importance of having more open data and making it easier to find and use though I’m hoping that it will take a more decentralized and componentized form than simply a ‘wikipedia’ for data. More important though than any details is the fact that this kind of interest from a wider audience indicates that issues of data openness and production are going mainstream — something we as a community should strongly welcome.

OKCon 2008

We’re pleased to announce that audio, images and slides from OKCon 2008 are now available at the Post-Event Information page.

Most of the material can be obtained from the OKF subversion repository.

If you’ve blogged the event or have pictures or the like, please let us know and we’ll post a link from the Post-Event page. We are also able to host any further documentation in the repository.

Many thanks to all of you who came to speak, present and participate! We had a great day and very much enjoyed the talks, demos and conversations that took place throughout the day.

We’ve now set up a wiki page for local Open Knowledge groups - to arrange meetups, forums and other activities:

In addition to the Cambridge group, which has been around for a few years, we are in the process of creating groups in London and Oxford. If you’d like to get involved in any of these, or you’d like to set up your own local group - don’t hesitate to get in touch!

Our second annual Open Knowledge Conference (OKCon) is taking place tomorrow. Like last year, the event will bring together individuals and groups from across the open knowledge spectrum for a day of seminars and workshops. Though we’re nearing capcity, there are still a few places left for last minute registrants!

Details

Speakers

Session 1 (1045-1200): Transport and Environment

  • Gavin Starks (AMEE and dgen)
  • Tom Steinberg (MySociety)
  • Dr Muki Haklay (Department of Civil, Environmental and Geomatic Engineering, University College London)

Session 2 (1200-1315): Visualization and Analysis

  • Liz Turner (Freelance Designer and Visualizer Extraordinaire)
  • Gael Varoquaux (Mayavi2 - the next Generation Visualization Toolkit)
  • Martin Albrecht (SAGE the Open Source Mathematics Engine)

Session 3 (1415-1530): Education and Academia

  • Erik Duval (ARIADNE)
  • Lisa Petrides (OER Commons)
  • Dr Martin Brett (Cambridge University History Department and the Ivo Project)

Open Space

  • 1540-1640 (Room 1): Open Media
  • 1540-1640 (Room 2): Remixing, Peer Production and Open Knowledge
  • 1645-1745 (Room 1): Law, Licensing and Policy
  • 1645-1745 (Room 2): Versioning, Packaging, and Structuring Open Material
  • 1750-1830 (Room 1): Kept free for spontaneous contributions and breakout sessions

A more detailed schedule can be found at the Open Space wiki page

Theme

‘Open Knowledge’ is material that others are free to access, reuse or re-distribute and may be anything from sonnets to statistics, genes to geodata. In recent years we’ve seen the growth of successful open knowledge projects - from peer reviewed journals to community edited encyclopaedias - but what impact can open licensing have in education, research and commerce? Is sharing the key to scaling? What kinds of business models are available to open knowledge distributors and how is open knowledge applied in different institutional and professional contexts?

Furthermore, there now exist large and growing amounts of open material but what kinds of tools are available to analyse and represent it? How can we sort, search, store it to maximise its visibility and reusability?

We’ve also witnessed in the last few years the rise of web-based services — from social networking sites to online spreadsheet packages. While we have definitions for open software and open knowledge, what is an open service and what kinds of new services can be built using open knowledge?

Organizers

OKCon is organized by the Open Knowledge Foundation in partnership with the LSE Information Systems and Innovation Group.

Given the public role of libraries and the fact that bibliographic metadata (i.e. the material in library catalogues) doesn’t seem that exciting from a commercial point of view you might think that, of all the types of data out there, it would be bibliographic data that would be the most open. You might even think, given the public-spiritedness of librarians, that this is the kind of area where not only could it be openly available but it would be openly available (in nice little bzip or gzipped dumps …).

In fact the situation is quite the opposite. Most libraries appear to implicitly or explicitly exert rights over their data with some libraries licensing access to their catalogue data for substantial sums of money. The following lists some of the examples (both closed and open) that we know of:

  • Library of congress: public domain in the US (or at least free) but copyrighted outside the US. See [1] and comments in in fred2.0 readme which state:

    These data are works of the United States Government and as such are not subject to copyright within the United States. (17 U.S.C §105).

    The Library of Congress has copyrighted these data for use outside the United States. Contact the LC for permission prior to use or distribution of this data outside the United States. [http://www.loc.gov/cds/mds.html]

  • fred2.0 (fred2.0 CKAN package): an excellent example of the effort to make material available but unfortunately has same restrictions as Library of Congress (from which the material is sourced).
  • British Library: closed (and apparently gets sold for substantial sums).
  • OCLC/Worldcat: closed. See the OCLC CKAN page.
  • Barton/Simile: semi-open. Sourced from OCLC. Originally taken down but now back under CC non-commercial. See [1] for further discussion.
  • OpenLibrary: in theory open (though no formal license or dump as yet and some material may have been sourced from LoC making it suspect outside of the US)
  • isbndb.com: not really fully bibliographic data and status uncertain (see isbndb.com CKAN page)
  • LibraryThing: closed. Does not seem to make data available and source would likely make this problematic (from the about page):

    LibraryThing uses Amazon and libraries that provide open access to their collections with the Z39.50 protocol. The protocol is used by a variety of desktop programs, notably bibliographic software like EndNote. LibraryThing appears to be the first mainstream web use.

As we continue to search for open sources of bibliographic data we’d love to hear from anyone who knows of examples not already on this list.

[1] http://www.bookism.org/open/2007/04/02/open-data-what-would-kilgour-think/

Yesterday Creative Commons announced that their Attribution and Attribution Sharealike licenses will feature a seal of approval and link to Freedom Defined - the Definition of Free Cultural Works. We’ve been in touch with Freedom Defined since May 2006 (we blogged about the project last year) as their aims are so similar to that of opendefinition.org and the Open Knowledge Definition.

While there was discussion last year of merging the two projects, it now looks as though they will remain complementary - with Freedom Defined focusing on cultural works, and with the Open Knowledge Definition retaining a broader conception of ‘knowledge’ that includes data (see e.g. Good news for open data).

Mike Linksvayer of Creative Commons comments:

This added signaling is part of an ongoing effort to distinguish among the range of Creative Commons licenses — never say the Creative Commons license, as there is no such thing. Our license deeds have always communicated the distinct properties of each license with icons and brief descriptions.

This is great news and will hopefully contribute to the strengthening of a more robust sense of free culture/open knowledge within the plethora of liberal licensing options that are now available!