Support Us

You are browsing the archive for Open Content.

Let’s fix EU copyright!

Stefan Wehrmeyer - January 18, 2014 in Featured, Open Content

Today is Day 6 of Copyright Week, organised by EFF, looking at Getting Copyright Right.

Fix copyright

The European Commission is currently holding a Public Consultation on the review of the EU copyright rules – and they’re looking for your input.

Unfortunately, the consultation documents that the European Commission are difficult to fill out: rather than encouraging online participation, they have provided forms to be printed out, and the questions are difficult to understand.

So, with the support of Creativity for Copyright, and help of a couple of other people contributing code, translations and guides, this site has been set up: http://youcan.fixcopyright.eu/ which provides a simple web form and easy contextual information which helps people to understand current problems with copyright.

You can answer the form in a number of different languages, and after completing it the document will be sent to you by email.

Your input is crucial to pushing them in the right direction – make your voice heard now! http://youcan.fixcopyright.eu/

PDF Liberation Hackathon – January 18-19

Guest - December 19, 2013 in Events, Featured, Open Access, Open Content, Sprint / Hackday

This guest blog post has been written by Marc Joffe, of Public Sector Credit Solutions.

OpenSpending Workshop Bosnia

Open government data is valuable only to the extent that it can be used cost-effectively. When governments provide “open data” in the form of voluminous PDFs they offer the appearance of openness without its benefits. In this situation, the open government movement had two options: demand machine readable data or hack the PDFs – using technology to liberate the interesting data from them. The two approaches are complimentary; we can pursue both at the same time.

When it comes to liberating data from PDFs, advanced technologies are available but expensive. In my previous life as a technology manager at a financial firm, I was given the opportunity to purchase a sophisticated PDF extraction tool for USD 200,000 – not counting annual maintenance and implementation consulting costs.

This amount is beyond the reach of just about every startup and non-profit in the open data world. It is also beyond the means of most media organizations, so lowering the cost of PDF extraction is also a priority for journalists. The data journalism community has responded by developing software to harvest usable information from PDFs. Tabula, a tool written by Knight-Mozilla OpenNews Fellow Manuel Aristarán, extracts data from PDF tables in a form that can be readily imported to a spreadsheet – if the PDF was “printed” from a computer application. Introduced earlier this year, Tabula continues to evolve thanks to the volunteer efforts of Manuel, with help from OpenNews Fellow Mike Tigas and New York Times interactive developer Jeremy Merrill. Meanwhile, DocHive, a tool whose continuing development is being funded by a Knight Foundation grant, addresses PDFs that were created by scanning paper documents. DocHive is a project of Raleigh Public Record and is led by Charles and Edward Duncan.

These open source tools join a number of commercial offerings such as Able2Extract and ABBYY Fine Reader that extract data from PDFs. A more comprehensive list of open source and commercial resources is available here.

Unfortunately, the free and low cost tools available to data journalists and transparency advocates have limitations that hinder their ability to handle large scale tasks. If, like me, you want to submit hundreds of PDFs to a software tool, press “Go” and see large volumes of cleanly formatted data, you are out of luck. These limits reduce our ability to analyze and report on Parliamentary/Congressional financial disclosures, campaign contribution records and government budgets – which often arrive in volume, in PDF form.

PDF hacking has uses outside the government transparency / data journalism nexus. As Peter Murray-Rust has argued, the progress of science is being retarded because valuable data are “jailed” within PDF journal articles. For this reason, Dr. Rust and several colleagues have been developing AMI – a tool that leverages Apache PDFBox to mine usable content from scientific documents.

Whether your motive is to improve government, lower the cost of data journalism or free scientific data, you are welcome to join The PDF Liberation Hackathon on January 18-19, 2014 – sponsored by The Sunlight Foundation, Knight-Mozilla OpenNews and others. We’ll have hack sites at the NYU-Poly Incubator in New York, Chicago Community Trust, Sunlight’s Washington DC office and at RallyPad in San Francisco (one or two locations will have an opening social on the evening of the 17th). Developers can also join remotely because we will publish a number of clearly specified PDF extraction challenges before the hackathon.

Participants can work on one of the pre-specified challenges or choose their own PDF extraction projects. Ideally, hackathon teams will use (and hopefully improve upon) open source tools to meet the hacking challenges, but they will also be allowed to embed commercial tools into their projects as long as their licensing cost is less than $1000 and an unlimited trial is available.

Prizes of up to $500 will be awarded to winning entries. To receive a prize, a team must publish their source code on a GitHub public repository. To join the hackathon in DC or remotely, please sign up at Eventbrite; to hack with us in SF, please sign up via this Meetup. Signup links for New York and Chicago will be posted here. Please also complete our Google Form survey.

The PDF Liberation Hackathon is going to be a great opportunity to advance the state of the art when it comes to harvesting data from public documents. I hope you can join us.

What We Hope the Digital Public Library of America Will Become

Jonathan Gray - April 17, 2013 in Bibliographic, Featured, Free Culture, Open Content, Open GLAM, Open Humanities, Policy, Public Domain

Tomorrow is the official launch date for the Digital Public Library of America (DPLA).

If you’ve been following it, you’ll know that it has the long term aim of realising “a large-scale digital public library that will make the cultural and scientific record available to all”.

More specifically, Robert Darnton, Director of the Harvard University Library and one of the DPLA’s leading advocates to date, recently wrote in the New York Review of Books, that the DPLA aims to:

make the holdings of America’s research libraries, archives, and museums available to all Americans—and eventually to everyone in the world—online and free of charge

What will this practically mean? How will the DPLA translate this broad mission into action? And to what extent will they be aligned with other initiatives to encourage cultural heritage institutions to open up their holdings, like our own OpenGLAM or Wikimedia’s GLAM-WIKI?

Here are a few of our thoughts on what we hope the DPLA will become.

A force for open metadata

The DPLA is initially focusing its efforts on making existing digital collections from across the US searchable and browsable from a single website.

Much like Europe’s digital library, Europeana, this will involve collecting information about works from a variety of institutions and linking to digital copies of these works that are spread across the web. A super-catalogue, if you will, that includes information about and links to copies of all the things in all the other catalogues.

Happily, we’ve already heard that the DPLA is releasing all of this data about cultural works that they will be collecting using the CC0 legal tool – meaning that anyone can use, share or build on this information without restriction.

We hope they continue to proactively encourage institutions to explicitly open up metadata about their works, and to release this as machine-readable raw data.

Back in 2007, we – along with the late Aaron Swartz – urged the Library of Congress to play a leading role in opening up information about cultural works. So we’re pleased that it looks like DPLA could take on the mantle.

But what about the digital copies themselves?

A force for an open digital public domain

The DPLA has spoken about using fair use provisions to increase access to copyrighted materials, and has even intimated that they might want to try to change or challenge the state of the law to grant further exceptions or limitations to copyright for educational or noncommercial purposes (trying to succeed where Google Books failed). All of this is highly laudable.

But what about works which have fallen out of copyright and entered the public domain?

Just as they are doing with metadata about works, we hope that the DPLA takes a principled approach to digital copies of works which have entered the public domain, encouraging institutions to publish these without legal or technical restrictions.

We hope they become proactive evangelists for a digital public domain which is open as in the Open Definition, meaning that digital copies of books, paintings, recordings, films and other artefacts are free for anyone to use and share – without restrictive clickwrap agreements, digital rights management technologies or digital watermarks to impose ownership and inhibit further use or sharing.

The Europeana Public Domain Charter, in part based on and inspired by the Public Domain Manifesto, might serve as a model here. In particular, the DPLA might take inspiration from the following sections:

What is in the Public Domain needs to remain in the Public Domain. Exclusive control over Public Domain works cannot be re-established by claiming exclusive rights in technical reproductions of the works, or by using technical and or contractual measures to limit access to technical reproductions of such works. Works that are in the Public Domain in analogue form continue to be in the Public Domain once they have been digitised.

The lawful user of a digital copy of a Public Domain work should be free to (re-) use, copy and modify the work. Public Domain status of a work guarantees the right to re-use, modify and make reproductions and this must not be limited through technical and or contractual measures. When a work has entered the Public Domain there is no longer a legal basis to impose restrictions on the use of that work.

The DPLA could create their own principles or recommendations for the digital publication of public domain works (perhaps recommending legal tools like the Creative Commons Public Domain Mark) as well as ensuring that new content that they digitise is explicitly marked as open.

Speaking at our OpenGLAM US launch last month, Emily Gore, the DPLA’s Director for Content, said that this is definitely something that they’d be thinking about over the coming months. We hope they adopt a strong and principled position in favour of openness, and help to raise awareness amongst institutions and the general public about the importance of a digital public domain which is open for everyone.

A force for collaboration around the cultural commons

Open knowledge isn’t just about stuff being able to freely move around on networks of computers and devices. It is also about people.

We think there is a significant opportunity to involve students, scholars, artists, developers, designers and the general public in the curation and re-presentation of our cultural and historical past.

Rather than just having vast pools of information about works from US collections – wouldn’t it be great if there were hand picked anthologies of works by Emerson or Dickinson curated by leading scholars? Or collections of songs or paintings relating to a specific region, chosen by knowledgable local historians who know about allusions and references that others might miss?

An ‘open by default’ approach would enable use and engagement with digital content that breathes a life into it that it might not otherwise have – from new useful and interesting websites, mobile applications or digital humanities projects, to creative remixing or screenings of out of copyright films with new live soundtracks (like Air’s magical reworking of Georges Méliès’s 1902 film Le Voyage Dans La Lune).

We hope that the DPLA takes a proactive approach to encouraging the use of the digital material that it federates, to ensure that it is as impactful and valuable to as many people as possible.

Open and the “Next Great Copyright Act”

Mike Linksvayer - March 20, 2013 in Legal, Open Content, Public Domain

Director of the U.S. Copyright Office Maria Pallante is expected to call today for updates to U.S. copyright law. Her brief written testimony is already available and a longer speech given two weeks ago (titled “The Next Great Copyright Act”) provides additional flavor.

Substantial changes to copyright will take years to play out in the U.S., and similarly around the world. If Open is to impact how copyright and other knowledge regulation plays out over the next years, we must assert how and why, and develop our strategies for making it so. Statements like Pallante’s provide not-to-be-missed opportunities to contextualize and explain the importance of Open to the world.

While Pallante’s calls are at best a mixed bag, two items offer glimmers of hope and are useful for illustrating both the value and strategy of Open:

Congress also may need to apply fresh eyes to the next great copyright act to ensure that the copyright law remains relevant and functional. This may require some bold adjustments to the general framework. You may want to consider alleviating some of the pressure and gridlock brought about by the long copyright term — for example, by reverting works to the public domain after a period of life plus fifty years unless heirs or successors register their interests with the Copyright Office.

50 years with an option for more is far from anything that might be considered optimal — OKF’s Rufus Pollock has estimated 15 years and others less, even before accounting for values achieved through openness such as freedom and equality — and is a dangerous place to start new debate, considering that Disney lobbyists have not yet weighed in.

But any possibility of mitigating the heretofore relentless march of copyright term extension and by implication appreciation of the value of the public domain is welcome, and an opportunity.

Some of the most compelling work by the Open community involves making public domain works accessible, and celebrating our bounty. Compelling for culture — and critical for policy. What better way to make the case for expanding and protecting the public domain than to demonstrate and increase the value of works that are free of copyright restriction even now? Well, we have to talk about our work in those terms, loudly! Public Domain Review postcards

Pallante:

And in compelling circumstances, you may wish to reverse the general principle of copyright law that copyright owners should grant prior approval for the reproduction and dissemination of their works — for example, by requiring copyright owners to object or “opt out” in order to prevent certain uses, whether paid or unpaid, by educational institutions or libraries.

Openly licensed works — those that all are free to use, reuse, and redistribute subject only, at most, to the requirement to attribute and/or share-alike — unambiguously permit such uses, right now, and are increasingly becoming expected and even mandated where public funding is provided or public benefit is a primary goal. What better way to make the case for liberal policy where public funding or benefit is at stake than to promote and demonstrate the value of Open works now? Again, we have to talk about our usual pro-openness work’s relevance to policy, loudly!

But open licensing is opt-in (even when mandated, it is as if a group opted-in, still leaving default policy for everyone else), ultimately limiting its impact. We shouldn’t shy away from that reality — indeed it is a key reason open licensing can be, if we make it so, a harbinger of better default policy, but not at all a substitute for better default policy.

When positioning Open in the context of broader copyright and other information regulation debates, we shouldn’t be content to merely address points made in those debates, but from an Open perspective. We must also raise additional issues that arise from the experience of Open movements: a knowledge commons requires protection and promotion.

Private enclosure of public domain and Open works, eg through “copyfraud”, might be addressed through policy. Ensuring the public’s right to audit, understand, replicate, and modify data and tools such as software and designs for research and hardware, might be addressed through policy. Actually we know these can be addressed through policy, as demonstrated for decades on an opt-in basis through copyleft, one of the signal innovations of our movements.

Although over 25 years old (starting with free software), open licenses and the amazing projects that use them (that run the Internet, and are making governments more transparent, bit by bit, and so much more) have played almost no explicit role in debates about default copyright policy. Hopefully you’re beginning to think that we can change that — with little or no alteration of our existing Open activities, as we mainly need to appreciate just how provocative and potent those are, and tell the public, especially the policy world.

Ultimately, we can shift the centrality of “copyright policy” to that of “open policy” — what information regulation is best for the knowledge commons — for all humanity’s yearning for freedom, equality, and well governed institutions.

Boundless Learning demands a jury trial

Theodora Middleton - February 15, 2013 in External, Open Content, Open Textbooks

We’ve been following the case of Boundless Learning on the OKF blg (see here and here), in which the world’s most prominent producer of Open Access textbooks online is being sued by the world’s biggest producers of physical, copyrighted textbooks. In the latest twist to the tale, Boundless have filed their answer, requesting a trial by jury.

The publishers who are pursuing Boundless – Pearson, Cengage and Macmillan’s Bedford, Freeman & Worth – do not allege that any of their content has been plagiarised, or claim copyright on any of the facts or ideas in their books (since it is impossible to claim copyright on such things). Instead they allege that the ’ “selection, coordination and arrangement” of the unprotectable elements has been pilfered.

Boundless counter that following the same basic order in textbooks “is necessitated by the subject matter and standard in these fields” – a claim which they believe will be born out through trials over the coming months.

In their press release they say:

At a time when textbook prices have risen at three times the rate of inflation, Boundless is well along the way to turning around this escalation by offering equivalent quality, openly-licensed educational materials online at dramatically lower costs … Boundless will vigorously deny the overly broad and legally flawed allegations made by the publishers … Boundless is confident that it will become evident that its digital textbooks do not violate copyright or any other rights of the plaintiffs.

Boundless have been at the forefront of challenging the oligopoly of the big textbook pubishers, and the outcome of this case will have implications for everyone in the sector. Boundless seem confident that a jury of peers will agree that their efforts are a development in the right direction. The rapidly-expanding world of Open Online Education is watching with baited breath.

Open Research Data Handbook Sprint

Velichka Dimitrova - February 15, 2013 in Open Access, Open Content, Open Data, Open Economics, Open Science, Open Standards, Our Work, WG Economics

On February 15-16 we are updating the Open Research Data Handbook to include more detail on sharing research data from scientific work, and to remix the book for different disciplines and settings. We’re doing this through an open book sprint. The sprint will happen at the Open Data Institute, 65 Clifton Street, London EC2A 4JE.

The Friday lunch seminar will be streamed through the Open Economics Bambuser channel. If you would like to participate, please see the Online Participation Hub for links to documents and programme updates. You can follow this event at the IRC channel #okfn-rbook and follow on twitter with hashtags #openresearch and #okfnrbook.

The Open Research Data Handbook aims to provide an introduction to the processes, tools and other areas that researchers need to consider to make their research data openly available.

Join us for a book sprint to develop the current draft, and explore ways to remix it for different disciplines and contexts.

Who it is for:

  • Researchers interested in carrying out their work in more open ways
  • Experts on sharing research and research data
  • Writers and copy editors
  • Web developers and designers to help present the handbook online
  • Anyone else interested in taking part in an intense and collaborative weekend of action

What will happen:

The main sprint will take place on Friday and Saturday. After initial discussions we’ll divide into open space groups to focus on research, writing and editing for different chapters of the handbook, developing a range of content including How To guidance, stories of impact, collections of links and decision tools.

A group will also look at digital tools for presenting the handbook online, including ways to easily tag content for different audiences and remix the guide for different contexts.

Agenda:

Where: 65 Clifton Street, EC2A 4JE (3rd floor – the Open Data Institute)

Friday, February 15th

  • 13:00 – 13:30: Arrival and sushi lunch
  • 13:30 – 14:30: Open research data seminar with Steven Hill, Head of Open Data Dialogue at RCUK.
  • 14:30 – 17:30: Working in teams

Friday, February 16th

  • 10:00 – 10:30: Arrival and coffee
  • 10:30 – 11:30: Introducing open research lightning talks (your space to present your project on research data)
  • 11:30 – 13:30: Working in teams
  • 13:30 – 14:30: Lunch
  • 14:30 – 17:30: Working in teams
  • 17:30 – 18:30: Reporting back

As many already registered for online participation we will broadcast the lunch seminar through the Open Economics Bambuser channel. Please drop by in the IRC channel #okfn-rbook

Partners:

OKF Open Science Working Group – creators of the current Open Research Data Handbook
OKF Open Economic Working Group – exploring economics aspects of open research
Open Data Research Network - exploring a remix of the handbook to support open social science
research in a new global research network, focussed on research in the Global South.
Open Data Institute – hosting the event

Version Variation Visualisation

Tom Cheesman - February 8, 2013 in Featured Project, Open Content, Public Domain, WG Linguistics

In 2010, I had a long paper about the history of German translations of Othello rejected by a prestigious journal. The reviewer wrote: “The Shakespeare Industry doesn’t need more information about world Shakespeare. We need navigational aids.”

About the same time, David Berry turned me on to Digital Humanities. I got a team together (credits) and we’ve built some cool new tools.

All culturally important works are translated over and over again. The differences are interesting. Different versions of Othello reflect changing, contested ideas about race, gender and sexuality, political power, and so on, over the centuries, right up to the present day. Hence any one translation is just one snapshot from its local place and moment in time, just one interpretation, and what’s interesting and productive is the variation, the diversity.

But with print culture tools, you need a superhuman memory, a huge desk and ideally several assistants, to leaf backwards and forwards in all the copies, so you can compare and contrast. And when you present your findings, the minutiae of differences can be boring, and your findings can’t be verified. How do you know I haven’t just picked quotations that support my argument?

But with digital tools, multiple translations become material people can easily work and play with, research and create with, and we can begin to use them in totally new ways.

Recent work

vvv screenshot2 We’ve had funding from UK research councils and Swansea University to digitize 37 German versions of Othello (1766-2010) and build these prototype tools. There you can try out our purpose-built database and tools for freely segmenting and aligning multiple versions; our timemap of versions; our parallel text navigation tool which uses aligned segment attributes for navigation; and most innovative of all: the tool we call ‘Eddy and Viv’. This lets you compare all the different translations of any segment (with help from machine translation), and it also lets you read the whole translated text in a new way, though the variety of translations. You don’t need to know the translating language.

This is a radical new idea (more details on our platform). Eddy and Viv are algorithms: Eddy calculates how much each translation of each segment differs in wording from others, then Viv maps the variation in the results of that analysis back onto the translated text segments.

This means you can now read Shakespeare in English, while seeing how much all the translators disagree about how to interpret each speech or line, or even each word. It’s a new way of reading a literary work through translators’ collective eyes, identifying hotspots of variation. And you don’t have to be a linguist.

Future plans and possible application to collections of public domain texts

I am a linguist, so I’m interested in finding new ways to overcome language barriers, but equally I’m interested in getting people interested in learning languages. Eddy and Viv have that double effect. So these are not just research tools: we want to make a cultural difference.

We’re applying for further funding. We envisage an open library of versions of all sorts of works, and a toolsuite supporting experimental and collaborative approaches to understanding the differences, using visualizations for navigation, exploration and comparison, and creating presentations for research and education.

The tools will work with any languages, any kinds of text. The scope is vast, from fairy tales to philosophical classics. You can also investigate versions in just one language – say, different editions of an encyclopedia, or different versions of a song lyric. It should be possible to push the approach beyond text, to audio and video, too.

Shakespeare is a good starting point, because the translations are so numerous, increasing all the time, and the differences are so intriguing. But a few people have started testing our tools on other materials, such as Jan Rybicki with Polish translations of Joseph Conrad’s work. If we can demonstrate the value, and simplify the tasks involved, people will start on other ‘great works’ – Aristotle, Buddhist scripture, Confucius, Dante (as in Caroline Bergvall’s amazing sound work ‘Via’), Dostoyevski, Dumas…

Many translations of transculturally important works are in the public domain. Most are not, yet. So copyright is a key issue for us. We hope that as the project grows, more copyright owners will be willing to grant more access. And of course we support reducing copyright restrictions.

Tim Hutchings, who works on digital scriptures, asked me recently: “Would it be possible to create a platform that allowed non-linguist readers to appreciate the differences in tone and meaning between versions in different languages? … without needing to be fluent in all of those languages.” – Why not, with imaginative combinations of various digital tools for machine translation, linguistic analysis, sentiment analysis, visualization and not least: connecting people.

Boundless Releases All Its Textbooks Under Open License

Sam Leon - January 24, 2013 in News, Open Access, Open Content

159054v5-max-250x250News just in that Boundless, the open source digital textbook provider, is releasing all of its 18 open source textbooks under a Creative Commons Attribution and Share-Alike license.

We covered the progress of this brilliant initiative mid-way through last year. Boundless leverages open content on the web, whether that’s information on Wikipedia or digital copies of public domain artworks, to produce textbooks that are free for everyone to access.

Boundless provides an alternative to traditional textbooks that are out of reach to many given their often hefty price tags. We’re really excited to see that the company is now making all its content available under such a permissive license that will maximise re-use of this material but make sure that those who have spent time compiling and writing these resources are attributed.

The range of textbooks on Boundless is continually increasing. You can learn about everything from accounting and biology to sociology and economics. Many of the textbooks are supplemented by other useful learning aids such as flash cards and quizes. What is more, these textbooks don’t go out-of-date when a new discovery is made or practices change within a given domain.

To find out more about what Boundless do visit their website.

Sita’s free: Landmark copyleft animated film is now licensed CC0

Sarah Stierch - January 19, 2013 in Free Culture, Open Content, Public Domain, Public Domain Works

Sit back and relax Sita..you're free!

Sit back and relax Sita..you’re free!

This past Friday, American cartoonist, animator, and free culture activist Nina Paley announced she was releasing her landmark animated film Sita Sings the Blues under a Creative Commons CC0 licenseSita Sings the Blues is quite possibly the most famous animated film to be released under an open license. The 82 minute film, which is an autobiographical story mixed with an adaptation of the Ramayana, was released in 2008 under a Creative Commons Attribution-ShareAlike license.

Paley, a well known copyleft and free licensing advocate, found inspiration for releasing Sita in recent life events. The day after learning about the death of internet activist and computer programmer Aaron Swartz, Paley was asked to provide permissions, by the National Film Board of Canada (NFB), for filmmaker Chris Landreth to “refer” to Sita Sings the Blues in an upcoming film. Challenges with NFB lawyers reminded Paley of the challenges Swartz faced in relation to his “freeing” of JSTOR documents. “I couldn’t bear to enable more bad lawyers, more bad decisions, more copyright bullshit, by doing unpaid paperwork for a corrupt and stupid system. I just couldn’t,” Paley explained on her blog. She refused to sign the paperwork, and the NFB requested that Landreth remove any mentions of Sita in his film.

“CC-0 is as close as I can come to a public vow of legal nonviolence,” Paley states, channeling her frequent frustration with film industry lawyers and copyrights. In a copyleft community where participants are often challenged on what license is the best option, Paley took this chance to attempt to discover that: “I honestly have not been able to determine which Free license is “better,” and switching to CC-0 may help answer that question.”

Sita can now sing the blues (or perhaps something happier, since she is as free as it can get), without having to file for paperwork ever again.

The Digital Public Library of America moving forward

Kenny Whitebloom - November 6, 2012 in Bibliographic, External, Open Content, Open Data, Open GLAM

A fuller version of this post is available on the Open GLAM blog

The Digital Public Library of America (DPLA) is an ambitious project to build a national digital library platform for the United States that will make the cultural and scientific record available, free to all Americans. Hosted by the Berkman Center for Internet & Society at Harvard University, the DPLA is an international community of over 1,200 volunteers and participants from public and research libraries, academia, all levels of government, publishing, cultural organizations, the creative community, and private industry devoted to building a free, open, and growing national resource.

Here’s an outline of some of the key developments in the DPLA planning initiative. For more information on the Digital Public Library of America, including ways in which you can participate, please visit http://dp.la.

Content

In the fall of 2012, the DPLA received funding from the National Endowment for the Humanities, the Institute for Museum and Library Services, and the Knight Foundation to support our Digital Hubs Pilot Project. This funding enabled us to develop the DPLA’s content infrastructure, including implementation of state and regional digital service pilot projects. Under the Hubs Pilot, the DPLA plans to connect existing state infrastructure to create a national system of state (or in some cases, regional) service hubs.

The service hubs identified for the pilot are:

  • Mountain West Digital Library (Utah, Nevada and Arizona)
  • Digital Commonwealth (Massachusetts)
  • Digital Library of Georgia
  • Kentucky Digital Library
  • Minnesota Digital Library
  • South Carolina Digital Library

In addition to these service hubs, organizations large digital collections that are going make their collections available via the DPLA will become content hubs. We have identified the National Archives and Records Administration, the Smithsonian Institute, and Harvard University as some of the first potential content hubs in the Digital Hubs Pilot Project.

Here’s our director for content, Emily Gore, to give you a full overview:

Technical Development

The technical development of the Digital Public Library of America is being conducted in a series of stages. The first stage (December 2011-April 2012) involved the initial development of a back-end metadata platform. The platform provides information and services openly and to all without restriction by way of open source code.

We’re now on stage two: integrating continued development of the back-end platform, complete with open APIs, with new work on a prototype front end. It’s important to note that this front-end will serve as a gesture toward the possibilities of a fully built-out DPLA, providing but one interface for users to interact with the millions of records contained in the DPLA platform.

Development of the back-end platform — conducted publicly, with all code published on GitHub under a GNU Affero General Public License — continues so that others can develop additional user interfaces and means of using the data and metadata in the DPLA over time, which continues to be a key design principle for the project overall.

Events

We’ve been hosting a whole load of events, from our large public events like the DPLA Midwest last month in Chicago, to smaller more intimate hackathons. These events have brought together a wide range of stakeholders — librarians, technologists, creators, students, government leaders, and others – and have proved exciting and fruitful moments in driving the project forward.

On November 8-9, 2012, the DPLA will convene its first “Appfest” Hackathon at the Chattanooga Public Library in Chattanooga, TN. The Appfest is an informal, open call for both ideas and functional examples of creative and engaging ways to use the content and metadata in the DPLA back-end platform. We’re looking for web and mobile apps, data visualization hacks, dashboard widgets that might spice up an end-user’s homepage, or a medley of all of these. There are no strict boundaries on the types of submissions accepted, except that they be open source. You can check out some of the apps that might be built at the upcoming hackathon on the Appfest wiki page.

The DPLA remains an extremely ambitious project, and we encourage anyone with an interest in open knowledge and the democratization of information to participate in one form or another. If you have any questions about the project or ways to get involved, please feel free to email me at kwhitebloom[at]cyber.law.harvard.edu.

Get Updates