Support Us

You are browsing the archive for Open Access.

New Open Knowledge Initiative on the Future of Open Access in the Humanities and Social Sciences

Jonathan Gray - October 21, 2014 in OKF Projects, Open Access, Open Humanities, Open Research, WG Humanities

Screen Shot 2014-10-21 at 11.57.15

To coincide with Open Access Week, Open Knowledge is launching a new initiative focusing on the future of open access in the humanities and social sciences.

The Future of Scholarship project aims to build a stronger, better connected network of people interested in open access in the humanities and social sciences. It will serve as a central point of reference for leading voices, examples, practical advice and critical debate about the future of humanities and social sciences scholarship on the web.

If you’d like to join us and hear about new resources and developments in this area, please leave us your details and we’ll be in touch.

For now we’ll leave you with some thoughts on why open access to humanities and social science scholarship matters:

“Open access is important because it can give power and resources back to academics and universities; because it rightly makes research more widely and publicly available; and because, like it or not, it’s beginning and this is our brief chance to shape its future so that it benefits all of us in the humanities and social sciences” – Robert Eaglestone, Professor of Contemporary Literature and Thought, Royal Holloway, University of London.

*

“For scholars, open access is the most important movement of our times. It offers an unprecedented opportunity to open up our research to the world, irrespective of readers’ geographical, institutional or financial limitations. We cannot falter in pursuing a fair academic landscape that facilitates such a shift, without transferring prohibitive costs onto scholars themselves in order to maintain unsustainable levels of profit for some parts of the commercial publishing industry.” Dr Caroline Edwards, Lecturer in Modern & Contemporary Literature, Birkbeck, University of London and Co-Founder of the Open Library of Humanities

*

“If you write to be read, to encourage critical thinking and to educate, then why wouldn’t you disseminate your work as far as possible? Open access is the answer.” – Martin Eve, Co-Founder of the Open Library of Humanities and Lecturer, University of Lincoln.

*

“Our open access monograph The History Manifesto argues for breaking down the barriers between academics and wider publics: open-access publication achieved that. The impact was immediate, global and uniquely gratifying–a chance to inject ideas straight into the bloodstream of civic discussion around the world. Kudos to Cambridge University Press for supporting innovation!” — David Armitage, Professor and Chair of the Department of History, Harvard University and co-author of The History Manifesto

*

“Technology allows for efficient worldwide dissemination of research and scholarship. But closed distribution models can get in the way. Open access helps to fulfill the promise of the digital age. It benefits the public by making knowledge freely available to everyone, not hidden behind paywalls. It also benefits authors by maximizing the impact and dissemination of their work.” – Jennifer Jenkins, Senior Lecturing Fellow and Director, Center for the Study of the Public Domain, Duke University

*

“Unhappy with your current democracy providers? Work for political and institutional change by making your research open access and joining the struggle for the democratization of democracy” – Gary Hall, co-founder of Open Humanities Press and Professor of Media and Performing Arts, Coventry University

Celebrating Open Access Week by highlighting community projects!

Christian Villum - October 20, 2014 in Featured, Open Access

OAlogo1

This week is Open Access Week all around the world, and from Open Knowledge’s side we are following up on last year’s tradition by putting together a blog post series to highlight great Open Access projects and activities in communities around the world. Every day this week will feature a new writer and activity.

Open Access Week, a global event now entering its eighth year, is an opportunity for the academic and research community to continue to learn about the potential benefits of Open Access, to share what they’ve learned, and to help inspire wider participation in helping to make Open Access a new norm in scholarship and research.

This past year has seen lots in great progress and with the Open Knowledge blog we want to help amplify this amazing work done in communities around the world:

  • Tuesday, Jonathan Gray from Open Knowledge: “Open Knowledge work on Open Access in humanities and social sciences”
  • Wednesday, David Carroll from Open Access Button: “Launching the New Open Access Button”
  • Thursday, Alma Swan from SPARC Europe: “Open Access and the humanities: on our travels round the UK”
  • Friday, Jenny Molloy from Open Science working group: “OK activities in open access to science”
  • Saturday, Kshitiz Khanal from Open Knowledge Nepal: “Combining Open Science, Open Access, and Collaborative Research”
  • Sunday, Denis Parfenov from Open Knowledge Ireland: “Open Access: Case of Ireland”

We’re hoping that this series can inspire even more work around Open Access in the year to come and that our community will use this week to get involved both locally and globally. A good first step is to sign up at http://www.openaccessweek.org for access to a plethora of support resources, and to connect with the worldwide Open Access Week community. Another way to connect is to join the Open Access working group.

Open Access Week is an invaluable chance to connect the global momentum toward open sharing with the advancement of policy changes on the local level. Universities, colleges, research institutes, funding agencies, libraries, and think tanks use Open Access Week as a platform to host faculty votes on campus open-access policies, to issue reports on the societal and economic benefits of Open Access, to commit new funds in support of open-access publication, and more. Let’s add to their brilliant work this week!

Support Diego Gomez, Join the Global Open Access Movement

Christian Villum - October 1, 2014 in Open Access

This is a post put together based on great contributions on the blogs of the Electronic Frontier Foundation (Adi Kamdar & Maira Sutton), Creative Commons (Timothy Vollmer) and the Open Access Button project (David Carroll).

Join the global Open Access movement!

diego_gomez-2bIn July the Electronic Frontier Foundation (EFF) wrote about the predicament that Colombian student Diego Gomez found himself in after he shared a research article online. Gomez is a graduate student in conservation and wildlife management at a small university. He has generally poor access to many of the resources and databases that would help him conduct his research. Paltry access to useful materials combined with a natural culture of sharing amongst researchers prompted Gomez to share a paper on Scribd so that he and others could access it for their work. The practice of learning and sharing under less-than-ideal circumstances could land Diego in prison.

Facing 4-8 years in prison for sharing an article

The EFF reports that upon learning of this unauthorized sharing, the author of the research article filed criminal complaint against Gomez. The charges lodged against Diego could put him in prison for 4-8 years. The trial has started, and the court will need to take into account several factors: including whether there was any malicious intent to the action, and whether there was any actual harm against the economic rights of the author.

Academics and students send and post articles online like this every day—it is simply the norm in scholarly communication. And yet inflexible digital policies, paired with senseless and outdated practices, have led to such extreme cases like Diego’s. People who experience massive access barriers to existing research—most often hefty paywalls—often have no choice but to find and share relevant papers through colleagues in their network. The Internet has certainly enabled this kind of information sharing at an unprecedented speed and scale, but we are still far from reaching its full capacity.

If open access were the default for scholarly communication, cases like Diego’s would become obsolete.

Let’s stand together to support Diego Gomez and promote Open Access worldwide.

Help Diego Gomez and join academics and users in fighting outdated laws and practices that keep valuable research locked up for no good reason. If open access were the default for scholarly communication, cases like Diego’s would become obsolete. Academic research would be free to access and available under an open license that would legally enable the kind of sharing that is so crucial for enabling scientific progress.

We at Open Knowledge have joined as signees of the petition in support of Diego alongside prominent organisations such as the Electronic Frontier Foundation, Creative Commons, Open Access Button, Internet Archive, Public Knowledge, and the Right to Research Coalition. Sign your support for Diego to express your support for open access as the default for scientific and scholarly publishing, so researchers like Diego don’t risk severe penalties for helping colleagues access the research they need:

[Click here to sign the petition]

Sign-on statement: “Scientific and scholarly progress relies upon the exchange of ideas and research. We all benefit when research is shared widely, freely, and openly. I support an Open Access system for academic publishing that makes research free for anyone to read and re-use; one that is inclusive of all and doesn’t force researchers like Diego Gomez to risk severe penalties for helping colleagues access the research they need.”

Newsflash! OKFestival Programme Launches

Beatrice Martini - June 4, 2014 in Events, Free Culture, Join us, Network, News, OKFest, OKFestival, Open Access, Open Data, Open Development, Open Economics, Open Education, Open GLAM, Open Government Data, Open Humanities, Open Knowledge Foundation, Open Knowledge Foundation Local Groups, Open Research, Open Science, Open Spending, Open Standards, Panton Fellows, Privacy, Public Domain, Training, Transparency, Working Groups

At last, it’s here!

Check out the details of the OKFestival 2014 programme – including session descriptions, times and facilitator bios here!

Screen Shot 2014-06-04 at 4.11.42 PM

We’re using a tool called Sched to display the programme this year and it has several great features. Firstly, it gives individual session organisers the ability to update the details on the session they’re organising; this includes the option to add slides or other useful material. If you’re one of the facilitators we’ll be emailing you to give you access this week.

Sched also enables every user to create their own personalised programme to include the sessions they’re planning to attend. We’ve also colour-coded the programme to help you when choosing which conversations you want to follow: the Knowledge stream is blue, the Tools stream is red and the Society stream is green. You’ll also notice that there are a bunch of sessions in purple which correspond to the opening evening of the festival when we’re hosting an Open Knowledge Fair. We’ll be providing more details on what to expect from that shortly!

Another way to search the programme is by the subject of the session – find these listed on the right hand side of the main schedule – just click on any of them to see a list of sessions relevant to that subject.

As you check out the individual session pages, you’ll see that we’ve created etherpads for each session where notes can be taken and shared, so don’t forget to keep an eye on those too. And finally; to make the conversations even easier to follow from afar using social media, we’re encouraging session organisers to create individual hashtags for their sessions. You’ll find these listed on each session page.

We received over 300 session suggestions this year – the most yet for any event we’ve organised – and we’ve done our best to fit in as many as we can. There are 66 sessions packed into 2.5 days, plus 4 keynotes and 2 fireside chats. We’ve also made space for an unconference over the 2 core days of the festival, so if you missed out on submitting a proposal, there’s still a chance to present your ideas at the event: come ready to pitch! Finally, the Open Knowledge Fair has added a further 20 demos – and counting – to the lineup and is a great opportunity to hear about more projects. The Programme is full to bursting, and while some time slots may still change a little, we hope you’ll dive right in and start getting excited about July!

We think you’ll agree that Open Knowledge Festival 2014 is shaping up to be an action-packed few days – so if you’ve not bought your ticket yet, do so now! Come join us for what will be a memorable 2014 Festival!

See you in Berlin! Your OKFestival 2014 Team

All-star wrap-up of a month of Open Knowledge events all around the world – April 2014

Beatrice Martini - May 23, 2014 in Community Stories, Events, Featured, Meetups, OKF France, OKF Greece, OKF Italy, OKF Switzerland, OKFN France, Open Access, Open Data, Open Data Index, Open Government Data, Open Knowledge Foundation Local Groups, Sprint / Hackday, Workshop

Last month we asked the Open knowledge community to start sharing more details about the events we all run, to discover how many people are rocking Open Knowledge events all around the world! The community has been great at responding the call and now we’re glad to feature some of the April events we got reports (and pictures and videos!) from.

The winners of the Apps4Greece award have been announced! Check out the winning apps, aiming to improve the functionality of cities, businesses, services and develop entrepreneurship and innovation.

Organised by Open Knowledge France after the Paris Open Government Conference (April 24-25) during which France announced it’s joining the Open Government Partnership – and gathering more the 50 people! Featuring Open Knowledge founder’s Rufus Pollock and discussions about the state of Open Data in France, Open Data Index, French version of School of Data Ecole des Données (congratulations!) and more.

  • Open Access Days in Egypt (Cairo, Egypt – April 27-28) Screen Shot 2014-05-22 at 11.07.36 AM Open Knowledge Egypt, among many other organizations and researchers, participated in the 2-day event driven by the aim to promote open access to researchers in Egypt and the Middle East, and plant a seed for future initiatives.

We’re so looking forward to hearing everything about your upcoming events! Some juicy ones in the pipeline:

So, what you’re waiting for? It’s time to share your stories for next months’ global roundup! Please submit your blogposts about your May events to the Community Tumblr (details about how/where here) by June 4 in order to be featured in our all-star monthly wrap-up to be published in June on the main Open Knowledge blog and channels! Thank you! We’re looking forward to hearing from you!

Skillshares and Stories: Upcoming Community Sessions

Heather Leson - April 3, 2014 in CKAN, Events, Network, OKF Brazil, OKF Projects, Open Access, Open Knowledge Foundation Local Groups, School of Data

We’re excited to share with you a few upcoming Community Sessions from the School of Data, CKAN, Open Knowledge Brazil, and Open Access. As we mentioned earlier this week, we aim to connect you to each other. Join us for the following events!

What is a Community Session: These online events can be in a number of forms: a scheduled IRC chat, a community google hangout, a technical sprint or hackpad editathon. The goal is to connect the community to learn and share their stories and skills.

We held our first Community Session yesterday. (see our Wiki Community Session notes) The remaining April events will be online via G+. These sessions will be a public Hangout to Air. The video will be available on the Open Knowledge Youtube Channel after the event. Questions are welcome via Twitter and G+.

All these sessions are Wednesdays at 10:30 – 11:30 am ET/ 14:30 – 15:30 UTC.

Mapping with Ketty and Ali: a School of Data Skillshare (April 9, 2014)

Making a basic map from spreadsheet data: We’ll explore tools like QGIS (a free and Open-source Geographic Information System), Tilemill (a tool to design beautiful interactive web maps) Our guest trainers are Ketty Adoch and Ali Rebaie.

To join the Mapping with Ketty and Ali Session on April 9, 2014

Q & A with Open Knowledge Brazil Chapter featuring Everton(Tom) Zanella Alvarenga (April 16, 2014)

Around the world, local groups, Chapters, projects, working groups and individuals connect to Open Knowledge. We want to share your stories.

In this Community Session, we will feature Everton (Tom) Zanella Alvarenga, Executive Director.

Open Knowledge Foundation Brazil is a newish Chapter. Tom will share his experiences growing a chapter and community in Brazil. We aim to connect you to community members around the world. We will also open up the conversation to all things Community. Share your best practices

Join us on April 16, 2014 via G+

Take a CKAN Tour (April 23, 2014)

This week we will give an overview and tour of CKAN – the leading open source open data platform used by the national governments of the US, UK, Brazil, Canada, Australia, France, Germany, Austria and many more. This session will cover why data portals are useful, what they provide and showcase examples and best practices from CKAN’s varied user base! Our special guest is Irina Bolychevsky, Services Director (Open Knowledge Foundation).

Learn and share your CKAN stories on April 23, 2014

(Note: We will share more details about the April 30th Open Access session soon!)

Resources

Knowledge Creation to Diffusion: The Conflict in India

Guest - February 28, 2014 in Open Access, Open Research, Open Science

facebook-cover

This is a guest post by Ranjit Goswami, Dean (Academics) and (Officiating) Director of Institute of Management Technology (IMT), Nagpur, India. Ranjit also volunteers as one of the Indian Country Editors for the Open Data Census.

Developing nations, more so India, increasingly face a challenge in prioritizing its goals. One thing that increasingly becomes relevant in this context, in the present age of open knowledge, is the relevance of subscription-journals in dissipation and diffusion of knowledge in a developing society. Young Aaron Swartz from Harvard had made an effort to change it, that did cost him his life; most developed nations have realized research funded by tax-payers money should be made freely available to tax-payers, but awareness on these issues are at quite pathetic levels in India – both at policy level and among members of academic community.

Before one looks at the problem, a contextual understanding is needed. Today, a lot of research is done globally, including some of it in India, and its importance in transforming nations and society is increasingly getting its due recognition across nations. Quantum of original application oriented research, applicable specifically to the developing world, is a small part of overall global research. Some of it is done locally in India too, in spite of two obvious constraints developing nations face: (1) lack of funds, and (2) lack of capability and/or capacity.

Tax-funded research should be freely available

This article argues that research outcomes, done in India with Indian tax-payers money, are to be freely available to all Indians, for better diffusion. Unfortunately, the present practice is quite opposite.

The lack of diffusion of knowledge becomes evident in absence of any planned efforts, to make the research done in local context available in open platforms. Here when one looks at the academic community in India, due to the older mindset where research score and importance is given only for publishing research papers in journals, often even in journals of questionable quality, faculty members are encouraged to publish in subscription-journals. Open access journals are considered untouchables. Faculty members mostly do not keep a version of the publication to be freely accessible – be it in their own institute’s website, or in other formats online. More than 99% of Indian higher educational institutes do not have any open-access research content in their websites.

Simultaneously, a lot of academic scams get reported, more from India, as measuring research contribution is a difficult task. Faculty members often fall prey to short-cuts of institute’s research policy, in this age of mushrooming journals.

Facing academic challenges

India, in its journey to be an to an open knowledge society, faces diverse academic challenges. Experienced faculty members feel, that making their course outlines available in the public domain would lead to others copying from it; whereas younger faculty members see subscription journal publishing as the only way to build a CV. The common ill-founded perception is that top journals would not accept your paper if you make a version of it freely available. All of above act counter-productive to knowledge diffusion in a poor country like India. The Government of India has often talked about open course materials, but in most government funded higher educational institutes, one seldom sees even a course outline in public domain, let alone research output. Question therefore is: For public funded universities and institutes, why should any Indian user have to cough up large sums of money again to access their research output? And it is an open truth that – barring a very few universities and institutes – most Indian colleges, universities and research organizations or even practitioners cannot afford the money required to pay for subscribing most well-known journal databases, or afford individual articles therein.

facebook-cover

It would not be wrong to say that out of thirty-thousand plus higher educational institutes, not even one per cent has a library access comparable to institutes in developed nations. And academic research output, more in social science areas, need not be used only for academic purposes. Practitioners – farmers, practicing doctors, would-be entrepreneurs, professional managers and many others may benefit from access to this research, but unfortunately almost none of them would be ready or able to shell out $20+ for a few pages by viewing only the abstract, in a country where around 70% of people live below $2 a day income levels.

Ranking is given higher priority than societal benefit

Academic contribution in public domain through open and useful knowledge, therefore, is a neglected area in India. Although, over the last few years, we have seen OECD nations, including China, increasingly encouraging open-access publishing by academic community; in India – in its obsession with university ranks where most institutes fare poorly, we are on reverse gear. Director of one of India’s best institutes have suggested why such obsessions are ill-founded, but the perceptions to practices are quite opposite.

It is, therefore, not rare to see a researcher getting additional monetary rewards for publishing in top-category subscription journals, with no attempt whatsoever – be it from researcher, institute or policy-makers – to make a copy of that research available online, free of cost. Irony is, that additional reward money again comes from taxpayers.

Unfortunately, existing age-old policies to practices are appreciated by media and policy-makers alike, as the nation desperately wants to show to the world that the nation publishes in subscription journals. Point here is: nothing wrong with producing in journals, encourage it even more for top journals, but also make a copy freely available online to any of the billion-plus Indians who may need that paper.

Incentives to produce usable research

In case of India, more in its publicly funded academic to research institutes, we have neither been able to produce many top category subscription-journal papers, nor have we been able to make whatever research output we generate freely available online. On quality of management research, The Economist, in a recent article stated that faculty members worldwide ‘have too little incentive to produce usable research. Oceans of papers with little genuine insight are published in obscure periodicals that no manager would ever dream of reading.’ This perfectly fits in India too. It is high time we look at real impact of management and social science research, rather than the journal impact factors. Real impact is bigger when papers are openly accessible.

Developing and resource deficit nations like India, who need open access the most, thereby further lose out in present knowledge economy. It is time that Government and academic community recognize the problem, and ensure locally done research is not merely published for academic referencing, but made available for use to any other researcher or practitioner in India, free of cost.

Knowledge creation is important. Equally important is diffusion of that knowledge. In India, efforts to resources have been deployed on knowledge creation, without integrative thinking on its diffusion. In the age of Internet and open access, this needs to change.

facebook-cover

Prof. Ranjit Goswami is Dean (Academics) and (Officiating) Director of Institute of Management Technology (IMT), Nagpur – a leading private B-School in India. IMT also has campuses in Ghaziabad, Dubai and Hyderabad. He is on twitter @RanjiGoswami

Copyright and Open Access 2014

Michelle Brook - January 15, 2014 in Featured, Open Access

This post is a guest post by Michelle Brook and Tom Olijhoek from the Open Knowledge Foundation Open Access Working Group.

This week has been proclaimed Copyright week by the EFF (Electronic Frontier Foundation) and today, Wednesday Jan 15, is Open Access Day 2014.

It is almost exactly 1 year ago that Aaron Swartz (http://en.wikipedia.org/wiki/Aaron_Swartz) died in the middle of his struggle for open knowledge and it would be a good thing to make this week and in particular Open Access Day, a recurring event in his honor.

The open access movement has gained momentum in the past year and too much has happened to list every thing. Instead lets focus on a few key events and developments.

In 2013 the White House has issued a directive stating that all publicly funded research should be made publicly available in repositories. The reaction of the scientific publishers has been to allow this, but under the condition that there is an embargo time of 6 months or 1 year. Many have thought that this would be a necessary transition measure, but recently they have been proven very wrong in this assumption because a powerful lobby of publishers is now even demanding for embargo times of up to 3 years!

In our opinion any embargo time for making publications open access is the wrong thing to do: it is not in the interest of science, not in the interest of society, it seems designed only to protect the rights of the publishers in order to maintain their profits. Any paper, especially in the Science, Technology, Engineering and Maths disciplines, refers to work done at least 1-2 years previously. Combined with the inherent fast pace of science, any embargo period – especially prolonged embargo periods – will make sharing of the information less useful and less efficient by prolonging this time span further. Instead we should strive for Zero-embargo publication and push for SHORTER review and handling times, which can sometimes be as long as 6 months!

We should remember Open Access is not only about having information freely available to view. People should also be able to reuse the information freely with no restrictions other than the requirement to attribute. Instead of traditional copyright rules and property rights open access publishers increasingly use a set of licenses developed by Creative Commons. These licenses provide a basic choice of rules for the usage of the work, in combination with the stringent demand for attribution of the work to the original author(s). In this way copyright remains (forever) with the author while allowing for unrestricted (or in other cases somewhat restricted) use of the information.

The original copyright rules that evolved around 1700 (Statute of Anne) were developed to protect the right of the owner of a work for a limited time (2x 14 years) in exchange for having the work in the public domain after this time period. So in a sense these rules were aimed at allowing to share the information. Because information did not travel that fast in those days, this ‘embargo period’ was then considered enough. When through technical advancements information started to move more quickly the copyright period was gradually extended to 70 years and more (Copyright, Designs and Patents Act 1988). However, in the process the copyright ownership had shifted from individual copyright to corporate copyright owned by publishing businesses. The ultimate goal of the copyright laws no longer reflected the ultimate goal of sharing information after a short period of time, but instead have a new role of defending business interests for as long as possible.

Today, thanks to the invention of the Internet, we see the making of a sharing economy. Many sharing communities exist already, but the community of sharing scientists is slow in coming. Although the internet was developed by scientists to exchange information the public has been much more quick in seeing and using the possibilities for sharing ideas, goods and information. Sharing of scientific information is still in its infancy, not in the least because of the ongoing efforts of traditional publishers to shield information for as long as this is profitable, but open science communities have started to form all over the world. This can be seen by the rapid growth of the Open Knowledge Foundation, with over 40 local open knowledge communities worldwide, many more than only two years ago. And it is also illustrated by the steady growth of older open access publishers like PLoS, BioMedCentral, as well as the very successful introductions of new journals like eLife and PeerJ.

Political and scientific support is also growing. The next European research program Horizon2020 aims at 100 % open access for all publicly funded research. And a scientific society like the Max Planck society has just organized its tenth anniversary Berlin conference on open access in Berlin.

However not only political and scientific support is important. We want to have citizens, students, entrepreneurs, and everyone else who needs (specific) information to push for global open access to all academic literature. And we need your help to do this.

  • You can contact the Open Knowledge Foundation by registering on the website
  • You can subscribe to any of the mailing lists of the OKF for instance the open access list and take part in discussions
  • You can share your stories on difficulties or success with accessing information on the website WhoNeedsAccess
  • You can download the OpenAccessButton and start registering where you hit paywalls when trying to access information

Tom Olijhoek and Michelle Brooks from the Open Access Working Group/ OKF

PDF Liberation Hackathon – January 18-19

Guest - December 19, 2013 in Events, Featured, Open Access, Open Content, Sprint / Hackday

This guest blog post has been written by Marc Joffe, of Public Sector Credit Solutions.

OpenSpending Workshop Bosnia

Open government data is valuable only to the extent that it can be used cost-effectively. When governments provide “open data” in the form of voluminous PDFs they offer the appearance of openness without its benefits. In this situation, the open government movement had two options: demand machine readable data or hack the PDFs – using technology to liberate the interesting data from them. The two approaches are complimentary; we can pursue both at the same time.

When it comes to liberating data from PDFs, advanced technologies are available but expensive. In my previous life as a technology manager at a financial firm, I was given the opportunity to purchase a sophisticated PDF extraction tool for USD 200,000 – not counting annual maintenance and implementation consulting costs.

This amount is beyond the reach of just about every startup and non-profit in the open data world. It is also beyond the means of most media organizations, so lowering the cost of PDF extraction is also a priority for journalists. The data journalism community has responded by developing software to harvest usable information from PDFs. Tabula, a tool written by Knight-Mozilla OpenNews Fellow Manuel Aristarán, extracts data from PDF tables in a form that can be readily imported to a spreadsheet – if the PDF was “printed” from a computer application. Introduced earlier this year, Tabula continues to evolve thanks to the volunteer efforts of Manuel, with help from OpenNews Fellow Mike Tigas and New York Times interactive developer Jeremy Merrill. Meanwhile, DocHive, a tool whose continuing development is being funded by a Knight Foundation grant, addresses PDFs that were created by scanning paper documents. DocHive is a project of Raleigh Public Record and is led by Charles and Edward Duncan.

These open source tools join a number of commercial offerings such as Able2Extract and ABBYY Fine Reader that extract data from PDFs. A more comprehensive list of open source and commercial resources is available here.

Unfortunately, the free and low cost tools available to data journalists and transparency advocates have limitations that hinder their ability to handle large scale tasks. If, like me, you want to submit hundreds of PDFs to a software tool, press “Go” and see large volumes of cleanly formatted data, you are out of luck. These limits reduce our ability to analyze and report on Parliamentary/Congressional financial disclosures, campaign contribution records and government budgets – which often arrive in volume, in PDF form.

PDF hacking has uses outside the government transparency / data journalism nexus. As Peter Murray-Rust has argued, the progress of science is being retarded because valuable data are “jailed” within PDF journal articles. For this reason, Dr. Rust and several colleagues have been developing AMI – a tool that leverages Apache PDFBox to mine usable content from scientific documents.

Whether your motive is to improve government, lower the cost of data journalism or free scientific data, you are welcome to join The PDF Liberation Hackathon on January 18-19, 2014 – sponsored by The Sunlight Foundation, Knight-Mozilla OpenNews and others. We’ll have hack sites at the NYU-Poly Incubator in New York, Chicago Community Trust, Sunlight’s Washington DC office and at RallyPad in San Francisco (one or two locations will have an opening social on the evening of the 17th). Developers can also join remotely because we will publish a number of clearly specified PDF extraction challenges before the hackathon.

Participants can work on one of the pre-specified challenges or choose their own PDF extraction projects. Ideally, hackathon teams will use (and hopefully improve upon) open source tools to meet the hacking challenges, but they will also be allowed to embed commercial tools into their projects as long as their licensing cost is less than $1000 and an unlimited trial is available.

Prizes of up to $500 will be awarded to winning entries. To receive a prize, a team must publish their source code on a GitHub public repository. To join the hackathon in DC or remotely, please sign up at Eventbrite; to hack with us in SF, please sign up via this Meetup. Signup links for New York and Chicago will be posted here. Please also complete our Google Form survey.

The PDF Liberation Hackathon is going to be a great opportunity to advance the state of the art when it comes to harvesting data from public documents. I hope you can join us.

Open Access Week 2013!

Michelle Brook - October 24, 2013 in Open Access

Happy Open Access Week!

Open Access week is a global event, celebrating open access. Taking place in the last full week of October every year, there are many events taking place online and offline which bring together people who care about Open Access, and provide opportunity to spread the good word.

There’s a lot going on this year!

  • There are a huge number of events taking place – so look out to see if there is one near you!
  • If there aren’t any events nearby, or you can’t get out, many of these events will be streamed. (A list of these can be found here
  • Follow the Twitter conversation on the hashtags #oaweek and #openaccess
  • The Guardian are hosting a live chat abut the future of Open Access research and publishing on Friday
  • The ASAP (Accelerating Science Award Programme) winners have been announced (Big congratulations to the winners!)

There are some great blog posts and articles emerging on Open Access. Let us know in the comments if I’ve missed any articles or news that you think we should be sharing!

As many people probably know, the Open Knowledge Foundation cares deeply about open access to research outputs – as defined by the Budapest Open Access Initiative and in alignment with the Open Knowledge Definition. From the Panton Principles and Panton Fellowships through to the Open Science and Open Access working groups, the community around the Open Knowledge Foundation is recognised as being highly involved in the push for greater openness around scholarship and research!

We’re going to be doing much more to support our community in the advocacy of open access over the coming months. Please sign up to let us know how we can best support you, and what type of tools and resources will help you!

Photo credit flickr user slubdresden

Get Updates