Support Us

You are browsing the archive for Open Science.

Why are patents and locked-up science seen as the way forward for growth and innovation?

Christian Villum - June 18, 2014 in OKF Denmark, Open Science

This is a translated and edited version of a blog post originally appearing on the Danish denfri.dk blog. See the original post here.

These past few weeks have highlighted the crossroads that we as a society are facing: Whether non-open data, siloed knowledge and patented ideas make up the best way for growth and innovation? Or whether the logic of the Internet with it’s open data, open knowledge sharing, open sourcing and remix-culture is the right path for the modern society?

A couple of weeks ago we saw the premiere of “The Internet’s Own Boy”, the documentary of Aaron Swartz, the now world-famous young Internet prodigy that despite enormous success with his start up business chose to throw himself actively into the battle to secure everyone free access to the large academic database JSTOR: An access that is normally only granted to the few that can afford a university degree. Appalled by the unproportionally high fees charged by JSTOR for access to (often tax funded) scientific journals Swartz set up a computer in the basement of the American MIT university, and started to download all the journals in the database. His scheme was unravelled when caught by a security camera, and contrary to all legal precedence, in order to make an example of Swartz, the federal government chose to charge him with a felony: Something that would likely send Swarz behind bars for up to 35 years. The pressure from the government and the FBI hurled Swartz into a depression and he ends up taking his own life. A horrible and tragic story.

A society currently building on intellectual ownership

As an entrepreneur and Danish translator of some of Swarz’ texts I had the pleasure of being invited to participate in a panel at the Danish premiere event for the film in Copenhagen earlier this month. In this panel was also Director for the Danish National Library, Pernille Drost, who explained to the audience that even Denmark, with our free educational system, have similar draconian rules for access to our scientific journals (and cultural heritage as well). Science and culture is held systematically locked up behind technical and legal iron gates and is thereby closed off for the general public – both domestically and abroad – unless you are wealthy enough to pay huge sums of money. This goes even for journals that are a 100 years old!

Additionally, we recently had a referendum in Denmark to determine if we were to join the unified European patent court. The result was a resounding ‘yes’ (roughly 70% of the votes in favor of joining), which sent a clear signal – mainly inspired by the majority of the Danish political parties, both left and right – that we as a society believe ideas are developed best by being locked up and protected. Apparently, the idea is that only those companies rich enough to be able to afford expensive patent lawyers and navigate the complicated intellectual property rights universe should be the ones to ensure our continued growth, improve our welfare and create a balanced and fair global society. In other words, we choose to keep our knowledge hermetically locked in and accessible only to a very few hands, both in the United States and in Denmark. The same applies to the rest of the world, especially the developed world. But is this the best way forward? For ourselves and for the world?

What “open” enables…

In “The Internet’s Own Boy”, for instance, we saw examples of how free access to knowledge can create some very defining breakthroughs in science. Among other the film tells the story of a 14-year old boy in the US, who independently from the school system uses his wit and access to a series of otherwise unavailable science journals to develop a groundbreaking quantum leap in cancer science – to the great surprise of all doctors and scientists within that area.

This example made me think of the story about the English economy student Thomas Herndon, who in 2010 via access to a financial study made by Carmen Reinhart & Kenneth Rogoff discovered a critical error in one of the most prolific analyses therein: An analysis with calculations used by governments around the world to guide economic austerity policy during the global financial crisis. The critical error had created an erroneous foundation for political work on the highest level – and it wasn’t discovered until Herndon, a graduate student at the time, obtained access and started looking into the numbers.

It also reminded me of the story of “The Boy Who Harnessed the Wind”; a tale which in book form by author Bryan Mealer became an international bestseller. Mealer describes how another 14-year old, William Kamkvamba in Malawi, gets access to a handful of old science books and builds a fully functional wind turbine from scrap metal and all of a sudden produces power for his village in the chronically poor (and 98% powerless) African country.

Let’s zoom back to Denmark again. Two weeks ago an interesting story occurred in the news: A group of amateur data analysis enthusiasts (most of them actually from the Open Knowledge Denmark group) had looked into the data from Unified Patent Court referendum, which had been released in a somewhat open format. This made the group discover an anomaly in the reporting of votes from the small city of Taarsbaek north of Copenhagen: Something didn’t look right. The group contacted the authorities and further investigation showed that the officials in Taarsbaek had accidentally switched the ‘yes’ and ‘no’ votes! A human error, yes, and not something that had any influence on the final result. But at the same time an error that had never been discovered if the data hadn’t been released for the public to scrutinize.

What does these examples say about the gigantic potential of free access to knowledge?

Unleashing the gigantic supercomputer

Imagine if citizens in general had the opportunity to access scientific journals? Build on existing technology? Look into all our non-sensitive public data? How many errors could be corrected? How many breakthroughs would we see in medicine research? How many inventions and infrastructural improvements would proliferate in the developing world? Sadly, it’s a utopia, because we keep our knowledge locked up and kept away from the public eye. Isn’t it time we get rid of that kind of old-fashion pre-Internet thinking?

All the peoples around the world make up a huge brain trust, a giant computer, which holds a potential for growth and welfare that we can hardly imagine. The Internettet is the nerve system in this latent supercomputer. But to activate it we need to stop seeing science and knowledge as only a tradable product, which only the richest 5% of the world’s population has access to. By opening up knowledge for everyone, we create a foundation for citizen science that will enrich us all with a growth potential that exceeds our own capability by many unimaginable lengths.

Learning from the Internet

“Sure,” you think, “but what about the cost? It’s not cheap to research and develop ideas!”. Of course not, but look at the economies already developing on the Internet: For example open source software, where developers around the world share computer code and stand on each other’s shoulders, even if they are competitors. An billion dollar economy, where all parties become richer, and where companies flourish and where millions of jobs are made, while the core of it all, the code – that is, the product – is freely available for anyone to continue building on.

In the software world a large part of the major actors have already abandoned patents and tossed away the iron gates in favor of a much more powerful growth paradigm: Sharing of knowledge, crowdsourcing and open data. This open source mindset can easily be transferred to the rest of society, including the science world and business in general. The economic potential is gigantic – for all of the world’s population. Let’s all lead the way for this kind of thinking. That is real innovation.

Newsflash! OKFestival Programme Launches

Beatrice Martini - June 4, 2014 in Events, Free Culture, Join us, Network, News, OKFest, OKFestival, Open Access, Open Data, Open Development, Open Economics, Open Education, Open GLAM, Open Government Data, Open Humanities, Open Knowledge Foundation, Open Knowledge Foundation Local Groups, Open Research, Open Science, Open Spending, Open Standards, Panton Fellows, Privacy, Public Domain, Training, Transparency, Working Groups

At last, it’s here!

Check out the details of the OKFestival 2014 programme – including session descriptions, times and facilitator bios here!

Screen Shot 2014-06-04 at 4.11.42 PM

We’re using a tool called Sched to display the programme this year and it has several great features. Firstly, it gives individual session organisers the ability to update the details on the session they’re organising; this includes the option to add slides or other useful material. If you’re one of the facilitators we’ll be emailing you to give you access this week.

Sched also enables every user to create their own personalised programme to include the sessions they’re planning to attend. We’ve also colour-coded the programme to help you when choosing which conversations you want to follow: the Knowledge stream is blue, the Tools stream is red and the Society stream is green. You’ll also notice that there are a bunch of sessions in purple which correspond to the opening evening of the festival when we’re hosting an Open Knowledge Fair. We’ll be providing more details on what to expect from that shortly!

Another way to search the programme is by the subject of the session – find these listed on the right hand side of the main schedule – just click on any of them to see a list of sessions relevant to that subject.

As you check out the individual session pages, you’ll see that we’ve created etherpads for each session where notes can be taken and shared, so don’t forget to keep an eye on those too. And finally; to make the conversations even easier to follow from afar using social media, we’re encouraging session organisers to create individual hashtags for their sessions. You’ll find these listed on each session page.

We received over 300 session suggestions this year – the most yet for any event we’ve organised – and we’ve done our best to fit in as many as we can. There are 66 sessions packed into 2.5 days, plus 4 keynotes and 2 fireside chats. We’ve also made space for an unconference over the 2 core days of the festival, so if you missed out on submitting a proposal, there’s still a chance to present your ideas at the event: come ready to pitch! Finally, the Open Knowledge Fair has added a further 20 demos – and counting – to the lineup and is a great opportunity to hear about more projects. The Programme is full to bursting, and while some time slots may still change a little, we hope you’ll dive right in and start getting excited about July!

We think you’ll agree that Open Knowledge Festival 2014 is shaping up to be an action-packed few days – so if you’ve not bought your ticket yet, do so now! Come join us for what will be a memorable 2014 Festival!

See you in Berlin! Your OKFestival 2014 Team

Building an archaeological project repository II: Where are the research data repositories?

Guest - April 17, 2014 in CKAN, Open Science, WG Archaeology

This is a guest post by Anthony Beck, Honorary fellow, and Dave Harrison, Research fellow, at the University of Leeds School of Computing

DART_UML_DART_2011_2013_RAW

Data repository as research tool

In a previous post, we examined why Open Science is necessary to take advantage of the huge corpus of data generated by modern science. In our project Detection of Archaeological residues using Remote sensing Techniques, or DART, we adopted Open Science principles and made all the project’s extensive data available through a purpose-built data repository built on the open-source CKAN platform. But with so many academic repositories, why did we need to roll our own? A final post will look at how the portal was implemented.

DART: data-driven archaeology

DART’s overall aim is to develop analytical methods to differentiate archaeological sediments from non-archaeological strata, on the basis of remotely detected phenomena (e.g. resistivity, apparent dielectric permittivity, crop growth, thermal properties etc). DART is a data rich project: over a 14 month period, in-situ soil moisture, soil temperature and weather data were collected at least once an hour; ground based geophysical surveys and spectro-radiometry transects were conducted at least monthly; aerial surveys collecting hyperspectral, LiDAR and traditional oblique and vertical photographs were taken throughout the year, and laboratory analyses and tests were conducted on both soil and plant samples. The data archive itself is in the order of terabytes.

Analysis of this archive is ongoing; meanwhile, this data and other resources are made available through open access mechanisms under liberal licences and are thus accessible to a wide audience. To achieve this we used the open-source CKAN platform to build a data repository, DARTPortal, which includes a publicly queryable spatio-temporal database (on the same host), and can support access to individual data as well as mining or analysis of integrated data.

This means we can share the data analysis and transformation processes and demonstrate how we transform data into information and synthesise this information into knowledge (see, for example, this Ipython notebook which dynamically exploits the database connection). This is the essence of Open Science: exposing the data and processes that allow others to replicate and more effectively build on our science.

Lack of existing infrastructure

Pleased though we are with our data repository, it would have been nice not to have to build it! Individual research projects should not bear the burden of implementing their own data repository framework. This is much better suited to local or national institutions where the economies of scale come into their own. Yet in 2010 the provision of research data infrastructure that supported what DART did was either non-existent or poorly advertised. Where individual universities provided institutional repositories, these were focused on publications (the currency of prestige and career advancement) and not on data. Irrespective of other environments, none of the DART collaborating partners provided such a data infrastructure.

Data sharing sites like Figshare did not exist – and when it did exist the size of our hyperspectral data, in particular, was quite rightly a worry. This situation is slowly changing, but it is still far from ideal. The positions taken by Research Councils UK and the Engineering and Physical Science Research Council (EPSRC) on improving access to data are key catalysts for change. The EPSRC statement is particularly succinct:

Two of the principles are of particular importance: firstly, that publicly funded research data should generally be made as widely and freely available as possible in a timely and responsible manner; and, secondly, that the research process should not be damaged by the inappropriate release of such data.

This has produced a simple economic issue – if research institutions can not demonstrate that they can manage research data in the manner required by the funding councils then they will become ineligible to receive grant funding from that council. The impact is that the majority of universities are now developing their own, or collaborating on communal, data repositories.

But what about formal data deposition environments?

DART was generously funded through the Science and Heritage Programme supported by the UK Arts and Humanities Research Council (AHRC) and the EPSRC. This means that these research councils will pay for data archiving in the appropriate domain repository, in this case the Archaeology Data Service (ADS). So why produce our own repository?

Deposition to the ADS would only have occurred after the project had finished. With DART, the emphasis has been on re-use and collaboration rather than primarily on archiving. These goals are not mutually exclusive: the methods adopted by DART mean that we produced data that is directly suitable for archiving (well documented ASCII formats, rich supporting description and discovery metadata, etc) whilst also allowing more rapid exposure and access to the ‘full’ archive. This resulted in DART generating much richer resource discovery and description metadata than would have been the case if the data was simply deposited into the ADS.

The point of the DART repository was to produce an environment which would facilitate good data management practice and collaboration during the lifetime of the project. This is representative of a crucial shift in thinking, where projects and data collectors consider re-use, discovery, licences and metadata at a much earlier stage in the project life cycle: in effect, to create dynamic and accessible repositories that have impact across the broad stakeholder community rather than focussing solely on the academic community. The same underpinning philosophy of encouraging re-use is seen at both FigShare and DataHub. Whilst formal archiving of data is to be encouraged, if it is not re-useable, or more importantly easily re-useable, within orchestrated scientific workflow frameworks then what is the point.

In addition, it is unlikely that the ADS will take the full DART archive. It has been said that archaeological archives can produce lots of extraneous or redundant ‘stuff’. This can be exacerbated by the unfettered use of digital technologies – how many digital images are really required for the same trench? Whilst we have sympathy with this argument, there is a difference between ‘data’ and ‘pretty pictures’: as data analysts, we consider that a digital photograph is normally a data resource and rarely a pretty picture. Hence, every image has value.

This is compounded when advances in technology mean that new data can be extracted from ‘redundant’ resources. For example, Structure from Motion (SfM) is a Computer Vision technique that extracts 3D information from 2D objects. From a series of overlapping photographs, SfM techniques can be used to extract 3D point clouds and generate orthophotographs from which accurate measurements can be taken. In the case of SfM there is no such thing as redundancy, as each image becomes part of a ‘bundle’ and the statistical characteristics of the bundle determine the accuracy of the resultant model. However, one does need to be pragmatic, and it is currently impractical for organisations like the ADS to accept unconstrained archives. That said, it is an area that needs review: if a research object is important enough to have detailed metadata created about it, then it should be important enough to be archived.

For DART, this means that the ADS is hosting a subset of the archive in long-term re-use formats, which will be available in perpetuity (which formally equates to a maximum of 25 years), while the DART repository will hold the full archive in long term re-use formats until we run out of server money. We are are in discussion with Leeds University to migrate all the data objects over to the new institutional repository with sparkling new DOIs and we can transfer the metadata held in CKAN over to Open Knowledge’s public repository, the dataHub. In theory nothing should be lost.

How long is forever?

The point on perpetuity is interesting. Collins Dictionary defines perpetuity as ‘eternity’. However, the ADS defines ‘digital’ perpetuity as 25 years. This raises the question: is it more effective in the long term to deposit in ‘formal’ environments (with an intrinsic focus on preservation format over re-use), or in ‘informal’ environments (with a focus on re-use and engagement over preservation (Flickr, Wikimedia Commons, DART repository based on CKAN, etc)? Both Flickr and Wikimedia Commons have been around for over a decade. Distributed peer to peer sharing, as used in Git, produces more robust and resilient environments which are equally suited to longer term preservation. Whilst the authors appreciate that the situation is much more nuanced, particularly with the introduction of platforms that facilitate collaborative workflow development, this does have an impact on long-term deployment.

Choosing our licences

Licences are fundamental to the successful re-use of content. Licences describe who can use a resource, what they can do with this resource and how they should reference any resource (if at all).

Two lead organisations have developed legal frameworks for content licensing, Creative Commons (CC) and Open Data Commons (ODC). Until the release of CC version 4, published in November 2013, the CC licence did not cover data. Between them, CC and ODC licences can cover all forms of digital work.

At the top level the licences are permissive public domain licences (CC0 and PDDL respectively) that impose no restrictions on the licensees use of the resource. ‘Anything goes’ in a public domain licence: the licensee can take the resource and adapt it, translate it, transform it, improve upon it (or not!), package it, market it, sell it, etc. Constraints can be added to the top level licence by employing the following clauses:

  • BY – By attribution: the licensee must attribute the source.
  • SA – Share-alike: if the licensee adapts the resource, they must release the adapted resource under the same licence.
  • NC – Non commercial: the licensee must not use the work within a commercial activity without prior approval. Interestingly, in many area of the world, the use of material in university lectures may be considered a commercial activity. The non-commercial restriction about the nature of the activity, not the legal status of the institution doing the work.
  • ND – No derivatives: the licensee can not derive new content from the resource.

Each of these clauses decreases the ‘open-ness’ of the resource. In fact, the NC and ND clause are not intrinsically open (they restrict both who can use and what you can do with the resource). These restrictive clauses have the potential to produce license incompatibilities which may introduce profound problems in the medium to long term. This is particularly relevant to the SA clause. Share-alike means that any derived output must be licensed under the same conditions as the source content. If content is combined (or mashed up) – which is essential when one is building up a corpus of heritage resources – then content created under a SA clause can not be combined with content that includes a restrictive clause (BY, NC or ND) that is not in the source licence. This licence incompatibility has a significant impact on the nature of the data commons. It has the potential to fragment the data landscape creating pockets of knowledge which are rarely used in mainstream analysis, research or policy making. This will be further exacerbated when automated data aggregation and analysis systems become the norm. A permissive licence without clauses like Non-commercial, Share-alike or No-derivatives removes such licence and downstream re-user fragmentation issues.

For completeness, specific licences have been created for Open Government Data. The UK Government Data Licence for public sector information is essentially an open licence with a BY attribution clause.

At DART we have followed the guidelines of The Open Data Institute and separated out creative content (illustrations, text, etc.) from data content. Hence, the DART content is either CC-BY or ODC-BY respectively. In the future we believe it would be useful to drop the BY (attribution) clause. This would stop attribute stacking (if the resource you are using is a derivative of a derivative of a derivative of a ….. (you get the picture), at what stage do you stop attribution) and anything which requires bureaucracy, such as attributing an image in a powerpoint presentation, inhibits re-use (one should always assume that people are intrinsically lazy). There is a post advocating ccZero+ by Dan Cohen. However, impact tracking may mean that the BY clause becomes a default for academic deposition.

The ADS uses a more restrictive bespoke default licence which does not map to national or international licence schemes (they also don’t recognise non CC licences). Resources under this licence can only be used for teaching, learning, and research purposes. Of particular concern is their use of the NC clause and possible use of the ND clause (depending on how you interpret the licence). Interestingly, policy changes mean that the use of data under the bespoke ADS licence becomes problematic if university teaching activities are determined to be commercial. It is arguable that the payment of tuition fees represents a commercial activity. If this is true then resources released under the ADS licence can not be used within university teaching which is part of a commercial activity. Hence, the policy change in student tuition and university funding has an impact on the commercial nature of university teaching which has a subsequent impact on what data or resources universities are licensed to use. Whilst it may never have been the intention of the ADS to produce a licence with this potential paradox, it is a problem when bespoke licences are developed, even if they were originally perceived to be relatively permissive licences. To remove this ambiguity it is recommended that submissions to the ADS are provided under a CC licence which renders the bespoke ADS licence void.

In the case of DART, these licence variations with the ADS should not be a problem. Our licences are permissive (by attribution is the only clause we have included). This means the ADS can do anything they want with our resources as long as they cite the source. In our case this would be the individual resource objects or collections on the DART portal. This is a good thing, as the metadata on the DART portal is much richer than the metadata held by the ADS.

Concerns about opening up data, and responses which have proved effective

Christopher Gutteridge (University of Southampton) and Alexander Dutton (University of Oxford) have collated a Google doc entitled ‘Concerns about opening up data, and responses which have proved effective‘. This document describes a number of concerns commonly raised by academic colleagues about increasing access to data. For DART two issues became problematic that were not covered by this document:

  • The relationship between open data and research novelty and the impact this may have on a PhD submission.
  • Journal publication – specifically that a journal won’t publish a research paper if the underlying data is open.

The former point is interesting – does the process of undertaking open science, or at least providing open data, undermine the novelty of the resultant scientific process? With open science it could be difficult to directly attribute the contribution, or novelty, of a single PhD student to an openly collaborative research process. However, that said, if online versioning tools like Git are used, then it is clear who has contributed what to a piece of code or a workflow (the benefits of the BY clause). This argument is less solid when we are talking solely about open data. Whilst it is true that other researchers (or anybody else for that matter) have access to the data, it is highly unlikely that multiple researchers will use the same data to answer exactly the same question. If they do ask the same question (and making the optimistic assumption that they reach the same conclusion), it is still highly unlikely that they will have done so by the same methods; and even if they do, their implementations will be different. If multiple methods using the same source data reach the same conclusion then there is an increased likelihood that the conclusion is correct and that the science is even more certain. The underlying point here is that 21st-century scientific practice will substantially benefit from people showing their working. Exposure of the actual process of scientific enquiry (the algorithms, code, etc.) will make the steps between data collection and publication more transparent, reproduceable and peer-reviewable – or, quite simply, more scientific. Hence, we would argue that open data and research novelty is only a problem if plagiarism is a problem.

The journal publication point is equally interesting. Publications are the primary metric for academic career progression and kudos. In this instance it was the policy of the ‘leading journal in this field’ that they would not publish a paper from a dataset that was already published. No credible reasons were provided for this clause – which seems draconian in the extreme. It does indicate that no one size fits all approach will work in the academic landscape. It will also be interesting to see how this journal, which publishes work which is mainly funded by EPSRC, responds to the EPSRC guidelines on open data.

This is also a clear demonstration that the academic community needs to develop new metrics that are more suited to 21st century research and scholarship by directly link academic career progression to other source of impact that go beyond publications. Furthermore, academia needs some high-profile exemplars that demonstrate clearly how to deal with such change. The policy shift and ongoing debate concerning ‘Open access’ publications in the UK is changing the relationship between funders, universities, researchers, journals and the public – a similar debate needs to occur about open data and open science.

The altmetrics community is developing new metrics for “analyzing, and informing scholarship” and have described their ethos in their manifesto. The Research Councils and Governments have taken a much greater interest in the impact of publically funded research. Importantly public, social and industry impact are as important as academic impact. It is incumbent on universities to respond to this by directly linking academic career progression through to impact and by encouraging improved access to the underlying data and procesing outputs of the research process through data repositories and workflow environments.

Knowledge Creation to Diffusion: The Conflict in India

Guest - February 28, 2014 in Open Access, Open Research, Open Science

facebook-cover

This is a guest post by Ranjit Goswami, Dean (Academics) and (Officiating) Director of Institute of Management Technology (IMT), Nagpur, India. Ranjit also volunteers as one of the Indian Country Editors for the Open Data Census.

Developing nations, more so India, increasingly face a challenge in prioritizing its goals. One thing that increasingly becomes relevant in this context, in the present age of open knowledge, is the relevance of subscription-journals in dissipation and diffusion of knowledge in a developing society. Young Aaron Swartz from Harvard had made an effort to change it, that did cost him his life; most developed nations have realized research funded by tax-payers money should be made freely available to tax-payers, but awareness on these issues are at quite pathetic levels in India – both at policy level and among members of academic community.

Before one looks at the problem, a contextual understanding is needed. Today, a lot of research is done globally, including some of it in India, and its importance in transforming nations and society is increasingly getting its due recognition across nations. Quantum of original application oriented research, applicable specifically to the developing world, is a small part of overall global research. Some of it is done locally in India too, in spite of two obvious constraints developing nations face: (1) lack of funds, and (2) lack of capability and/or capacity.

Tax-funded research should be freely available

This article argues that research outcomes, done in India with Indian tax-payers money, are to be freely available to all Indians, for better diffusion. Unfortunately, the present practice is quite opposite.

The lack of diffusion of knowledge becomes evident in absence of any planned efforts, to make the research done in local context available in open platforms. Here when one looks at the academic community in India, due to the older mindset where research score and importance is given only for publishing research papers in journals, often even in journals of questionable quality, faculty members are encouraged to publish in subscription-journals. Open access journals are considered untouchables. Faculty members mostly do not keep a version of the publication to be freely accessible – be it in their own institute’s website, or in other formats online. More than 99% of Indian higher educational institutes do not have any open-access research content in their websites.

Simultaneously, a lot of academic scams get reported, more from India, as measuring research contribution is a difficult task. Faculty members often fall prey to short-cuts of institute’s research policy, in this age of mushrooming journals.

Facing academic challenges

India, in its journey to be an to an open knowledge society, faces diverse academic challenges. Experienced faculty members feel, that making their course outlines available in the public domain would lead to others copying from it; whereas younger faculty members see subscription journal publishing as the only way to build a CV. The common ill-founded perception is that top journals would not accept your paper if you make a version of it freely available. All of above act counter-productive to knowledge diffusion in a poor country like India. The Government of India has often talked about open course materials, but in most government funded higher educational institutes, one seldom sees even a course outline in public domain, let alone research output. Question therefore is: For public funded universities and institutes, why should any Indian user have to cough up large sums of money again to access their research output? And it is an open truth that – barring a very few universities and institutes – most Indian colleges, universities and research organizations or even practitioners cannot afford the money required to pay for subscribing most well-known journal databases, or afford individual articles therein.

facebook-cover

It would not be wrong to say that out of thirty-thousand plus higher educational institutes, not even one per cent has a library access comparable to institutes in developed nations. And academic research output, more in social science areas, need not be used only for academic purposes. Practitioners – farmers, practicing doctors, would-be entrepreneurs, professional managers and many others may benefit from access to this research, but unfortunately almost none of them would be ready or able to shell out $20+ for a few pages by viewing only the abstract, in a country where around 70% of people live below $2 a day income levels.

Ranking is given higher priority than societal benefit

Academic contribution in public domain through open and useful knowledge, therefore, is a neglected area in India. Although, over the last few years, we have seen OECD nations, including China, increasingly encouraging open-access publishing by academic community; in India – in its obsession with university ranks where most institutes fare poorly, we are on reverse gear. Director of one of India’s best institutes have suggested why such obsessions are ill-founded, but the perceptions to practices are quite opposite.

It is, therefore, not rare to see a researcher getting additional monetary rewards for publishing in top-category subscription journals, with no attempt whatsoever – be it from researcher, institute or policy-makers – to make a copy of that research available online, free of cost. Irony is, that additional reward money again comes from taxpayers.

Unfortunately, existing age-old policies to practices are appreciated by media and policy-makers alike, as the nation desperately wants to show to the world that the nation publishes in subscription journals. Point here is: nothing wrong with producing in journals, encourage it even more for top journals, but also make a copy freely available online to any of the billion-plus Indians who may need that paper.

Incentives to produce usable research

In case of India, more in its publicly funded academic to research institutes, we have neither been able to produce many top category subscription-journal papers, nor have we been able to make whatever research output we generate freely available online. On quality of management research, The Economist, in a recent article stated that faculty members worldwide ‘have too little incentive to produce usable research. Oceans of papers with little genuine insight are published in obscure periodicals that no manager would ever dream of reading.’ This perfectly fits in India too. It is high time we look at real impact of management and social science research, rather than the journal impact factors. Real impact is bigger when papers are openly accessible.

Developing and resource deficit nations like India, who need open access the most, thereby further lose out in present knowledge economy. It is time that Government and academic community recognize the problem, and ensure locally done research is not merely published for academic referencing, but made available for use to any other researcher or practitioner in India, free of cost.

Knowledge creation is important. Equally important is diffusion of that knowledge. In India, efforts to resources have been deployed on knowledge creation, without integrative thinking on its diffusion. In the age of Internet and open access, this needs to change.

facebook-cover

Prof. Ranjit Goswami is Dean (Academics) and (Officiating) Director of Institute of Management Technology (IMT), Nagpur – a leading private B-School in India. IMT also has campuses in Ghaziabad, Dubai and Hyderabad. He is on twitter @RanjiGoswami

Building an archaeological project repository I: Open Science means Open Data

Guest - February 24, 2014 in CKAN, Open Science, WG Archaeology

This is a guest post by Anthony Beck, Honorary fellow, and Dave Harrison, Research fellow, at the University of Leeds School of Computing.

In 2010 we authored a series of blog posts for the Open Knowledge Foundation subtitled ‘How open approaches can empower archaeologists’. These discussed the DART project, which is on the cusp of concluding.

The DART project collected large amounts of data, and as part of the project, we created a purpose-built data repository to catalogue this and make it available, using CKAN, the Open Knowledge Foundation’s open-source data catalogue and repository. Here we revisit the need for Open Science in the light of the DART project. In a subsequent post we’ll look at why, with so many repositories of different kinds, we felt that to do Open Science successfully we needed to roll our own.

Open data can change science

Open inquiry is at the heart of the scientific enterprise. Publication of scientific theories – and of the experimental and observational data on which they are based – permits others to identify errors, to support, reject or refine theories and to reuse data for further understanding and knowledge. Science’s powerful capacity for self-correction comes from this openness to scrutiny and challenge. (The Royal Society, Science as an open enterprise, 2012)

The Royal Society’s report Science as an open enterprise identifies how 21st century communication technologies are changing the ways in which scientists conduct, and society engages with, science. The report recognises that ‘open’ enquiry is pivotal for the success of science, both in research and in society. This goes beyond open access to publications (Open Access), to include access to data and other research outputs (Open Data), and the process by which data is turned into knowledge (Open Science).

The underlying rationale of Open Data is this: unfettered access to large amounts of ‘raw’ data enables patterns of re-use and knowledge creation that were previously impossible. The creation of a rich, openly accessible corpus of data introduces a range of data-mining and visualisation challenges, which require multi-disciplinary collaboration across domains (within and outside academia) if their potential is to be realised. An important step towards this is creating frameworks which allow data to be effectively accessed and re-used. The prize for succeeding is improved knowledge-led policy and practice that transforms communities, practitioners, science and society.

The need for such frameworks will be most acute in disciplines with large amounts of data, a range of approaches to analysing the data, and broad cross-disciplinary links – so it was inevitable that they would prove important for our project, Detection of Archaeological residues using Remote sensing Techniques (DART).

DART: data-driven archaeology

DART aimed is to develop analytical methods to differentiate archaeological sediments from non-archaeological strata, on the basis of remotely detected phenomena (e.g. resistivity, apparent dielectric permittivity, crop growth, thermal properties etc). The data collected by DART is of relevance to a broad range of different communities. Open Science was adopted with two aims:

  • to maximise the research impact by placing the project data and the processing algorithms into the public sphere;
  • to build a community of researchers and other end-users around the data so that collaboration, and by extension research value, can be enhanced.

‘Contrast dynamics’, the type of data provided by DART, is critical for policy makers and curatorial managers to assess both the state and the rate of change in heritage landscapes, and helps to address European Landscape Convention (ELC) commitments. Making the best use of the data, however, depends on openly accessible dynamic monitoring, along the lines of that developed for the Global Monitoring for Environment and Security (GMES) satellite constellations under development by the European Space Agency. What is required is an accessible framework which allows all this data to be integrated, processed and modelled in a timely manner.

It is critical that policy makers and curatorial managers are able to assess both the state and the rate of change in heritage landscapes. This need is wrapped up in national commitments to the European Landscape Convention (ELC). Making the best use of the data, however, depends on openly accessible dynamic monitoring, along similar lines to that proposed by the European Space Agency for the Global Monitoring for Environment and Security (GMES) satellite constellations. What is required is an accessible framework which allows all this data to be integrated, processed and modelled in a timely manner. The approaches developed in DART to improve the understanding and enhance the modelling of heritage contrast detection dynamics feeds directly into this long-term agenda.

Cross-disciplinary research and Open Science

Such approaches cannot be undertaken within a single domain of expertise. This vision can only be built by openly collaborating with other scientists and building on shared data, tools and techniques. Important developments will come from the GMES community, particularly from precision agriculture, soil science, and well documented data processing frameworks and services. At the same time, the information collected by projects like DART can be re-used easily by others. For example, DART data has been exploited by the Royal Agricultural University (RAU) for use in such applications as carbon sequestration in hedges, soil management, soil compaction and community mapping. Such openness also promotes collaboration: DART partners have been involved in a number of international grant proposals and have developed a longer term partnership with the RAU.

Open Science advocates opening access to data, and other scientific objects, at a much earlier stage in the research life-cycle than traditional approaches. Open Scientists argue that research synergy and serendipity occur through openly collaborating with other researchers (more eyes/minds looking at the problem). Of great importance is the fact that the scientific process itself is transparent and can be peer reviewed: as a result of exposing data and the processes by which these data are transformed into information, other researchers can replicate and validate the techniques. As a consequence, we believe that collaboration is enhanced and the boundaries between public, professional and amateur are blurred.

Challenges ahead for Open Science

Whilst DART has not achieved all its aims, it has made significant progress and has identified some barriers in achieving such open approaches. Key to this is the articulation of issues surrounding data-access (accreditation), licensing and ethics. Who gets access to data, when, and under what conditions, is a serious ethical issue for the heritage sector. These are obviously issues that need co-ordination through organisations like Research Councils UK with cross-cutting input from domain groups. The Arts and Humanities community produce data and outputs with pervasive social and ethical impact, and it is clearly important that they have a voice in these debates.

Open Knowledge Foundation at Mozilla Festival – meet us!

Beatrice Martini - October 24, 2013 in Events, Join us, Meetups, OKFestival, Open Science, School of Data, Workshop

IMGP4529

At the Open Knowledge Foundation we love festivals – and attending is just half of the fun, we really like making things happen. So as soon as our friends over at Mozilla started building up their fabulous Mozilla Festival we decided to roll up our sleeves and join the party!

Mozilla Festival will take place in London (UK) on October 25th-27th. A big group from our team (who? Read on to know more about it) will head over and spread all around town for the duration. Our calendar:

Who of the Open Knowledge Foundation staff members will be at Mozilla Festival and can’t wait to meet you (ping them on Twitter to find them – links below)?

  • Beatrice Martini (Events Coordinator) joining the Mozilla Team as enthusiast friend volunteer, supporting the work of Mozilla Festival’s Events Coordinator (the wonderful Michelle Thorne) and warming up for OKFestival 2014 next July (do join us – sign up on the website for news!)
  • Zara Rahman, Christian Villum, Katelyn Rogers (Community Managers for Local Groups, Working Groups and Open Government Data – not in that order) running the Building collaboration across the open space workshop
  • Michelle Brook (Open Education Community Coordinator) coordinating the Open Science on the Web workshop
  • Michael Bauer, Milena Marin (School of Data) and Heather Leson (Community Engagement Director) rocking the Data Expedition Bootcamp
  • Sander van der Waal (Head of Long Term Projects Unit), James Hamilton (Development Director), Marieke Guy (LinkedUp Project Community Coordinator) meeting, supporting, linking up

Dear festival-goers, see you there – and at our very own upcoming festival, OKFestival 2014!

Crowdcrafting: Putting Citizens in Control of Citizen Science

Open Knowledge - September 17, 2013 in Open Science, PyBossa

Press Release: Geneva, 17 September 2013

Speaking at the Open Knowledge Conference, the world’s leading event on open data, Co-director of the Open Knowledge Foundation, Rufus Pollock, announced today that the open-source platform Crowdcrafting has grown to accommodate over 120 projects, making it the world’s most diverse open-source platform for online citizen science and crowdsourced data analysis.

Crowdcrafting is a collaboration between the Citizen Cyberscience Centre and the Open Knowledge Foundation, launched six months ago. Since its launch, a number of important projects have been built and developed using the tool.

The project ForestWatchers, for example, enables citizen-based monitoring of the deforestation in developing regions. Built on Crowdcrafting’s open source technology, it has received the support of the Open Society Foundations for a second phase in which local knowledge from citizens in the field can be integrated with the maps produced by online participants.

crowdcrafting antimatter

Another project, Rural GeoLocator comes from the Public Health Computing group at the Swiss Tropical and Public Health Institute in Basel. The goal of this application is to help the SolarMal project, which studies the potential of innovative mosquito trapping technologies for malaria control. The geo-locations of the houses will be used to inform the project logistics and analysis of the SolarMal project.

Other projects that run on Crowdcrafting include “Does Antimatter fall up or down?” an application exploring the effect of gravity on antimatter.; Air quality with lichens to analyse and classify lichens as indicators of air pollution levels; and the Shell JIV transcription project which aims to transcribe the locations of oil spills in the Niger Delta from documents provided by Shell. Recognizing the broad power and potential of this platform, the Shuttleworth Foundation this month awarded one of its prestigious fellowships to the lead developer of Crowdcrafting, Daniel Lombraña González of the Citizen Cyberscience Centre.

Recent new developments have extended the scope of Crowdcrafting to include the collection of sensory information via mobile phones. Seamless integration with the Open Knowledge Foundation’s flagship CKAN database for open data means the tool will form an important part of the future of open science.

John Ellis, keynote speaker at Open Knowledge Conference, and world-renowned theoretical physicist at CERN and King’s College London commented:

“I was amazed how students at the CERN Webfest in August could turn CERN data on antimatter into a new citizen science project within just a weekend. This shows the power of the Crowdcrafting platform.”

Also speaking at Open Knowledge Conference, Francesco Pisano, director of research for the UN Institute for Training and Research, one of the founding partners of the Citizen Cyberscience Centre, remarked:

“Crowdcrafting is more than just a tool for basic science. Our UNOSAT programme is adapting the technology to efficiently combine the strength of volunteer computing with the work the UN and many NGOs have to do in generating information and assessments after natural disasters and other humanitarian crises.”

Denis Hochstrasser, vice-rector for research at the University of Geneva, which is hosting the Open Knowledge Conference satellite event on Open and Citizen Science, added:

“I’m proud that the Crowdcrafting platform is based here at University of Geneva. And I’m personally convinced that this grass-roots approach to citizen science will have a large impact on biomedical research, a core competence of our University. This is an area where increasingly, communities of patients are pro-actively collecting and analyzing their own medical data.”

ENDS


Crowdcrafting will feature in a special satellite event on Open and Citizen Science at this week’s Open Knowledge Conference in Geneva, where Daniel Lombraña González will be helping prospective new users set up their projects.

New Panton Fellows Announced!

Michelle Brook - September 16, 2013 in Featured, Open Science, Panton Fellows, WG Open Data in Science

We’ve just finished the second round of appointments for the Panton Fellowships, and this year we have three Fellows joining us: Rosie Graves (UK), Peter Kraker (Austria), and Sam Moore (UK). Peter will be joining us at OKCon this year, so please come and find him and introduce yourself!

Panton fellows 2013 Left to right: Sam Moore, Peter Kraker and Rosie Graves

We had some really excellent applications this year, so we’d like to say a massive thank you to everyone who helped us spread the word.

The Panton Fellowships are year-long awards with a small stipend and additional travel budget. They are typically awarded to early career researchers (although those with other backgrounds are very welcome to apply), who submit proposals that we think will be really beneficial to promote open science, open data in science, and the Panton Principles.

We’ve expanded from last year’s two Fellows, who were both based in the UK, so we are really excited to see what they manage to do! We’re aware that this year’s Fellows are going to have a tough job living up to the amazing work carried out by our inaugural Fellows, Ross Mounce and Sophie Kershaw, however this lot certainly have the potential to do so!

Each of the Fellows will be writing a short introductory post about themselves, their thoughts and plans for the year. We also hope they will be posting semi regular updates about their activities, so sign up to the open science working group mailing list to be kept up to date!

Publishing research without data is simply advertising, not science

Guest - September 3, 2013 in Open Science

The following post is by Graham Steel. It is an adaptation of a five minute lightning talk given at Glasgow’s 1st Open Knowledge Foundation meet-up.

Commencing in 2001, I became involved in the Charitable Sector as Vice-Chair of a support group for families affected by a rare and invariably fatal neurodegenerative disease. This led to reading scientific papers for the first time. We were sent paper copies of Manuscripts published by the main two research camps in the UK who were involved in this field.

In the following year or so, I was alerted to PubMed which opened my eyes to a world of relevant literature, albeit, mainly only at the Abstract level. I drafted a template email to send to corresponding authors begging for PDF’s of full articles as it is pretty impossible to fully digest scientific papers just from the Abstract. 90% of requests were successful over the years.

Science-paywall1

My library of research papers is fairly extensive (this is just the OA subset) and on not one occasion have I paid to obtain literature of interest. An interesting quote from Jack Andraka in this regard:-

“This was the [paywall to the] article I smuggled into class the day my teacher was explaining antibodies and how they worked. I was not able to access very many more articles directly. I was 14 and didn’t drive and it seemed impossible to go to a University and request access to journals.

“I soon learned that many of the papers I was interested in reading were hidden behind expensive paywalls. I convinced my mom to use her credit card for a few but was discouraged when some of them turned out to be expensive but not useful to me. She became much less willing to pay when she found some in the recycle bin!” – SOURCE

In mid 2006, I stumbled upon something dearly refreshing. Open Access (OA), a dream come true for a Patient Advocate. As I continued to delve into the world of ‘open’, in 2007, I learned about Open NoteBook Science (ONS) and made contact (and subsequently met in person) with the leaders in this field.

Having already become involved in the Open Knowledge Foundation (OKFN) that year, I later found about the Panton Principles in 2010. Great to see this development:-

“Science is based on building on, reusing and openly criticising the published body of scientific knowledge.

For science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made open”.

In terms of Open Data Repositories, there is an extensive list which can be accessed here on the Open Access Directory (OAD). Go check it out.

The one that I am most familiar with is figshare where I blog about Open Science. So my journey into science looks like this:-

  • started with paper copies of some papers
  • limited access to Toll Access papers
  • full access to required Toll Access papers
  • discovering OA
  • making contact with the OKFN
  • finding out about ONS
  • Open Data
  • Open Science

The title of this post went a stage further in May this year in the form of a post by Claire Bower, Digital Comms. Manager, BMJ Group:-

“Publishing articles without making the data available is scientific malpractice”

To quote from the final paragraph:-

“As more funders and learned societies call for new ways to make research data more available, reusable and reproducible, it will be interesting to see how established and emerging platforms will work with researchers and publishers to make access to data as pain-free as possible”.

Conclusion

“Give a scientist data/tools, and you feed the science world for a day. Teach them openness, and you feed the science world for a lifetime”Jonathan Eisen, 2011


steel

Graham Steel has been involved in Patient Advocacy since 2001 and is a strong advocate and vocal supporter of Open Access, Open Data and Free Culture (interview).

Cover image: Tonsil biopsy in variant CJD, by Sbrandner, CC-BY-SA

OKCon 2013 Guest Post: Is Open Source Drug Discovery Practical?

Guest - August 30, 2013 in Events, Join us, OKCon, Open Knowledge Foundation, Open Science

The following guest post is by Matthew Todd, Senior Lecturer at the School of Chemistry, The University of Sydney and Sydney Ambassador of the Open Knowledge Foundation. As part of OKCon 2013 Matthew will host a satellite event entitled ‘Is Open Source Drug Discovery Practical?’, taking place on on Thursday 19 September from 09:00 – 12:00 at the World Health Organization (WHO) – UNAIDS HQ. (Find instructions about how to get there below, and register to attend the event here).


IsOpenSourceDrugDiscoveryPracticalIf we value collaboration as a way of speeding scientific progress, we should all embrace open science since it promises to supercharge the collaboration process, both by making data available to anyone but also by allowing anyone to work on a problem. Open science can obviously promise this because of its essential and defining condition: openness. We, as humans, default to this way of interacting with each other, but such norms can be overridden where there is some advantage in keeping secrets. A possible advantage might be financial, meaning there may be an incentive to work in a closed way if something one has done can be capitalized on for financial reward, leading to the idea of “intellectual property” and its protection through patents.

So we appear to have two opposing forms of enquiry. One that is open (without patents) and one that is closed. Clearly there are examples of great things arising from both.

One of the areas of science that has been of late dominated by the private sector is the pharmaceutical industry. Many effective medicines have been developed using the current model, but is it the only way? Might drug discovery that aligns with open source principles be possible?

My lab has been involved in trying to answer this question, both in developing ways to improve how we make medicines and how we discover new ones. The latter project, Open Source Malaria, directly challenges the idea that something new and of potential value to health should be sequestered away from public involvement. The OSM project abandons the protection of intellectual property so we may take advantage of the greatest number of people working on the problem in a barrierless, meritocratic collaboration.

There are historical arguments that patents are not necessary to drug discovery. Therapeutics of great value have been developed without patents, such as penicillin and the polio vaccine. The ability to patent molecular structures (rather than the methods used to make them) is a relatively recent invention. Patents have been accused of allowing companies to innovate less frequently.

But is an open approach really possible for the development of a new drug? Who would pay for the clinical trials? Who would invest money in the medicine if there is no monopoly on selling it downstream? Is there a realistic economic model that can take a promising new therapeutic and turn it into a medicine for treating millions of people? If open drug discovery is possible for diseases such as malaria, where there is little prospect of a profit, can the same model be applied to a disease like cancer, or Alzheimer’s, where the predicted profit would be very high under the current model?

These questions will all be addressed at a session I am hosting at the Open Knowledge Conference. This satellite event, taking place on the Thursday, is entitled “Is Open Source Drug Discovery Practical?“. I am very excited to have assembled a highly knowledgeable panel to discuss these issues, and in some ways it is lucky that OKCon is taking place in Geneva, where so many of the people most relevant to the current method of finding new medicines are located. The speakers are from the World Health Organisation, the Medicines for Malaria Venture, the Drugs for Neglected Diseases Initiative, GlaxoSmithKline, the World Intellectual Property Organisation and the Structural Genomics Consortium. If anyone is able to answer the session’s main question, these speakers can.

These panel members will have 10 minutes to speak about their organization’s efforts related to a more open approach to drug discovery. We will then have some coffee, and then turn to addressing some of the key questions above. There will be ample chance for members of the audience to take an active role in the discussion. If you are interested in the quandary of how we are going to find the drugs that we most need for the coming generations, and how we might be able to use open data and open research to do that, then this session is for you. The subject is so interesting because the discovery of effective new medicines is very hard: we assume, then, that the best way to do the research is using a massively distributed collaboration with lots of open data, yet that model is a real challenge today because of the structures we have put in place to support the industry.

Please join us! The session will take place at WHO’s main headquarters from 09:00 till 12:00. So we ensure we don’t overflow the room, please register to attend here, where you will also find more detail of the specific items for discussion and the panel members.

Location: Initially sign in at the WHO main building then go across to the WHO-UNAIDS building, meeting room D46031 (take lift 33/34 to go to the 4th floor).

Instructions on getting to WHO by public transport

[Picture credit]

Get Updates