You are browsing the archive for Open Data.

Announcing the Open Definition Licenses Service

February 16, 2012 in Open Content, Open Data, Open Definition, Open Knowledge, Open Knowledge Definition, Open Standards, Our Work, WG Open Licensing

We’re pleased to announce a simple new service from the Open Knowledge Foundation as part of the Open Definition Project: the (Open) Licenses Service.

open licensing

The service is ultra simple in purpose and function. It provides:

  • Information on licenses for open data, open content, and open-source software in machine readable form (JSON)
  • A simple web API that allows you retrieve this information over the web — including using javascript in a browser via JSONP

In addition to the service there’s also:

What’s Included

There’s data on more than 100 open (and a few closed) licenses including all OSI-approved open source licenses and all Open Definition conformant open data and content licenses. Also included are a few closed licenses as well as ‘generics’ — licensed representing a category (useful where a user does not know the exact license but knows, for example, that the material only requires attribution).

View all the licenses available »

In addition various generic groups are provided that are useful when constructing license choice lists, including non-commercial options, generic Public Domain and more. Pre-packaged groups include:

The source for all this material is a git licenses repo on github. Not only does it provide another way to get the data, but also means that if you spot an error, or have a suggestion for an improvement, you can file an issue on the Github repo or fork, patch and submit a pull request.

Why this Service?

The first reason is the most obvious: having a place to record license data in a machine readable way, especially for open licenses (i.e. for content and data those conforming to the Open Defnition and for Software the Open Source Definition).

The second reason is to make it easier for other people to include license info into their own apps and services. Literally daily, new sites and services are being created that allow users to share or create content and data. But when they do that, if there’s any intention for that data to get used and reused by others it’s essential that the material get licensed — and preferably, openly licensed.

By providing license data in a simple machine-usable, web friendly format we hope to make it easier for people to integrate license choosers — and good license defaults — into their sites. This will provide not only greater clarify, but also, more open content and data — remember, no license usually means defaulting to the most restrictive, all rights reserved, condition.

Translators needed!

February 10, 2012 in CKAN, OKF Projects, Open Data, Our Work, Releases, Volunteer Opportunities

Do you speak another language apart from English? Have you got a little bit of spare time over the next week?

CKAN 1.6 is set to release in one week’s time and all the new features need translating. Can you help us complete it in time? If you can spend 15 minutes filling in the gaps using the Transifex website, then not only will community CKANs in your country benefit (e.g. Czech, Swedish, French etc), but so will the international CKANs run in your language! (e.g. thedatahub.org, datacatalogs.org, publicdata.eu)

These are the languages and how complete the translations are:

https://www.transifex.net/projects/p/ckan/resource/1-6/

Serbian 83%

Finnish 83% Norwegian 83%

Portuguese 83%

Italian 83%

Catalan 83%

French 83%

Polish 82% Czech 82%

German 80%

Spanish 76%

Swedish 74%

Hungarian 58%

Albanian 43%

Dutch 37%

Bulgarian 37%

Greek 27%

Slovenian 23%

It’s easy to do some translating!

First timers will need to setup their account first:

  1. Log-in with Transifex/Facebook/Twitter/Google account here.

  2. Choose a CKAN language team: https://www.transifex.net/projects/p/ckan/teams/

  3. Click “Join this team”

  4. Wait for me or another admin to approve you

Now to translate:

  1. https://www.transifex.net/projects/p/ckan/resource/1-6/

  2. Click on your language

  3. Press “Translate”.

Every day this week I’ll put the translations up on thedatahub.org for you to see the results. Please help make help make this open data catalogue readable by as many people as possible!

Open Knowledge Foundations’s CKAN Software to Power new European Commission Data Portal

January 31, 2012 in CKAN, News, OKF Projects, Open Data

[CKAN logo]

CKAN will power the new European Commission (EC) Data Portal

The European Commission is to make its data publicly and openly available through a new data portal, along the lines of those already used by national governments such as http://data.gov.uk/. Like http://data.gov.uk/ the new site will be based on the open-source CKAN Data Portal Software developed by the Open Knowledge Foundation.

The Foundation will also be one of the partners in the project to build the site; the project’s official press release is below. See also the announcement on the CKAN blog.


PRESS RELEASE – FOR IMMEDIATE RELEASE

TenForce, the Open Knowledge Foundation and InfAI to develop European Commission open data portal

Open data will encourage re-use, improving transparency, policy-making and growth

The European Commission (EC) has awarded a contract to create an open data portal website, where data produced by European Commission services will be freely available. Belgian company TenForce will lead the project to deliver the portal, supported by Leipzig University’s Institute for Applied Computer Science (InfAI), and UK-based non-profit the Open Knowledge Foundation.

Users will be able to search for information in a flexible range of ways, for example by subject area, country, and region, and to visualise the data or download it for re-use in research, campaigns or commercial applications. The EC and the contracted partners will run workshops and other outreach activities, to raise awareness of and interest in the data among companies, researchers, journalists and policy groups.

The site will be based on open source software components including Drupal and CKAN. CKAN is a powerful data portal software package written by the Open Knowledge Foundation; it is already used to catalogue freely-available data from a number of governments, both within and beyond the EU. As well as viewing or downloading the raw data, users will be able to view it by way of sophisticated graphic visualisations developed by InfAI. TenForce will be responsible for the overall management, the architecture of the portal, store deployment and taxonomy management and some of the integration work.

ENDS

NOTES FOR EDITORS

1 Background
On 12 December 2011 the European Commission presented an Open Data Strategy for Europe setting out clearer rules on making the best use of government-held information. The proposed Open Data Strategy will make it easier for business and citizens to find and re-use information held by public sector bodies in the Member States and by the Commission itself. Primarily, the Commission plans to update the 2003 Directive on the re-use of public sector information. The Commission has also updated its own re-use rules so as to make its data available in machine-readable format and to include data from research by the Joint Research Centre. In 2012 the Commission will launch a web portal making it easy for industry and citizens to search for Commission data. More information here and here.
2. TenForce
TenForce BVBA is a Belgian software company specialized in the design, development and delivery of practical solutions to complex problems. TenForce has years of international experience in knowledge management, and an in depth expertise in emerging technologies. Besides designing, marketing and supporting its flagship product – a web-based management environment for project and operational activities – it conducts several projects on a European scale focusing on modelling complex systems for publishing solutions. Contact: Bastiaan Deblieck, bastiaan.deblieck@tenforce.com , +32 16 31 48 60.
3. InfAI
InfAI is an institute of the University of Leipzig, one of the oldest (founded 1409) and largest (30.000 students) universities in Germany. InfAI hosts the world class Knowledge Engineering Research Group (http://aksw.org), which is establishing theoretical results and scalable implementations for the field of knowledge engineering. The group’s tools and services enjoy considerable popularity: the open-source Semantic Web framework OntoWiki, for example, is downloaded more than 500 times a month, and applied in cases ranging from creating biomedical ontologies to knowledge management for business. Contact: Sören Auer, auer@informatik.uni-leipzig.de
4. Open Knowledge Foundation
The Open Knowledge Foundation (OKF) is a not-for-profit organization founded in 2004, dedicated to promoting open knowledge in all its forms. It builds tools and communities with a network of international leaders in this field. Projects include CKAN, a data portal that powers the UK government’s http://data.gov.uk/ and the pan-European http://publicdata.eu/ and several dozen other government and community data sites around the world; and the OpenSpending — which maps government and corporate spending around the world. The Foundation runs forums, workshops and an annual conference drawing together representatives from across the knowledge society – from academics and public servants to entrepreneurs and web developers. Contact: Laura James, laura.james@okfn.org.

Linked Open Data and Low Carbon Development

January 27, 2012 in External, Guest post, Open Data

The following guest post is by Denise Recheis from reegle, the clean energy info portal.

Offering multiple explanations for a concept increases understanding and using LOD allows both humans and machines to semantically connect related content. This is a huge advantage in our increasingly complex world!

Especially in the field of clean energy, the increasing availability of LOD is really beneficial. To make sense of the often complex factors contributing to climate change and the highly technical solutions thereof, as well as rapid development in national and international policy regarding these factors, access to high quality and timely information is crucial.

The clean energy info portal www.reegle.info and the energy info wiki www.openEI.org see themselves as gateways to a wealth of information regarding renewable energy, energy efficiency and climate change issues. They are hosted by REEEP (Renewable Energy and Energy Efficiency Partnership – where I work) and NREL (National Renewable Energy Laboratory) respectively. Both organizations have a strong commitment to the idea of Linked Open Data (LOD) and have been integrating the core principles of LOD into their online portals.

In an effort to increase awareness about the possibilities associated with publishing and consuming LOD, we organized a well-attended workshop in Abu Dhabi in January 2012. Alongside the event, we brought out a publication explaining the basics of LOD, as well as the first steps for any organization considering joining the LOD cloud. “Linked Open Data: The Essentials” (published by Semantic Web Company and REEEP) is available as a downloadable PDF, as well as a booklet which can be ordered.

“Linked Open Data: The Essentials” also highlights some best practice examples, two of them being reegle and OpenEI. Reegle’s country energy profiles are a prime example of mashed up open data. These dossiers present the reader with statistics, maps, general facts and policy and regulatory details in a pleasant design. The information is provided by LOD providers such as DBpedia (Wikipedia), the UN and the World Bank, OpenEI and other highly trusted sources. Reegle has also developed an extensive thesaurus covering clean energy and climate compatible development with full liked data capabilities, which is available for free to re-use as a widget or word press plugin, and which is currently used as the basis for a brand-new API. Of course reegle provides all its datasets as Linked Open Data free for re-use and provides datasets in RDF (Resource Description Framework) format and via a SPARQL endpoint on our data portal.

OpenEI (Open Energy Information) has always seen sharing as one of its key missions. The data is available in RESTful API, RDF and SPARQL, for integration into external websites. But even when browsing the site, users benefit from a variety of LOD sources which enhance and increase the information presented. For example, several definitions offered in the glossary are collected from different LOD sources and OpenEI’s country pages feature information from a variety of sources, including reegle’s country energy profiles. This is easily possible when organizations rely on LOD, because when several websites describe the same things they can all be connected and give users a more rounded picture of sometimes difficult subjects.

Our expected end-users include the educational sector, helping students across the world study laws and regulation, efficient engineering, and the latest ideas in clean energy from many different authoritative sources in a single gateway. Specialists and project developers can quickly gather valuable information about specific regions and areas focusing on energy-relevant issues.

Integrating the principles of LOD has had a pleasant side-effect which has been highlighted in the recent workshop in Abu Dhabi: sharing data is often a starting point for fruitful collaborations between organizations with a similar agenda. Sharing data very often also means sharing the work burden. Each organization can then focus on their specific areas of expertise, while freeing up resources from areas that can be taken over by other organizations. Sharing the results of such targeted efforts generates high-quality content, and makes it available to all stakeholders in renewable energy, energy efficiency and climate adaptation/mitigation.

We are committed to increasing the share of information available as LOD, and will continue to actively support other organizations thinking of joining the LOD cloud.

Open Economics Hack Day Saturday January 28th 2012

January 18, 2012 in Events, Hackday / Code Sprint, Open Data, WG Economics, Working Groups

This post is by Velichka Dimitrova, Coordinator for the Economics Working Group at the Open Knowledge Foundation.

On Saturday 28th January we’re getting together for an Open Economics Hackday where we’ll be be wrangling data and building apps related to economics — all are welcome!

As with all hackdays, exactly what gets work on gets decided on the day (you can add suggestions to the etherpad). However, one particular idea, which we could become a submission to Apps4Italy, is set out below.

One Idea for What We’ll Work On: ProgressVote

One of the most fundamental questions in economic research is: how do we measure social progress? Policy makers have come up with alternative measures accounting for environmental impacts, inequality, happiness and other indicators of human development.

However, the multiplicity of factors has caused another problem – how do we decide on the importance of each individual factor in a composite index? They could be either equally important (such as in the HDI) or they could be given different weights.

In our last project YourTopia – which was one of the winners of last year’s World Bank Apps4Development Prize – we offered one possible solution by letting you decide on which dimensions and aspects of economic development to prioritize.

However there are limitations to such an approach: faced with a myriad of technical indicators people are often overwhelmed by the complexity: Does life expectancy at birth matter more than the inflation rate or the M2 money supply? And what does M2 money supply even mean?

In ProgressVote, we’d like to improve on YourTopia in a variety of ways:

First, by combining proxy voting with the crowd-based Yourtopia approach: Instead of voting for indicators, people vote for expert statements that interpret the dashboard of variables. By doing so, it is hoped to strike a balance between expert judgements and the interpretation of the general public: Experts may be more able to interpret technical data, but in the end it is the citizens who decide which expert statement to endorse.

Second, we’d like to add support time series — so you can see how progress (or lack of it) has evolved over time — as well as better geo support — for example, so it is possible to look at regions as well as countries have performed (consider Italy for instance).

Interested? Then come join us on Saturday 28th January!

Ideas for OpenPhilosophy.org

December 20, 2011 in Bibliographic, Free Culture, Ideas, Open Content, Open Data, Public Domain, WG Cultural Heritage, WG Humanities, WG Public Domain, Working Groups

The following post is from Jonathan Gray, Community Coordinator at the Open Knowledge Foundation. It is cross-posted from jonathangray.org.

For several years I’ve been meaning to start OpenPhilosophy.org, which would be a collection of open resources related to philosophy for use in teaching and research. There would be a focus on the history of philosophy, particularly on primary texts that have entered the public domain, and on structured data about philosophical texts.

The project could include:

  • A collection of public domain philosophical texts, in their original languages. This would include so called ‘minor’ figures as well as well known thinkers. The project would bring together texts from multiple online sources – from projects like Europeana, the Internet Archive, Project Gutenberg or Wikimedia Commons, to smaller online collections from libraries, archives, academic departments or individual scholars. Every edition would be rights cleared to check that it could be freely redistributed, and would be made available either under an open license, with a rights waiver or a public domain dedication.
  • Translations of public domain philosophical texts, including historical translations which have entered the public domain, and more recent translations which have been released under an open license.
  • Ability to lay out original texts and translations side by side – including the ability to create new translations, and to line up corresponding sections of the text.
  • Ability to annotate texts, including private annotations, annotations shared with specific users or groups of users, and public annotations. This could be done using the Annotator tool.
  • Ability to add and edit texts, e.g. by uploading or by importing via a URL for a text file (such as a URL from Project Gutenberg). Also ability to edit texts and track changes.
  • Ability to be notified of new texts that might be of interest to you – e.g. by subscribing to certain philosophers.
  • Stable URLs to cite texts and or sections of texts – including guidance on how to do this (e.g. automatically generating citation text to copy and paste in a variety of common formats).

The project could also include a basic interface for exploring and editing structured data on philosophers and philosophical works:

  • Structured bibliographic data on public domain philosophical works – including title, year, publisher, publisher location, and so on. Ability to make lists of different works for different purposes, and to export bibliographic data in a variety of formats (building on existing work in this area – such as Bibliographica and related projects).
  • Structured data on secondary texts, such as articles, monographs, etc. This would enable users to browse secondary works about a given text. One could conceivably show which works discuss or allude to a given section of a primary text.
  • Structured data on the biographies of philosophers – including birth and death dates and other notable biographical and historical events. This could be combined with bibliographic data to give a basic sense of historical context to the texts.

Other things might include:

  • User profiles – to enable people to display their affiliation and interests, and to be able to get in touch with other users who are interested in similar topics.
  • Audio version of philosophical texts – such as from Librivox.
  • Links to open access journal articles.
  • Images and other media related to philosophy.
  • Links to Wikipedia articles and other introductory material.
  • Educational resources and other material that could be useful in a teaching/learning context – e.g. lecture notes, slide decks or recordings of lectures.

While there are lots of (more or less ambitious!) ideas above, the key thing would be to develop the project in conjunction with end users in philosophy departments, including undergraduate students and researchers. Having something simple that could be easily used and adopted by people who are teaching, studying or researching philosophy or other humanities disciplines would be more important that something cutting edge and experimental but less usable. Hence it would be really important to have a good, intuitive user interface and lots of ongoing feedback from users.

What do you think? Interested in helping out? Know of existing work that we could build on (e.g. bits of code or collections of texts)? Please do leave a comment below, join discussion on the open-humanities mailing list or send me an email!

SNCF launches a debate on open transport data in France

December 15, 2011 in Guest post, Open Data, WG Open Transport

The following guest post is by Pieter Colpaert from iRail npo and Pierre Chrzanowski, and was reviewed by Regards Citoyens. Pieter and Pierre are both members of our brand new Working Group on Open Transport – watch this space for a full announcement of the working group’s activities and details on how to get involved!”

At first sight, you may think that data.sncf.com is the new open data website of the SNCF, the National Corporation of French Railways. Not yet. The company preferred to launch a consultation website before opening up its data. Anyone can add their thoughts on open transport data on data.sncf.com.

In a country struggling to involve the transport industry in the open data movement, this initiative is most welcome. After the release of data.gouv.fr, we hope transport data will soon be part of the available datasets. The lack till today of open transport data in France led independent initiatives to extract the data without authorisation, placing them in legal insecurity. A change by SNCF is therefore really welcome.

Although SNCF seems to be ready for open data, other public transport operators in France are still reluctant. RATP, the state-owned subway operator for Paris area, recently refused to let other app developers use its map for free. This inspired CheckMyMetro, a startup which was forced to remove the RATP map from its smartphone application, to organize a subway map design contest.

As a lot of organizations are launching similar debates on open data, it is important that they rightfully apply the word “open” and that while doing this they know how to gain an added value for themselves and their customers. Data.scnf.com is a great opportunity to remind the SNCF and other transport actors in Europe of the actual meaning of the word “open” and to help introduce a productive open data policy.

Open data for multimodal transport

Today, commuters use different types of transport to go to work or to travel across Europe. For them, access to timetables, networks maps and real-time transport data is the key to organize their journey or to get informed of disruptions. Multimodal transport is part of the last European Commission transport policy which has announced the launch of a contest for the best European multimodal journey planner. The software behind these intermodal journey planners can be as intelligent as can be, but when there is no data, the software is useless.

Some countries are already doing their part. The UK Government recently committed itself to the release of high-value transport data. Which also seems to provide a good input to answer the data.sncf.com consultation. Here is the comprehensive list of transport data soon to be released: - Rail timetable information on a weekly basis - Real-time running data from Network Rail - Location about Great Britain Rail Network and GB rail network stations - Traveline National Dataset on a weekly basis (Great Britain buses) - Next Buses API of planned and real-time information at 350 000 GB bus stops

There are already many journey planner apps offered either by transport companies or developed by independent developer teams, but only a few can help you to organize your journey across the whole EU – deutschebahn offers the closest. Furthermore, with open data, there are new services to come that transport companies did not think about.

Transport innovation through real open data

By starting a debate on open data, data.sncf.com wants to take the first steps towards clearing the path for innovative services. The definition of open data is clear and not debatable. As defined by the Open Knowledge Foundation: “A piece of content or data is open if anyone is free to use, reuse, and redistribute it – subject only, at most, to the requirement to attribute and share-alike”. This means data need to be released for free in an open license and available in open formats. The French statements on open data also give a clear definition of what “open” means. SNCF could then choose to open its datasets either under the new French Open License or among other open licenses available like the ODbL, already in use in different French cities. On open formats, the 5 star-ranking of the W3C is a good reference. But open transport data is part of an industry and a new market. If we want to help developers to develop multimodal apps, the respect of standards is required.

Let’s hope this initiative from the SNCF is the beginning of a real shift towards open transport data in France and beyond.

You can participate to the SNCF debate here

The ePSIplatform is also working on a report on the re-use of transport data in Europe. You can reply to their questionnaire here.

Open Data – Destination Hackney

December 14, 2011 in Guest post, Open Data

The following guest post is by Duncan Ray, from Destination Hackney.

In Summer 2012, the borough of Hackney in London will be opening its doors to millions of visitors flocking to the Olympic games. It’s an exciting time for this part of London, and through the Race for Apps competition it’s a fantastic opportunity for open data too!

Race for Apps is a competition to crowdsource mobile apps from the digital community to showcase Hackney’s area and talent to journalists and businesspeople coming into the local Hackney area next year during the Olympics. It’s a collaboration between Hackney Council, the Technology Strategy Board and Digital Shoreditch.

Race For Apps from race for apps on Vimeo.

From the beginning the organisers were very conscious of how releasing data openly could lead to the creation of innovative apps, and that data could provide a big differentiator for the competition.

We carried out research, talking to various developers, and got the strong impression of ‘find my nearest’ and ‘what’s on’ guides as being the key data sets. We have maintained an open dialogue with developers as the competition has gone live, and are still picking up with our internal IT team where data is being requested as the foundation for a new app entrant.

The website Destination Hackney (www.destinationhackney.co.uk), launching in January, showcases Hackney for visitors to the area next year.

Fortunately they had a ready-made data set covering local businesses and events.

Unfortunately the data sets weren’t open: they weren’t licensed for third party commercial use, which made app development a bit of a non-starter.

To resolve this we are now establishing a new dataset, hosted on Destination Hackney’s site, that can be freely used by app developers, and we will be updating this via RSS. The dataset is licensed under the attribution-only Open Government License for Public Sector Information. This will go live from January 2012 for Race for Apps entrants and the data set will build as more businesses come on board. Benefits for businesses and events are substantial, with those apps using Destination Hackney data providing a plethora of new marketing channels for businesses to use to get to visitors next year.

The competition is now on, and you can enter your ideas, finished apps, or apps reworked for the local context in four categories of “Finding Your Way”, “Making Connections”, “Citizen Journalists”, “Fun and Games”, and “Wild Card” on the Race for Apps site. Or you can hold out for the release of the Destination Hackney dataset in the New Year!