Support Us

You are browsing the archive for CKAN.

Open Data Training at the Open Knowledge Foundation

Laura James - September 26, 2013 in Business, CKAN, Featured, Open Data, Open Government Data, Open Knowledge Foundation, Our Work, School of Data, Technical, Training

We’re delighted to announce today the launch of a new portfolio of open data training programs.

For many years the Open Knowledge Foundation has been working — both formally and informally — with governments, civil society organisations and others to provide this kind of advice and training. Today marks the first time we’ve brought it all together in one place with a clear structure.

These training programs are designed for two main groups of people interested in open data:

  1. Those within government and other organisations seeking a short introduction to open data – what it is, why to “do” open data, what the challenges are, and how to get started with an open data project or policy.

  2. The growing group of those specialising in open data, perhaps as policy experts, open data program managers, technology specialists, and so on, generally within government or other organisations. Here we offer more in-depth training including detailed material on how to run an open data program or project, and also a technical course for those deploying or maintaining open data portals.

Our training programs are designed and delivered by our team of open data experts with many years of experience creating, maintaining and supporting open data projects around the world.

Please contact us for details on any of the these courses, or if you’d be interested in discussing a custom program tailored to your needs.

Our Open Data Training Programs

Open Data Introduction

Who is this for?

This course is a short introduction to open data for anyone and is perfectly suited to teams from diverse functions across organisations who are thinking about or adopting open data for the first time.

Topics covered

Everything you need to understand and start working in this exciting new area: what is open data, why should institutions open data, what are the benefits and opportunities to doing so, and of course how you can get started with an open data policy or project.

This is a one day course to help you and your team get started with open data.

Photo by Victor1558

Administrative Open Data Management

Who is this for?

Those specialising in open data, whether as policy experts, open data program managers and similar roles in government, civil service, and other organisations. This course is specifically for non-technical staff who are responsible for managing Open Data programs in their organisation. Such activities typically include implementing an Open Data strategy, designing/launching an Open Data portal, coordinating publication processes, preparing data for publication, and fostering data re-use.

Topics covered

Basics of Open Data (legal, managerial, technical); Success factors for the design and execution of an Open Data program; Overview of the technology landscape; Success factors for community re-use.

Open Data Portal Technology

Who is this for?

Those specializing in open data, whether as software or data experts, and open data delivery managers and similar roles in government, civil service, and other organisations. Technical staff who are responsible for maintaining or running an enterprise Open Data portal. Such activities typically include deployment, system administration and hosting, site theming, development of custom extensions and applications, ETL procedures, data conversions, data life-cycle management.

Topics covered

Basics of Open Data, publication process, and technology landscape; architecture and core functionality of a modern Open Data Management System (CKAN used as example). Deployment, administration and customisation; deploying extensions; integration; geospatial and other special capabilities; engaging with the CKAN community.

Photo by Victor1558

Custom training

We can offer training programs tailored to your specific needs, for your organisation, data domain, or locale. Get in touch today to discuss your requirements!

Working with data

We also run the School of Data, which helps civil society organisations, journalists and citizens learn the skills they need to use data effectively, through both online and in-person “learning through doing” workshops. The School of Data runs data-driven investigations and explorations, and data clinics and workshops from “What is Data” up to advanced visualisation and data handling. As well as general training and materials, we offer topic-specific and custom courses and workshops. Please contact schoolofdata@okfn.org to find out more.

As with all of our work, all relevant materials will be openly licensed, and we encourage others (in the global Open Knowledge Foundation network and beyond) to use and build on them.

Publish from ScraperWiki to CKAN

Guest - July 5, 2013 in CKAN, Featured Project

The following post is by Aidan McGuire, co-founder of ScraperWiki. It is cross-posted on the ScraperWiki blog.

ScraperWiki are looking for open data activists to try out our new “Open your data” tool.

Since its first launch ScraperWiki has worked closely with the Open Data community. Today we’re building on this commitment by pre-announcing the release of the first in a series of tools that will enable open data activists to publish data directly to open data catalogues.

To make this even easier, ScraperWiki will also be providing free datahub accounts for open data projects.

This first tool will allow users of CKAN catalogues (there are 50, from Africa to Washington) to publish a dataset that has been ingested and cleaned on the new ScraperWiki platform. It’ll be released on the 11th July.

screenshot showing new tool (alpha)

If you run an open data project which scrapes, curates and republishes open data, we’d love your help testing it. To register, please email hello@scraperwiki.com with “open data” in the subject, telling us about your project.

Why are we doing this? Since its launch ScraperWiki has provided a place where an open data activist could get, clean, analyse and publish data. With the retirement of “ScraperWiki Classic” we decided to focus on the getting, cleaning and analysing, and leave the publishing to the specialists – places like CKAN.

This new “Open your data” tool is just the start. Over the next few months we also hope that open data activists will help us work on the release of tools that:

  • Generate RDF (linked data)
  • Update data real time
  • Publish to other data catalogues

Here’s to liberating the world’s messy open data!


Aidan McGuire is the co-founder of ScraperWiki, the site which enables you to “Get, clean, analyse, visualise and manage your data,
with simple tools or custom-written code.” Among other things, they write and catalogue screen-scrapers to extract and analyse public data from websites.

U.S. government’s data portal relaunched on CKAN

Irina Bolychevsky - May 23, 2013 in CKAN, Featured, News, Releases

Today, we are excited to announce that our work with the US Federal Government (data.gov) has gone live at catalog.data.gov! You can also read the announcement from the data.gov blog with their description of the new catalog.

Catalog.Data.gov

The Open Knowledge Foundation’s Services team, which deploys CKAN, have been working hard on a new unified catalog to replace the numerous previously existing catalogs of data.gov. All geospatial and raw data is federated into a single portal where data from different portals, sources and catalogs is displayed in a beautiful standardized user interface allowing users to search, filter and facet through thousands of datasets.

This is a key part of the U.S. meeting their newly announced Open Data Policy and marks data.gov’s first major step into open source. All the code is available on Github and data.gov plan to make their CKAN / Drupal set-up reusable for others as part of OGPL.

As one of the first major production sites to launch with the shiny new CKAN 2.0, data.gov takes advantage of the much improved information architecture, templating and distributed scalable authorization model. CKAN provides data.gov with a web interface for over 200 publishing organizations to manage their members, harvest sources and datasets – supporting requirements being outlined in Project Open Data. This means that agencies can maintain their data sources individually, schedule regular refreshes of the metadata into the central repository and manage an approval workflow.

There have been many additions to CKAN’s geospatial functionality, most notably a fast and elegant geospatial search:

Geospatial search filter

We have added robust support for harvesting FGDC and ISO 19139 documents from WAFs, single spatial documents, CSW endpoints, ArcGIS portals, Z39:50 sources, ESRI Geoportal Servers as well as other CKAN catalogs. This is available for re-use as part of our harvesting and spatial extensions.

Most importantly, this is a big move towards greater accessibility and engagement with re-users. Not only is metadata displayed through a browsable web interface (instead of XML files), there is now a comprehensive CKAN API with access to all web functionality including search queries and downloads which respects user and publisher permission settings. Users can preview the data in graphic previews as well as exploring Web Map Services, whilst the dataset page provides context, browsable tags, dataset extent, and maintainers.

Web Map Service

As data.gov invites users to get involved and provide feedback, we would also like to say that we are really excited about CKAN’s future. We have a very active mailing list, new documentation for installing CKAN and ways to contribute to the code for anyone wanting to join the CKAN community.

If you’re launching a CKAN portal soon or have one we don’t know about, let us know and we’ll make sure to add you to our wall of awesome!

Announcing CKAN 2.0

Mark Wainwright - May 10, 2013 in CKAN, Featured, Featured Project, News, OKF Projects, Open Data, Open Government Data, Releases, Technical

CKAN is a powerful, open source, open data management platform, used by governments and organizations around the world to make large collections of data accessible, including the UK and US government open data portals.

Today we are very happy and excited to announce the final release of CKAN 2.0. This is the most significant piece of CKAN news since the project began, and represents months of hectic work by the team and other contributors since before the release of version 1.8 last October, and of the 2.0 beta in February. Thank you to the many CKAN users for your patience – we think you’ll agree it’s been worth the wait.

[Screenshot: Front page]

CKAN 2.0 is a significant improvement on 1.x versions for data users, programmers, and publishers. Enormous thanks are due to the many users, data publishers, and others in the data community, who have submitted comments, code contributions and bug reports, and helped to get CKAN to where it is. Thanks also to OKF clients who have supported bespoke work in various areas that has become part of the core code. These include data.gov, the US government open data portal, which will be re-launched using CKAN 2.0 in a few weeks. Let’s look at the main changes in version 2.0. If you are in a hurry to see it in action, head on over to demo.ckan.org, where you can try it out.

Summary

CKAN 2.0 introduces a new sleek default design, and easier theming to build custom sites. It has a completely redesigned authorisation system enabling different departments or bodies to control their own workflow. It has more built-in previews, and publishers can add custom previews for their favourite file types. News feeds and activity streams enable users to keep up with changes or new datasets in areas of interest. A new version of the API enables other applications to have full access to all the capabilities of CKAN. And there are many other smaller changes and bug fixes.

Design and theming

The first thing that previous CKAN users notice will be the greatly improved page design. For the first time, CKAN’s look and feel has been carefully designed from the ground up by experienced professionals in web and information design. This has affected not only the visual appearance but many aspects of the information architecture, from the ‘breadcrumb trail’ navigation on each page, to the appearance and position of buttons and links to make their function as transparent as possible.

[Screenshot: dataset page]

Under the surface, an even more radical change has affected how pages are themed in CKAN. Themes are implemented using templates, and the old templating system has been replaced with the newer and more flexible Jinja2. This makes it much easier for developers to theme their CKAN instance to fit in with the overall theme or branding of their web presence.

Authorisation and workflow: introducing CKAN ‘Organizations’

Another major change affects how users are authorised to create, publish and update datasets. In CKAN 1.x, authorisation was granted to individual users for each dataset. This could be augmented with a ‘publisher mode’ to provide group-level access to datasets. A greatly expanded version of this mode, called ‘Organizations’, is now the default system of authorisation in CKAN. This is much more in line with how most CKAN sites are actually used.

[Screenshot: Organizations page]

Organizations make it possible for individual departments, bodies, groups, etc, to publish their own data in CKAN, and to have control over their own publishing workflow. Different users can have different roles within an Organization, with different authorisations. Linked to this is the possibility for each dataset to have different statuses, reflecting their progress through the workflow, and to be public or private. In the default set-up, Organization user roles include Members (who can read the Organization’s private datsets), Editors (who can add, edit and publish datasets) and Admins (who can add and change roles for users).

More previews

In addition to the existing image previews and table, graph and map previews for spreadsheet data, CKAN 2.0 includes previews for PDF files (shown below), HTML (in an iframe), and JSON. Additionally there is a new plugin extension point that makes it possible to add custom previews for different data types, as described in this recent blog post.

[Screenshot: PDF preview]

News feeds and activity streams

CKAN 2.0 provides users with ways to see when new data or changes are made in areas that they are interested in. Users can ‘follow’ datasets, Organizations, or groups (curated collections of datasets). A user’s personalised dashboard includes a news feed showing activity from the followed items – new datasets, revised metadata and changes or additions to dataset resources. If there are entries in your news feed since you last read it, a small flag shows the number of new items, and you can opt to receive notifications of them via e-mail.

Each dataset, Organization etc also has an ‘activity stream’, enabling users to see a summary of its recent history.

[Screenshot: News feed]

Programming with CKAN: meet version 3 of the API

CKAN’s powerful application programming interface (API) makes it possible for other machines and programs to automatically read, search and update datasets. CKAN’s API was previously designed according to REST principles. RESTful APIs are deservedly popular as a way to expose a clean interface to certain views on a collection of data. However, for CKAN we felt it would be better to give applications full access to CKAN’s own internal machinery.

A new version of the API – version 3 – trialled in beta in CKAN 1.8, replaced the REST design with remote procedure calls, enabling applications or programmers to call the same procedures as CKAN’s own code uses to implement its user interface. Anything that is possible via the user interface, and a good deal more, is therefore possible through the API. This proved popular and stable, and so, with minor tweaks, it is now the recommended API. Old versions of the API will continue to be provided for backward compatibility.

Documentation, documentation, documentation

CKAN comes with installation and administration documentation which we try to keep complete and up-to-date. The major changes in the rest of CKAN have thus required a similarly concerted effort on the documentation. It’s great when we hear that others have implemented their own installation of CKAN, something that’s been increasing lately, and we hope to see even more of this. The docs have therefore been overhauled for 2.0. CKAN is a large and complex system to deploy and work on improving the docs continues: version 2.1 will be another step forward. Where people do run into problems, help remains available as usual on the community mailing lists.

… And more

There are many other minor changes and bug fixes in CKAN 2.0. For a full list, see the CKAN changelog.

Installing

To install your own CKAN, or to upgrade an existing installation, you can install it as a package on Ubuntu 12.04 or do a source installation. Full installation and configuration instructions are at docs.ckan.org.

Try it out

You can try out the main features at demo.ckan.org. Please let us know what you think!

European Union launches CKAN data portal

Mark Wainwright - February 25, 2013 in CKAN, Open Data, WG EU Open Data, WG Open Government Data

On Friday, to coincide with Saturday’s International Open Data Day, the European Commission (EC) unveiled a new data portal, which will be used to publish data from the EC and other bodies of the European Union.

This major project was announced last year, and it went live in December for testing before today’s announcement. The portal includes extensive CKAN customisation and development work by the Open Knowledge Foundation, including a multilingual extension enabling data descriptions (metadata) to be made available in different languages: at present the metadata is offered in English, French, German, Italian and Polish. The portal was originally planned for EC data, but it will now also hold data from the European Environment Agency, and hopefully in time a number of other EU bodies as well.

The EU has been a key mover in driving the Open Data agenda in member states, so it is fitting that it is now promoting transparency and re-use of its own data holdings by making them available in one place. It has for some years been encouraging member states to publish data via dedicated portals, and it also supports the OKF’s work on publicdata.eu, a prototype of a pan-European data portal harvesting data from catalogues across the Union, via the LOD2 research project.

The portal currently makes 5,885 datasets available, most of which come from Eurostat. In their blog post announcing the launch the European Commission say they are “confident that it will be a catalyst for change in the way data is handled inside the Commission as well as beyond”, and promise more to come:

More data will become available as the Commission’s services adapt their data management and licensing policies and make machine-readable formats the rule. Our ambition is to make an open licence applicable across the board for all datasets in the portal.

Furthermore, in 2013, an overarching pan-European aggregator for open data should federate the content of more than 70 existing open data portal initiatives in the Member States at national, regional or local level.

We’re looking forward to helping make it happen.

New Open data hub from OKFN Greece

Charalampos Bratsas - February 14, 2013 in CKAN, OKF Greece, Open Data, Open Government Data

Opening up public sector data is becoming a top priority for governments throughout Europe and North America. We are pleased to announce the launch of the new Greek open data hub, developed and hosted by OKFN Greece. The data hub integrates the Open Knowledge Foundation’s open source data cataloging software CKAN, which is also the basis of the UK, the European and the US portals.

1BwTJWO4kyf66rdIzFVGfIUk90yszxF34CatgEg

Open data can be used in smart city services, financial monitoring, decision support systems and numerous other applications. The problem is finding them. Supposing you wanted to make a shiny new smartphone app, requiring a combination of geospatial data, some cultural facts and a photo collection. You know this data does exist, but you are also aware that you are going to have a hard time finding their providers, discovering their outgoing links and their license. All of this involves a significant investment of time.

Ordinary citizens, too, are made to invest precious time hunting down and combining data, such as the location of the nearest Job Centre, plus information on how to get there by public transport.

This is why we need data hubs where publishers can use, promote, and advertise all their datasets together. Citizens will also catalog a dataset if it is useful to them and maybe to others. Once the datasets reach a critical level, links between them are discovered and developed, multiplying the value of the datasets and dynamically increasing their significance. Combine this with live data previews, a smart search system and a powerful API and you have taken open data to the next level.

The Greek open data hub includes:

  1. The Open Data repository (http://ckan.okfn.gr). This section of the site is built using the CKAN platform (like the EU & UK sites).
  2. Examples of applications using Greek linked open data, like Greek DBpedia (DayLikeToday, DBpedia game) and visualizations with data from the Clarity Program, the municipalities etc.
  3. A live demo where anybody will be able to submit a SPARQL query and chart its results with Google Chart Editor.
  4. Information about the Greek Linked Open Data cloud – a visual network representation of the Greek Linked Open Data Cloud. OKFN Greece is constantly working on making this one huge!

Find out how you can use the hub, contribute to it, and get involved on our blog!

US government to release open data using OKF’s CKAN platform

Mark Wainwright - February 1, 2013 in CKAN, News, Open Geodata, Open Government Data

You may have seen hints of it before, but the US government data portal, data.gov, has just announced officially that its next iteration – “data.gov 2.0″ – will incorporate CKAN, the open-source data management system whose development is led and co-ordinated by the Open Knowledge Foundation. The OKF itself is one of the organisations helping to implement the upgrade.

Like all governments, the US collects vast amounts of data in the course of its work. Because of its commitment to Open Data tens of thousands of datasets are openly published through data.gov. The new-look data.gov will be a major enhancement, and will for the first time bring together geospatial data with other kinds of data in one place.

CKAN is fast becoming an industry standard, and the US will become the latest to benefit from its powerful user interface for searching and browsing, rich metadata support, harvesting systems to help ingest data from existing government IT systems, and machine interface, helping developers to find and re-use the data. The partnership is also excellent news for CKAN, which is being improved with enhancements to its features for ingesting and handling geodata.

As it happens, CKAN itself is also moving towards a version 2.0. In fact, after months of hard work, the beta-version of CKAN 2.0 will hopefully be released in a couple of weeks. To keep up to date with developments, follow the CKAN blog or follow @CKANproject on Twitter.

Building a data portal with CKAN

Mark Wainwright - September 5, 2012 in CKAN, Open Government Data

A while ago, Augusto Hermann wrote on this blog about a unique civic engagement project: the participatory process of building a government data portal in Brazil. The site, dados.gov.br, is still going strong, and Augusto has now written over on the CKAN blog about the process of building and deploying it using CKAN, the Open Knowledge Foundation’s free, open-source Data Management System for publishing data.

Augusto reports:

Our experience using CKAN has been positive throughout, from deploying an instance to work on extensions and translations. The beta version was created in a one-day open sprint of development on the portal (pictured below). The excellent installation documentation then meant that a person without much experience of Python or systems administration was able to put it online in a few hours.

Read the whole post here.

UK Government Releases Open Data White Paper and new Data.Gov.UK

Rufus Pollock - June 28, 2012 in CKAN, Featured, News, Open Government Data

Today, the UK govenrment made a major announcement regarding Open Data and released a revamped Data.Gov.UK — its flagship open data site.

Screen Shot 2012-06-28 at 14.09.16

Open Data White Paper

The Cabinet Office is ushering in a new wave of open data releases with the publication of a new Open Data White Paper.

The White Paper gestures at a world in which there is “presumption to publish” within government, and in which common standards and formats for publishing data online are adhered to. It also includes a commitment on the part of government fo provide public sector data for free “wherever appropriate and possible”.

The document, written by Cabinet Office minister Francis Maude, highlights the benfits of open data for society at large such as greater transparency and improved public services. The White Paper also explores the way in which open data can unlock economic potential by stimulating the creation of new tools and services.

The data to be released under the new plans will add to the 9,000 datasets already available
via data.gov.uk, a data portal powered by the Open Knowledge Foundation’s open-source software CKAN.

The plans set out in the White Paper also include details of the way in which government will safeguard private information and data. Privacy experts will be consulted during every planned open data release to make sure that the value of open data is realised without compromising on indviduals’ rights to privacy.

The Open Data White Paper can be downloaded here

New Data.Gov.UK – new CKAN

In tandem with the release of the White Paper the UK Government’s flagship site Data.Gov.UK has seen a major overhaul. As part of this its data section is now being powered directly by the latest version of the Open Knowledge Foundation’s open-source CKAN data portal software.

This brings new features and and UI upgrade to the site. Here’s an overview.

Data Home Page

Data.Gov.UK data home page

Data Search

View Dataset Information – Contract Spending

Data.Gov.UK dataset information page

View Dataset Data

Data.Gov.UK contract spending data page

Find out more

Read more on the CKAN blog here or on Data.Gov.UK here and here.

The Code

All the code for the CKAN part of Data.Gov.UK is open-source and available on github:

Some of the other extensions used:

Opening up scientific data with CKAN and the DataHub

Mark Wainwright - June 19, 2012 in CKAN, Open Science

The argument for open-access science has been won. The old model of scientific publishing was laid down when the costs of publishing were so great that charging for access was the sensible way to meet them. As scientists’ work moves online, it is the old model we can no longer afford: the costs to humanity of restricting access is too high. A few scientists may have been saying this for years, but now, not only does open-access have the backing of such respected bodies as the Wellcome Trust, but the fact gets lead front-page coverage in the national press. A government-commissioned report published yesterday adds weight to the case. The Open Access tide, we may hope, is unstoppable.

However, it has not yet breached all the defences and overrun the plains. Until then, if you are a researcher, how can you get your research results out where people can read your conclusions – and even work with your data? At the Open Knowledge Foundation, we believe we have one answer.

CKAN: open source data management

CKAN is a free, open-source data management system. It is used to get data out in the open by local and national governments as well as international bodies, but it was originally designed for the more community-oriented use of which the DataHub is an excellent example. On the DataHub, anyone can create a dataset in a couple of minutes. Data can be uploaded or linked to elsewhere on the web. Different data ‘resources’ (such as files of any kind) can be collected together in a dataset, and annotated with information about their author(s), provenance, availablity for re-use, etc.

Publishing research on the DataHub

CKAN is agnostic about what kind of data can be published. A scientific paper might be catalogued as one dataset. The resources could be, for example: different versions of the printed paper (say, the author’s TeX file, and a PDF); a link to the paper’s page on a journal website; spreadsheets of experimental results; the source code you wrote to process the results; and others, such as separate image files of your graphs and diagrams. Of course, how much is included will depend, among other things, on which rights you haven’t signed away to the publisher.

The screenshot below shows an example of a paper represented in just this kind of way (the original dataset is here):

[IMG: Dataset screenshot] src="http://blog.okfn.org/files/2012/06/ccc-screenshot.png" />

Visualising, checking and re-using data

If you publish data it is probably in the hope that other people will use it – whether to check your results or as a starting point for new research of their own. CKAN provides interactive visualisations to your data, as well as an API for querying the data directly across the web – allowing other scientists (or your future self!) to search and process your results without downloading large data files or writing their own interface. Visualisations can also be embedded in blog posts or other web pages. For example, here, live from the DataHub, is a graph of average annual global temperature anomaly, showing the effect of global warming since 1880 in hundredths of a degree:

Metadata

CKAN stores a rich set of metadata, with versioned history. By default it has standard fields such as title, author, and a free-form description, but as a scientist you want others, for example for your paper’s journal, volume number, Digital Object Identifier. No problem – you can add as many fields as you like, as the screenshot below shows. A CKAN site specialised for research could include such fields by default.

[IMG: Metadata screenshot] src="http://blog.okfn.org/files/2012/06/ccc-metadata.png" />

Benefits to the researcher

You’ve already put your research in all kinds of places. Perhaps there’s a preprint in arXiv.org, a copy on an institutional repository or on your departmental website, and if you’re lucky enough to publish in an open-access journal, it’s on their website too. Are there any benefits of putting it in CKAN as well? Here are some.

Collect all your output together: You can create a group that collects all your output together. You may have moved instutions, published in different journals (and even different fields), leaving a trail of out-of-date home pages behind you with incomplete lists of your publications. But you can always keep a complete record of your output on your favourite CKAN datahub.

Collect publications from other hubs: Conversely, perhaps you are an institution, looking to build a repository, but your departments want to retain their own ‘look and feel’ or even their own sites. They can achieve the former with customisable theming on group pages. Alternatively, CKAN’s advanced harvesting system means you can import and synchronise metadata from other hubs, or even different systems, providing they make their metadata available in a standard format.

Acess control: You can control who can see and edit your datasets, so for example joint papers can be edited by any of the authors.

Alt metrics: Get a record of how many people have accessed or downloaded your data. If the appropriate CKAN extension is installed, your dataset can have share buttons (for Twitter, Facebook, etc) and you can also get figures for how often it has been shared.

Try it out

You can try out CKAN right now, by taking your favourite piece of research and heading over to thedatahub.org. Alternatively, if you happen to be a department / university / funding council / research group / etc and fancy your own CKAN site, have a look at ckan.org, or feel free to get in touch.

Get Updates