Support Us

You are browsing the archive for Exemplars.

Prescribing Analytics: how drug data can save the NHS millions

December 17, 2012 in Exemplars, External, Open Data, Open Science

Last week saw the launch of prescribinganalytics.com (covered in the Economist and elsewhere). At present it’s “just” a nice data visualisation of some interesting open data that show the NHS could potentially save millions from its drug budget. I say “just” because we’re in discussions with several NHS organizations about providing a richer, tailored, prescribing analytics service to support the best use of NHS drug budgets.

Working on the project was a lot of fun, and to my mind the work nicely shows the spectacular value of open data when combined with people and internet. The data was, and is, out there. All 11 million or so rows of it per month, detailing every GP prescription in England. Privately, some people expressed concern that failure to do anything with the data so far was undermining efforts to make it public at all.

Once data is open it takes time for people to discover a reason for doing something interesting with it, and to get organized to do it. There’s no saying what people will use the data for, but provided the data isn’t junk there’s a good bet that sooner or later something will happen.

The story of how prescribinganalytics.com came to be is illustrative, so I’ll briefly tell my version of it here… Fran (CEO of https://www.mastodonc.com/) emailed me a few months ago with the news that she was carrying out some testing using the GP prescribing data.

I replied and suggested looking at prescriptions of proprietary vs generic ACE-inhibitors (a drug that lowers blood pressure) and a few other things. I also cc’d Ben Goldacre and my good friend Tom Yates. Ben shared an excellent idea he’d had a while ago for a website with a naughty name that showed how much money was wasted on expensive drugs where there was an identically effective cheaper option and suggested looking at statins (a class of drug that reduces the risk of stroke, heart attack, and death) first.

Fran did the data analysis and made beautiful graphics. Ben, Tom, and I, with help from a handful of other friends, academics, and statisticians provided the necessary domain expertise to come up with an early version of the site which had a naughty name. We took counsel and decided it’d be more constructive, and more conducive to our goals, not to launch the site with a naughty name. A while later the http://www.theodi.org/ offered to support us in delivering prescribinganalytics.com. In no particular order, Bruce (CTO of https://www.mastodonc.com/), Ayesha (http://londonlime.net/), Sym Roe, Ross Jones, and David Miller collaborated with the original group to make the final version.

I’d call the way we worked peer production, a diverse group of people with very different skill sets and motivations formed a small self-organizing community to achieve the task of delivering the site. I think the results speak for themselves, it’s exciting, and this is just the beginning :-)

Notes

  1. Mastodon C is a start-up company currently based at The Open Data Institute. The Open Data Institute’s mission is to catalyse the evolution of an open data culture to create economic, environmental, and social value.

  2. Open Health Care UK is a health technology start-up

  3. About Ben Goldacre

  4. Full research findings and details on methodology can be found at: http://www.prescribinganalytics.com/

Hurricane Sandy and open data

November 1, 2012 in Exemplars, News, Open Data, Open Government Data

It is not an immediately obvious partnership, and yet open data and crisis response go together incredibly well. As storms have lashed the East coast of the US in recent days, causing tragic loss of life and enormous financial damage, many of the tools which have helped citizens to track its path and stay safe have been built on the back of open government data. Just as with the Open Street Map community’s response in the Haiti disaster, we find that with open data at their fingertips, civic hackers and developers are able to build useful tools in an emergency with a speed that far outstrips what centralised government agencies are able to produce.

Check out the Google Crisis Map of Hurricane Sandy, which predicts the future of the storm in real time, including power outages; or the New York Times’s evacuation map. Or if you’re a coder wanting to work with others in the tech community, check out HurricaneHackers who are working on projects and resources for Sandy. Alex Howard is tracking the datastorm here.

He writes:

When natural disasters loom, public open government data feeds become critical infrastructure … it’s key to understand that it’s government weather data, gathered and shared from satellites high above the Earth, that’s being used by a huge number of infomediaries to forecast, predict and instruct people about what to expect and what to do.

And New York City’s Chief Digital Officer, Rachel Haot, wrote to TechCrunch:

Open data is critical in crisis situations because it allows government to inform and serve more people than it ever could on its own through conventional channels. By making data freely available in a usable format for civic-minded developers and technology platforms, government can exponentially scale its communications and service delivery.

We’ve set up a CKAN group for data related to Sandy here: http://thedatahub.org/group/sandy-response-data

If you’re interested in contributing, there are some useful links to get started with here.

Open Street Map has officially switched to ODbL – and celebrates with a picnic

September 12, 2012 in Exemplars, External, Featured, Open Data, Open Data Commons, WG Open Licensing

Open Street Map is probably the best example of a successful, community driven open data project.

The project was started by Steve Coast in 2004 in response to his frustration with the Ordnance Survey’s restrictive licensing conditions.

Steve presented on some of his early ‘mapping parties’ – where a small handful of friends would walk or cycle around with GPS devices and then rendezvous in the pub for a drink – at some of the Open Knowledge Foundation’s first events in London.

In the past 8 years it has grown from a project run by a handful of hobbyists on a shoestring to one of the world’s biggest open data projects, with hundreds of thousands of registered users and increasingly comprehensive coverage all over the world.

In short, Open Street Map is the Wikipedia of the open data world – and countless projects strive to replicate its success.

Hence we are delighted that – after a lengthy consultation process – today Open Street Map has officially switched to using the OKFN’s Open Data Commons ODbL license.

Michael Collinson, who is on the License Working Group at the OpenStreetMap Foundation, reports:

It is my great pleasure to pass on to you that as of 07:00 UTC this morning, 12th September 2012, OpenStreetMap began publishing its geodata under Open Data Common’s ODbL 1.0. That is several terabytes of data created by a contributor community of over a three-quarters of a million and growing every day.

The Open Street Map blog reports that OSM community members will be celebrating with a picnic:

At long last we are at the end of the license change process. After four years of consultation, debate, revision, improvement, revision, debate, improvement, implementation, coding and mapping, mapping, mapping, it comes down to this final step. And this final step is an easy one, because we have all pitched in to do the hard work in advance. The last step is so easy, it will be a picnic.

If you use data from Open Street Map, you can read about how the switch will affect you here.

A big well done to all involved for coming to the end of such a lengthy process – and we hope you enjoyed the sandwiches!

Montevideo: proud of our data

August 9, 2011 in Exemplars

The following post is by Guillermo Moncecchi of Intendencia de Montevideo in Uruguay.

Here, in Montevideo, we are proud of our data. The Intendencia de Montevideo drives the economic, social and cultural life of the city, producing data. Lots of data. The government has spent years developing its information services, almost all government processes produce digital data. High quality data: we need it to accomplish our government tasks. As we said, we are proud of them: we have high precision cartography, including every street and every address; we have birth, death and marriage data in the city; we have digitized the placement of libraries, polyclinics, city landmarks, light points… we need them for our work. And, as we do a good work, all these data are accurate and are continuously updated.

As we are proud of our data, a day came when we ask ourselves: why not let others use them? We discussed the idea and decided to embrace the open data principles, removing barriers to information access: we decided that our data should be on the public domain. The city Mayor approved the idea and wrote a resolution stating the open data approach: if it is public, it is open. We then started an open data portal and published the first data sets. From then, we have been continuously working on updating the portal. We listen to people asking for data. We try to satisfy them. Moreover, we are trying to include an open data version of our information as a mandatory product of every software we develop, including the open data idea in the software development cycle.

Yes, we lost some money: before open data, we charged individuals and institutions for the access to our cartography base. Today, an application using OpenStreetMap uses the same cartography we use for our daily work. That is: the best cartography available for Montevideo. For free. That means better services for people in Montevideo. We have eased data exchange with other public institutions: want some data? Just go to the site and get it. Not available? Ok, wait a couple of days and look again… you’ll get the data, and everybody will. It’s public, it’s open. We are about to publish our accounting data: where does my money come from, where does my money go. Digitally, in open formats. For everybody. That is how we think about transparency.

We want to build community. We want our data to be used, because we are responsible for them. People have started using our data: in our portal, we have linked applications buil using our data. People have found mistakes within our data: we corrected them. We are not afraid of errors: we want to solve them.

Going to http://datos.gub.uy we are working with Agesic (the Electronic Government and Information Society Agency of the Uruguayan government), trying to aid in the development of the Uruguayan open data portal. The Uruguayan state has information access laws, but wants more: if it is public is open. We want to help with our data and our experience.

OpenCorporates: the Open Database of the Corporate World

December 20, 2010 in Exemplars, External, Open Data, Open Government Data

This is a guest post by Chris Taggart, a member of OKFN’s open government working group and creator of OpenlyLocal, who today launched a new website OpenCorporates in collaboration with Rob McKinnon (a project they first demoed at the Open Government Data Camp in November).

Why OpenCorporates? Like most open data/open source projects, it was started (just a couple of months ago), because the founders, Chris Taggart & Rob McKinnon, needed such a resource to exist. Specifically we needed:

  1. an open data base of companies not just in the UK, or in another individual country, but in any country
  2. a way of matching lists of company names to real-world companies (with their company numbers)
  3. a place where the increasingly large amount of open government data relating to companies could be brought together, with all the power that would bring to the community

So, OpenCorporates was created, and while it’s very, very early days, we think we’ve got something that is massively more usable than anything else out there (and did we mention it’s open data too?).

So, without any more delay, let’s take a quick run through the main features. The first place is, reasonably, the home page, where you can search for a company name from the over 3,800,000 companies in the OpenCorporates database

You can also start browsing the database by filtering by jurisdiction (this similar but not the same as country – more on this in a later post), and from there to filtering by company type or status.

The next bit is where it starts to get really interesting, and that’s where we can start to filter based on public data we’ve imported. Let’s say we want to see all the company with Financial Transactions – there’s possibly a better way of expressing this, but these are all the UK central government spending items recently release as part of its drive to open up government. Click on the Financial Transactions filter and you get:

There’s 4955 companies who received a payment from central government. Let’s now see those who received notices from the UK Health & Safety Executive by clicking on the filter to the right:

Then let’s choose an industry classification, say, Fishing, Fish Farming etc. OK that’s just one company. DUCHY OF CORNWALL OYSTER FARM LIMITED, and clicking on that gives us the following screen:

OK. Interesting, but click through onto the transaction, and you get this:

I’ll leave it to the reader to dig out more about that transaction (clue: http://www.google.co.uk/search?q=NOMS), but I think you’ll agree it’s a pretty useful starting point.

The second core feature is the ability to matcth company names to real-world companies, complete with company numbers. To do this, we’ve implemented the back end stuff that the awesome Google Refine needs, and here a short screencast will do the job of a thousand words: screencast on vimeo.

It’s worth mentioning one last feature, which is some ways is the most powerful but not at all sexy, and that’s the ability to have a URL for every company in the world (we’ll be adding the ability for the community to add companies soon). Why is this important? Because when we’re talking about companies, it’s difficult to be sure which company we’re talking about. We need universal identifiers for them, and the best are URLs. This means that different people can refer to the same OpenCorporates url (here’s the one for Google Bermuda Limited) and be sure that they’re talking about the same company.

Finally, we’ve got lots of features we’re working on, including a full-blown API, so it’s easy to get the data out and reuse it elsewhere. Watch this space, follow @OpenCorporates on twitter and start exploring.

New mapping tool from European Fish Subsidy project

November 8, 2010 in Exemplars, External, Open Data, Open Government Data, Open/Closed, Policy, WG Open Government Data, Working Groups

The folks over at Fish Subsidy (who are also behind the amazing Farm Subsidy project) have just released a new mapping tool to help people find out how €3.4 billion of European fisheries subsidies is spent:

  • This is a great example of reusing European public data to make it easier to understand for citizens, journalists and others. The project highlights how important it is for public bodies to make data legally and technically reusable, not just available online via an official ‘shiny front end’, so that others can do interesting things with it.

The project also releases all of their cleaned up data under a fully open license, which means that others can continue to do interesting things with it (like connect it with other data sources, produce new visualisations, and so on):

    *

From the press release:

> LONDON, Monday, 8 November 2010—fishsubsidy.org today launched an interactive map that allows European citizens to track €3.4 billion in EU fisheries subsidies. The map shows 39,174 payments to vessels from 1994 to 2006 under the Financial Instrument for Fisheries Guidance (FIFG).

> Users can select categories of payment (vessel construction, modernisation, scrapping, etc.) and see clearly the geographical distribution of funds, both across the continent and in member states, including the outermost regions. The map also provides summaries of all payments to individual ports with links to the vessel pages at fishsubsidy.org.

> “This new resource reveals where the money went, identifying which regions and ports benefited most from fisheries subsidies. The largest subsidy recipients were in Spain, where public money has fuelled greater and greater fishing capacity, exerting ever more pressure on already depleted fish stocks,” said Markus Knigge of the Pew Environment Group. “Rather than propping up a subsidies-addicted industry, the EU should invest in conserving valuable fish stocks and securing the future viability of vulnerable fisheries-dependent communities.”

> The maps cover payments only up to 2006. “Unfortunately, the new system of transparency that applies to the European Fisheries Fund (EFF) is deficient in a number of respects, the most important of which being that data disclosed no longer identifies the vessels for which subsidies were paid,” said Jack Thurston of EU Transparency.

If you’re interested in talking to the project about the release, you can call or email Jack!

Interview with Hugh McGuire, Founder of Librivox.org

October 7, 2010 in Exemplars, External, Featured Project, Interviews, Public Domain, WG Public Domain, Working Groups

Following is an interview with Hugh McGuire, Founder of the Librivox project and member of the Open Knowledge Foundation’s Working Group on the Public Domain.



Could you tell us a bit about the project and its background? Why did you start it? When? What was the need at the time?

There were some philosophical reasons, and some practical reasons for the creation of LibriVox, which “launched” in August 2005. On the philosophical side, I was fascinated by Richard Stallman and the free software movement, both in methodology and in ethic. I was equally excited by Lessig’s work with the Creative Commons movement and the idea of protecting public domain, including projects such as Michael Hart’s Project Gutenberg. Brewster Kahle’s vision at the Internet Archive of Universal Access to All Human Knowledge was another piece of the puzzle, as was Wikipedia, the most visible non-software open source project around at the time. Finally blogging and podcasting revealed the possibility that anyone could make media and deliver it to the world. It was a potent cocktail.

On the practical side, I was going on a long drive, and wanted to download some free audiobooks – there weren’t very many to be found – and it seemed to me an open source project to make some would be an exciting application of all that stuff I’d been thinking of above.

How is the project doing now? Any numbers on contributors, files, etc? Wider coverage and exposure?

It’s clicking along. We put out about 100 books a month now. Here are our latest stats:

  • Total number of projects 4342
  • Number of completed projects 3768
  • Number of completed non-English projects 551
  • Total number of languages 32
  • Number of languages with a completed work 29
  • Number of completed solo projects 1716
  • Number of readers 3975…who have completed something 3772
  • Total recorded time: 78850563 seconds, or 2 years, 182 days, 3 hours, 18 minutes, and 31 seconds. Total of 78438 sections.

What are the synergies with other projects/inititatives like Project Gutenberg, Wikimedia Foundation projects, Internet Archive and suchlike?

Project Gutenberg provides the bulk of the texts we work from, and they do all the legal work to make sure the texts are in the public domain. They’ve given us some financial support over the years to pay some server costs. And they also have started hosting some of our audiobooks.

Internet Archive hosts all our audio, and when we need a legal entity to represent us – for instance when we launched our first, brief funding drive this spring – IA helps out.

We’ve never had much connection with the Wikimedia Foundation, though we’ve talked with them over the years of course.

Can users request audio versions of particular texts?

Yes, but that doesn’t guarantee that anyone will want to record them.

What are your current plans for languages other than English?

To record all public domain books in all languages in the universe.

Any interesting stories about Librivox content? Coincidences, anecdotes or interesting reuses of the material?

Eegs. Well, some LibriVox cover art was used in a Blackberry commercial. The explosion & popularity of mobile apps – iPhone/Android – built on the LibriVox catalog has been the most gratifying. And we’re starting to see new websites built on our catalog too … it’s exciting, and demonstrates the value of open APIs:

How can people help out? Are there any particular types of assistance or expertise you are currently seeking?

Mostly: reading and prooflistening.

I understand you are personally interested in open content, open data and the public domain. Do you currently have any plans for other projects in this area?

Hrm. I’m mostly focused on book publishing these days, and I’m trying do things in the publishing industry that push towards a more open approach to content.

Can you give a sense of what you hope this area will look like in the future? E.g. in ten or twenty years time? Any thoughts about the future of delivering and reusing public domain content? New opportunities?

Well one thing I would like to see is the public domain expanding again in the USA. The current approach to copyright — essentially extension after extension so that nothing new ever goes into the public domain — is very depressing. But I think the tension between this desire to keep things locked up, and the unprecedented ability to do things with books, media, data is a great debate. I have to think that in the end the value of using data & media in new ways will outwiegh the desire to create false scarcity, but there’s lots of struggle yet to make this happen, and to figure out what businesses look like in such an environment.

In short – we live in interesting times.

Avatar of lisa

by lisa

How to open up local data: notes from Warwickshire council

May 25, 2010 in Exemplars, External, OKF, Open Data, Open Government Data, WG Open Government Data

The following guest post is from Kate Sahota, one of the people involved in the Warwickshire County Council’s Open Data (which we we blogged about last month).

How it all began

It seems the key to triggering a successful open data project is to show the people that matter something shiny, like an iPhone, with a real example of what open data can achieve.

Jim Morton began a small covert operation to start opening up Warwickshire’s data under the guise of developing an iPhone application with news, events, jobs and location information from Warwickshire County Council (WCC). This was launched in January 2010 and by the middle of May 2010 had already been downloaded nearly 2,000 times.

Using the success of the iPhone project and the increasing number of good open data examples (e.g. ) we were able to kick-off our own open data project to create opendata.warwickshire.gov.uk. The business case and main benefits driving the project are:

  • Transparency for the public
  • Enhancing public communications
  • Improving service delivery and enabling citizens to self-serve
  • Contributing towards new ways of running public services
  • Improving external contribution to WCC
  • Enabling mash-ups of disparate sources of information to create new ways of looking at information
  • Enabling 3rd sector organisations or individuals to develop applications aggregating data across organisational boundaries
  • Reducing workload in areas like Freedom of Information (FOI), the Observatory and Public Relations
  • Reinforcing our efforts to resolve data and information issues

The Project

Using the Identify, Represent and Expose principles outlined in Jeni Tennison’s blog, a small team of 4: Jim Morton, Steve Woodward and Terry Rich-Whitehead and myself began working on getting information out of the organisation and building a technical solution and set of standards that fitted with our ongoing work to introduce open, non-proprietary standards across our ICT architecture. We were keen to ensure the project made use of cloud technologies to deal with deal with any scaling/demand issues.

The open data site has been written using Ruby on Rails and is hosted on external platform-as-a-service provider Heroku. The database managing the sets of data includes a standard XML schema for the metadata associated with each dataset. It was very important to ensure this schema was aligned to that used by data.gov.uk to enable us to easily extract our metadata for inclusion in their data catalogue. The application will soon be open-sourced to enable other authorities to easily build their own open data site.

We began at the end January 2010 working 2 days a week, and by mid-April we had unofficially launched the site with over a dozen datasets. By the time we officially launched a couple of weeks later there were nearly 30, and this number is growing by the week.

Tips and Tricks

  • Start with quick and easy datasets – the sooner you can get datasets up and build sample applications to demonstrate the purpose and benefits of open data, the more likely you are to encourage other people to give you their data
  • For those reluctant to open up their data, there is one key question you need to ask: “If someone requested this information under FOI, would we have to give it to them?” If their answer is “Yes”, then you have a very strong starting point for persuading them to give you their data
  • Ensure you have a good process for feeding back any issues with the data
  • Use a standard [preferably open! -- ed.] Creative Commons license to cover usage of the data rather than trying to write your own

The Competition

As part of our work to publish the initiative, we are running a “Hack Warwickshire” competition between Monday 17th May 2010 and Friday 25th June 2010. This competition is challenging everyone to come up with new and innovative uses for our data and web services. The winner of the competition will be the proud owner of a brand new Apple iPad.

Avatar of lisa

by lisa

Warwickshire County Council launch new open data site!

April 30, 2010 in Exemplars, External, News, OKF, OKF Projects, Open Data, Open Government Data

Warwickshire County Council pinged us earlier this week to let us know about the launch of their new open data site!

warwick-open-data

The site hosts a range of data sets – available in CVS or XML. For example there are details about education in the region, including:

There is also a selection of data on Warwickshire councillors such as the council election results for 4th June 2009, 5th May 2005 and 7th June 2001. There is a blog and a strategy blog associated with the main website giving the latest news on the latest datasets as they are added.

The most recent blog post explains that data will soon be available on areas such as school exclusions, traffic, car parking, council buildings and Warwick County Council finance. There are also plans to allow the site visitors to post notes about the data, make requests for new data or changes, plus a showcase for web sites and applications that make use of the data.

So congratulations to Warwickshire County Council for the new release! We hope other local authorities are encouraged to follow suit.

Libraries in Cologne open up bibliographic data!

March 15, 2010 in CKAN, Exemplars, External, OKF, OKF Projects, Open Data, WG Open Bibliographic Data

The following press release is reproduced with permission from Adrian Pohl and Felix Ostrowski, who are both at the North Rhine-Westphalian Library Service Center and who are both members of the Open Knowledge Foundation’s Working Group on Open Bibliographic Data – launched earlier this month. We’ve added a koeln-library-data package to the bibliographic data group on CKAN.

Cologne-based libraries and the Library Centre of Rhineland-Palatinate (LBZ) in cooperation with the North Rhine-Westphalian Library Service Center (hbz) are the first German libraries to adopt the idea of Open Access for bibliographic data by publishing their catalog data for free public use. The University and Public Library of Cologne (USB), the Library of the Academy of Media Arts Cologne, the University Library of the University of Applied Science of Cologne and the LBZ are taking the lead by releasing their data. The Public Library of Cologne has announced to follow shortly. The release of bibliographic data forms a basis for linking that data with data from other domains in the Semantic Web.

Libraries have been involved with the Open Access movement for a long time. The objective of this movement is to provide free access to knowledge to everybody via the internet. Until now, only few libraries have done so with their own data. Rolf Thiele, deputy director of the USB Cologne, states:

> Libraries appreciate the Open Access movement because they themselves feel obliged to provide access to knowledge without barriers. Providing this kind of access for bibliographic data, thus applying the idea of Open Access to their own products, has been disregarded until now. Up to this point, it was not possible to download library catalogues as a whole. This will now be possible. We are taking a first step towards a worldwide visibility of library holdings on the internet.

The library of the European Organization for Nuclear Research (CERN) has already published its data under a public domain license in January.

Public data is placed in the public domain The publication of the data enables anybody to download, modify and use it for any purpose. “In times in which publishers and some library organisations see data primarily as a source of capital, it is important to stick up for the traditional duty of libraries and librarians. Libraries have always strived to make large amounts of knowledge accessible to as many people as possible, with the lowest restrictions possible,” said Silke Schomburg, deputy director of the hbz. “Furthermore libraries are funded by the public. And what is publicly financed should be made available to the public without restrictions,” she continued.

Cooperation and data exchangie between libraries have been firmly established in the library world for more than 100 years. Freely supplying bibliographic data should not only further enhance cooperation among libraries but enable subsequent use by non-library institutions. “In the course of the internet’s development it became clear that many services can be greatly enhanced by catalog data. The German Wikipedia for example has been enriched with German National Library data for a long time. Such enrichment is often hindered and constricted by the data’s half open character,” Schomburg notes.

Data for the Semantic Web The North Rhine-Westphalian Library Service Center has recently begun evaluating the possibilities to transform data from library catalogs in such a way that it can become a part of the emerging Semantic Web. The liberalization of bibliographic data provides the legal background to perform this transformation in a cooperative, open, and transparent way. Currently there are discussions with other member libraries of the hbz library network to publish their data. Moreover, “Open Data” and “Semantic Web” are topics that are gaining perception in the international library world.

Further information and links to the published datasets are available at:

    *

Please create an account to get started.

Sign up to the Open Knowledge Newsletter

Get Updates