The following post is from David Read, a developer working on the Open Knowledge Foundation’s CKAN project. David attended the Greater London Authorities’ Possibilities of Real Time Data conference earlier this week.
London’s authorities have opened up lots of their data this year, kicking off in January with the launch of the London Data Store (which we blogged about here). You can download, mash-up and analyse stats about crime, housing, education, planning and so on, but the biggie that many developers are desperate to get their hands is transport information. The GLA and 4ip held a half-day conference on Monday at City Hall on this public data, and transport data got top billing.
Dr David Mountain at Placr described how he’s been screen scraping the London Underground departure boards. Although his results were somewhat predictable – that waiting times increased during peak hours, or when the line had a problem – there was obvious interest from the audience in knowing what to expect on unfamiliar lines, and comparison between particular stations.
It is incredible to think that bus and train companies don’t make their timetables open. Chris Osbourne from ITO World has an
interesting way to automatically produce extremely helpful bus maps, yet only two small British areas allow it. He made the point that councils subsidise a massive number of bus services, yet the ‘bus barons’ hug their data and stifle the innovation that might actually increase their custom whilst improving their usefulness to society.
Titled “Possibilities of Real Time Data”, the conference was positioned to highlight the value of open data, but I was surprised
that there was not a discussion to work through the problems faced. The room seemed to be a healthy mix of technologists, representatives from London boroughs and other public data holders including Transport For London. Only briefly did we glean an idea of the road blocks TFL and other providers see preventing them releasing bus, tube and rail information. Data Protection was mentioned and clearly there are many more areas to be discussed, but surely this needs to be discussed in public to move it forward?
The success of using more general data to closely monitor and improve performance was the theme of a CTO of a council in Washington DC. Bryan Sivak described how they have been publishing lots of public data for about 5 years now on their Data Catalog.
He had stories about improved snow clearing with residents reacting to the map of snow-ploughed areas. They’ve had a couple of years’ worth of application competitions, the widely copied ‘Apps For Democracy’, but they also felt a clear need to develop some themselves. He mentioned a couple of measurable successes of using data: increased rehousing of homeless and spotting trends in overtime use.
What was most interesting regards my involvement as a CKAN developer were Sivak’s future plans. Firstly he mooted “Data Catalog in a Box”, where they open-source all their code so other cities/countries can reap similar benefits. It is great to see another example of publicly funded software being given back to the public, rather than locked away for private use. And although there is sure to be plenty of overlap with CKAN, and competition is healthy, no doubt we can both learn from each other and it creates more options for organisations wanting to catalog their open data.
The other future plan for Washington DC that piqued my interest was their “OpenCity API”. They have linked up with 7 of the top cities in the US to agree on the same web API for ‘government service requests’. i.e. for filling pot holes, requesting snow clearing etc. All this openness of data, source and sharing of APIs is very promising – other governments take note!
Great review Sara
A quick clarifier about the schedule data, at the moment the issue of IP over schedules isn’t confirmed. I think most bus companies would actually be willing to see their timetable information in many more places.
Bus operators send timetables to Local Authorities, often on paper.
Local Authorities enter this information into a variety of IT systems.
Schedules are collated at a regional level by Traveline, who operate regional call centres and online journey planners.
Regional data is sent on to the DfT, collated and used by Transport Direct, the national journey planner
We need the issue of who has IP ownership clarified and then a dictat that schedule data must be published.