Freeing Train Data
The following guest post is by Peter Hicks, IP Network Engineer and Open Transport Data advocate.
In the late 1990s, I decided to learn more about why my commute to and from London wasn’t always a smooth process. Having an inquisitive nature, I set about casually talking to people ‘in the know’ – friends inside and outside the industry, reading Railtrack and ATOC’s websites, asking questions on mailing lists. I found out a huge amount of information about how everything worked – from signalling to timetabling, from fares setting to seat reservations. It helped me avoid becoming one of those angry commuters that station staff have to deal with when things go wrong, largely because I could understand and sympathise.
I fondly remember speaking to the Station Supervisor at one station after a late-running train caused me to miss my connection on to a branch line. Whilst he was booking me a taxi home (which was standard practice when this sort of thing happened to the last trains of the day), I simply asked what had happened. He told me I’d been delayed by a broken-down train, and without thinking, I simply said, “The up slow line’s reversibly signalled; why couldn’t the signaller have run my train in the wrong direction through there and around the failed train on the down slow?”
His eyes lit up, and he asked me which company I worked for. When I told him I didn’t work on the railway, he told me I should apply for a job there, handed me the taxi form, and said “I wish more people were like you.”
Up until a year or two ago, I didn’t have much productive use for my knowledge – it was just a personal quest to understand and exercise my brain whilst commuting. By chance, I heard of a Saturday morning event hosted at City Hall by Emer Coleman, Director of Digital Projects at the GLA, called “Free London’s Data”. There, I met a number of passionate and motivated developers who wanted to work on transport data, and I realised I could put my knowledge to good use.
My current quest is to help open up access to rail data. My manifesto is bold and simple: Everyone should have free and open access to raw fares, timetable and real-time running data.
Release the data, free it up, and let a large number of keen, hungry and motivated developers analyse it for everyone’s good. It’s not a new concept to many – but it’s a culture shift in the rail industry.
The data is already out there and in use, and I don’t think it’s too big a job to make it available to everyone. There are barriers, but I don’t see them as insurmountable. The biggest problem at the moment is dealing with the culture shift.
Network Rail can make the timetable and real-time data available, but they’re not geared up to interface directly with large numbers of developers. They operate the rail infrastructure, and many of their customers are wholly within the rail industry.
I currently take both timetable and real-time data from Network Rail on a non-commercial basis. Getting access to the information hasn’t been that difficult and my initial request broke some new ground within the company.
It hasn’t been easy to analyse and work with the data, but it’s not something I’m going to give up on. It’s now too exciting to let rest. Nearly everyone I’ve spoken to has been very enthusiastic about the opportunities, although I don’t think anyone’s more excited than I am.
To showcase what can be done with the raw data I have, I am writing a proof-of-concept website which will make timetable and real-time data visible. I want it to act as a vehicle to inspire other people and help them come up with ideas; to lower the technical barriers and help them get on with the important task of innovating.
It would be unfair to ignore the fact that ATOC have already invested millions into a system called Darwin, which takes the same data from Network Rail, conditions it, takes further information directly from train operators, and presents it in the form of the Live Departure Boards service. This is an excellent system, which I use every day to check whether my train home is delayed – but, unfortunately, it’s not open. ATOC don’t seem to want to make it open in the Open Data sense either. Superficially, their Code of Practice seems to promote innovation, but my interpretation leads me to believe it presents more barriers than opportunities.
Finally, what about fares information? That data set, possibly the most useful to the greatest number of people, is wholly owned by ATOC. It’s available, but at a price in excess of £25,000, which puts it firmly out of the reach of the vast majority of developers and those who want to analyse and work with the data.
ATOC produce the Avantix Traveller Fares Information CD-ROM and sell it for a mere £10.81; this provides a simple Windows application which queries a local off-line fares database. It’s not updated daily, but the low cost suggets there’s an unfair premium being imposed for access to the raw data.
I will continue to work toward getting everyone free and open access to raw fares, timetable and real-time running data. I am up for the challenge. I want to help the industry overcome technical and political hurdles in providing open access to this data, so society can begin to benefit from free access to rail data.
The codebase is on GitHub at http://github.com/poggs/tsdbexplorer for anyone interested enough to take a look. I’m happy to talk further about the work I’m doing – please email me at peter.hicks[at]poggs[dot]co[dot]uk.