US Congress data opened

Exciting news on open legislative data from the US. Eric Mills (from the [Sunlight Foundation]()), Josh Tauberer (of []( and Derek Willis have been beavering away on a public domain scraper and dataset from [](, the official source for legislative information for the US Congress. They’ve just hit a key milestone – the incorporation of everything that THOMAS has on Bills going back to 1973 when its records began!

Eric says:

We’ve [published and documented]( all of this data in bulk, and I’ve worked it into Sunlight’s pipeline, so that [searches for bills in Scout]( use data collected directly from this effort.

The data and code are all hosted on Github on a “[unitedstates](” organization, which is right now co-owned by me, Josh, and Derek – the intent is to have this all exist in a common space. To the extent that the code needs a license at all, I’m using a public domain “[unlicense](” that should at least be sufficient for the US (other suggestions welcome).

There’s other great stuff in this organization, too – Josh made an amazing donation of his [legislator dataset](, and converted it to YAML for easy reuse. I’ve worked that dataset into Sunlight’s products already as well. I’ve also moved my [legal citation extractor]( into this organization — and my colleague Thom Neale has an in-progress [parser for the US Code](, to convert it from binary typesetting codes into JSON.

Github’s organization structure actually makes possible a very neat commons. I’m hoping this model proves useful, both for us and for the public.