Exciting news on open legislative data from the US. Eric Mills (from the [Sunlight Foundation]()), Josh Tauberer (of [GovTrack.us](http://govtrack.us/)) and Derek Willis have been beavering away on a public domain scraper and dataset from [THOMAS.gov](http://thomas.gov/), the official source for legislative information for the US Congress. They’ve just hit a key milestone – the incorporation of everything that THOMAS has on Bills going back to 1973 when its records began!
We’ve [published and documented](https://github.com/unitedstates/congress/wiki) all of this data in bulk, and I’ve worked it into Sunlight’s pipeline, so that [searches for bills in Scout](https://scout.sunlightfoundation.com/search/federal_bills/freedom%20of%20information) use data collected directly from this effort.
The data and code are all hosted on Github on a “[unitedstates](https://github.com/unitedstates/)” organization, which is right now co-owned by me, Josh, and Derek – the intent is to have this all exist in a common space. To the extent that the code needs a license at all, I’m using a public domain “[unlicense](https://github.com/unitedstates/congress/blob/master/LICENSE)” that should at least be sufficient for the US (other suggestions welcome).
There’s other great stuff in this organization, too – Josh made an amazing donation of his [legislator dataset](https://github.com/unitedstates/congress-legislators), and converted it to YAML for easy reuse. I’ve worked that dataset into Sunlight’s products already as well. I’ve also moved my [legal citation extractor](https://github.com/unitedstates/citation) into this organization — and my colleague Thom Neale has an in-progress [parser for the US Code](https://github.com/unitedstates/uscode), to convert it from binary typesetting codes into JSON.
Github’s organization structure actually makes possible a very neat commons. I’m hoping this model proves useful, both for us and for the public.