Support Us

You are browsing the archive for Featured Project.

Introducing ContentMine

Marieke Guy - July 21, 2015 in Featured Project, Open Access, Open Data

If you are interested in Open Access and Open Data and haven’t hear about ContentMine yet then you are missing out! Graham Steel, ContentMine Community Manager, has written a post for us introducing this exciting new tool.

contentmine2ContentMine aims to liberate 100,000,000 facts from the scientific literature.

We believe that “The Right to Read is the Right to Mine“: anyone who has lawful access to read the literature with their eyes should be able to do so with a machine.

We want to make this right a reality and enable everyone to perform research using humanity’s accumulated scientific knowledge. The extracted facts are CC0.

The Content Mine Team at the Panton Arms in Cambridge

The ContentMine Team & Helen Turvey, Executive Director, Shuttleworth Foundation at the Panton Arms in Cambridge

Research which relies on aggregating large amounts of dynamic information to benefit society is particularly key to our work – we want to see the right information getting to the right people at the right time and work with professionals such as clinical trials specialists and conservationists. ContentMine tools, resources, services and content are fully Open and can be re-used by anybody for any legal purpose.

ContentMine is inspired by the community successes of Wikimedia, Open StreetMap, Open Knowledge, and others and encourages the growth of subcommunities which design, implement and pursue their particular aims. We are funded by the Shuttleworth Foundation, a philanthropic organisation who are unafraid to re-imagine the world and fund people who’ll change it.

Content Mine welcome session

ContentMine Wellcome Trust Workshop

There are several ways to get involved with ContentMine. You can find us on GitHub, Google Groups, email, Twitter and most recently, we have a variety of open communities set up here on Discourse.

This posh has been reposted from the Open Access Working Group blog.

Become a Friend of The Public Domain Review

Adam Green - June 25, 2015 in Featured, Featured Project, Free Culture, Open GLAM, open knowledge, Public Domain, Public Domain Review

Open Knowledge project The Public Domain Review launches a major new fundraising drive, encouraging people to become Friends of the site by giving an annual donation.

For those not yet in the know, The Public Domain Review is a project dedicated to protecting and celebrating, in all its richness and variety, the cultural public domain. In particular, our focus is on the digital copies of public domain works, the mission being to facilitate the appreciation, use and growth of a digital cultural commons which is open for everyone.

We create collections of openly licensed works comprised of highlights from a variety of galleries, libraries, archives, and museums, many of whom also contribute to our popular Curator’s Choice series (including The British Library, Rijksmuseum, and The Getty). We also host a fortnightly essay series in which top academics and authors write about interesting and unusual public domain works which are available online.

Founded in 2011, the site has gone from strength to strength. In its 4 plus years it has seen contributions from the likes of Jack Zipes, Frank Delaney, and Julian Barnes – and garnered praise from such media luminaries as The Paris Review, who called us “one of their favourite journals”, and The Guardian, who hailed us as a “model of digital curation”.

This is all very exciting but we need your help to continue the project into the future.

We are currently only bringing in around half of the base minimum required – the amount we need in order to tick along in a healthy manner. (And around a third of our ideal goal, which would allow us to pay contributors). So it is of urgent importance that we increase our donations if we want the project to continue.

Hence the launch of a brand new fundraising model through which we hope to make The Public Domain Review sustainable and able to continue into the future. Introducing “Friends of The Public Domain Review”https://publicdomainreview.org/support/

Image 1: one of the eight postcards included in the inaugural postcard set. The theme is "Flight" and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July - Source.

Image 1: one of the eight postcards included in the inaugural postcard set. The theme is “Flight” and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July. Source = http://www.loc.gov/pictures/item/00650258.

What is it?

This new model revolves around building a group of loyal PDR (Public Domain Review) supporters – the “Friends” – each of whom makes an annual donation to the project. This club of patrons will form the beating heart of the site, creating a bedrock of support vital to the project’s survival.

How can one become a Friend?

There is no fixed yearly cost to become a Friend – any annual donation will qualify you – but there is a guide price of $60 a year (£40/€55).

Are there any perks of being a Friend?

Yes! Any donation above $30 will make you eligible to receive our exclusive twice-a-year “postcard set” – 8 beautiful postcards curated around a theme, with a textual insert. Friends will also be honoured in a special section of the site and on a dedicated page in all PDR Press publications. They will also get first refusal in all future limited edition PDR Press creations, and receive a special end of year letter from the Editor.

How do I make my donation?

We’ve worked hard to make it as easy as possible to donate. You no longer have to use PayPal on the PDR site, but can rather donate using your credit or debit card directly on the site.

For more info, and to make your donation, visit: https://publicdomainreview.org/support/

Become a Friend before 8th July to receive the inaugural postcard set upon the theme of “Flight”

Image 2: one of the eight postcards included in the inaugural postcard set. The theme is "Flight" and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July - Source.

Image 2: one of the eight postcards included in the inaugural postcard set. The theme is “Flight” and the set will be sent out to all Friends donating $30/£20/€27.50 or more before 8th July. Source = http://www.loc.gov/pictures/item/2002722387/.

Walkthrough: My experience building Australia’s Regional Open Data Census

Stephen Gates - March 6, 2015 in Featured Project, OKF Australia, Open Data Census

Skærmbillede 2015-03-06 kl. 11.27.11

On International Open Data Day (21 Feb 2015) Australia’s Regional Open Data Census launched. This is the story of the trials and tribulations in launching the census.

Getting Started

Like many open data initiatives come to realise, after filling up a portal with lots of open data, there is a need for quality as well as quantity. I decided to tackle improving the quality of Australia’s open data as part of my Christmas holiday project.

I decided to request a local open data census on 23 Dec (I’d finished my Christmas shopping a day early). While I was waiting for a reply, I read the documentation – it was well written and configuring a web site using Google Sheets seemed easy enough.

The Open Knowledge Local Groups team contacted me early in the new year and introduced me to Pia Waugh and the team at Open Knowledge Australia. Pia helped propose the idea of the census to the leaders of Australia’s state and territory government open data initiatives. I was invited to pitch the census to them at a meeting on 19 Feb – Two days before International Open Data Day.

A plan was hatched

On 29 Jan I was informed by Open Knowledge that the census was ready to be configured. Could I be ready be launch in 25 days time?

Configuring the census was easy. Fill in the blanks, a list of places, some words on the homepage, look at other census and re-use some FAQ, add a logo and some custom CSS. However, deciding on what data to assess brought me to a screaming halt.

Deciding on data

The Global census uses data based on the G8 key datasets definition. The Local census template datasets are focused on local government responsibilities. There was no guidance for countries with three levels of government. How could I get agreement on the datasets and launch in time for Open Data Day?

I decided to make a Google Sheet with tabs for datasets required by the G8, Global Census, Local Census, Open Data Barometer, and Australia’s Foundation Spatial Data Framework. Based on these references I proposed 10 datasets to assess. An email was sent to the open data leaders asking them to collaborate on selecting the datasets.

GitHub is full of friends

When I encountered issues configuring the census, I turned to GitHub. Paul Walsh, one of the team on the OpenDataCensus repository on GitHub, was my guardian on GitHub – steering my issues to the right place, fixing Google Sheet security bugs, deleting a place I created called “Try it out” that I used for testing, and encouraging me to post user stories for new features. If you’re thinking about building your own census, get on GitHub and read what the team has planned and are busy fixing.

The meeting

I presented to the leaders of Australia’s state and territory open data leaders leaders on 19 Feb and they requested more time to add extra datasets to the census. We agreed to put a Beta label on the census and launch on Open Data Day.

Ready for lift off

The following day CIO Magazine emailed asking for, “a quick comment on International Open Data Day, how you see open data movement in Australia, and the importance of open data in helping the community”. I told them and they wrote about it.

The Open Data Institute Queensland and Open Knowledge blogged and tweeted encouraging volunteers to add to the census on Open Data Day.

I set up Gmail and Twitter accounts for the census and requested the census to be added to the big list of censuses.

Open Data Day

No support requests were received from volunteers submitting entries to the census (it is pretty easy). The Open Data Day projects included:

  • drafting a Contributor Guide.
  • creating a Google Sheet to allow people to collect census entries prior to entering them online.
  • Adding Google Analytics to the site.

What next?

We are looking forward to a few improvements including adding the map visualisation from the Global Open Data Index to our regional census. That’s why our Twitter account is @AuOpenDataIndex.

If you’re thinking about creating your own Open Data Census then I can highly recommend the experience and there is great team ready to support you.

Get in touch if you’d like to help with Australia’s Open Data Census.

Stephen Gates lives in Brisbane, Queensland, Australia. He has written Open Data strategies and driven their implementation. He is actively involved with the Open Data Institute Queensland contributing to their response to Queensland’s proposed open data law and helping coordinate the localisation of ODI Open Data Certificates. Stephen is also helping organise GovHack 2015 in Brisbane. Australia’s Regional Open Data Census is his first project working with Open Knowledge.

Building a Free & Open World-wide Address Dataset

tomlee - February 23, 2015 in Featured Project, Open Data

Skærmbillede 2015-02-20 kl. 09.50.25

Finding your way through the world is a basic need, so it makes sense that satellite navigation systems like GPS and Galileo are among open data’s most-cited success stories. But as wonderful as those systems are, they’re often more useful to robots than people. Humans usually navigate by addresses, not coordinates. That means that address data is an essential part of any complete mapping system.

Unfortunately, address data has historically been difficult to obtain. At best, it was sold for large amounts of money by a small set of ever-more consolidated vendors. These were often the product of public-private partnerships set up decades ago, under which governments granted exclusive franchises before the digital era unveiled the data’s full importance. In some cases, data exclusivity means that the data simply isn’t available at any price.

Fortunately, the situation is improving. Scores of governments are beginning to recognize that address data is an important part of their open data policy. This is thanks in no small part to the community of advocates working on the issue. Open Knowledge has done important work surveying the availability of parcel and postcode data, both of which are essential parts of address data. OpenAddresses UK has recently launched an ambitious plan to collect and release the country’s address data. And in France, the national OpenStreetMap community’s BANO project has been embraced by the government’s own open data portal.

This is why we’re building OpenAddresses.io, a global community collecting openly available address data. I and my fellow OpenAddresses.io contributors were pleased to recently celebrate our 100 millionth address point:

Getting involved in OpenAddresses is easy and can quickly pay dividends. Adding a new dataset is as easy as submitting a form, and you’ll benefit by improving a global open address dataset in one consistent format that anyone can use. Naturally, we also welcome developers: there are interesting puzzles and mountains of data that still need work.

Our most important tools to gather more data are email and search engines. Addresses are frequently buried in aging cadastral databases and GIS portals. Time spent hunting for them often reveals undiscovered resources. A friendly note to a person in government can unlock new data with surprising success. Many governments simply don’t know that citizens need this data or how to release it as an open resource.

If you work in government and care about open data, we’d like to hear from you. Around the world, countries are acknowledging that basic geographic data belongs in the commons. We need your help to get it there.

The Role of Open Data in Choosing Neighborhood

Lorenzo Leva - November 14, 2014 in Featured Project

To what extent is it important to get familiar with our environment?

If we think about how the world surrounding us has changed throughout the years, it is not so unreasonable that, while walking to work, we might encounter some new little shops, restaurants, or gas stations we had never noticed before. Likewise, how many times did we wander about for hours just to find green spaces for a run? And the only one we noticed was even more polluted than other urban areas!

Citizens are not always properly informed about the evolution of the places they live in. And that is why it would be crucial for people to be constantly up-to-date with accurate information of the neighborhood they have chosen or are going to choose.

the_role_of_opendata.doc

(Image source: London Evening Standard)

London is a neat evidence of how transparency in providing data is basic in order to succeed as a Smart City. The GLA’s London Datastore, for instance, is a public platform of datasets revealing updated figures on the main services offered by the town, in addition to population’s lifestyle and environmental risks. These data are then made more easily accessible to the community through the London Dashboard.

The importance of dispensing free information can be also proved by the integration of maps, which constitute an efficient means of geolocation. Consulting a map where it’s easy to find all the services you need as close as possible can be significant in the search for a location.

Skærmbillede 2014-11-03 kl. 14.02.12

(Image source: Smart London Plan)

The Global Open Data Index, published by Open Knowledge in 2013, is another useful tool for data retrieval: it showcases a rank of different countries in the world with scores based on openness and availability of data attributes such as transport timetables and national statistics.

Here it is possible to check UK Open Data Census and US City Open Data Census.

As it was stated, making open data available and easily findable online not only represented a success for US cities but favoured apps makers and civic hackers too. Lauren Reid, a spokesperson at Code for America, reported according to Government Technology: “The more data we have, the better picture we have of the open data landscape.”

That is, on the whole, what Place I Live puts the biggest effort into: fostering a new awareness of the environment by providing free information, in order to support citizens willing to choose the best place they can live.

The outcome is soon explained. The website’s homepage offers visitors the chance to type address of their interest, displaying an overview of neighborhood parameters’ evaluation and a Life Quality Index calculated for every point on the map.

The research of the nearest medical institutions, schools or ATMs thus gets immediate and clear, as well as the survey about community’s generic information. Moreover, data’s reliability and accessibility are constantly examined by a strong team of professionals with high competence in data analysis, mapping, IT architecture and global markets.

For the moment the company’s work is focused on London, Berlin, Chicago, San Francisco and New York, while higher goals to reach include more than 200 cities.

US Open Data Census finally saw San Francisco’s highest score achievement as a proof of the city’s labour in putting technological expertise at everyone’s disposal, along with the task of fulfilling users’ needs through meticulous selections of datasets. This challenge seems to be successfully overcome by San Francisco’s new investment, partnering with the University of Chicago, in a data analytics dashboard on sustainability performance statistics named Sustainable Systems Framework, which is expected to be released in beta version by the the end of 2015’s first quarter.

the_role_of_opendata.doc2

(Image source: Code for America)

Another remarkable collaboration in Open Data’s spread comes from the Bartlett Centre for Advanced Spatial Analysis (CASA) of the University College London (UCL); Oliver O’Brien, researcher at UCL Department of Geography and software developer at the CASA, is indeed one of the contributors to this cause. Among his products, an interesting accomplishment is London’s CityDashboard, a real-time reports’ control panel in terms of spatial data. The web page also allows to visualize the whole data translated into a simplified map and to look at other UK cities’ dashboards.

Plus, his Bike Share Map is a live global view to bicycle sharing systems in over a hundred towns around the world, since bike sharing has recently drawn a greater public attention as an original form of transportation, in Europe and China above all.

O’Brien’s collaboration with James Cheshire, Lecturer at UCL CASA, furthermore gave life to a groundbreaking project called DataShine, aimed to develop the use of large and open datasets within the social science community through new means of data’s visualisation, starting from a mapping platform with 2011 Census data, followed by maps of individual census tables and the new Travel to Work Flows table.

Skærmbillede 2014-11-03 kl. 14.01.59

(Image source: Suprageography)

Call for action: Help improve the open knowledge directory

Guest - November 10, 2014 in Featured Project

opensteps This is a guest blog post from Open Steps, an independent blog aggregating worldwide information around Open Cultures in form of articles, videos and other resources. Its aim is to document open knowledge (OK) related projects and keep track on the status of such initiatives worldwide. From organisations using Open Data, promoting Open Source technologies, launching Open Government initiatives, following the principles behind Open Science, supporting the release of information to newsrooms practicing Data Journalism.

In this way, their site seeks to continue, this time virtually, the globetrotter project realised between July 2013 to July 2014 and discover further OK projects all around the world.

If you followed the journey across Europe, India, Asia and South-America that Margo and Alex from Open Steps undertook last year, you probably already know their open knowledge directory. During those 12 months, in every of the 24 visited countries they had the chance to met numerous enthusiastic activists sharing the same ideas and approaches. In order to keep record of all those amazing projects they created what began as a simple contact list but soon evolved in a web application that has been growing since then.

okdirectory1

After some iterations a new version has been recently released which not only features a new user interface with better usability but also sets a base for a continuous development that aims to encourage collaboration among people across borders while monitoring the status of open knowledge initiatives worldwide and raising awareness about relevant projects worth to discover. If you haven’t done it yet, head to http://directory.open-steps.org and join it!

New version implementing PLP Profiles

One of the main features of this new version is the implementation of the Portable Linked Profiles, short PLP. In a nutshell, PLP allows you to create a profile with your basic contact information that you can use, re-use and share. Basic contact information refers to the kind of information you are used to type in dozens of online forms, from registering on social networks, accessing web services or leaving your feedback in forums, it is always the same information: Name, Email, Address, Website, Facebook, Twitter, etc…PLP addresses this issue but also, and most important, allows you to decide where you want your data to be stored.

okdirectory2

By implementing PLP, this directory does not make use anymore of the old Google Form and now allow users to edit their data and keep it up-to-date easily. For the sake of re-usability and interoperability, it makes listing your profile in another directory so easy as just pasting the URI of your profile on it. If you want to know more about PLP, kindly head to the current home page, read a more extensive article about it on Open Steps or check the github repository with the documentation. PLP is Open Source software and is based on Open Web Standards and Common Vocabularies so collaboration is more than welcome.

Participate on defining the next steps for the open knowledge directory

Speaking about collaboration, on the upcoming Wednesday 12th of November, a discussion will take place on how the worldwide open knowledge community can benefit from such a directory, how the current Open Steps’ implementation can be improved and what would be the next steps to follow. No matter what background you have, if you are a member of the worldwide open knowledge community and want to participate on the improvement of the open knowledge directory, please join us.
When? Wednesday, 12th November 2014. 3pm GMT

Event on Google+: https://plus.google.com/events/c46ni4h7mc9ao6b48d9sflnetvo

References

This blog post is also available on the Open Education Working Group blog.

New Open Access Button launches as part of Open Access Week

David Carroll - October 22, 2014 in Featured Project, Open Access

This post is part of our Open Access Week blog series to highlight great work in Open Access communities around the world.

button

Push Button. Get Research. Make Progress.

If you are reading this, I’m guessing that you too are a student, researcher, innovator, an everyday citizen with questions to answer, or just a friend to Open Knowledge. You may be doing incredible work and are writing a manuscript or presentation, or just have a burning desire to know everything about anything. In this case I know that you are also denied access to the research you need, not least because of paywalls blocking access to the knowledge you seek. This happens to me too, all the time, but we can do better. This is why we started the Open Access Button, for all the people around the world who deserve to see and use more research results than they can today.

Yesterday we released the new Open Access Button at a launch event in London, which you can download from openaccessbutton.org. The next time you’re asked to pay to access academic research. Push the Open Access Button on your phone or on the web. The Open Access Button will search the web for version of the paper that you can access.

If you get your research, you can make progress with your work. If you don’t get your research, your story will be used to help change the publishing system so it doesn’t happen again. The tool seeks to help users get the research they need immediately, or adds papers unavailable to a wish-list we can get started . The apps work by harnessing the power of search engines, research repositories, automatic contact with authors, and other strategies to track down the papers that are available and present them to the user – even if they are using a mobile device.

The London launch led other events showcasing the Open Access Button throughout the week, in Europe, Asia and the Middle East. Notably, the new Open Access Button was previewed at the World Bank Headquarters in Washington D.C. as part of the International Open Access Week kickoff event. During the launch yesterday, we reached at least 1.3 million people on social media alone. The new apps build upon a successful beta released last November that attracted thousands of users from across the world and drew lots of media attention. These could not have been built without a dedicated volunteer team of students and young researchers, and the invaluable help of a borderless community responsible for designing, building and funding the development.

Alongside supporting users, we have will start using the data and the stories collected by the Button to help make the changes required to really solve this issue. We’ll be running campaigns and supporting grassroots advocates with this at openaccessbutton.org/action as well as building a dedicated data platform for advocates to use our data . If you go there you now you can see the ready to be filled map, and your first action, sign our first petition, this petition in support of Diego Gomez, a student who faces 8 years in prison and a huge monetary fine for doing something citizens do everyday, sharing research online for those who cannot access it.

If you too want to contribute to these goals and advance your research, these are exciting opportunities to make a difference. So install the Open Access Button (it’s quick and easy!), give it a push, click or tap when you’re denied access to research, and let’s work together to fix this problem. The Open Access Button is available now at openaccessbutton.org.

The state of Swedish digital policy: Open Knowledge Sweden at the annual Almedalen Political Summit

Guest - August 1, 2014 in Featured Project, OKF Sweden

This is a guest blog post by Kristina Olausson, Blog writer and editor for Open Knowledge Sweden. You can see the Swedish version it is based on here.

Almedalen 2014

Photo by Socialdemokrater, CC-BY-ND

Part of the team of Open Knowledge Sweden, Kristina Olausson and Mattias Axell, visited the annual politicians week – the Almedalen week at Gotland, Sweden. It is an event in which the political parties, interest groups and the public sector participates. The Almedalen week was initiated by Swedish Prime Minister Olof Palme in 1968 and has evolved to become the main political gathering of the year. Even though the outline has changed over time it now follows a rather fixed pattern. Each party has one day of the week dedicated to their events, and the party leader gives a speech in the evening. In parallel to what the parties arrange, there is a huge number of seminars organized by different interest groups, companies and public sector bodies. This year more than 3500 seminars could be found in the program. By participating, Open Knowledge Sweden aimed to follow the current debates on Swedish digital policy and what importance these have in the upcoming Swedish national elections this autumn. During the week we took part in seminars on digitalization, integrity and open data. 

Almedalen 2014

Photo by djurensratt, CC-BY-NC

Since last year a change can be noticed in the attitude towards open data among Swedish public sector bodies and municipalities. It is now more open an positive, less skeptical. The question is no longer if, but how the public sector can make its information easier to use. More public sector bodies (PSBs) than before have started working with open data. However, with regards to the OKFN definition of open data, it should be noted that in these cases it is rather the re-use of public sector information than open data that is discussed. The municipality Skellefteå and the region Västerbotten arranged a seminar on open data and how the possibilities of innovation can be used. They also raised the question about how the responsibility for this process should be devided between the public and private sector as well as other interested parties. Henrik Ishihara, an expert working for Anna-Karin Hatt, the Minister for Information Technology and Energy, said that about 40 percent of all PSBs now work with re-use of public information. Janne Elvelid, former employee of the Committee of Digitization, was more sceptical to the current development and showed that Sweden has actually lost its place among the leading countries on IT.  Almedalen 2014

Photo by Kristina Olausson, CC-BY-SA

At another seminar organized by Lantmäteriet, who offer map-data, discussed if charges should be put on data and if so, how much. The public sector body itself has now started to work more actively to make their data open. Why then are Swedish PSBs and municipalities lacking behind their European colleagues in this development?  According to many actors the main obstacle in making more data open is the demand on the PSBs to charge for re-use of data. The principle of publicity is an old tradition in Sweden which implies that all public information is available to the public. However, this does not mean that it is for free. What separates Sweden from many other European countries is the fact that many public sector bodies are obliged to charge for re-use of data. It was argued by some actors we met that it will be impossible to create more re-use without removing the rules of charging. In the case of Lantmäteriet, they estimate that the removal of charges on their map-data will cost about 100 million Swedish kronor (about 12 million euro). 

The possibilies of digitazation was another theme of many seminars. Dagens Industri and SAS Institute organized one to discuss how the public sector can use big data (as already done by the private sector) to predict certain patterns in society. This could for example be finding the next flue crisis by analysing Facebook status updates. One challenge put forward in this discussion is the fact that many public services are offered by the 290 Swedish municipalities (kommuner). As there is a strong self-governing principle in Sweden, the municipalities are not collaborating on many of these services which makes it hard for small municipalities to invest in digitalization. Thus, more collaboration is needed not only for municipalities but also for public sector bodies.

Cloud services is a positive possibility of developing the public services as the goal is to have more service online and thus also more information stored in this format. In the mean time, during this development, there is a need to take privacy issues into account. Microsoft arranged a number on seminars on this theme during the week. One that we attended was regarding privacy in schools in combination with cloud services. In Sweden the Salem-case is especially well known. The municipality Salem was criticized by the Data Inspection Authorities because they let their students use Google’s cloud services which was regarded not to have sufficient protection for the pupils’ privacy. How this should be done in practice is still under political discussion, if so very limited. At a seminar by Ernst and Young company representatives of some of our big telephone- and network operators said this has led to they themselves having to make their own priorities on privacy. This might however not be positive as it could lead to companies starting to censor their net services, according to their own liking. This might lead to less transparent processes of handling these issues. Additionally, not all companies are happy to take on this responsibility themselves. The debated judgement from the European Court of Justice in the case Google Spain vs. Mario Costeja González was used as an example by David Mothander, Nordic Policy Advisor at Google Almedalen 2014

Photo by FORES, CC-BY

He was critical to the judgement, also called the right to be forgotten, states that internet search engine operators are responsible for “the processing that it carries out of personal data which appear on web pages published by third parties“. Naturally, it is not surprising that a company like Google does not want to be responsible for such procedures. However it also leads to interesting questions on who should be responsible for protecting the privacy and personal data of individuals. The opportunities of digitization was also discussed at a seminar with representatives of youth party organisations. While the left (and the youth organisation of the Swedish democrats) were most concerned about the surveillance society, the right wing parties wanted better conditions for companies. They instead want the state to take care of the infrastructure (broad band etc.) and the companies should run the development. The interesting aspect of this seminar was foremost that it had such a high density of politicians. Generally the events on the themes we covered did not have that many political representatives in the panels. Thus it has been hard to evaluate the digital politics of the parties with regards to the upcoming elections this autumn.

Almedalen 2014

Photo by Lärarnas Nyheter, CC-BY-NC-ND

Digital policy has not been a central theme to this years election campaigns. However, even though the Swedish politicians were not discussing these issues intensively many interesting ideas were put forward by interest groups and companies. Open Data is still not common among Swedish public sector bodies. Even though some mix up the terms, it is rather re-use of public sector information that is discussed. The positive change that can be noticed is that the representatives of the public sector who participated in this year’s Almedalen week had a more open attitude towards the possibility of re-using their data. Open Knowledge Sweden works to advocate more re-use of information from the public sector and we are positive towards the ongoing shift in Sweden regarding these issues. We believe that more re-use will create huge value for society, both within the public and private sector. The main obstacle is not the technological shift, that some want to point at, but rather the rule of charges that applies to many public sector bodies who collects and offer public information. Unfortunately it seems that politicians are not prioritizing to change the current system. The more probable next step will be that public sector bodies themselves try to find ways of limiting the charges. However, the decision to charge remains with the government.

Except for following the current debates on Swedish digital policy, the Almedalen week was an opportunity to make contact with other actors and advocates of digitalization. There seems to be a general support and interest in making data open for re-use. However, we will probably have to wait until after our national elections this autumn to see real change regarding such issues in Sweden.

OpenCorporates invites you to join the launch of #FlashHacks

Guest - July 10, 2014 in Featured Project

This is a guest blog post by OpenCorporates.

Screen Shot 2014-07-09 at 15.40.22

OpenCorporates is now 3 years old. Looking back our first blog on the Open Knowledge (Foundation) blog about reaching 20 million companies, it is heartening to see that we have come a long way. We now have over 70 million companies in 80 jurisdictions worldwide making us the world’s largest open database of companies. The success story of OpenCorporates is not that of a tiny team but that of the whole open data community because it has always been a community effort thanks to the efforts of Open Knowledge and others. From writing scrapers to alerting us when new data is available, deciphering language issues or helping us grow our reach – the open data community has been the driver behind OpenCorporates.

Yet, while our core target of a URL for every single company in the world is making great progress, there’s a bigger goal here – of de-siloing all the government data that relates to companies and connecting it to those companies. In fact, one of the most frequent questions has been “How can I help get data into OpenCorporates?” Now, we have an answer to that. Not just an answer – a brand new platform, that makes it possible for the community to help us get company-related data into OpenCorporates.

To start this new era of crowdscraping – we launched a #FlashHacks campaign which aims to get 10 million datapoints in 10 days. With your help, we are confident we can smash the target.

DSCF0648

Why is this important?

Information about public and private sector is of monumental importance to understanding and changing the world we live in. Transnational corporations can wield unprecedented influence on politics and economy and we have a limited capacity to understand this when we don’t know what these legal entities look like. The influence of these companies can be good or bad and we don’t have a clear picture of this.

Company information is often not available and when it is, it is buried under hard-to-use websites and PDFs. Fortunately, the work of the open data and transparency community has brought a tide of change. With the introduction of Open Government Partnership and G8 Open Data Charter, governments are committing to make this information easily and publicly available. Yet, action on this front remains slow. And that’s why scraping is at the heart of the open data movement! Where would the open data community be if it had not been for bot-writers spending time deciphering formats and writing code to release data?

DSCF0660

We want to use #FlashHacks as a celebration of the commitment of bot-writers and invite others to join us in changing the world through open data.

#FlashHacks at OKFestival

The last day of the campaign coincides with the last day of OKFestival, probably, the biggest gathering of the open data community. So, we will be putting on three #FlashHacks in partnership with Open Knowledge Germany, Code for Africa and Sunlight Foundation.

The OKF Germany #FlashHack will be releasing German data. Sign up here.

The Sunlight Foundation #FlashHack will be releasing political lobbying data. Sign up here.

The Code for Africa #FlashHack will be releasing African data. Sign up here.

How you can join the crowdscraping movement if you can’t make it to OKFest?

  • If you can code in Ruby and/or Python, join http://missions.opencorporates.com and sign up!
  • Have a look at the datasets we have listed on the Campaign page! If there is a dataset you think we should include in this, please put that down here.
  • Sign up to a mission! Send a tweet pledge to say you have taken on a mission.
  • Write the bot and submit on the platform.
  • Tweet your success with the #FlashHacks tag! Don’t forget to upload the FlashHack design as your twitter cover photo and facebook cover photo to get more people involved.

Any problems – you can post on our Google Group.

GitLaw: How The Law Factory turns the French parliamentary process into 300 version-controlled Open Data visualizations

Guest - June 25, 2014 in Featured Project

This is a guest blog post by the French NGO Regards Citoyens, which actively promotes public Open Data principles in France since 2009 and lobbying transparency since 2010. They create web projects using public data to provide tools for a better dialogue between citizens and representatives. Their most known initiative is a parliamentary monitoring website: NosDeputes.fr.

TheLawFactory.fr

Law is Code!

Over the last few years, a number of people have explored the idea of inverting Lawrence Lessig’s metaphor “code is law”, looking at the evolution of laws through the lens of coding tools. The parliamentary process is indeed so similar to a collaborative software development workflow that it is only natural to try and use a version control tool such as git to track individual legislative changes.

The analogy between both processes is deep: in each case, there is a group of people collaborating on a textual artifact (bill or program source code), proposing changes (amendments or patches), adopting or rejecting them (through votes or pull requests), and iterating until a stable, public version is made available (by promulgation or release). This new paradigm to think about legislation paves the way for new, innovative approaches of law-tracking. Some exciting work has already been made, most notably in Germany: the BundesGit project invites citizens to propose their own legal modifications as “pull requests”, and Gregor Aisch produced an unprecedented visualization of modifications to one law over 40 years of amendments.

Initiated in 2011, the Law Factory project worked on the French legislative process to answer a simple question: does the Parliament actually write the law, or are MPs only validating the executive’s drafts, as most people commonly assume? A collaboration between Regards Citoyens, an NGO that has monitored the French Parliament’s work through its project NosDéputés.fr since 2009, and two research laboratories at Sciences Po Paris, the médialab and the Centre d’études européennes, the project also sought support from all over the world.

Two international conferences, in June 2012 and May 2014, gathered in Paris activists, NGOs, researchers, public servants and journalists, in order to share projects and ideas from a wide range of expertise. In June 2013, a two-day DesignCamp with the italian info-designers of Density Design and the portuguese hacktivists of Manufactura Independente led to a collaboration with Density Design to forge innovative ways to represent and explore bills throughout the legislative process.

The Law Factory: browsing through 290 adopted bills

After three years, TheLawFactory.fr was finally released on May 28, 2014 as a free software web application, combining all available information on 290 bills promulgated since 2010. All of the text of these bills and their amendments, as well as contextual documents such as debate transcripts, are redistributed as open data, published as version-controlled text into git repositories, and made accessible through four interactive tools that enable users – researchers, journalists, lobbyists, citizens, legislators and legislative staff – to browse the legislative process under various levels of zoom.

Similar to a Gantt chart, the first visualization proposes to navigate through time and discover within the legislative agenda which bills were discussed, when, within which chamber and for how long. The display can be switched from the top menu to other views: a comparative one to compare the global time taken to study each bill, and a quantitative one to consider only the times when the text was actually being discussed and not just sitting in between the two chambers. The menu also contains filters to display only the most amended bills or those that took the most time to consider. Another feature allows the user to select a theme or legislative year.

Clicking on a bill provides (on the right) a small set of metrics offering a first estimate of a bill’s controversiality, including measures of how much the actual text grew and changed during the whole process, how many amendments were proposed and adopted and how many words were spoken during the debates. Contrary to popular belief, first analyses reveal that the French Parliament impacts the law writing process significantly: 74 % of the amended texts studied were modified by at least 50 %, and 61 % increased in length by at least 50 %. Only a handful of texts – highly controversial ones – reveal a decrease in volume by the end of the parliamentary process. Clicking on the button “Explorer les articles” allows the user to study the legislative process of each bill individually.

Studying each bill’s changes individually

All of the changes measured can be further explored in a second module for each bill. Each step of a bill’s legislative process is displayed as a column: the first draft proposal from the government or an MP on the left, followed by each version successively adopted in committee and plenary (by the Senate and the National Assembly) during a first reading, sometimes a conciliation committee, and quite often a few more readings. In each column, the text is split into articles or sections grouping multiple articles. Each article at each step is represented as a box with a proportional height to the actual length in characters of its text.

Switching the display view into a “compact mode” reveals how much the whole text actually grew at each step. New articles are marked in green, while those that have been removed are marked in red. All other aticles are shaded in grey depending on how much the text of their alineas was modified during that step so that any highly rewritten article can be quickly identified. Clicking on an article gives access on the right to the full text of the article at the step, and optionnally display the differences with the previous version as within developers usual code-diff tools.

Exploring the debates and amendments of the parliamentary process

When the text of a bill was modified at a step, a third tool is made accessible to explore all the related amendments via a folder icon in the column header. Each amendment is represented as a small square with the color of its originating political party and an ideogram revealing its resulting status: adopted, rejected or left out. Here again, clicking on an amendment reveals (on the right) the complete amendment text, its author(s) and an explanation of the amendment.

Switching the view into a “grouped mode” and ordering the amendments per political party gives a visual estimate of the origin of all adopted modifications. This can also help visualize parliamentary obstruction, for example, when a political group purposely floods the debate to slow it down with hundreds of amendments destined to fail.

A last visualization, accessible via a discussion icon in the column header, proposes to get to the root of the bill changes: the actual discussions between the members of the parliament at a selected step. Presidents, rapporteurs, government members and the different parliamentary party groups share the speaking time differently during the successive parts of the debate, beginning with a general discussion, followed with a focus on each article and related amendments individually. Throughout these steps, each group of speakers is represented as a stream graph, each step being a box with a size proportional to the number of words spoken per subject.

This visualization aids the user to identify highly debated articles and evaluate the evolving position of each political party on a text. Once again, clicking on a box displays on the right the detailed list of the speakers, with links to the actual minutes of the debate as republished oat NosDéputés.fr and NosSénateurs.fr, offering an unprecedented way to trace the discussions related to specific modifications of the law in just a few clicks.

How does it work?

Like with most projects handling parliamentary data, at least two thirds of the development time had to be spent not on building the visual exploration tools, but on collecting, assembling, cleaning and processing the data. The French Senate has made great efforts towards open data over the past couple years and recently started distributing complete dumps of some of their databases on a daily basis. On the other hand, the National Assembly remains hermetic to any sign of openness, most probably due to its members’ irrational fears of already existing activity rankings by the press…

Ultimately, the Senate’s efforts were only marginally useful for this initiative. Data related to amendments and debates was already preprocessed and waiting to be reused within NosDéputés.fr and NosSénateurs.fr’s databases, only requiring some adjustments and enrichments to their APIs in order to provide direct access to a specific bill’s debates, which can now also benefit other interested users.

But the missing data required here was the version-controlled legislative one: the actual text of the bills at each step of the parliamentary process. Both chambers only publish these as HTML and PDF documents, requiring the extraction of the desired information with automated . Transforming this into data requires parsing the various layouts and scraping the legislative structure and text out of each document. At this stage only half of the work is achieved; the biggest challenge is to autocomplete the many missing pieces within the text . For what are likely practical syntaxic reasons, the texts published by the two chambers often do not include unmodified articles or pieces of articles, letting the reader, hence our robots, crawl though a maze of previous versions to look for the actual missing pieces of text.

Once the data was finally assembled and the code published as a free software, it could be redistributed for anyone to reuse as an open data API tree giving access to each data file used to display the visualizations, as well as all source data that was assembled to generate it.

Following the GitLaw ideas, it only seemed natural to also host version-controlled git repositories for each one of the bills: as within a computer programme project, a bill’s articles become text files, and at the date of each step of the parliamentary process, an institution commits a new version for each article it modified. Using the free software GitLab, anyone can browse the repositories on a web platform, and, like on GitHub, propose their own amendments by “forking” a project and submitting their “pull request”.

What’s next?

Still, the automated handling of the sources’ discrepancies remains imperfect. Our robots fail today over 125 of the texts promulgated since 2010, which means our corpus represents only 70% of the considered texts. We remain confident that we will soon be able to process the majority of the missing bills on one hand, and further on to integrate texts during their on-going adoption process, allowing anyone to access the detailed version of a text with the proposed amendments during the debates. If some of the parsing errors are clearly identified as procedure issues for financial laws for instance, some exceptional cases will certainly reach the limits of automation, as for this erratum we encountered which further “amends” the adopted text that was published.

This whole work will always retain a variety of complex challenges unless the institutions step forward. This is only one example of the many reasons why Parliaments all over the world should progressively migrate their legislative processes towards fully integrated routines into their information systems. Let’s just imagine how both the institutions themselves and the societies they serve would benefit from the positive externalities which can only emerge from parliamentary openness and transparency.

Anyone curious to see more on the subject should feel free to browse through the hours of video captation of the latest Open Legislative Data Conference :)

Get Updates