Support Us

You are browsing the archive for OKF Projects.

Newsflash! OKFestival Programme Launches

Beatrice Martini - June 4, 2014 in Events, Free Culture, Join us, Network, News, OKFest, OKFestival, Open Access, Open Data, Open Development, Open Economics, Open Education, Open GLAM, Open Government Data, Open Humanities, Open Knowledge Foundation, Open Knowledge Foundation Local Groups, Open Research, Open Science, Open Spending, Open Standards, Panton Fellows, Privacy, Public Domain, Training, Transparency, Working Groups

At last, it’s here!

Check out the details of the OKFestival 2014 programme – including session descriptions, times and facilitator bios here!

Screen Shot 2014-06-04 at 4.11.42 PM

We’re using a tool called Sched to display the programme this year and it has several great features. Firstly, it gives individual session organisers the ability to update the details on the session they’re organising; this includes the option to add slides or other useful material. If you’re one of the facilitators we’ll be emailing you to give you access this week.

Sched also enables every user to create their own personalised programme to include the sessions they’re planning to attend. We’ve also colour-coded the programme to help you when choosing which conversations you want to follow: the Knowledge stream is blue, the Tools stream is red and the Society stream is green. You’ll also notice that there are a bunch of sessions in purple which correspond to the opening evening of the festival when we’re hosting an Open Knowledge Fair. We’ll be providing more details on what to expect from that shortly!

Another way to search the programme is by the subject of the session – find these listed on the right hand side of the main schedule – just click on any of them to see a list of sessions relevant to that subject.

As you check out the individual session pages, you’ll see that we’ve created etherpads for each session where notes can be taken and shared, so don’t forget to keep an eye on those too. And finally; to make the conversations even easier to follow from afar using social media, we’re encouraging session organisers to create individual hashtags for their sessions. You’ll find these listed on each session page.

We received over 300 session suggestions this year – the most yet for any event we’ve organised – and we’ve done our best to fit in as many as we can. There are 66 sessions packed into 2.5 days, plus 4 keynotes and 2 fireside chats. We’ve also made space for an unconference over the 2 core days of the festival, so if you missed out on submitting a proposal, there’s still a chance to present your ideas at the event: come ready to pitch! Finally, the Open Knowledge Fair has added a further 20 demos – and counting – to the lineup and is a great opportunity to hear about more projects. The Programme is full to bursting, and while some time slots may still change a little, we hope you’ll dive right in and start getting excited about July!

We think you’ll agree that Open Knowledge Festival 2014 is shaping up to be an action-packed few days – so if you’ve not bought your ticket yet, do so now! Come join us for what will be a memorable 2014 Festival!

See you in Berlin! Your OKFestival 2014 Team

Introducing the new Open Development Toolkit site!

Zara Rahman - June 3, 2014 in OKF Projects, Open Development

Open Development Toolkit screenshot

We’re very happy to launch today a new website for the Open Development Toolkit, which which includes a number of new features to help people make use of, and contribute to, the project.

When the project began in early 2014, the project brief was fairly open; since then, after speaking to various members of the Open Development community, attending events such as the IATI TAG meeting, and doing a thorough assessment of what is already going on in the community, we’ve narrowed down the project aims, and target audience, considerably. With regards to the target audience, we’re now considering two main, broad demographics: data users, and development agencies/donors.

By ‘data users’, we’re considering primarily infomediaries in aid recipient countries; civil society and journalists, who could be using development data in their work. They’re in a position to be able to understand the data with local context, and convey their findings to their communities in an effective way. We want to make it as easy as possible for them to find and use aid data portals that already exist, as well as develop their technical skills in accessing, and using, raw aid data to facilitate their work.

With regards to development agencies and donors, we’re looking specifically at those who are thinking of making their data available online; rather than building new portals from scratch and creating proprietary tools, we’d like to encourage them to build upon what has already been created, share and take into account lessons learned, and contribute to the community with their tool/portal creation. Especially where tools have been built with public funds (eg. development arms of governments) we see no reason for these tools to remain closed source and proprietary.

Tools

The new site includes a curated list of Tools, which allow the user to understand, visualise or access aid data in various ways. Each ‘Tool’ presented on the site with a short description of what it does, along with its main strengths and weaknesses, and each one is classified with a number of tags, stating the perceived skill level required (beginner, intermediate, or advanced), the data source used by the tool, as well as its ‘theme’ (eg. global overview, donor specific, recipient country, donor government). The tagging system allows users to search for tools by what they’re wanting to focus on – for example, looking into the activities of a certain donor agency, or taking a closer look at projects taking place in a particular aid recipient country.

Each tool also has a second tab, explaining how the tool was made. We’re putting special focus on the tools which are already open source, and by putting the name of the developer(s) who have worked on these tools along with their contact details, we hope to make it as easy as possible for more work to be commissioned which will build upon their expertise.

Community

Another focus of the site is to bring together people who have worked on building the tools from a technical perspective, along with people working in development agencies, and the potential users of the data; the whole ‘development data’ ecosystem, in a way.

On the Community page, anyone active in the Open Development space is encouraged to create a profile, (for now, via filling in this Google form), with their contact details and a short biography, either as an individual or as an organisation. Activities of organisations and individuals can be seen on their profile pages, for example, tools that they have built or contributed to, blog posts that they have written, and people/organisations with whom they have collaborated.

We hope that highlighting the work that people have done within the Open Development community, along with their collaborations, will facilitate further collaboration, and encourage organisations to call upon community expertise when developing new tools.

Training

As well as displaying the tools and work that have already been created within the community and encouraging collaboration, we also want to support civil society and journalists to get the skills they need to use development data in their work, as mentioned above. We’ll be doing this by working with School of Data to create an Aid Curriculum, made up of various modules on technical skills required to work with aid data.

Ideally, we’d like to build upon training materials that have already been created in the sector, and make them available for remixing and reuse by others in the future; we’ll be encouraging people to try them out in workshops and training sessions, and we’d love to get feedback on how they have best been used, so we can iterate and improve upon them in the future. The curriculum will also be available online for people to work through at their own pace.

Blog

Last, but not least – the site includes a blog, where we’ll be posting on topics such as uses of development data by civil society or journalists, lessons learned during the software development of data portals, and other issues surrounding data use within the global development sector. We welcome submissions to the blog – take a look here to see other topics, and how to contribute.

Feedback on the site is most welcome – either open an issue on Github or drop an email to zara@opendevtoolkit.net.

Bonding with Hong Kong and upcoming Open Spending

Heather Leson - May 16, 2014 in Events, Featured, OKF Hong Kong, Open Spending

Learning and sharing across the global Open Knowledge community are the two core purposes of our regular Community Sessions.

odhk - logo

This week Mart van de Ven and Bastien Douglas joined us to share all about the Open Data Hong Kong community.

Some of the key lessons they advised are: ask your community for help more, have regular events, translation is key and be ready for longer term engagement. Mart, Bastien and the ODHK folks: Have a great Longitudinal Hack!

See more about Open Data Hong Kong.

Next Community Session: All about OpenSpending

Around the world, citizens are getting involved in OpenSpending. So, far there are OpenSpending activities in 66 countries resulting in 735 datasets and 25207863 entries.

Join Anders Pedersen, Community Manager for OpenSpending to learn more about this project and how you can get involved.

  • Date: Wednesday, May 28. 2014
  • Time: 10:00 – 11:00 EDT/14:00-15:00 UTC (See worldtimebuddy.com for your timezone)
  • How to Register (G+)

Join the OpenSpending community See some Spending Stories.

We will record this.

NOTE: We are booking June 2014 Community Sessions. Contact heatherDOTleson AT OKFN DOT org if you have an idea, discussion or skillshare.

Talk soon!

We need you! Become a School of Data Fellow

Milena Marin - May 9, 2014 in Featured, School of Data

IMG_6400

Got data skills to share? Member of a community that wants to turn data into information? Know about a data journalism or civic activism project or organisation which need a push for using data more effectively? The School of Data needs you! We are currently broadening our efforts to spread data skills around the world, and people like you are crucial in this effort: new learners need guidance and people to help them along the way. Stand out and become a **School of Data Fellow**.

We are looking for people fitting the following profile:

  • Data savvy: has experience working with data and a passion for teaching data skills.

  • Understands the role of Non-Governmental Organizations (NGOs) and media in bringing positive change through advocacy, campaigns, and storytelling. Fellows are passionate about enabling partners to use data effectively through training and ongoing support.

  • Interested or experienced in working with journalism and/or civil society.

  • Has some facilitation skills and enjoys community-building (both online and offline).

  • Eager to learn from and be connected with an international community of data enthusiasts

As a School of Data fellow, you will receive data and leadership training, as well as coaching to organise events and build your community. You will also be part of a growing global network of School of Data practitioners, benefiting from the network effects of sharing resources and knowledge and contributing to our understanding about how best to localise our training efforts.

You will be part of a six-month training programme where we expect you to work with us for an average of five days a month, including attending online and offline trainings, organising events, and being an active member of the School of Data community.

There are up to 10 fellowship positions open for the July to December 2014 School of Data training programme.

We have current collaborations and resourcing confirmed to support fellows from the following countries: Romania, Hungary, South Africa, Indonesia, and Tanzania. We are also able to consider applicants for the remaining 5 places in this round from countries meeting these criteria:

  • The country falls under lower income, lower-middle income or upper-middle income categories as classified here.

  • There is demand from civil society organisations and/or journalists who wish to benefit from such a scheme.

  • There are some interesting datasets available in the country which would be worth exploring further. These could either be data published by a government or organisation or data collected by an organisation for their own internal use. Digitised or non-digitised—anything goes! We’re keen for a variety of challenges and want the fellows’ help to adapt teaching techniques to a variety of situations.

Our goal is to have global fellows from a wide mix of these countries. Don’t see your country listed? Keep reading to learn how you can get involved!

Got questions? See more about the Fellowship Programme here and have a looks at this Frequently Asked Questions (FAQ) page. If this doesn’t answer your question, email us on schoolofdata@okfn.org

Not sure if you fit the profile? Have a look at who is a fellow now!

Convinced? Apply now to become a School of data fellow. The application will be open until the 1st of June 2014 and the programme will start in July 2014.

Take a CKAN Tour

Heather Leson - May 1, 2014 in CKAN, Events, OKF Projects

From baby name datasets and apps via the South Australian government to new City of Surrey, B.C., (Canada) site, there are many instances of CKAN around the world. CKAN is the data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. It is used by various levels of governments, civil societies and organization to make their data transparent and available.

In this 1-hour video hangout Irina Bolychevsky, Services Director gives us an overview of CKAN with live demo’s of several CKAN sites including data.gov.uk, publicdata.eu and data.glasgow.gov.uk. She also answered community questions.

ckan-logo
Get Involved

CKAN has a wide community of contributors working to remix and extend the software. Two examples of code that folks have contributed includes Ckanext-spatial and ckanext-realtime (github links).

The CKAN core committers host regular online developer meetings. These are every Tuesday and Thursday 13:00 – 14: 00 EDT reviewing pull requests and discussing architecture. We meet up on ckan developer mailing list, being on the #ckan irc channel in freenode (to the the google hangout link for meetings!) and commenting on github tickets. All welcome.

Community questions tend to be asked on StackOverflow using the CKAN tag on Stack Overflow. You can also file issues/contribute code on github.

Contact us

If you want to talk about CKAN development, please come and say hi on the ckan-dev mailing list or the #ckan IRC channel on irc.freenode.org. If you have service inquiries, you can reach out to the team: services at ckan dot org

Upcoming Community Sessions: CKAN, Community Feedback

Heather Leson - April 28, 2014 in CKAN, Events, Network, Open Knowledge Foundation Local Groups, Our Work, Working Groups

Happy week! We are hosting two Community Sessions this week. You have expressed an interest in learning more about CKAN. As well, We are continuing our regular Community Feedback sessions.

Boy and the world image

Take a CKAN Tour:

This week we will give an overview and tour of CKAN – the leading open source open data platform used by the national governments of the US, UK, Brazil, Canada, Australia, France, Germany, Austria and many more. This session will cover why data portals are useful, what they provide and showcase examples and best practices from CKAN’s varied user base! Bring your questions on how to get started and best practices.

Guest: Irina Bolychevsky, Services Director (Open Knowledge) Questions are welcome via G+ or Twitter.

  • Date: Wednesday, April 30, 2014
  • Time: 7:30 PT /10:30 ET /14:30 UTC /15:30 BST/16:30 CEST
  • Duration: 1 hour
  • Register and Join via G+ (The Hangout will be recorded.)
Community Feedback Session

We promised to schedule another Community Feedback Session. It is hard to find a common time for folks. We will work on timeshifting these for next sessions. This is a chance to ask questions, give input and help shape Open Knowledge.

Please join Laura, Naomi and I for the next Community Feedback Session. Bring your ideas and questions.

  • Date: Wednesday, April 30, 2014
  • Time:9:00 PT/12:00EDT/16:00 UTC /17:00 BST/18:00 CEST
  • Duration:1 hour
  • Join via Meeting Burner

We will use Meeting Burner and IRC. (Note: We will record both of these.)

How to join meeting Burner: Audio instructions Option 1 Dial-in to the following conference line: Number 1- (949) 229 – 4400 # Option 2 You may join the conference bridge with your computer’s microphone/speakers or headset

How to join IRC: http://wiki.okfn.org/How_to_use_IRC/_Clients_and_Tips

More about the new Open Knowledge Brand

Host a Community Session in May

We are booking Community Sessions for May. These Open Knowledge online events can be in a number of forms: a scheduled IRC chat, a community google hangout, a technical sprint or an editathon. The goal is to connect the community to learn and share their stories and skills. If you would like to suggest a session or host one, please contact heather dot leson at okfn dot org.

More details about Community Sessions

(Photo: Heather Leson (San Francisco))

Building an archaeological project repository II: Where are the research data repositories?

Guest - April 17, 2014 in CKAN, Open Science, WG Archaeology

This is a guest post by Anthony Beck, Honorary fellow, and Dave Harrison, Research fellow, at the University of Leeds School of Computing

DART_UML_DART_2011_2013_RAW

Data repository as research tool

In a previous post, we examined why Open Science is necessary to take advantage of the huge corpus of data generated by modern science. In our project Detection of Archaeological residues using Remote sensing Techniques, or DART, we adopted Open Science principles and made all the project’s extensive data available through a purpose-built data repository built on the open-source CKAN platform. But with so many academic repositories, why did we need to roll our own? A final post will look at how the portal was implemented.

DART: data-driven archaeology

DART’s overall aim is to develop analytical methods to differentiate archaeological sediments from non-archaeological strata, on the basis of remotely detected phenomena (e.g. resistivity, apparent dielectric permittivity, crop growth, thermal properties etc). DART is a data rich project: over a 14 month period, in-situ soil moisture, soil temperature and weather data were collected at least once an hour; ground based geophysical surveys and spectro-radiometry transects were conducted at least monthly; aerial surveys collecting hyperspectral, LiDAR and traditional oblique and vertical photographs were taken throughout the year, and laboratory analyses and tests were conducted on both soil and plant samples. The data archive itself is in the order of terabytes.

Analysis of this archive is ongoing; meanwhile, this data and other resources are made available through open access mechanisms under liberal licences and are thus accessible to a wide audience. To achieve this we used the open-source CKAN platform to build a data repository, DARTPortal, which includes a publicly queryable spatio-temporal database (on the same host), and can support access to individual data as well as mining or analysis of integrated data.

This means we can share the data analysis and transformation processes and demonstrate how we transform data into information and synthesise this information into knowledge (see, for example, this Ipython notebook which dynamically exploits the database connection). This is the essence of Open Science: exposing the data and processes that allow others to replicate and more effectively build on our science.

Lack of existing infrastructure

Pleased though we are with our data repository, it would have been nice not to have to build it! Individual research projects should not bear the burden of implementing their own data repository framework. This is much better suited to local or national institutions where the economies of scale come into their own. Yet in 2010 the provision of research data infrastructure that supported what DART did was either non-existent or poorly advertised. Where individual universities provided institutional repositories, these were focused on publications (the currency of prestige and career advancement) and not on data. Irrespective of other environments, none of the DART collaborating partners provided such a data infrastructure.

Data sharing sites like Figshare did not exist – and when it did exist the size of our hyperspectral data, in particular, was quite rightly a worry. This situation is slowly changing, but it is still far from ideal. The positions taken by Research Councils UK and the Engineering and Physical Science Research Council (EPSRC) on improving access to data are key catalysts for change. The EPSRC statement is particularly succinct:

Two of the principles are of particular importance: firstly, that publicly funded research data should generally be made as widely and freely available as possible in a timely and responsible manner; and, secondly, that the research process should not be damaged by the inappropriate release of such data.

This has produced a simple economic issue – if research institutions can not demonstrate that they can manage research data in the manner required by the funding councils then they will become ineligible to receive grant funding from that council. The impact is that the majority of universities are now developing their own, or collaborating on communal, data repositories.

But what about formal data deposition environments?

DART was generously funded through the Science and Heritage Programme supported by the UK Arts and Humanities Research Council (AHRC) and the EPSRC. This means that these research councils will pay for data archiving in the appropriate domain repository, in this case the Archaeology Data Service (ADS). So why produce our own repository?

Deposition to the ADS would only have occurred after the project had finished. With DART, the emphasis has been on re-use and collaboration rather than primarily on archiving. These goals are not mutually exclusive: the methods adopted by DART mean that we produced data that is directly suitable for archiving (well documented ASCII formats, rich supporting description and discovery metadata, etc) whilst also allowing more rapid exposure and access to the ‘full’ archive. This resulted in DART generating much richer resource discovery and description metadata than would have been the case if the data was simply deposited into the ADS.

The point of the DART repository was to produce an environment which would facilitate good data management practice and collaboration during the lifetime of the project. This is representative of a crucial shift in thinking, where projects and data collectors consider re-use, discovery, licences and metadata at a much earlier stage in the project life cycle: in effect, to create dynamic and accessible repositories that have impact across the broad stakeholder community rather than focussing solely on the academic community. The same underpinning philosophy of encouraging re-use is seen at both FigShare and DataHub. Whilst formal archiving of data is to be encouraged, if it is not re-useable, or more importantly easily re-useable, within orchestrated scientific workflow frameworks then what is the point.

In addition, it is unlikely that the ADS will take the full DART archive. It has been said that archaeological archives can produce lots of extraneous or redundant ‘stuff’. This can be exacerbated by the unfettered use of digital technologies – how many digital images are really required for the same trench? Whilst we have sympathy with this argument, there is a difference between ‘data’ and ‘pretty pictures’: as data analysts, we consider that a digital photograph is normally a data resource and rarely a pretty picture. Hence, every image has value.

This is compounded when advances in technology mean that new data can be extracted from ‘redundant’ resources. For example, Structure from Motion (SfM) is a Computer Vision technique that extracts 3D information from 2D objects. From a series of overlapping photographs, SfM techniques can be used to extract 3D point clouds and generate orthophotographs from which accurate measurements can be taken. In the case of SfM there is no such thing as redundancy, as each image becomes part of a ‘bundle’ and the statistical characteristics of the bundle determine the accuracy of the resultant model. However, one does need to be pragmatic, and it is currently impractical for organisations like the ADS to accept unconstrained archives. That said, it is an area that needs review: if a research object is important enough to have detailed metadata created about it, then it should be important enough to be archived.

For DART, this means that the ADS is hosting a subset of the archive in long-term re-use formats, which will be available in perpetuity (which formally equates to a maximum of 25 years), while the DART repository will hold the full archive in long term re-use formats until we run out of server money. We are are in discussion with Leeds University to migrate all the data objects over to the new institutional repository with sparkling new DOIs and we can transfer the metadata held in CKAN over to Open Knowledge’s public repository, the dataHub. In theory nothing should be lost.

How long is forever?

The point on perpetuity is interesting. Collins Dictionary defines perpetuity as ‘eternity’. However, the ADS defines ‘digital’ perpetuity as 25 years. This raises the question: is it more effective in the long term to deposit in ‘formal’ environments (with an intrinsic focus on preservation format over re-use), or in ‘informal’ environments (with a focus on re-use and engagement over preservation (Flickr, Wikimedia Commons, DART repository based on CKAN, etc)? Both Flickr and Wikimedia Commons have been around for over a decade. Distributed peer to peer sharing, as used in Git, produces more robust and resilient environments which are equally suited to longer term preservation. Whilst the authors appreciate that the situation is much more nuanced, particularly with the introduction of platforms that facilitate collaborative workflow development, this does have an impact on long-term deployment.

Choosing our licences

Licences are fundamental to the successful re-use of content. Licences describe who can use a resource, what they can do with this resource and how they should reference any resource (if at all).

Two lead organisations have developed legal frameworks for content licensing, Creative Commons (CC) and Open Data Commons (ODC). Until the release of CC version 4, published in November 2013, the CC licence did not cover data. Between them, CC and ODC licences can cover all forms of digital work.

At the top level the licences are permissive public domain licences (CC0 and PDDL respectively) that impose no restrictions on the licensees use of the resource. ‘Anything goes’ in a public domain licence: the licensee can take the resource and adapt it, translate it, transform it, improve upon it (or not!), package it, market it, sell it, etc. Constraints can be added to the top level licence by employing the following clauses:

  • BY – By attribution: the licensee must attribute the source.
  • SA – Share-alike: if the licensee adapts the resource, they must release the adapted resource under the same licence.
  • NC – Non commercial: the licensee must not use the work within a commercial activity without prior approval. Interestingly, in many area of the world, the use of material in university lectures may be considered a commercial activity. The non-commercial restriction about the nature of the activity, not the legal status of the institution doing the work.
  • ND – No derivatives: the licensee can not derive new content from the resource.

Each of these clauses decreases the ‘open-ness’ of the resource. In fact, the NC and ND clause are not intrinsically open (they restrict both who can use and what you can do with the resource). These restrictive clauses have the potential to produce license incompatibilities which may introduce profound problems in the medium to long term. This is particularly relevant to the SA clause. Share-alike means that any derived output must be licensed under the same conditions as the source content. If content is combined (or mashed up) – which is essential when one is building up a corpus of heritage resources – then content created under a SA clause can not be combined with content that includes a restrictive clause (BY, NC or ND) that is not in the source licence. This licence incompatibility has a significant impact on the nature of the data commons. It has the potential to fragment the data landscape creating pockets of knowledge which are rarely used in mainstream analysis, research or policy making. This will be further exacerbated when automated data aggregation and analysis systems become the norm. A permissive licence without clauses like Non-commercial, Share-alike or No-derivatives removes such licence and downstream re-user fragmentation issues.

For completeness, specific licences have been created for Open Government Data. The UK Government Data Licence for public sector information is essentially an open licence with a BY attribution clause.

At DART we have followed the guidelines of The Open Data Institute and separated out creative content (illustrations, text, etc.) from data content. Hence, the DART content is either CC-BY or ODC-BY respectively. In the future we believe it would be useful to drop the BY (attribution) clause. This would stop attribute stacking (if the resource you are using is a derivative of a derivative of a derivative of a ….. (you get the picture), at what stage do you stop attribution) and anything which requires bureaucracy, such as attributing an image in a powerpoint presentation, inhibits re-use (one should always assume that people are intrinsically lazy). There is a post advocating ccZero+ by Dan Cohen. However, impact tracking may mean that the BY clause becomes a default for academic deposition.

The ADS uses a more restrictive bespoke default licence which does not map to national or international licence schemes (they also don’t recognise non CC licences). Resources under this licence can only be used for teaching, learning, and research purposes. Of particular concern is their use of the NC clause and possible use of the ND clause (depending on how you interpret the licence). Interestingly, policy changes mean that the use of data under the bespoke ADS licence becomes problematic if university teaching activities are determined to be commercial. It is arguable that the payment of tuition fees represents a commercial activity. If this is true then resources released under the ADS licence can not be used within university teaching which is part of a commercial activity. Hence, the policy change in student tuition and university funding has an impact on the commercial nature of university teaching which has a subsequent impact on what data or resources universities are licensed to use. Whilst it may never have been the intention of the ADS to produce a licence with this potential paradox, it is a problem when bespoke licences are developed, even if they were originally perceived to be relatively permissive licences. To remove this ambiguity it is recommended that submissions to the ADS are provided under a CC licence which renders the bespoke ADS licence void.

In the case of DART, these licence variations with the ADS should not be a problem. Our licences are permissive (by attribution is the only clause we have included). This means the ADS can do anything they want with our resources as long as they cite the source. In our case this would be the individual resource objects or collections on the DART portal. This is a good thing, as the metadata on the DART portal is much richer than the metadata held by the ADS.

Concerns about opening up data, and responses which have proved effective

Christopher Gutteridge (University of Southampton) and Alexander Dutton (University of Oxford) have collated a Google doc entitled ‘Concerns about opening up data, and responses which have proved effective‘. This document describes a number of concerns commonly raised by academic colleagues about increasing access to data. For DART two issues became problematic that were not covered by this document:

  • The relationship between open data and research novelty and the impact this may have on a PhD submission.
  • Journal publication – specifically that a journal won’t publish a research paper if the underlying data is open.

The former point is interesting – does the process of undertaking open science, or at least providing open data, undermine the novelty of the resultant scientific process? With open science it could be difficult to directly attribute the contribution, or novelty, of a single PhD student to an openly collaborative research process. However, that said, if online versioning tools like Git are used, then it is clear who has contributed what to a piece of code or a workflow (the benefits of the BY clause). This argument is less solid when we are talking solely about open data. Whilst it is true that other researchers (or anybody else for that matter) have access to the data, it is highly unlikely that multiple researchers will use the same data to answer exactly the same question. If they do ask the same question (and making the optimistic assumption that they reach the same conclusion), it is still highly unlikely that they will have done so by the same methods; and even if they do, their implementations will be different. If multiple methods using the same source data reach the same conclusion then there is an increased likelihood that the conclusion is correct and that the science is even more certain. The underlying point here is that 21st-century scientific practice will substantially benefit from people showing their working. Exposure of the actual process of scientific enquiry (the algorithms, code, etc.) will make the steps between data collection and publication more transparent, reproduceable and peer-reviewable – or, quite simply, more scientific. Hence, we would argue that open data and research novelty is only a problem if plagiarism is a problem.

The journal publication point is equally interesting. Publications are the primary metric for academic career progression and kudos. In this instance it was the policy of the ‘leading journal in this field’ that they would not publish a paper from a dataset that was already published. No credible reasons were provided for this clause – which seems draconian in the extreme. It does indicate that no one size fits all approach will work in the academic landscape. It will also be interesting to see how this journal, which publishes work which is mainly funded by EPSRC, responds to the EPSRC guidelines on open data.

This is also a clear demonstration that the academic community needs to develop new metrics that are more suited to 21st century research and scholarship by directly link academic career progression to other source of impact that go beyond publications. Furthermore, academia needs some high-profile exemplars that demonstrate clearly how to deal with such change. The policy shift and ongoing debate concerning ‘Open access’ publications in the UK is changing the relationship between funders, universities, researchers, journals and the public – a similar debate needs to occur about open data and open science.

The altmetrics community is developing new metrics for “analyzing, and informing scholarship” and have described their ethos in their manifesto. The Research Councils and Governments have taken a much greater interest in the impact of publically funded research. Importantly public, social and industry impact are as important as academic impact. It is incumbent on universities to respond to this by directly linking academic career progression through to impact and by encouraging improved access to the underlying data and procesing outputs of the research process through data repositories and workflow environments.

Skillshares and Stories: Upcoming Community Sessions

Heather Leson - April 3, 2014 in CKAN, Events, Network, OKF Brazil, OKF Projects, Open Access, Open Knowledge Foundation Local Groups, School of Data

We’re excited to share with you a few upcoming Community Sessions from the School of Data, CKAN, Open Knowledge Brazil, and Open Access. As we mentioned earlier this week, we aim to connect you to each other. Join us for the following events!

What is a Community Session: These online events can be in a number of forms: a scheduled IRC chat, a community google hangout, a technical sprint or hackpad editathon. The goal is to connect the community to learn and share their stories and skills.

We held our first Community Session yesterday. (see our Wiki Community Session notes) The remaining April events will be online via G+. These sessions will be a public Hangout to Air. The video will be available on the Open Knowledge Youtube Channel after the event. Questions are welcome via Twitter and G+.

All these sessions are Wednesdays at 10:30 – 11:30 am ET/ 14:30 – 15:30 UTC.

Mapping with Ketty and Ali: a School of Data Skillshare (April 9, 2014)

Making a basic map from spreadsheet data: We’ll explore tools like QGIS (a free and Open-source Geographic Information System), Tilemill (a tool to design beautiful interactive web maps) Our guest trainers are Ketty Adoch and Ali Rebaie.

To join the Mapping with Ketty and Ali Session on April 9, 2014

Q & A with Open Knowledge Brazil Chapter featuring Everton(Tom) Zanella Alvarenga (April 16, 2014)

Around the world, local groups, Chapters, projects, working groups and individuals connect to Open Knowledge. We want to share your stories.

In this Community Session, we will feature Everton (Tom) Zanella Alvarenga, Executive Director.

Open Knowledge Foundation Brazil is a newish Chapter. Tom will share his experiences growing a chapter and community in Brazil. We aim to connect you to community members around the world. We will also open up the conversation to all things Community. Share your best practices

Join us on April 16, 2014 via G+

Take a CKAN Tour (April 23, 2014)

This week we will give an overview and tour of CKAN – the leading open source open data platform used by the national governments of the US, UK, Brazil, Canada, Australia, France, Germany, Austria and many more. This session will cover why data portals are useful, what they provide and showcase examples and best practices from CKAN’s varied user base! Our special guest is Irina Bolychevsky, Services Director (Open Knowledge Foundation).

Learn and share your CKAN stories on April 23, 2014

(Note: We will share more details about the April 30th Open Access session soon!)

Resources

The School of Data Journalism 2014!

Milena Marin - April 3, 2014 in Data Journalism, Events, Featured, School of Data

DJH_5 copy

We’re really excited to announce this year’s edition of the School of Data Journalism, at the International Journalism Festival in Perugia, 30th April – 4th May.

It’s the third time we’ve run it (how time flies!), together with the European Journalism Centre, and it’s amazing seeing the progress that has been made since we started out. Data has become an increasingly crucial part of any journalists’ toolbox, and its rise is only set to continue. The Data Journalism Handbook, which was born at the first School of Data Journalism is Perugia, has become a go-to reference for all those looking to work with data in the news, a fantastic testament to the strength of the data journalism community.

As Antoine Laurent, Innovation Senior Project Manager at the EJC, said:

“This is really a must-attend event for anyone with an interest in data journalism. The previous years’ events have each proven to be watershed moments in the development of data journalism. The data revolution is making itself felt across the profession, offering new ways to tell stories and speak truth to power. Be part of the change.”

Here’s the press release about this year’s event – share it with anyone you think might be interested – and book your place now!


PRESS RELEASE FOR IMMEDIATE RELEASE

April 3rd, 2014

Europe’s Biggest Data Journalism Event Announced: the School of Data Journalism

The European Journalism Centre, Open Knowledge and the International Journalism Festival are pleased to announce the 3rd edition of Europe’s biggest data journalism event, the School of Data Journalism. The 2014 edition takes place in Perugia, Italy between 30th of April – 4th of May as part of the International Journalism Festival.

#ddjschool #ijf13

A team of about 25 expert panelists and instructors from New York Times, The Daily Mirror, Twitter, Ask Media, Knight-Mozilla and others will lead participants in a mix of discussions and hands-on sessions focusing on everything from cross-border data-driven investigative journalism, to emergency reporting and using spreadsheets, social media data, data visualisation and mapping techniques for journalism.

Entry to the School of Data Journalism panels and workshops is free. Last year’s editions featured a stellar team of panelists and instructors, attracted hundreds of journalists and was fully booked within a few days. The year before saw the launch of the seminal Data Journalism Handbook, which remains the go-to reference for practitioners in the field.

Antoine Laurent, Innovation Senior Project Manager at the EJC said:

“This is really a must-attend event for anyone with an interest in data journalism. The previous years’ events have each proven to be watershed moments in the development of data journalism. The data revolution is making itself felt across the profession, offering new ways to tell stories and speak truth to power. Be part of the change.”

Guido Romeo, Data and Business Editor at Wired Italy, said:

“I teach in several journalism schools in Italy. You won’t get this sort of exposure to such teachers and tools in any journalism school in Italy. They bring in the most avant garde people, and have a keen eye on what’s innovative and new. It has definitely helped me understand what others around the world in big newsrooms are doing, and, more importantly, how they are doing it.”

The full description and the (free) registration to the sessions can be found on http://datajournalismschool.net You can also find all the details on the International Journalism Festival website: http://www.journalismfestival.com/programme/2014

ENDS

Contacts: Antoine Laurent, Innovation Senior Project Manager, European Journalism Centre: laurent@ejc.net Milena Marin, School of Data Programme Manager, Open Knowledge Foundation, milena.marin@okfn.org

Notes for editors

Website: http://datajournalismschool.net Hashtag: #DDJSCHOOL

The School of Data Journalism is part of the European Journalism Centre’s Data Driven Journalism initiative, which aims to enable more journalists, editors, news developers and designers to make better use of data and incorporate it further into their work. Started in 2010, the initiative also runs the website DataDrivenJournalism.net as well as the Doing Journalism with Data MOOC, and produced the acclaimed Data Journalism Handbook.

About the International Journalism Festival (www.journalismfestival.com) The International Journalism Festival is the largest media event in Europe. It is held every April in Perugia, Italy. The festival is free entry for all attendees for all sessions. It is an open invitation to listen to and network with the best of world journalism. The leitmotiv is one of informality and accessibility, designed to appeal to journalists, aspiring journalists and those interested in the role of the media in society. Simultaneous translation into English and Italian is provided.

About Open Knowledge (www.okfn.org) Open Knowledge, founded in 2004, is a worldwide network of people who are passionate about openness, using advocacy, technology and training to unlock information and turn it into insight and change. Our aim is to give everyone the power to use information and insight for good. Visit okfn.org to learn more about the Foundation and its major projects including SchoolOfData.org and OpenSpending.org.

About the European Journalism Centre (www.ejc.net) The European Journalism Centre is an independent, international, non-profit foundation dedicated to maintaining the highest standards in journalism in particular and the media in general. Founded in 1992 in Maastricht, the Netherlands, the EJC closely follows emerging trends in journalism and watchdogs the interplay between media economy and media culture. It also hosts each year more than 1.000 journalists in seminars and briefings on European and international affairs.

Happy Spring Cleaning, Community Style

Heather Leson - April 1, 2014 in Community Stories, Events, Featured, Network, OKF Projects, OKFestival, Open Knowledge Foundation, Open Knowledge Foundation Local Groups, Our Work, Working Groups

OKF_HK

Crazy about happy? Call it spring fever, but I am slightly addicted to the beautiful creativity of people around the world and their Happy videos (map). We are just one small corner of the Internet and want to connect you to Open Knowledge. To do this, we, your community managers, need to bring in the Happy. How can we connect you, meet your feedback, continue the spirit of global Open Data Day, and celebrate our upcoming 10 year anniversary as Open Knowledge? Tall order, but consider this.

Open Knowledge is a thriving network. We exist because of all of you and the incremental efforts each of you make on a wide-range of issues around the world. The way forward is to flip the community around. We will focus on connecting you to each other. Call it inspired by Happy or the Zooinverse mission, but we heard your input into the community survey and want to meet it.

Coffee smiley by spaceageboy

So, here are 4 key ways we aim to connect you:

1. Community Tumblr

Greece, MENA, and Tanzania – these are just some of the locations of Open Knowledge Stories on the Community Tumblr. We know that many of you have stories to tell. Have something to say or share? Submit a story. Just one look at the recent WordPress about 10 moments around the world gives me inspiration that the stories and impact exist, we just need to share more.

The Open Knowledge Community Tumblr

2. Wiki Reboot

As with every spring cleaning, you start by dusting a corner and end up at the store buying bookshelves and buckets of paint. The Open Knowledge wiki has long been ridden with spam and dust bunnies. We’ve given it a firm content kick to make it your space. We are inspired by the OpenStreetMap community wiki.

What next? Hop on over and create your Wiki User account – Tell us about yourself, See ways to Get Involved and Start Editing. We think that the wiki is the best way to get a global view of all things Open Knowledge and meet each other. Let’s make this our community hub.

3. Community Sessions

We have a core goal to connect you to each other. This April we are hosting a number of online community events to bring you together. Previously, we had great success with a number of online sessions around Open Data Day and OKFestival.

The Community Sessions can be in a number of forms: a scheduled IRC chat, a community Google hangout, a technical sprint or hackpad editathon. We are using the wiki to plan. All events will be announced on the blog and be listed in the main Open Knowledge events calendar.

Wiki planning for the Community Sessions:

The first session is Wednesday, April 2, 2014 at 14:30 UTC/10:30 ET. We will host an IRC chat all about the wiki. To join, hop onto irc.freenode.net #okfn. IRC is a free text-based chat service.

4. OkFestival

OKFestival is coming soon. You told us that events is one of the biggest ways that you feel connected to Open Knowledge. As you many know, there are regular online meetups for School of Data, CKAN and OpenSpending Communities. Events connect and converge all of us with location and ideas.

Are you planning your own events where you live or on a particular open topic? We can help in a few ways:

  • Let us know about the events you’re running! Let’s discover together how many people are joining Open knowledge events all around the world!
  • Never organized an event before or curious to try a new type of gathering? Check out our Events Handbook for tips and tricks and contact our Events Team if you have questions or feedback about it
  • Want to connect with other community members to talk about your events, share skills, create international series of events together? Ping our global mailing list!

Have some ideas on how we can bring on the happy more? Drop us a line on the okfn-discuss mailing list or reach out directly – heather DOT leson AT okfn DOT org.

(Photo by SpaceAgeBoy)

Get Updates