This blog has been reposted from the Open Knowledge Foundation Germany blog
In early 2018 the Open Knowledge Foundation Germany (OKFDE) and Algorithm-Watch launched the project OpenSCHUFA, which works on reverse-engineering the algorithms of the Schufa, Germany’s credit rating system. This week the first analyses of OpenSchufa dataset are published. The data teams and editorial offices of Bayerischer Rundfunk and SpiegelOnline have evaluated the anonymous data that has been collected with the help of our „mydata“ project OpenSchufa since this spring.
In the last 10 months this project generated 100,000 individual data requests in Germany. Out of these, more than 30,000 were directed at Schufa and resulted in more than 3,000 data donations to us. Many thanks to all the people who donated money, time and especially their data and supported this project with other resources!
These are the most important findings:
- Bad scores even without negative characteristics
Many people have bad Schufa scores, although they have no negative characteristics. Our data implies that the Schufa lists some people as „higher risk“ even if they don’t have negative data on them. This means: Apparently the Schufa algorithm is error-prone. Even if people who have no debts or defaults get bad scores, the scoring procedure is broken.
- Allegedly accurate scores despite inaccurate data
The Schufa scores suggest to the public that they are particularly scientifically prepared. Part of this image is generated by the alleged accuracy of scores such as 85.04% or 97.41%. However, the information is misleading. The Schufa often lacks the data to make reliable statements about the creditworthiness of individuals. In almost a fourth of the people in our dataset, the Schufa has a maximum of three data points on users. In these cases, the score is not particularly trustworthy.
- Factors: Age, sex and moves
The OpenSchufa data set suggests that factors such as age, gender and many moves affect the Schufa score. For example, young men are often worse off. This means that even aspects that cannot be influenced could cause negative Schufa scores. At the moment, it is not possible to say with the data how exactly the factors affect the respective score and whether the Schufa will include them individually in the calculation or combine them. It is very possible that the scores discriminate.
- Some scores have fallen out of time
In many areas, the Schufa holds several score versions from one scoring area over individuals. As a result, for example, people have a worse score after version 1 of the Schufa Bank Score than after version 2 or version 3 of the Schufa Bank Score. Those who are unlucky that a bank requests an older score version from the Schufa have worse cards in such an example. The fact that the older score versions are still being released apparently leads to biases.
This results in these demands:
Thanks to OpenSchufa, the German Advisory Council for Consumer Affairs (SVRV) at the Federal Ministry of Justice and Consumer Protection has already written a paper with scoring transparency.
SRVR advocated that the Schufa and other scoring providers disclose their algorithm. Characteristics and weighting of the Scores must become understandable for the consumer. As the results of OpenSchufa also show, possible discrimination must be examined and disclosed. A central problem at Schufa is obviously the quality of the score and the data on which it is based.
Further reporting by Bayerischer Rundfunk has already shown that the supervision of Schufa and other scoring providers is inadequate. The Schufa itself pays for the reports that should actually review it independently.
The Federal Ministry has already announced that it will examine the Council’s recommendations. In addition to transparency, Schufa should also accept its responsibility in society. This includes that the Schufa should cooperate constructively with researchers, journalists and civil society. So far, the Schufa press office has attracted attention primarily because it intimidates journalists.
And what about the Schufa algorithm?
We are currently working on reliably deciphering various aspects of the Schufa formula. The challenge: Of around 30,000 data acccess requests that users have sent to Schufa via selbstauskunft.net, only around 3,000 data records have been forwarded to us. Nevertheless, we try to make further reliable statements about the Schufa algorithm and continue to work with the data set.
Originally, we had planned to address targeted calls to specific population groups in order to obtain data from them in the event of distortions in the data. However, this is no longer possible at present. Since the data protection regulation (GDPR) was applied in May, Schufa has given significantly less data to individuals than before. Data donations of Schufa information are therefore not usable for us since May.
Together with our partner AlgorithmWatch we continue to work on the evaluation of the data and hope to be able to derive further insights from the data soon. Afterwards we want to give further recommendations for legal regulations.
Also important: Schufa currently still refuses to provide free information by e-mail, although the GDPR obliges them to. We will work to ensure that Schufa complies with this obligation.
The Schufa is the beginning, but not the end. We need more transparency for all scoring providers in Germany and Europe.
Schufa reporting on the 28th of November (in German)
- SpiegelOnline: Blackbox Schufa
- SpiegelOnline: 2800 Datensätze – so haben wir sie ausgewertet
- SpiegelOnline: So bestellen Sie Ihre kostenlose Schufa-Auskunft
- SpiegelOnline: Video – Wie die Schufa zu ihrem Urteil kommt
- Bayerischer Rundfunk: Die ganze Recherche – Erhöhtes Risiko
- Bayerischer Rundfunk: Schufa-Score – Wie Menschen unverschuldet zum Risikofall werden
- ARD PlusMinus: Blackbox Schufa: Eine exklusive Datenauswertung gibt Einblicke
- ZDF Zoom: Unheimliche Macht – Wie Algorithmen unser Leben bestimmen
- tagesschau.de: Unverschuldet als Risikofall eingestuft?
- Deutschlandfunk: Durch die Schufa unverschuldet zum Risikofall
For further inquiries
Walter Palmetshofer, Open Knowledge Foundation Deutschland, firstname.lastname@example.org, +49 30 57703666 0
Walter is part of the team of Open Knowledge Foundation Germany (www.okfn.de). He is an economist by training and has been involved in the domain of Netzpolitik for many years. He is project lead of the EU research project Open Data Incubator (ODINE), the Digitaler Offenheitsindex [do:index], and supervises the Open Data Census. He worked as a system administrator in NYC, before moving to Berlin, where he co-founded a start-up in 2012.