Von Timo Wilken, Telefónica NEXT
The General Data Protection Regulation (GDPR), which came into force on 25 May 2018, has led to a high level of privacy protection for citizens throughout Europe, which is a welcome change. At the same time, some companies are still unsure about to how correctly apply the new legal situation in their daily business with personal data - especially since the possible fines for data protection violations have risen significantly. Big data anonymization is paving the way for a solution.
Anonymization can be a new way to comply with GDPR where personally identifiable information are not of the essence.
In addition to the GDPR already in force, a European ePrivacy regulation is also imminent. It will replace the previous ePrivacy Directive and its national transposition laws and regulate the area of electronic communications anew - also combined with a significantly higher fine framework.
Are personally identifiable information really needed?
For companies, however, this change offers the great opportunity to re-evaluate their processing of personal data. Are there use cases where personal information can be disregarded? Can data also be useful in anonymous form? Telefónica NEXT, for example, already uses anonymous data from the mobile network of its parent company Telefónica Germany to support transport planning and stationary retail in city centres.
This is because information can still represent significant value without knowing to whom it refers. Frequently, the behaviour of individuals is not relevant, as compared to statistically significant group behavior. Innovative technical solutions like big data anonymization can help to master this challenge.
Anonymous data can be analyzed without prior opt-in
The advantage of processing anonymized data is obvious: according to recital 26 of the GDPR, the data protection legislation does not apply to anonymous information, i.e. not even to personal data "rendered anonymous in such a manner that the data subject is not or no longer identifiable".
In determining whether a natural person is identifiable, Recital 26 states that “account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly ". All objective factors, such as the cost and time required for identification, must be weighed in, taking into account the technology and means available at the time of processing.
What is big data anonymization in the sense of GDPR?
The GDPR therefore does not require absolute anonymization, but de facto anonymization in accordance with the provisions of recital 26. What exactly does this mean? In the case of lawfully processed data "generated" in the course of regular business activities, so many characteristics must be removed, modified or aggregated that identification according to these requirements is no longer possible.
At the same time, the person responsible should ensure that as little information as possible is lost in order to keep the information’s value as high as possible. Finding the right balance here is a complex task. For this reason, a data processor should also check to what extent the chosen procedures really ensure that individual persons can no longer be identified. In many cases, it is not sufficient to simply remove identification features such as names or addresses. Successful big data anonymization is required to take context into account.
Combine data from different sources and stay anonymous
A combination of different data snippets without respective identification features can possibly lead to a personal reference. The de-anonymization using other data sources is called linkage attack - a fate that a well-known video streaming provider suffered after the release of seemingly anonymous movie ratings based on 500.000 user reviews. By linking them with public online ratings, researchers where able to re-introduce personally identifiable information.
When combining anonymous data from different sources, specialized anonymization solutions like the Data Anonymization Platform can ensure that personal references are not re-created accidentially or by linkage attacks through context or third party data. Picture: ivanovgood via Pixabay
For example, the anonymous job title "veterinarian" in combination with an equally anonymous city name could lead to the identification of a person - if only one veterinarian lives in this place, she can be easily looked up in the phone register. In this case, the regulatory framework of the GDPR would be reopened with all rights and obligations.
Complex measures are needed to keep up anonymization
Therefore, sufficient technical and organisational measures such as separation and role concepts, encryption and key management, application of k-anonymity and/or differential privacy and much more are needed to ensure end-to-end anonymization in the sense of recital 26 of the GDPR.
Telefónica NEXT has solved this challenge by developing the Data Anonymization Platform (DAP) for big data anonymization of mobile network data in close cooperation with the Federal Commissioner for Data Protection and Freedom of Information (BfDI), the highest German authority for data protection. The result is a complex anonymization procedure that has been successfully certified by TÜV Saarland with the "Geprüfter Datenschutz" seal of approval.
Data Anonymization Platform takes context into account
The patented platform also makes it possible to analyze data flexibly according to the issue at hand and to incorporate other internal and external data sources into the respective analysis while maintaining anonymity.
Once factual anonymity has been successfully established, the data can provide added value. The anonymized transaction data from Telefónica Deutschland's mobile communications are of great value to cities and transport companies, for example. Based on the data, they can understand where and how people move and stay in order to better adapt future infrastructure measures to demand. Anonymous data from the Connected Car could come into the game here, too.
Anonymous data create value for the public transport sector
The stationary retail sector can use anonymized data based on Wifi signals or mobile phones to better understand how groups of consumers move in and around shops. They help retailers to better tailor their offerings to the needs of their customers - and thus better compete with online retailers.
This is just a small round-up of the possibilities where big data anonymization can support new data-driven business models. Therefore, Telefónica NEXT has decided to make their patented Data Anonymization Platform available to other companies and industries. After all, beneficial data are not limited to the telecommunications sector, but can also support the automotive industry, banks and the healthcare system.
Big data anonymization will be key to solving numerous challenges of the data economy.
Timo Wilken is Data Protection Officer of Telefónica NEXT and advises on data protection in the development of new data-based products and business models. After completing his law studies at the University of Hamburg and the legal traineeship in Berlin and Tel Aviv, he was appointed data protection officer for the market research institutes of the German KANTAR Group and held the position for several years.