What is personal data?
To understand and apply the European regulation, it is important to identify the data concerned. According to the definition given by the Commission Nationale de l’Informatique et des Libertés (CNIL), personal data is information that directly or indirectly identifies a natural person.
A natural person can be identified :
- Directly, for example by his or her first and last names
- Indirectly, for example by his telephone number, the license plate of his car, but also by his voice or his image
This direct or indirect identification of a physical person can be carried out :
- From a single piece of data such as his social security number
- From the crossing of a set of data (sex, date of birth, address, possession of products, …)
The GDPR applies to the personal data of natural persons:
In their entirety: thus including professional data
But only to natural persons: the data of legal entities are not subject to it
For all collectors and users of personal data, the implementation of the GDPR has two main consequences. The first one is a reinforcement of the protection of the personal data of the persons concerned. The second one, resulting from the first one, is a need for all data actors to adapt their internal procedures to comply with the regulation.
Any organization established on the territory of the European Union or directly targeting European residents must comply.
Evolution of the collection of this data
The notion of personal data being defined, what are now the impacts when collecting them?
The GDPR allows the collection and processing of personal data when it is based on
- a consent: the person has consented to the processing of his or her data
- a contract: the processing is necessary for the performance or preparation of a contract with the data subject
- a legal obligation: the processing is imposed by legal texts
- a mission of public interest: the processing is necessary for the performance of a mission of public interest
- a legitimate interest: the processing is necessary for the pursuit of legitimate interests of the body processing the data or of a third party, in strict compliance with the rights and interests of the persons whose data are processed
- safeguarding vital interests: the processing is necessary to safeguard the vital interests of the data subject, or of a third party.
In addition, the acquisition, storage and use of this data must serve a precise, legitimate and strictly useful purpose to achieve the purposes previously determined at the time of collection. The duration of the processing and retention of such data must also be clearly defined at the time of collection.
Organizations collecting personal data are therefore adapting their strategies to comply with these legal requirements. For example, websites are obliged to inform users of the purpose of the use of navigation cookies and to collect them only after free, specific, informed and unambiguous consent, leaving it up to users to set the parameters of these cookies.
All actors using personal data must therefore comply with the European regulation and often change their privacy policy.
However, the practices of collecting and using personal data remain relatively non-transparent. Today, it is still difficult for the CNIL to know what are the concrete effects of the GDPR on all the personal data collected.
In this context, the practices of data brokers, companies whose activity is based on the resale of data to advertisers or marketing service providers, are characterized by difficulties in convincing that they have obtained the free, specific, informed and univocal consent of Internet users to collect and then resell their personal data. Moreover, their very activity of selling personal data may be in contradiction with several articles of the European regulation, mainly with the principle of purpose.
All personal data contributes to the business model of a data broker since, by its very nature, it resells this type of information and profits from it. The compliance or non-compliance with the European regulation of data coming from a data broker can therefore be complicated to demonstrate. Several examples show us these difficulties with several of these particular actors, followed by national control authorities: in France an investigation is opened on Criteo, in Ireland on Quantcast, in Great Britain on Acxiom, Experian and Equifax.
Impact on the use of personal data
Once this data has been collected in compliance with the GDPR, how can it be used?
The European regulation imposes a new framework, resulting in more security for personal data and therefore more constraints for companies using this data.
It is important to understand that GDPR has the sole purpose of protecting the people whose data is exploited and not to prevent their use, let alone to stigmatize data analysis and its exploitation.
To comply with the law, organizations using personal data must first conduct a risk assessment of the company’s data (leakage or breach of protection). When this risk is high for the rights and freedoms of the people affected by the data, a Data Protection Impact Assessment, or DPIA, must also be conducted. Organizations must then establish internal GDPR compliance files, containing data processing registers, the DPIA, data breach register, consent register, processor contracts and legal notices.
In addition to the rules at the time of collection, many other rules also apply when personal data is used. One way to enable the use of personal data in data science projects is to anonymize the data. Indeed, anonymized data falls outside the scope of the GDPR.
Today, several anonymization techniques exist:
- Hashing, a function that calculates a unique fingerprint (or signature) from the data provided
- Aggregation, grouping of data from individuals (thus losing the personal character)
- Differential confidentiality, derived from cryptography, a technique based on the introduction of noise (random information within the data)
But these techniques lead to an inevitable loss of information, and anonymization is not 100% secure.
Netflix had an illustration of this in the context of its 2006 contest where the American firm proposed to predict the ratings given to certain films by users. Researchers at the University of Texas were able to re-identify users simply by cross-referencing the information with public data, even though the company had published 100 million anonymized ratings.
The EU regulation also addresses other techniques, such as pseudonymization.
To enable teams of data scientists and data analysts to work serenely and efficiently with personal data, organizations must frequently rethink their organization and implement ad hoc solutions. One solution is to centralize personal data in a tool that promotes data governance and differentiates between personal data. Organizations must also develop work processes and train their employees in the use and documentation of personal data. Finally, they must ensure the traceability of transformations and algorithms applied to personal data to facilitate audits.
The implementation of and compliance with the GDPR is often perceived as a constraint, with no positive point. However, failure to implement GDPR can lead to significant penalties.
Sanctions in case of non-compliance
Depending on the seriousness of the facts, the penalties applied in case of non-compliance with the GDPR are variable. In the case where the supervisory authority chooses an administrative fine, the amount can be more or less heavy depending on the degree of seriousness of the violation of the privacy of the persons concerned: a maximum of 20 million euros or 4% of the annual worldwide turnover of the group (the maximum penalty is retained). In addition to these administrative sanctions, criminal sanctions can be added, with fines of up to 300,000 euros and five years’ imprisonment.
Among the companies already sanctioned, we can mention: Google, Facebook, Uber, Carrefour France and Carrefour Banque, Bouygues Telecom, Darty, Optical Center or the Office HLM of Rennes. Sanctions have been applied in all sectors of activity, the objective of the GDPR being to protect all personal data.
Any organization whose activity is closely or remotely related to this data must therefore comply with the regulation.
It is to France that the record penalty of 50 million euros, imposed by the CNIL to Google in early 2019. The sanctioned is not really a surprise considering the fact that the personal data protection measures were initially aimed at undermining the monetization of users’ personal data, which is at the origin of the economic rise of GAFAM (Google, Apple, Facebook, Amazon and Microsoft) and now NATUs (Netflix, Airbnb, Tesla and Uber).
Transformation of the Data professions
Since the implementation of the GDPR, ethics for data scientists and data analysts is an even more essential value than before. This requires them to question the origin, purpose and usefulness of each piece of data. But this is often complicated by the way things are done, which creates little link between the data professionals and the teams who want to use this data.
The position of Chief Data Officer is trying to fill this niche, but in a survey published earlier this year, only 28% of Chiefs Data Officers consider their role to be “beneficial and well-defined”.
Today, teams of data scientists and data analysts are therefore struggling to navigate between the regulations and the need to process personal data. This ethical approach has become a central issue for companies and organizations.
Moving from constraint to opportunity
The implementation of the GDPR must above all be seen by organizations as an opportunity to initiate change. It must allow to review the processes of personal data processing and to centralize them as well as to allow the implementation of new working methods. The Data professions and their environment will continue to evolve significantly over the next few years, so it may be appropriate to capitalize on this dynamic.
The increased awareness of the population on the protection of personal data encourages organizations to communicate on their compliance and their commitment to respect the privacy of their employees and customers. This can’t fail to reinforce trust in their ecosystem and all can only benefit in the recruitment and retention of customers and employees.
References:
CNIL : cnil.fr
Rapport annuel 2019 de la CNIL
https://www.cnil.fr/sites/default/files/atoms/files/cnil-40e_rapport_annuel_2019.pdf
Enquête sur les Chief Data Officers en France, publiée en 2020 par Data Galaxy
CNIL’s 2019 annual report
https://www.cnil.fr/sites/default/files/atoms/files/cnil-40e_rapport_annuel_2019.pdf
Survey on Chief Data Officers in France, published in 2020 by Data Galaxy