Machine learning, risks and benefits of automated decisions

It is increasingly common for companies, including insurance ones, to rely on “smart intermediaries” to support their decisions. How far can algorithms be pushed before infringing the data and rights of individuals? GDPR comes to the rescue

Published on 13 Apr 2020

Will the online bank grant the loan? Will the insurance premium vary consistently with driving behaviour? Who will be selected for the vacant position to which many people have applied online? 

These three questions have a common aspect: the uncertainty of the answer. The digital mechanisms of the organizations we address on a daily basis are now governed by “smart intermediaries”, which are useful for enhancing human decision-making skills in many areas of individual and community life, reducing inefficiencies and errors. However, what is the process that allows us to reach these decisions where the human intervention is ever less frequent? 


The backgrounds 

A first case does not imply intelligence, rather the simple re-proposal in digital form of natively analogical rules. Let us assume that every person over the age of eighty is automatically refused life insurance. The process is elementary: a fixed (deterministic) rule is established to automatically drive the decision. 

A second situation, which is increasingly used, involves a preliminary phase of data analysis – called profiling – which then leads to an automated decision. 

Profiling is defined as an automatic activity useful to classify subjects according to certain personal characteristics of behaviour. This operation, in itself meaningless, allows us to gather (also from aggregated data) personal information useful for the decisions that the organization will have to make at a later time. 

An example to understand what profiling is 

An example will help us understand its scope. 

An insurance company decides to analyse the electronic payment data of all its customers to identify behaviour related to the accident rate. The analysis reveals that people who consume alcohol in bars at night are more likely to commit road accidents: the model has identified a relevant pattern and is able to recognise it even when completely new data is processed. Therefore, by applying the profile to a single new customer, there is a good chance that, if that person buys drinks in bars at night, there is a considerable possibility of him/her being more likely to commit road accidents. 

Thus, by virtue of the information extracted, the insurance company will be able to customise the premiums of individual customers or offer them alternative products. Based on the result obtained from the model, a decision is then made, which in most cases takes place in a fully automated form. 

Machine learning, profiling and risk of discrimination 

Among the many technological solutions available on the market today, Machine Learning (ML) seems to be the most effective tool to achieve what has been described so far. In short, considering that profiles are nothing but patterns resulting from probabilistic data processing, using ML algorithms is particularly suitable and convenient for profiling activities. 

But the undeniable benefits of this approach to data analysis must be carefully balanced with the protection needs of the subjects to whom the data refer. In fact, such practices can give rise not only to new forms of pervasive and constant control and monitoring on individuals, but can even result in aggravating prejudice and increasing discrimination, leading in the worst cases to social exclusion or marginalisation. All this presents particular risks for individuals, especially considering the complexity and the often inevitable “opacity” regarding the functioning of these tools. 

The anti-discrimination protections of the Gdpr 

In the face of the considerable increase in risks outlined above, the European legal system has equipped itself with tools to contain the negative effects that these technologies could have on individuals. 

The new European Data Protection Regulation (EU Reg. 2016/679 – GDPR – directly applicable from 25 May 2018) continues to protect sensitive information and information systems from hacker attacks, but adds much more. One of the main aims of the GDPR is to counter the potential discriminatory capacity that algorithms and automated decisions may have on individuals. 

In particular, Article 22 states that natural persons have the right not to be subject to fully automated decisions that have legal effects or similar effects on them. 

The exceptions 

This general rule – derived with considerable improvements from the previous Directive 95/46/EC – is then mitigated by three exceptions (legal bases legitimising the processing) where automated decisions are instead allowed. 

The first exception relates to cases where “the decision is necessary for the performance of a contract”. The typical example concerns the case where a company receives a large number of applications for a vacant position and therefore decides to use automated decision making processes to make a short list of possible candidates with the intention of concluding a contract with one (or some) of them. In order to meet the more general requirement of necessity, the decision-making process should be the least invasive method of the person’s privacy to conclude that contract. In most cases, however, the decision facilitates or simply allows the conclusion of the contract, although it is not strictly necessary. Therefore, considering that the margin between necessity and facilitation is very thin, a restrictive interpretation of the rule would leave little room for organizations to make use of this exception and thus make use of fully automated decision-making processes. This approach to data analysis may even reduce the privacy-intrusive nature of the underlying process. In fact, thanks to feature selections – the phase in which a subset of variables relevant and significant for the construction of the model is selected – the machine is able to make a decision based on a limited number of data compared to those that a human being should instead process to make the same decision. This is precisely why in some cases automated decisions can be less invasive than those made by humans. 

The second case where automated decision-making processes can be used concerns situations where these techniques are ‘authorised under EU or domestic laws’. The Federal Data Protection Act – which adapts German national legislation to GDPR – has established that the right not to be subject to fully automated decisions does not apply in the context of insurance relationships. Without addressing the substance of this provision, which is still little used by European legislative authorities, it is important to note that a legislative measure that provides for the implementation of automated decision-making processes could encourage incredible technical innovations in the sector concerned. 

The third and final exception concerns cases where ‘the decision is based on the explicit consent of the person concerned’. For consent to be valid under GDPR, it must be “free, specific, informed and unequivocal”. In the context of this analysis, the information requirement is of particular complexity, according to which the data subject (the data subject) must be able to genuinely understand the processing activity that will be carried out automatically. This means that the person processing the data (the data controller) must make available to the data subject a sufficient amount of relevant information regarding the logic used to make the decision and the effects of the processing. This requirement will not be difficult to implement if the ML algorithms used are ‘white-box’. In this case, in fact, the data processing can be explained and understood by the data subject and the decision-making process can be even more clearly traceable than what is done by a human being. Instead, the “black-box” techniques present particular criticality. These highly performing instruments typically transform the variables in a way that is neither comprehensible nor explainable even by the best analyst or the same developer. 

In these cases, relying on consent to make automated decisions will require the data controller to make an additional effort – also in line with the principle of accountability – to make all the information available to the data subject. This will serve to prove that despite the technological complexity, the data subject has made an informed choice regarding the consent given. The importance of this requirement is linked to the Eurobarometro survey results which represent a general increase in user awareness and an increasing focus on the comprehensibility of how data are processed in an automated way in order to make truly informed decisions. The disruptive scope of machine learning techniques is undeniable. By drastically reducing costs and inefficiencies, better results can be achieved and innovative products and services can be designed. Of course, the use of these technologies is just as complex from the point of view of “GDPR compliance”, but not impossible. 

The three legitimacy assumptions discussed above must be carefully assessed by any organization that decides to initiate automated decision-making processes. In any case, the data controllers must be concerned about the technological choices and the correct implementation of the principles of lawfulness, correctness and transparency, the quality of the data described in art. 5 of the Regulation, as well as carrying out careful analysis of the risks that could arise. There are no impediments, but only rules to protect and manage the risks that could arise from the indiscriminate use of these technologies. 

Preventing the development of these techniques would simply be counterproductive in the wider context of the European Digital Single Market, whose main challenges will be played out precisely on automation. 

In conclusion, organisations that can govern these rules and react “agile” to these risks will gain considerable competitive advantages in an increasingly digital, interconnected, autonomous and robotic environment. The ultimate goal must not be simply compliance in a narrow sense, but the improvement of product design, user experience and – ultimately – increasing customer confidence as a key driver of digital markets. 

(Giulia Del Gamba – Stefano Leucci, originally published in Agenda Digitale) 

All rights reserved

Valuta la qualità di questo articolo

La tua opinione è importante per noi!

Related news