Computer says what? Why algos need a human touch

24 August 2017

The robots are coming. But, for now, there’s still a place for people in the brave new world of machine learning. 

Ask any decent Head of Innovation what they think is the next big thing and chances are they’ll single out decision-making by computers through algorithms and machine learning. You will be told that excluding humans will drive efficiency, speed and consistency. What’s not to like? Well, consider the much reported ejection of a passenger from a United Airlines flight.

Needing to find room for four flight crew, the airline ran an algorithm that instructed staff to offer $800 to any passenger prepared to leave the aircraft. None chose to do so. The next step for the algorithm was to select for ejection the four least valuable passengers based on parameters such as cabin class, loyalty points and ticket price. Whilst on the face of it rational, enforcing the algorithm’s calculation led to a catastrophically bad episode for the airline and one of its passengers.

An algorithm is just a financial model by another name.

Decision-making by machines is not new because, let’s be clear, an algorithm is just a financial model by another name. And once you grasp that algorithm risk is the same as model risk the dangers of over-reliance on machines are thrown into sharp relief. Let us not forget that the models that were the very hype of Long Term Capital Management were in the end the root of its failure.

As well as making the connection between algorithms and models, we should also remember that machine learning is not the same as human learning. Machines can only learn from what they are given and, like children, don’t yet do nuance or context.

This is aptly displayed by Fei Fei Li, Associate Professor at Stanford University, who has made huge strides in AI research by teaching computers to recognise the contents of pictures. She points out  that there is more to a picture than describing what it is physically:

"But there's so much more to this picture than just a person and a cake. What the computer doesn't see is that this is a special Italian cake that's only served during Easter time. The boy is wearing his favourite t-shirt given to him as a gift by his father after a trip to Sydney, and you and I can all tell how happy he is and what's exactly on his mind at that moment."

But while computers are inferior to humans in their inability to comprehend nuance, we humans have our own limitations. The truth is that in order to understand the United Airlines debacle you must recognise that it had two principal causes, neither of which were the fault of the algorithm.

Firstly, it was humans who told the algorithm to calculate a financial offer to tempt people to leave the aircraft. It was also humans who told the algorithm not to increase the offer in increments until people did volunteer. Instead, if the financial incentive was unsuccessful, the humans told the algorithm to work out the least valuable passengers on the flight and direct their ejection.

This led to the second cause; there was no concept amongst the staff involved that if the algorithm’s outcome was bad they could override it. The algorithm said do this and they did; the matter escalated from the cabin crew, to security and finally to airport police.

Ultimately, the magic of machine learning and algorithms boils down to the assumptions humans make in the design phase.

Ultimately, the magic of machine learning and algorithms boils down to the assumptions humans make in the design phase. This takes us straight to the age old model problem for boards and regulators; oversight over the design, development, validation and management of models. 

The problem the board faces is not ‘do they know how the model works’ but rather ‘what is the business outcome?’  Had the design been described to the airline board and had one of the board asked ‘what happens if passengers still refuse to leave the aircraft’ then the likely answer would have been ‘we don’t know’.

It seems inconceivable that there was a deliberate design to physically eject a passenger. It’s much more likely that no-one involved in the design of the algorithm had thought a passenger might still refuse to leave. It may prove to be the case that had flight crew been involved in the design they would have spotted this flaw, dealing as they do with the vagaries of human nature on a daily basis.

As well as being blind to the unintended consequences of their output, algorithms also present the risk of inadvertent discrimination. As algorithms drive an increasing number of business decisions, this risk increases. For example, an algorithm that does not schedule parcel delivery in certain suburbs because of safety concerns for delivery crew is rational, but becomes contentious if the suburbs concerned are predominantly non-white.

This risk is also evidenced in the discovery that women were less likely to be shown high paying jobs on search engines. No-one could work out why because of the many ways ads can be driven to users. It could have been the fault of the hirer or buried somewhere in a selection of algorithms that drove the display. Whatever the cause the outcome was discriminatory.

Adding to the discrimination risk is that algorithm-driven solutions for consumers are more likely, for cost effectiveness reasons, to be aimed at lower net worth individuals. The so called ‘investment advice gap’ only exists for those with less to invest. The answer is seen to be robo-advice, the provision of algorithmic driven advice delivered at low cost. It is too easy to say that low net worth equates to vulnerability but there is a definite link and thus a disproportionate number of vulnerable consumers may well have their long-term savings decided by algorithms. Meanwhile, higher net worth individuals will continue to benefit from the contextual, nuanced, personal advice and support of a human.

The one common risk shared by human and robo-advisers is that of providing poor advice. There is ample history of human sales forces mis-selling because of wrongly set incentives. Is it inconceivable that robo-advisers will fall, or be pushed, into the same trap?  After all, the algorithms driving the robo-advice are created, at least at first, by human authors. The first question the authors of the code have to answer is what are we aiming to achieve here?  The danger is that a misstep by the designers leads to quick and comprehensive mis-selling and/or discrimination.

Driving business decisions on automatic pilot is not reserved only to big business. Big data, available for purchase, can now be economically manipulated by massive computing power which itself can be purchased on demand. All of this means small groups can develop powerful solutions that can be made available to consumers on a widespread basis. 

To FinTech challengers, the thought of an algorithm-led business based on machine learning with none of the legacy costs of incumbents is very seductive. The algorithm becomes what any self-respecting challenger needs to base its business model on.

However, the attendant model risk adds to the risk profile of new businesses, which by their very nature are more vulnerable than incumbents. Business models may prove not viable, management may be less experienced and risk and control functions less developed. A further issue for regulators, especially in financial services, is that these new firms are likely to view themselves as technology companies. Thus the usual control frameworks of risk appetites and lines of defence may well be absent or under developed.

None of this is to argue against the need for innovation, but it is to argue for a risk-based approach. Firms and regulators need to be alive to the fact that algorithms are models and that means they are subject to model risk. In my view, use of algorithms means Model Risk Management (MRM) should be implemented to ensure proper documentation, good oversight and challenge of the basic assumptions with clear processes that deliver expert and independent validation.

Furthermore, the nature of financial services is that products sold today may not crystallise risks or become issues for several years. This passage of time makes the process of showing why a decision was made difficult, as mis-sold PPI and endowment mortgages have revealed. Evidencing why an algorithm makes a recommendation is not natural territory for model makers, but the accuracy of the record will nonetheless be a determining factor in any subsequent dispute.

As the use of algorithms spreads, and machine learning becomes more widely implemented, let’s remember three things;

  1. an algorithm is a model by another name, treat it as such
  2. the real world outcome is the only true measure of an algorithm or model
  3. never forget the human override

Innovation is the lifeblood of our economy in whatever sector, but so is effective risk management. Unless we want to replicate United Airlines’ travails, firms and regulators need to acknowledge that the two must go hand in hand.

 

Get Insight in your inbox