STEPHEN Hawking had warned a few years back that Artificial Intelligence (AI) could spell the end of human race, a threat echoed by Elon Musk, the founder of Tesla, Bill Gates, and many others. This is the apocalyptic vision that the robots we create, may decide we are an obsolete model deserving “retirement”. The truth, a more banal one, is that, are we letting machine algorithms take over what were human decisions earlier, decision making of the governments, the businesses and even ours. Today, algorithms decide who should get a job, which part of a city needs to be developed, who should get into a college, and in the case of a crime, what should be the sentence. It is not super intelligence of robots that is the threat to life as we know it, but machines taking over thousands of decisions that are critical to our lives and deciding social outcomes.
What are these decisions, and how are machines making them?
Suppose you apply for a loan. The huge amount of financial data you create – credit card transactions, banking transactions, ATM withdrawals – all of these data are accessed and processed by computer algorithms. With linking Aadhaar to your bank accounts and PAN cards, almost every transaction that you make over a certain amount in India, is available for such processing. This past data is stored for ever – it is cheaper to store all the data then selectively keep some, deleting others. All these are processed by the algorithms to decide a single score of your creditworthiness. Based on this final score, you are ranked and decision to give a loan is taken.
The issue here is a simple one. What decides your getting a loan or not is finally a machine score – not who you are, what you have achieved, how important is your work for the country (or society); for the machine, you are just the sum of all your transactions to be processed and reduced to a simple number.
So how does the machine do it? It is the algorithms that we write for the machines that take our data and provide a score for the decision. Those who are scored, have no control over the decisions. There is no appeal against such decisions. The algorithms are supposedly intellectual property and therefore jealously guarded. The worst part is that some the algorithms that we write, are not even understandable to those who have written them. Even the creators of such algorithms do not know how a particular algorithm came out with a specific score!
A mathematician and a data scientist Cathy O'neil, in a recent book, Weapons of Math Destruction, tells us that the apparent objectivity of processing the huge amount of data by algorithms is false. The algorithms themselves are nothing but our biases and subjectiveness that are being coded – “They are just opinions coded into maths”.
What happens when we transform the huge data that we create through our every day digital footprints into machine “opinions” or “decisions”? Google served ads for high paying jobs disproportionately to men; African Americans got longer sentences as they were flagged as high risk for repeat offences by a judicial risk assessment algorithm. It did not explicitly use the race of the offender, but used where they stayed, information of other family members, education and income to work out the risk, all of which put together was also a proxy for race.
One of the worst examples in the US of such scoring, was in calculating insurance premiums for drivers. A researcher found that drivers with low-income and no drunken driving cases, were asked to pay a higher premium than high-income persons, but with drunken driving cases. This was an obvious anomaly – drinking and driving is likely to be repeated and is an obvious risk for accidents. The algorithm had “correctly” deduced that low income persons had less capacity for insurance shopping than the high income ones, and therefore could be rooked for a higher premium. The algorithm was looking not at safe drivers, but maximising the profit of the insurance company.
The problem is not just the subjective biases of the people who code the algorithms, or the goal of the algorithm, but much deeper. They lie in the data and the so-called predictive models we build using this data. Such data and models simply reflect the objective reality of the high degree of inequality that exists within society, and replicates that in the future through its predictions.
What are predictive models? Simply put, we use the past to predict the future. We use the vast amount of data that are available, to create models that correlate the “desired” output with a series of input data. The output could be a credit score, the chance of doing well in a university, a job and so on. The past data of people who have been “successful” – some specific output variables – are selected as indicators of success and correlated with various social and economic data of the candidate. This correlation is then used to rank any new candidate in terms of chances of success based on her or his profile. To use an analogy, predictive models are like driving cars looking only at the rear-view mirror.
A score for success, be it a job, admission to a university, or a prison sentence, reflects the existing inequality of society in some form. An African American in the US, or a dalit, or a Muslim in India, does not have to be identified by race, caste or religion. The data of her or his social transactions are already prejudiced and biased. Any scoring algorithm will end up with a score that will predict their future success based on which groups are successful today. The danger of these models are that race or caste or creed may not exist explicitly as data, but a whole host of other data exist that act as proxies for these “variables”.
Such predictive models are not only biased by the opinion of those who create the models, but also the inherent nature of all predictive models: it cannot predict what it does not see. They end up trying to replicate what they see in the past has succeeded. They are inherently a conservative force trying to replicate the existing inequalities of society. And like all technology, they amplify what exists today over what is new in the world.
After the meltdown in 2008 of the financial markets, two modellers Emanuel Derman and Paul Wilmott had written The Financial Modelers' Manifesto. Modelled on the lines of the Communist Manifesto, it starts with “A spectre is haunting Markets – the spectre of illiquidity, frozen credit, and the failure of financial models.” It ends with the Modellers Hippocratic Oath:
~ I will remember that I didn't make the world, and it doesn't satisfy my equations.
~ Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.
~ I will never sacrifice reality for elegance without explaining why I have done so.
~ Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.
~ I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension.
The AI community is waking up to the dangers of such models taking over the world. Some of these are even violations of constitutional guarantees against discrimination. There are now discussions of creating an Algorithm Safety Boards in the US, such that algorithms can be made transparent and accountable. We should know what is being coded, and if required, find out why the algorithm came out with a certain decision: the algorithms should be auditable. It is no longer enough to say, “the computer did it”.
Similarly, in 2017, a group of AI researchers met in Asilomar and drew up a set of principles for what should be the goals of AI – common good, no arms race using AI, shared prosperity. While not as strong as the Asilomar principles drawn up in 1975 that barred certain kind of recombinant DNA technology at that time, they still articulate clearly that AI needs to be developed for public good and be socially regulated. It is our humanity that is at stake, not the end of the human race as a Hawking or Musk fears.