Israel, Gaza and AI machines

For the past decade, side rooms at international legal conferences have hosted panel discussions on the introduction of AI software into military tools. The use of AI-powered drones in Afghanistan, Pakistan and elsewhere has led to campaigns to ban “killer robots”. All of this was based on the idea that you need to keep human decision-making going as a means of ensuring that – even if technology makes warfare easier – a soldier with a moral conscience can ensure that human ethics and international law are still respected.

The explosive investigation that he announced on Wednesday +972 Magazine, an Israeli publication, could overturn those debates for years to come. The report, based on interviews with six anonymous Israeli soldiers and intelligence officials, alleges that the Israeli military used artificial intelligence software to carry out assassinations not only of suspected militants but also of civilians in Gaza in such large, so purposeful, that it could throw any claim by the Israeli military about adherence to international law out the window.

Among the most shocking elements of the accusations is that the war not fully delegated to AI. Instead, a lot of human decision-making was involved. But human decisions were to maximize killing and minimize the “bottleneck” of ethics and law.

To summarize the allegations, the Israeli military allegedly used an internal AI-based program called Lavender to identify possible Hamas and Palestinian Islamic Jihad (PIJ) militants among the population of Gaza and mark them as targets for Israeli air strike force bombers. In the first weeks of the war, when Palestinian casualties were greatest, the military “relied almost entirely on Lavender,” with the military giving “broad approval to officers to adopt Lavender’s kill lists, without requiring thorough scrutiny of why the machine made those choices or examine the raw intelligence on which they were based”.

The raw intelligence consisted of a number of parameters drawn from Israel’s vast surveillance system in Gaza – including a person’s age, gender, mobile phone usage patterns, movement patterns, which WhatsApp groups they are in, known contacts and addresses and more – compare rating from 1 to 100 determining the probability that the target is a militant. Characteristics of known Hamas and PIJ militants were fed into Lavender to train the software, which would then search for the same characteristics within the general Gaza population to help build the score. A high rating would make someone a target for assassination – with a threshold set by senior officers.

Four charges stand out in particular because of their dire implications in international law.

First, Lavender was allegedly used primarily to target suspected “junior” (ie, low-ranking) militants.

Second, human checks were minimal, with one officer estimating that they took about 20 seconds per target, and mostly just to confirm that the target was male (Hamas and PIJ have no women in their ranks).

Third, there was apparently a policy of trying to bomb younger targets in their family homes, even if their civilian family members were present, using a system called “Where’s Daddy?” which would alert the military when the target reached the house. The name of the software is particularly malicious because it implies the vulnerability of the target’s children as collateral damage. +972The report states that so-called dummy bombs were used in these attacks, as opposed to precision weapons, despite the fact that they cause more collateral damage, because precision weapons are too expensive to “drop” on such people.

And finally, the threshold for what the software considers a militant has been changed to meet the “constant pressure to create more targets for assassination.” In other words, if Lavender did not generate enough targets, the rating threshold was allegedly lowered to draw more Gazans—perhaps someone who met only a few criteria—into the kill net.

Every time a military seeks to kill someone, the customary international law of armed conflict (that is, the established, legally binding practice of what is and is not acceptable in war) applies two tests. The first is distinction – that is, you have to distinguish between what is a civilian and what is a military objective. The second is a precautionary measure – you must take all possible measures to avoid causing civilian deaths.

Israeli Air Force bombers reportedly dropped cheaper, less discriminatory bombs on the homes of lower-ranking Hamas militants. EPA

This does not mean that armies are prohibited from ever killing civilians. They are allowed to do that where necessary and unavoidablein accordance with the principle called “proportionality”.

The exact number of civilians who might be killed in a given military action has never been defined (and any military lawyer would tell you that it would be naive to attempt to do so). But the guiding principle has always been, understandably, to minimize casualties. The greatest number of justifiable civilian deaths occurred in efforts to kill the highest value targets, with the number decreasing as the target became less important. It is generally understood – including within the aforementioned actions of the Israeli army itself – that the killing of pedestrians is not worth a single civilian life.

But the Israeli military’s use of lavender reportedly backfired in many ways. In the first weeks of the war, the military’s international law department pre-authorized the deaths of up to 15 civilians, even children, to eliminate any target tagged by the AI software—a number that would be unprecedented in Israeli operational procedure. One officer says the number went up and down over time – up when commanders felt not enough targets were being hit, and down when there was pressure (probably from the US) to keep civilian casualties to a minimum.

The exact number of civilians who may die in a military action has never been defined

Again, the guiding principle of proportionality is to move towards zero civilian deaths, based on a target value – not to modulate the number of acceptable civilian deaths to hit a certain amount of targets.

The idea that younger militants were targeted in their own homes with weapons of mass casualty (supposedly because it was the method most compatible with the way Israel’s surveillance system in Gaza operates) is particularly outrageous. If true, it would be proof that the Israeli military has not only ignored the possibility of civilian casualties, but has actually institutionalized the killing of civilians alongside younger militants in its standard operating procedures.

The manner in which lavender was allegedly used also fails the test of distinction and international law’s prohibition against “indiscriminate attacks” on multiple fronts. An indiscriminate attack, as defined in customary law, includes any attack that is “not directed at a specific military objective” or uses a method or means of warfare “such as to attack military objectives and civilians … without distinction.”

The +972 the report paints a vivid picture of a program that flouts those rules. This doesn’t just include using “Where’s Dad?” a system for deliberately jamming civilian homes into kill zones and subsequently dropping stupid bombs on them, but also occasionally lowering the viewership threshold to make the killing less discriminatory. Two sources in the report said Lavender was trained in part on data gathered from public sector employees in Gaza — such as civil protection workers like police, firefighters and rescue workers — increasing the likelihood that a civilian would receive a higher rating.

On top of that, sources state that before Lavender was deployed, his accuracy in identifying anyone who actually fit the parameters given to him was only 90 percent; one out of 10 evaluated persons did not meet the criteria at all. This was considered an acceptable margin of error.

Normal mitigation of that kind of margin goes back to human decision-making; you would expect people to double-check the target list and ensure that 10 percent becomes 0 percent, or at least as close to it as possible. But a claim that soldiers routinely conducted only brief checks — mostly to determine whether the target was male — would show that was not the case.

If human soldiers can kill civilians, either on purpose or by mistake, and machines can kill civilians within the margin of error, then does the difference matter?