Beware the dangers of bias in AI

“You don’t even like potatoes,” my dad said to me last Thanksgiving. His offhand comment was as annoying as it was inaccurate.

“Of course I like potatoes. Who the hell doesn’t like potatoes?” I shot back. Of course, at some point in my history, I might have indeed disliked potatoes. And he might have just gotten this perception of me stuck in his database. If that was the case, his database was way out-of-date and causing me a headache well into my adulthood.

Tech companies, including those in computer and network security, face a similar problem when they rely on machine learning to automate tasks. Every tech company worth knowing has at least considered the benefits of machine learning, often referred to as artificial intelligence, if it hasn’t already been directly investing money in it.

It’s not hard to see why: The prize is a chance to get in on the ground level of a technology that promises an edge in the business of machine prediction. The power of machine learning comes from the concept of a “neural network,” designed to mimic the function of the human brain. In some tasks, feeding data through a well-trained algorithm can yield more consistent and accurate results than even biological intelligence.

How machine learning works

When building artificial neural networks, training a testable “model” is the first, most crucial step to successfully utilizing machine learning. A neural-network model is a digested map, or “brain,” of interconnected neurons processing information. It is the product of a mathematical formula, or algorithm, that represents the way a system processes information (i.e., my father’s perception that I don’t like potatoes).

Decades after Philip K. Dick warned us about the pitfalls of “precrime,” the realities of such a system are now frighteningly present. An algorithm is only as good as the data that it processes, so if a machine-learning algorithm is processing biased information, its results are unavoidably biased.

A machine-learning algorithm can be trained in a similar way to the human brain: through repeated exposure to information. By painstakingly classifying thousands of examples of the pattern a developer wants to identify, we can expect a computing system to literally compare apples to oranges. And by changing the data input flowing through the system, we can expect its assumptions to change in either subtle or drastic ways.

Bias finds its way into many aspects of society, professionalism, and study. And the creation of algorithmic bias, combined with an increased reliance on neural networks, is a cause for major concern. While others fear self-aware robots eradicating humanity, there’s a much more urgent and imminent concern about how algorithmic bias can pose a serious threat to society.

Decades after Philip K. Dick warned us about the pitfalls of “precrime,” the realities of such a system are now frighteningly present. An algorithm is only as good as the data that it processes, so if a machine-learning algorithm is processing biased information, its results are unavoidably biased.

The concept of developing a neural network to help predict crime is no science-fictional hypothetical: Data analysis company Palantir, known for its antiterrorism work, did this in secret with the New Orleans Police Department. And Palantir isn’t the only company out there trying to leverage this technology to earn law enforcement dollars.

PredPol, a Palantir competitor developing predictive-policing software, was found to be relying on “historical crime data” that “did not accurately predict future criminal activity,” according to a 2016 study. By processing biased data, it was “replicating systemic biases against overpoliced communities of color” that lead to crime and poverty in the first place.

Detecting algorithmic bias before it has had a chance to become systemic is a best-case scenario, but we’re running a very serious risk of amplifying systemic biases that propagate poverty cycles. And yet the most dangerous types of biases are the ones that we can’t account for, because not even we, as developers, understand them or how they impact our projects ourselves.

Machine-learning systems are quickly becoming a mainstay of our everyday lives, but like any system that users become invested in, they are all doomed to become legacy someday.

AI poses a unique challenge to the developers of the future because it can be difficult to unravel exactly how trained models make decisions. In some cases, we can understand how simple information can be interpreted to give us the results we get, but when complex data sets are being combed through, evaluated, and judged, why a machine gives us certain results, or assumptions, can be a bit of a mystery.

It’s for this reason that even nuanced things like social interactions may end up influencing even the most basic data sets and algorithms that go into our systems. What happens when the systems we use today to process complex information data such as verbal communication, built with today’s biases, become legacy in 30 to 40 years? What do they look like?

These are complex problems. As we move into a more interconnected and machine-dependent world, it’s important that developers work hard to eliminate common prejudices, misconceptions, and biases from their algorithms and data, so that our young electronic intelligences don’t inherit our inhumane mistakes.