Three notable examples of AI bias

by Michael McKenna, Toptal 14 October 2019

In 2016, the World Economic Forum claimed we are experiencing the fourth wave of the Industrial Revolution: automation using cyber-physical systems. Key elements of this wave include machine intelligence, blockchain-based decentralized governance, and genome editing.

As has been the case with previous waves, these technologies reduce the need for human labor but pose new ethical challenges, especially for artificial intelligence developers and their clients.

Humans: the ultimate source of bias in machine learning

All models are made by humans and reflect human biases. Machine learning models can reflect the biases of organizational teams, of the designers in those teams, the data scientists who implement the models, and the data engineers that gather data. Naturally, they also reflect the bias inherent in the data itself. Just as we expect a level of trustworthiness from human decision-makers, we should expect and deliver a level of trustworthiness from our models.

A trustworthy model will still contain many biases because bias (in its broadest sense) is the backbone of machine learning. A breast cancer prediction model will correctly predict that patients with a history of breast cancer are biased towards a positive result. Depending on the design, it may learn that women are biased towards a positive result. The final model may have different levels of accuracy for women and men and be biased in that way. The key question to ask is not “Is my model biased?”, because the answer will always be yes.

Searching for better questions, the European Union High Level Expert Group on Artificial Intelligence has produced guidelines applicable to model building. In general, machine learning models should be:

  1. Lawful—respecting all applicable laws and regulations
  2. Ethical—respecting ethical principles and values
  3. Robust—both from a technical perspective while taking into account its social environment

Historical cases of AI bias

Below are three historical models with dubious trustworthiness, owing to AI bias that is unlawful, unethical, or un-robust. The first and most famous case, the COMPAS model, shows how even the simplest models can discriminate unethically according to race. The second case illustrates a flaw in most natural language processing (NLP) models: They are not robust to racial, sexual and other prejudices. The final case, the Allegheny Family Screening Tool, shows an example of a model fundamentally flawed by biased data, and some best practices in mitigating those flaws.

COMPAS

The canonical example of biased, untrustworthy AI is the COMPAS system, used in Florida and other states in the US. The COMPAS system used a regression model to predict whether or not a perpetrator was likely to recidivate. Though optimized for overall accuracy, the model predicted double the number of false positives for recidivism for African American ethnicities than for Caucasian ethnicities.

The COMPAS example shows how unwanted bias can creep into our models no matter how comfortable our methodology. From a technical perspective, the approach taken to COMPAS data was extremely ordinary, though the underlying survey data contained questions with questionable relevance. A small supervised model was trained on a dataset with a small number of features. (In my practice, I have followed a similar technical procedure dozens of times, as is likely the case for any data scientist or ML engineer.) Yet, ordinary design choices produced a model that contained unwanted, racially discriminatory bias.

The biggest issue in the COMPAS case was not with the simple model choice, or even that the data was flawed. Rather, the COMPAS team failed to consider that the domain (sentencing), the question (detecting recidivism), and the answers (recidivism scores) are known to involve disparities on racial, sexual, and other axes even when algorithms are not involved. Had the team looked for bias, they would have found it. With that awareness, the COMPAS team might have been able to test different approaches and recreate the model while adjusting for bias. This would have then worked to reduce unfair incarceration of African Americans, rather than exacerbating it.

Any NLP model pre-trained naïvely on Common Crawl, Google News, or any other corpus, since Word2Vec

Large, pre-trained models form the base for most NLP tasks. Unless these base models are specially designed to avoid bias along a particular axis, they are certain to be imbued with the inherent prejudices of the corpora they are trained with—for the same reason that these models work at all. The results of this bias, along racial and gendered lines, have been shown on Word2Vec and GloVe models trained on Common Crawl and Google News respectively. While contextual models such as BERT are the current state-of-the-art (rather than Word2Vec and GloVe), there is no evidence the corpora these models are trained on are any less discriminatory.

Although the best model architectures for any NLP problem are imbued with discriminatory sentiment, the solution is not to abandon pretrained models but rather to consider the particular domain in question, the problem statement, and the data in totality with the team. If an application is one where discriminatory prejudice by humans is known to play a significant part, developers should be aware that models are likely to perpetuate that discrimination.

Allegheny Family Screening Tool: unfairly biased, but well-designed and mitigated

In this final example, we discuss a model built from unfairly discriminatory data, but the unwanted bias is mitigated in several ways. The Allegheny Family Screening Tool is a model designed to assist humans in deciding whether a child should be removed from their family because of abusive circumstances. The tool was designed openly and transparently with public forums and opportunities to find flaws and inequities in the software.

The unwanted bias in the model stems from a public dataset that reflects broader societal prejudices. Middle- and upper-class families have a higher ability to “hide” abuse by using private health providers. Referrals to Allegheny County occur over three times as often for African-American and biracial families than white families. Commentators like Virginia Eubanks and Ellen Broad have claimed that data issues like these can only be fixed if society is fixed, a task beyond any single engineer.

In production, the county combats inequities in its model by using it only as an advisory tool for frontline workers, and designs training programs so that frontline workers are aware of the failings of the advisory model when they make their decisions. With new developments in debiasing algorithms, Allegheny County has new opportunities to mitigate latent bias in the model.

The development of the Allegheny tool has much to teach engineers about the limits of algorithms to overcome latent discrimination in data and the societal discrimination that underlies that data. It provides engineers and designers with an example of consultative model building which can mitigate the real-world impact of potential discriminatory bias in a model.

Avoiding and mitigating AI bias: key business awareness

Fortunately, there are some debiasing approaches and methods—many of which use the COMPAS dataset as a benchmark.

Improve diversity, mitigate diversity deficits

Maintaining diverse teams, both in terms of demographics and in terms of skillsets, is important for avoiding and mitigating unwanted AI bias. Despite continuous lip service paid to diversity by tech executives, women and people of color remain under-represented.

Various ML models perform poorer on statistical minorities within the AI industry itself, and the people to first notice these issues are users who are female and/or people of color. With more diversity in AI teams, issues around unwanted bias can be noticed and mitigated before release into production.

Be aware of proxies: removing protected class labels from a model may not work!

A common, naïve approach to removing bias related to protected classes (such as sex or race) from data is to delete the labels marking race or sex from the models. In many cases, this will not work, because the model can build up understandings of these protected classes from other labels, such as postal codes. The usual practice involves removing these labels as well, both to improve the results of the models in production but also due to legal requirements. The recent development of debiasing algorithms, which we will discuss below, represents a way to mitigate AI bias without removing labels.

Be aware of technical limitations

Even best practices in product design and model building will not be enough to remove the risks of unwanted bias, particularly in cases of biased data. It is important to recognize the limitations of our data, models, and technical solutions to bias, both for awareness’ sake, and so that human methods of limiting bias in machine learning such as human-in-the-loop can be considered.


This is an excerpt of a longer, in-depth study into AI bias published by Toptal, the recruitment website for the world’s most talented developers, data scientists and engineers.

Michael McKenna is a data scientist specializing in health and retail. He strives for ethical AI practice and is published in medical ethics journals.