If AI is biased, how should we use it?
Even in the unprecedented time of the COVID-19 pandemic, the headlines are inundated with ways that artificial intelligence is being used to assist coronavirus patients and perhaps even prevent future pandemics.
Not far from these auspicious articles are claims that reveal a darker side to artificial intelligence. Reports about algorithms in some airport body scanners that may discriminate against different hair types, or a former Amazon recruiting tool that hired different sexes unequally, highlight the ramifications of biased computational decisions in our daily lives.
In public dialogue, the algorithms are often the scapegoat. Surely, if an algorithm acts in a way that society has deemed unacceptable, it must be the result of machine error.
In reality, machine error is only part of the equation. Human bias also plays a key role. At the root of artificial intelligence and data science is raw data. Data, containing variables that can be used to predict another target variable, is fed into these tools to train them. The data that goes into machines can be tainted with human bias. For example, with the Amazon recruiting tool, the ten years’ worth of information that was used to train the AI reflected the company’s bias toward hiring males. As a result, the recruiting tool penalized women and selected men more frequently. Consequently, algorithms can reinforce human biases, contributing to a feedback loop that intensifies inequalities.
Not all cases of machine bias are as simple as reflecting human bias. Data can also underrepresent groups, portray complex information in a binary manner, and miss critical confounding variables, among many more issues that can contribute to bias. In the book Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, Cathy O’Neil points to evidence that shows how machines can exacerbate biases across class, gender, race, location, and many other factors.
Computer scientists and statisticians are currently working to better understand bias. This involves modeling how bias can arise from data and identifying seemingly paradoxical types of bias. Scientists have also found that removing features that tend to reflect bias more, such as race and gender, does not necessarily guarantee that bias will not arise later in the process. In the case of the TSA airport scanners, race may not have been a variable considered by the algorithm, yet women of color end up being searched more frequently.
However, technical issues cannot be fixed overnight, and new issues regarding bias in computer science will continue to arise. Emphasis is often placed on the researchers to better design systems to solve the problem of bias, for example by the U.S. national government. Few regard the important role of the non-technical institutions that use these systems. Accordingly, there must be heightened accountability for those who use these tools. As O’Neil points out in her book, it is not only the researchers who must learn.
As banks, governments, schools, healthcare, and other organizations become more eager to use artificial intelligence, the institutions that use these data science and AI tools must consider how to account for their potential biases. These institutions must also be responsible for using these tools equitably. Ultimately, this will also improve their use of data science, help avoid high profile scandals, and identify technical oversights that could affect profits.
Responsibility requires looking at what happens before and after the algorithms are created and tested. This centers on two key ideas. The first is considering the structure of the problem that the science is applied to. Once the algorithm has been created and used for the proposed problem, the manner in which results are reported to the user and public must also be reexamined.
Structuring the Problem
Even before raw data is input into an algorithm, an algorithm can be doomed to be biased if it is solving a poorly structured problem. To properly structure a problem, one must consider the social complexities of the available data, as well as how the final product will be applied to the problem.
The book Data Feminism, by Catherine D’Ignazio and Lauren F. Klein, explores this concept rigorously by looking beyond the numbers and emphasizing the context in which data is collected.
D’Ignazio, an assistant professor at MIT, expanded on this topic by explaining some of her concerns about bias in artificial intelligence. One major problem is the lack of involvement of the communities that the AI aims to impact.
“Whose problems [are being addressed]? And, who is doing the work of the data science? There’s often a gap between how one community would solve a problem as opposed to a group that is very far away,” says D’Ignazio.
This gap can lead to the oversight of the needs and priorities of the group the AI is intended to be used for. To alleviate this issue, D’Ignazio suggests designing models and algorithms in a participatory manner, where there are many opportunities for the involvement of the groups of interest. This will help structure problems in a way that will minimize the potential biases of the AI. For example, if a sample of air travelers that included racial minorities was consulted before creating the algorithm for airport body scanners, it is possible that concerns about the algorithm’s response to different hair types may have been discussed, then implemented into the development of the algorithm.
A prime example of improving the success of AI through participation can be found in the SAFElab at Columbia University, directed by Desmond Patton. His team investigated the relationship between social media and gang violence in Chicago to identify tweets that may indicate future gun violence. To effectively understand the terminology in the tweets, SAFElab worked with specialists who were previously involved with gangs to label tweets, and the resulting data were used to train models.
Racial literacy is another concept that emphasizes the necessity of context in properly framing problems. Racial literacy in technology is built on improving one’s “intellectual understanding” of structural racism, “emotional intelligence,” and “commitment to take action to reduce harms to communities of color,” according to Data & Society’s Fellowship Program. Considering these concepts can help institutions better structure problems when utilizing AI and data science in situations that will impact individuals of color. Structuring problems with these concepts in mind will help identify cyclical patterns of oppression and provide ways to actively avoid them. In this manner, Data & Society’s Fellowship Program argues that “racial literacy offers a way to break free from old patterns.”
A deep understanding of context will help institutions better address the ramifications of using AI to solve problems. For example, before deciding to select jurors with AI, it is critical to consider the history of jury discrimination in the United States and how it contributes to incarceration.
Logistically, encouraging participation and developing context to better structure how AI is applied to problems may seem like a challenge at a large scale.
D’Ignazio acknowledges this. She says “all participation of everyone is not equal. Prioritize the people who are most marginalized and disproportionately having these issues.”
The practice of involving those most affected when structuring problems aligns with the motto of the Disability Rights Movements, D’Ignazio points out: “nothing about us, without us.” Such a motto needs to be embraced as more organizations move to utilize artificial intelligence to ensure the equitable and successful use of these powerful tools. Sometimes this may mean temporarily not using AI, until a better solution can be devised.
In order to understand the context of problems and incorporate the needed perspectives to properly use AI, significant human infrastructure and ethical navigation is required going forward.
Reporting the Findings
Once results come out of an algorithm, the information will be used. Many conclusions from data analysis will be incorporated into legislation, company agendas, or academic research. In these different scenarios, numbers are not self-explanatory and must be understood in context.
Here, the issue of sensationalized reporting becomes part of the problem of bias in artificial intelligence.
It is no secret that we live in an era of disinformation, and technology headlines can be particularly sensational. Articles with titles like “AI Is Inventing Languages Humans Can’t Understand. Should We Stop It?” rip the reader out of their seats, but ultimately mislead.
When this same sensationalism is applied to articles dealing with race, gender, ethnicity, and other societal categorizations, bias can be reinforced with misconstrued data.
For example, Cornell University researchers investigated racial bias in several Twitter datasets annotated for abusive language and hate speech. They found that all the datasets had racial bias because algorithms classified tweets by Black individuals as hate speech at a higher frequency than tweets by white individuals. Frequent use of the n-word and Black dialect differences, not necessarily used in hate speech, contributed to tweets by Black individuals being flagged as abusive at higher rates. The group further concluded that if these datasets are used to take action against hate speech on Twitter, they may result in more penalties against African-Americans.
The conclusions of this research were completely disregarded by the Pluralist’s headline: “Colleges Create AI to Identify ‘Hate Speech’ – Turns Out Minorities Are the Worst Offenders.” Sensational headlines about AI, like this, offer erroneous conclusions and perpetuate bias because readers have a greater tendency to believe the results from a machine—a phenomenon known as automation bias.
It is critical to clearly state the limitations of research when reporting the results of machine learning and data analytics. Just as a product may include a disclaimer for appropriate product use, it is an important part of reporting to state the limitations of data findings from AI.
In mainstream media, the allure of shock must be replaced with a more authentic description of the research. Journalists must become more cognizant of AI research practices and hold themselves accountable for reporting the nuanced conclusions described in research publications. Of course, it is also the responsibility of the reader to approach such reports with appropriate levels of skepticism.
As we move forward in our digital age, the role of humans in technology becomes even more paramount. Institutions, from banking to media, must properly structure their problems and responsibly report their findings if we are to use AI in an equitable manner.
Pingback: Data Feminism in the Harvard Tech Review – Data Feminism
Maybe inanimate objects like computers aren’t the problem, rather it’s people that accuse everyone of racism instead of having a open and constructive conversation for improvement.