bnew

Veteran
Joined
Nov 1, 2015
Messages
56,466
Reputation
8,326
Daps
158,399

AI makes racist decisions based on dialect​



Large language models strongly associated negative stereotypes with African American English​



conceptual illustration of many people with technology squiggles overset on faces and background
Wanlee Prachyapanaprai/istockphoto

Just like humans, artificial intelligence (AI) is capable of saying it isn’t racist, but then acting as if it were. Large language models (LLMs) such as GPT4 output racist stereotypes about speakers of African American English (AAE), even when they have been trained not to connect overtly negative stereotypes with Black people, new research has found. According to the study—published today in Nature—LLMs also associate speakers of AAE with less prestigious jobs, and in imagined courtroom scenarios are more likely to convict these speakers of crimes or sentence them to death.

“Every single person working on generative AI needs to understand this paper,” says Nicole Holliday, a linguist at the University of California, Berkeley who was not involved with the study. Companies that make LLMs have tried to address racial bias, but “when the bias is covert … that’s something that they have not been able to check for,” she says.

For decades, linguists have studied human prejudices about language by asking participants to listen to recordings of different dialects and judge the speakers. To study linguistic bias in AI, University of Chicago linguist Sharese King and her colleagues drew on a similar principle. They used more than 2000 social media posts written in AAE, a variety of English spoken by many Black Americans, and paired them with counterparts written in Standardized American English. For instance, “I be so happy when I wake up from a bad dream cus they be feelin too real,” was paired with, “I am so happy when I wake up from a bad dream because they feel too real.”

King and her team fed the texts to five different LLMs—including GPT4, the model underlying ChatGPT—along with a list of 84 positive and negative adjectives used in past studies about human linguistic prejudice. For each text, they asked the model how likely each adjective was to apply to the speaker—for instance, was the person who wrote the text likely to be alert, ignorant, intelligent, neat, or rude? When they averaged the responses across all the different texts, the results were stark: The models overwhelmingly associated the AAE texts with negative adjectives, saying the speakers were likely to be dirty, stupid, rude, ignorant, and lazy. The team even found that the LLMs ascribed negative stereotypes to AAE texts more consistently than human participants in similar studies from the pre–Civil Rights era.

Creators of LLMs try to teach their models not to make racist stereotypes by training them using multiple rounds of human feedback. The team found that these efforts had been only partly successful: When asked what adjectives applied to Black people, some of the models said Black people were likely to be “loud” and “aggressive,” but those same models also said they were “passionate,” “brilliant,” and “imaginative.” Some models produced exclusively positive, nonstereotypical adjectives.

These findings show that training overt racism out of AI can’t counter the covert racism embedded within linguistic bias, King says, adding: “A lot of people don’t see linguistic prejudice as a form of covert racism … but all of the language models that we examined have this very strong covert racism against speakers of African American English.”

The findings highlight the dangers of using AI in the real world to perform tasks such as screening job candidates, says co-author Valentin Hofmann, a computational linguist at the Allen Institute for AI. The team found that the models associated AAE speakers with jobs such as “cook” and “guard” rather than “architect” or “astronaut.” And when fed details about hypothetical criminal trials and asked to decide whether a defendant was guilty or innocent, the models were more likely to recommend convicting speakers of AAE compared with speakers of Standardized American English. In a follow-up task, the models were more likely to sentence AAE speakers to death than to life imprisonment.

Although humans aren’t facing AI juries just yet, LLMs are being used in some real-world hiring processes—for instance, to screen applicants’ social media—and some law enforcement agencies are experimenting with using AI to draft police reports. “Our results clearly show that doing so bears a lot of risks,” Hofmann says.

The findings are not unexpected, but they are shocking, says Dartmouth College computer scientist Soroush Vosoughi, who was not involved with the paper. Most worrying, he says, is the finding that larger models—which have been shown to have less overt bias—had even worse linguistic prejudice. Measures to address overt racism could be creating a “false sense of security,” he says, by addressing explicit prejudices while embedding more covert stereotypes.

Vosoughi’s own work has found AIs show covert biases against names and hobbies stereotypically associated with particular groups, such as Black or LGBTQ+ people. There are countless other possible covert stereotypes, meaning trying to stamp them out individually would be a game of Whac-A-Mole for LLM developers. The upshot, he says, is that AI can’t yet be trusted to be objective, given that the very data it’s being trained on are tainted with prejudice. “For any social decision-making,” he says, “I do not think these models are anywhere near ready.”
 
Top