articleProceedings of the National Academy of SciencesFeb 20, 2025HYBRID OA

Explicitly unbiased large language models still form biased associations

University of Chicago · Stanford University · +2 more institutions

PubMed
Indexed incrossrefpubmed

Abstract

Large language models (LLMs) can pass explicit social bias tests but still harbor implicit biases, similar to humans who endorse egalitarian beliefs yet exhibit subtle biases. Measuring such implicit biases can be a challenge: As LLMs become increasingly proprietary, it may not be possible to access their embeddings and apply existing bias measures; furthermore, implicit biases are primarily a concern if they affect the actual decisions that these systems make. We address both challenges by introducing two measures: LLM Word Association Test, a prompt-based method for revealing implicit bias; and LLM Relative Decision Test, a strategy to detect subtle discrimination in contextual decisions. Both measures are…

Citation impact

66
total citations
FWCI
123.87
Percentile
100%
References
77
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Econometrics
  • Mathematics
  • Linguistics
  • Statistical physics
  • Philosophy
  • Physics
No related works found for this paper.