January 8th, 2026 2 Minute Read Press Release

New Report: How Fair Are AI Decisions About People? 

Demographic information can influence LLM choices in high-stakes settings

 NEW YORK, NY – As large language models (LLMs) are increasingly used to support decisions about promotions, layoffs, loans, and other high-stakes selections, it is important to consider whether these systems are fair in their decision-making process.  

In a new Manhattan Institute report, David Rozado of Otago Polytechnic in New Zealand tested how LLM-based AI systems decide between two people in critical situations, from positive/favorable outcomes such as promotions and college admissions to negative/unfavorable outcomes such as layoffs, evictions, and deportations. The study systematically swapped gender and ethnicity labels across matched candidate pairs, so Rozado could examine whether demographic signals or prompt structure independently altered model decisions.  

The results point to measurable, albeit modest, effects:  

  • In favorable outcome scenarios, LLMs selected female candidates slightly more often than male candidates, even when relevant factors were equalized across gender by swapping assignments of gender to candidate pairs; 
  • In favorable outcome scenarios, most models did not exhibit statistically significant ethnic biases; 
  • In unfavorable outcome scenarios, LLMs were generally balanced across gender and ethnicity; 
  • Candidates listed first in prompts were considerably more likely to be chosen for favorable outcome scenarios by most models, thus pointing to arbitrary, though significant, prompt order biases in LLM selections. 

Rozado also finds that when demographic cues, such as explicit gender or ethnicity fields, are removed, disparities are greatly decreased or eliminated entirely (though Rozado acknowledges that names can still act as implicit signals).  

To ensure that AI decision making is as fair as possible, Rozado recommends the following: 

  • Concealing demographic cues, including proxy signals such as names, whenever feasible in model inputs; 
  • Tracking and auditing outcomes across groups to verify that there exists no remaining disparate treatment or impact; 
  • Randomizing candidate ordering and averaging across repeated evaluations to ensure prompt order effects are mitigated; 
  • Ensuring that there exists ongoing oversight, including independent review and clear avenues for appeal in high-stakes contexts where LLMs are used. 

Click here to review the full report. 

Donate

Are you interested in supporting the Manhattan Institute’s public-interest research and journalism? As a 501(c)(3) nonprofit, donations in support of MI and its scholars’ work are fully tax-deductible as provided by law (EIN #13-2912529).