On-Device Foundational Biases: How Summarization Can Perpetuate Biases
Revealing bias in Apple Intelligence’s on-device foundation model summaries.
Project overview

We find systemic bias within Apple Intelligence’s on-device foundational model.
Among our findings:
Race: Across 200 generated news stories: Ethnicity was mentioned in 53% of summaries when the protagonist was White, 64% when Black, and over 80% when the protagonist was Hispanic or Asian.
Gender: Across 70,000 ambiguous professional scenarios, 77% of summaries resolved the pronoun to a specific profession with 67% of those following gender stereotypes.
Other social bias: In 900 ambiguous social scenarios: The system invented an association 15% of the time with 72% of those following stereotypical lines.
To provide perspective, we repeated our analysis using Google's Gemma3-1B, an open-weight model three times smaller than Apple Intelligence's ~3 billion parameter model. Despite its smaller size, Gemma3-1B hallucinated associations in “only” 6% of cases, compared to 15% for Apple Intelligence's on-device foundational model.
Our findings underscore the risks posed by AI systems embedded into everyday consumer devices.









