ResearcharXiv
Differential Attention: A New Mechanism That Reduces Hallucinations by 60%
Friday, March 27, 2026
Researchers at UC Berkeley and Google DeepMind have published a paper introducing Differential Attention, a novel attention mechanism that computes the difference between two softmax attention maps. This approach effectively cancels out noise and reduces hallucinations by 60% on TruthfulQA benchmarks while maintaining comparable computational costs to standard multi-head attention.
Key Takeaways
- Computes difference between two parallel attention maps
- 60% reduction in hallucinations on TruthfulQA
- Comparable FLOPs to standard multi-head attention
- Drop-in replacement for existing transformer architectures