MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 4 days ago • 42