PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing Paper • 2303.10845 • Published Mar 20, 2023 • 3
Gamayun's Path to Multilingual Mastery: Cost-Efficient Training of a 1.5B-Parameter LLM Paper • 2512.21580 • Published 7 days ago • 5