MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems Paper • 2412.07067 • Published Dec 10, 2024
Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts Paper • 2510.23027 • Published Oct 27, 2025 • 1
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published 8 days ago • 32
MT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs Paper • 2104.08692 • Published Apr 18, 2021
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published 8 days ago • 32
Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts Paper • 2510.23027 • Published Oct 27, 2025 • 1
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published 8 days ago • 32