Running on CPU Upgrade Featured 2.54k The Smol Training Playbook 📚 2.54k The secrets to building world-class LLMs
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18 • 114
How to Train Your LLM Web Agent: A Statistical Diagnosis Paper • 2507.04103 • Published Jul 5 • 50
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper • 2412.04626 • Published Dec 5, 2024 • 14
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems Paper • 2408.16293 • Published Aug 29, 2024 • 27
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content Paper • 2406.11811 • Published Jun 17, 2024 • 16
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content Paper • 2406.11811 • Published Jun 17, 2024 • 16
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference Paper • 2404.15420 • Published Apr 23, 2024 • 11
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9, 2024 • 66
StarCoder 2 and The Stack v2: The Next Generation Paper • 2402.19173 • Published Feb 29, 2024 • 151 • 5
StarCoder 2 and The Stack v2: The Next Generation Paper • 2402.19173 • Published Feb 29, 2024 • 151
StarCoder 2 and The Stack v2: The Next Generation Paper • 2402.19173 • Published Feb 29, 2024 • 151