Running Featured 39 Porting nanochat to Transformers: an AI modeling history lesson π 39 Learn about ML and Transformers through nanochat
Running on CPU Upgrade Featured 2.54k The Smol Training Playbook π 2.54k The secrets to building world-class LLMs
view article Article huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning +2 Oct 27 β’ 70
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper β’ 2510.11696 β’ Published Oct 13 β’ 176
Language Models Can Learn from Verbal Feedback Without Scalar Rewards Paper β’ 2509.22638 β’ Published Sep 26 β’ 70
A Survey of Reinforcement Learning for Large Reasoning Models Paper β’ 2509.08827 β’ Published Sep 10 β’ 189
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model Paper β’ 2509.00676 β’ Published Aug 31 β’ 84