Spaces:

Salesforce
/

Efficient-Reasoning

Sleeping

yuhuixu commited on May 22

Commit

6fe96dd

verified ·

1 Parent(s): 2a0a8b1

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -35,19 +35,32 @@ resource constraints. To train models that are robust to truncated thinking, we
 introduce a lightweight `budget-constrained rollout` strategy, integrated into GRPO,
 which teaches the model to reason adaptively when the thinking process is cut
 short and generalizes effectively to unseen budget constraints without additional
-training.
 <p align="center">
   <img src="figs/framework.png" width="80%" />
 </p>
 **Main Takeaways**
 1. ✂️ Thinking + Solution are explicitly separated with independent budgets — boosting reliability under tight compute constraints.
 2. 🧠 Budget-Constrained Rollout: We train models to handle truncated reasoning using GRPO.
 3. 📈 Flexible scalability: Robust performance across diverse inference budgets on reasoning benchmarks like AIME and LiveCodeBench.
 4. ⚙️ Better performance with fewer tokens: Our trained model generates outputs that are 30% shorter while maintaining (or even improving) accuracy.
 ## Citation

 introduce a lightweight `budget-constrained rollout` strategy, integrated into GRPO,
 which teaches the model to reason adaptively when the thinking process is cut
 short and generalizes effectively to unseen budget constraints without additional
+training.
+    """)
+    gr.HTML("""
 <p align="center">
   <img src="figs/framework.png" width="80%" />
 </p>
+    """)
+gr.Markdown(
+    """
 **Main Takeaways**
 1. ✂️ Thinking + Solution are explicitly separated with independent budgets — boosting reliability under tight compute constraints.
 2. 🧠 Budget-Constrained Rollout: We train models to handle truncated reasoning using GRPO.
 3. 📈 Flexible scalability: Robust performance across diverse inference budgets on reasoning benchmarks like AIME and LiveCodeBench.
 4. ⚙️ Better performance with fewer tokens: Our trained model generates outputs that are 30% shorter while maintaining (or even improving) accuracy.
+<p align="center">
+  <img src="figs/aime.png" width="46%" />
+    <img src="figs/livecode.png" width="48%" />
+</p>
+<p align="center">
+  <img src="figs/codetable.png" width="90%" />
+</p>
+    """)
+gr.Markdown(
+    """
 ## Citation