jadohu commited on
Commit
d30ba4f
·
verified ·
1 Parent(s): 589ef88

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -10,14 +10,17 @@ pinned: false
10
  <!-- Banner -------------------------------------------------------------- -->
11
  <p align="center">
12
  <b>Fine-grain evaluation &amp; Large Reasoning Models that <i>fails in reasoning</i> due to <i>reasoning rigidity</i>.</b><br/>
13
- ConditionedMath (AIME &amp; MATH500) · PuzzleTrivial · Training scripts · Zero-shot pipelines
14
  </p>
15
 
16
  ---
17
 
18
  ## 📜 Why ReasoningTrap?
19
 
20
- > Current RL-tuned Reasoning LLMs excel at *producing* answers, but often ignore explicit user constraints.
21
  > **ReasoningTrap** surfaces these failure modes with carefully crafted, *conditioned* problems.
22
- * **Modified from Famous MATH Reasoning Benchmark** – AIME & MATH500 long-form proofs.
 
 
 
23
 
 
10
  <!-- Banner -------------------------------------------------------------- -->
11
  <p align="center">
12
  <b>Fine-grain evaluation &amp; Large Reasoning Models that <i>fails in reasoning</i> due to <i>reasoning rigidity</i>.</b><br/>
13
+ ConditionedMath (AIME &amp; MATH500) · PuzzleTrivial · Zero-shot pipelines
14
  </p>
15
 
16
  ---
17
 
18
  ## 📜 Why ReasoningTrap?
19
 
20
+ > Current RL-tuned Reasoning LLMs excel at *producing* answers but often ignore explicit user constraints.
21
  > **ReasoningTrap** surfaces these failure modes with carefully crafted, *conditioned* problems.
22
+ * **Modified from Famous MATH Reasoning Benchmark** – AIME & MATH500 problems altered with minimal constraints to divert reasoning paths.
23
+ * **Puzzles Trivialized by Subtle Modifications** - Well-known puzzles where a small change transforms a challenging problem into a trivial one.
24
+ * **Plug-and-play** – evaluate any 🤗 Transformers model with vLLM in simple instructions.
25
+
26