Spaces:

Salesforce
/

Efficient-Reasoning

Running

App Files Files Community

yuhuixu commited on May 22, 2025

Commit

2a0a8b1

verified ·

1 Parent(s): 131248e

Update app.py

Browse files

Files changed (1) hide show

app.py +0 -69

app.py CHANGED Viewed

@@ -47,75 +47,6 @@ training.
 3. 📈 Flexible scalability: Robust performance across diverse inference budgets on reasoning benchmarks like AIME and LiveCodeBench.
 4. ⚙️ Better performance with fewer tokens: Our trained model generates outputs that are 30% shorter while maintaining (or even improving) accuracy.
-<p align="center">
-  <img src="figs/aime.png" width="46%" />
-    <img src="figs/livecode.png" width="48%" />
-</p>
-<p align="center">
-  <img src="figs/codetable.png" width="90%" />
-</p>
-## Environment Setup
-### Installation
-```bash
-# Installing Python 3.10 Environment.
-conda create -n e1 python=3.10 -y
-conda activate e1
-# Installing dependencies.
-cd Elastic-Reasoning
-pip install -e ./verl
-pip install -e .
-```
-### Data
-Our raw training data is in `rllm/data/[train|test]/[code|math]/`, along with preprocessing scripts in `rllm/data/preprocess`. To convert the raw data into Parquet files for training, run:
-```bash
-# Download datasets from GDrive, populates rllm/data/[train|test]/[math|code]/*.json
-python scripts/data/download_datasets.py
-# Generate parquet files for Deepcoder/DeepscaleR in data/*.parquet
-python scripts/data/[deepcoder|deepscaler]_dataset.py
-```
-## Training
-```bash
-export MODEL_PATH="agentica-org/DeepScaleR-1.5B-Preview"
-./scripts/e1-math/e1_math_1.5b_1k_1k.sh --model $MODEL_PATH
-```
-## Evaluation
-To run our evaluation scripts, run:
-```bash
-./scripts/eval/eval_model.sh --model [CHECKPOINT_PATH] --datasets [DATASET1] [DATASET2] --output-dir [OUTPUT_DIR] --n [N_PASSES] --tp [TENSOR_PARALLEL_SIZE] --e1-mode [SEPARATE_BUDGETING] --e1-thinking-length [THINKING_LENGTH] --e1-solution-length [SOLUTION_LENGTH]
-```
-### Example on MATH
-```bash
-./scripts/eval/eval_model.sh --model Salesforce/E1-Math-1.5B --datasets aime math amc minerva olympiad_bench --output-dir $HOME/E1-Math-1.5B --tp 1 --n 16 --e1-mode True --e1-thinking-length 1024 --e1-solution-length 1024
-```
-### Example on LiveCodeBench
-```bash
-./scripts/eval/eval_model.sh --model Salesforce/E1-Code-14B --datasets test_livecodebench --output-dir $HOME/E1-Code-14B --tp 4 --e1-mode True --e1-thinking-length 1024 --e1-solution-length 1024
-```
-### Example on Codeforces
-```bash
-./scripts/eval/eval_model.sh --model Salesforce/E1-Code-14B --datasets test_codeforces --output-dir $HOME/DeepCoder-14B-Preview --tp 4 --n 8 --e1-mode True --e1-thinking-length 1024 --e1-solution-length 1024
-```
-```bash
-python scripts/deepcoder/benchmark/cf_elo_calc.py --results_path [RESULTS_JSON_PATH] --pass_n 8
-```
-### Unconstrained evaluation
-set `--e1-mode False` and `--max-length [Maxmum token length, e.g. 32768]`
-## Acknowledgement
-We greatly thanks [rllm](https://github.com/agentica-project/rllm) and [verl](https://github.com/volcengine/verl) for providing the awesome codebase!
 ## Citation

 3. 📈 Flexible scalability: Robust performance across diverse inference budgets on reasoning benchmarks like AIME and LiveCodeBench.
 4. ⚙️ Better performance with fewer tokens: Our trained model generates outputs that are 30% shorter while maintaining (or even improving) accuracy.
 ## Citation