Text Generation
Transformers
Safetensors
step3p5
conversational
custom_code
Eval Results
WinstonDeng commited on
Commit
a612345
·
verified ·
1 Parent(s): c7f7136

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -82,6 +82,10 @@ Performance of Step 3.5 Flash measured across **Reasoning**, **Coding**, and **A
82
  3. **BrowseComp (with Context Manager)**: When the effective context length exceeds a predefined threshold, the agent resets the context and restarts the agent loop. By contrast, Kimi K2.5 and DeepSeek-V3.2 used a "discard-all" strategy.
83
  4. **Decoding Cost**: Estimates are based on a methodology similar to, but more accurate than, the approach described arxiv.org/abs/2507.19427
84
 
 
 
 
 
85
  ## 4. Architecture Details
86
 
87
  Step 3.5 Flash is built on a **Sparse Mixture-of-Experts (MoE)** transformer architecture, optimized for high throughput and low VRAM usage during inference.
 
82
  3. **BrowseComp (with Context Manager)**: When the effective context length exceeds a predefined threshold, the agent resets the context and restarts the agent loop. By contrast, Kimi K2.5 and DeepSeek-V3.2 used a "discard-all" strategy.
83
  4. **Decoding Cost**: Estimates are based on a methodology similar to, but more accurate than, the approach described arxiv.org/abs/2507.19427
84
 
85
+ ### Recommended Inference Parameters
86
+ 1. For general chat domain, we suggest: `temperature=0.6, top_p=0.95`
87
+ 2. For reasoning / agent scenario, we recommend: `temperature=1.0, top_p=0.95`.
88
+
89
  ## 4. Architecture Details
90
 
91
  Step 3.5 Flash is built on a **Sparse Mixture-of-Experts (MoE)** transformer architecture, optimized for high throughput and low VRAM usage during inference.