Minor fixes in example code snippets and chat template description
#2
by
ameyasunilm
- opened
README.md
CHANGED
|
@@ -385,7 +385,7 @@ git clone https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-9B-v2
|
|
| 385 |
|
| 386 |
vllm serve nvidia/NVIDIA-Nemotron-Nano-9B-v2 \
|
| 387 |
--trust-remote-code \
|
| 388 |
-
--mamba_ssm_cache_dtype float32
|
| 389 |
--enable-auto-tool-choice \
|
| 390 |
--tool-parser-plugin "NVIDIA-Nemotron-Nano-9B-v2/nemotron_toolcall_parser_no_streaming.py" \
|
| 391 |
--tool-call-parser "nemotron_json"
|
|
@@ -479,7 +479,7 @@ Okay, let's see. The user has a bill of $100 and wants to know the amount for an
|
|
| 479 |
|
| 480 |
## Prompt Format
|
| 481 |
|
| 482 |
-
We follow the jinja chat template provided below. This template conditionally adds `<think>\n` to the start of the Assistant response if `/think` is found in the system prompt or
|
| 483 |
|
| 484 |
```
|
| 485 |
{%- set ns = namespace(enable_thinking = true) %}
|
|
|
|
| 385 |
|
| 386 |
vllm serve nvidia/NVIDIA-Nemotron-Nano-9B-v2 \
|
| 387 |
--trust-remote-code \
|
| 388 |
+
--mamba_ssm_cache_dtype float32 \
|
| 389 |
--enable-auto-tool-choice \
|
| 390 |
--tool-parser-plugin "NVIDIA-Nemotron-Nano-9B-v2/nemotron_toolcall_parser_no_streaming.py" \
|
| 391 |
--tool-call-parser "nemotron_json"
|
|
|
|
| 479 |
|
| 480 |
## Prompt Format
|
| 481 |
|
| 482 |
+
We follow the jinja chat template provided below. This template conditionally adds `<think>\n` to the start of the Assistant response if `/think` is found in either the system prompt or any user message. If no reasoning signal is added, the model defaults to reasoning "on" mode. The chat template adds `<think></think>` to the start of the Assistant response if `/no_think` is found in the system prompt. Thus enforcing reasoning on/off behavior.
|
| 483 |
|
| 484 |
```
|
| 485 |
{%- set ns = namespace(enable_thinking = true) %}
|