Adarsh-Iyer commited on
Commit
c9fe452
·
verified ·
1 Parent(s): 63c7746

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -13,7 +13,7 @@ This repo contains the model weights for **Instinct**, [Continue](https://contin
13
  There are many ways to plug a local model into Continue; we internally used an endpoint served by [SGLang](https://github.com/sgl-project/sglang), which is one of the options below. We observed no significant performance changes with fp8 quantization, so this may be used if desired.
14
 
15
  * SGLang: `python3 -m sglang.launch_server --model-path continuedev/instinct --load-format safetensors`
16
- * vLLM : `vllm serve continuedev/instinct --served-model-name instinct --load-format safetensors --enable-prefix-caching --enable-chunked-prefill`
17
 
18
  ## Learn more
19
 
 
13
  There are many ways to plug a local model into Continue; we internally used an endpoint served by [SGLang](https://github.com/sgl-project/sglang), which is one of the options below. We observed no significant performance changes with fp8 quantization, so this may be used if desired.
14
 
15
  * SGLang: `python3 -m sglang.launch_server --model-path continuedev/instinct --load-format safetensors`
16
+ * vLLM : `vllm serve continuedev/instinct --served-model-name instinct --load-format safetensors`
17
 
18
  ## Learn more
19