Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ This repo contains the model weights for **Instinct**, [Continue](https://contin
|
|
| 13 |
There are many ways to plug a local model into Continue; we internally used an endpoint served by [SGLang](https://github.com/sgl-project/sglang), which is one of the options below. We observed no significant performance changes with fp8 quantization, so this may be used if desired.
|
| 14 |
|
| 15 |
* SGLang: `python3 -m sglang.launch_server --model-path continuedev/instinct --load-format safetensors`
|
| 16 |
-
* vLLM : `vllm serve continuedev/instinct --served-model-name instinct --load-format safetensors
|
| 17 |
|
| 18 |
## Learn more
|
| 19 |
|
|
|
|
| 13 |
There are many ways to plug a local model into Continue; we internally used an endpoint served by [SGLang](https://github.com/sgl-project/sglang), which is one of the options below. We observed no significant performance changes with fp8 quantization, so this may be used if desired.
|
| 14 |
|
| 15 |
* SGLang: `python3 -m sglang.launch_server --model-path continuedev/instinct --load-format safetensors`
|
| 16 |
+
* vLLM : `vllm serve continuedev/instinct --served-model-name instinct --load-format safetensors`
|
| 17 |
|
| 18 |
## Learn more
|
| 19 |
|