No default RoPE scaling for long context?

by TomLucidor - opened 15 days ago

15 days ago

Sorry for asking, but the config seems to be only supporting 32K https://huggingface.co/inclusionAI/Ring-mini-sparse-2.0-exp/blob/main/config.json

bestfleer

inclusionAI org 11 days ago

When exceeding a sequence length of 32K, the model needs to enable YaRN. You can refer to SGLang's YaRN configuration and add the following to config.json:

"rope_scaling": {
  "factor": 4.0,
  "original_max_position_embeddings": 32768,
  "rope_type": "yarn"
}

TomLucidor

8 days ago

Any setup recommendations for vLLM instead of SGLang?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment