vLLM's log reports a 'incorrect tokenization'.
(APIServer pid=1) The tokenizer you are loading from '/models/Minimax-M2.5-BF16-INT4-AWQ' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
I saw this log when use vllm, does it matter? or just need to ignore it?
I didn't see this message when I used the 'Minimax-M2.1-FP8-INT4-AWQ' version before.
I tried this new model as soon as it was uploaded, but my account was new and I couldn't make a post.
And thanks for your effort again!
I think it doesn't appear if you use mratsim/MiniMax-M2.5-BF16-INT4-AWQ directly
Duplicate: https://huggingface.co/cyankiwi/MiniMax-M2.1-AWQ-4bit/discussions/1
and upstream: https://github.com/vllm-project/vllm/issues/30828
Seems like a vllm bug.
I stored the models weight on a specitified path, not the default path like .cache/huggingface/hub, so I need to pass a model path to vllm.
Thank you! Seems not effecting the inferring.