best quant avail but maybe needs new jinja template?

#1
by Fernanda24 - opened

Perplexity is really good maybe the lowest! But I'm testing around and I think there are massive improvements in how this model performs with the PR 16932 in llama.cpp and using this updated jinja chat template from here: https://github.com/ggml-org/llama.cpp/blob/master/models/templates/MiniMax-M2.jinja

Sign up or log in to comment