llama.cpp
#8
by
PotatoSniffer
- opened
Please open pull request in official llama.cpp repo
Note that the current llama.cpp fork doesn't seem to work, at least not on apple silicon: https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int4/discussions/2 (nevermind, my download was corrupted)
Note that the current llama.cpp fork doesn't seem to work, at least not on apple silicon: https://huggingface.co/stepfun-ai/Step-3.5-Flash-Int4/discussions/2(nevermind, my download was corrupted)
mine 's corrupted too.
For anyone who finds this in the future, llama.cpp has been updated to support Step 3.5 Flash, and the implementation is currently working on the latest version of llama.cpp as of writing. Tested on Apple Silicon at q8 quantization and seems to be working well initially!