Deepseek architecture?

#1
by spanspek - opened

I don't think this model is DeepSeek architecture, as far as I can tell it's the same as GLM 4.7 which is still based on the architecture used since GLM 4.5

https://huggingface.co/zai-org/GLM-4.7-Flash?inference_provider=zai-org

The link to the technical document is to GLM 4.5

Does the model work after quantizing it as DeepSeek, is it producing coherent output?

ive had to do the same thing before quants came out, it produces coherent output but i had one single output which was incoherent (kept rambling without stop)

someone else posted and is having issues, might need llama.cpp native support

Agree this conversion is broken (non-stop rambling).

Interestingly it notices that something is wrong with stuff like:

Correction: The input is extremely messy.

Not DeepSeek, but this is in order to borrow the MLA attention thing. Needs more support on lcpp side.

Unsloth are also using the deepseek2 architecture...

https://ibb.co/YFgQjDg1

https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF

Not DeepSeek, but this is in order to borrow the MLA attention thing. Needs more support on lcpp side.

In that case shouldn't we wait for llama.cpp to provide support for it and then GGUF them correctly? I mean using the deepseek architecture might get them working but will it get them working at full functionality / quality?

Not DeepSeek, but this is in order to borrow the MLA attention thing. Needs more support on lcpp side.

In that case shouldn't we wait for llama.cpp to provide support for it and then GGUF them correctly? I mean using the deepseek architecture might get them working but will it get them working at full functionality / quality?

From all we know FA is broken at that time

spanspek changed discussion status to closed

Sign up or log in to comment