New uploads Feb 12th?
#27
by
jmagder
- opened
While visiting your page as part of my daily OCD I noticed GLM 4.7 flash was updated. What changed? :)
I'll be trying out the mxfp4 version. I did notice that in the main Unsloth GLM-4.7-Flash model (https://huggingface.co/unsloth/GLM-4.7-Flash) the config.json file had been updated. Not sure what the impact is.
Whoops, didn't mean to create a duplicate issue: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF/discussions/28
I had codex investigate the difference between the old and new UD-Q4_K_XL files:
1. File/container metadata
- New file is 32 bytes larger.
- GGUF.kv_count changed from 59 to 60.
- One new GGUF key was added: general.sampling.top_p = 0.95.
- Data block starts 32 bytes later in the new file (consistent with one added metadata entry).
2. Model tensor content
- Not just metadata: 4 tensors differ in quantization type and bytes:
- blk.9.ffn_down_exps.weight: Q4_K -> Q5_K
- blk.9.ffn_down_shexp.weight: Q6_K -> Q8_0
- blk.13.ffn_down_exps.weight: Q5_K -> Q4_K
- blk.13.ffn_down_shexp.weight: Q8_0 -> Q6_K
- These 4 tensors also have different SHA-256 payloads by name, so this is a real re-quantization change, not just
relabeling.
- Other checked tensors (example: output.weight) matched exactly.
3. Net effect
- Looks like precision was shifted between block 9 and block 13 while keeping overall size essentially the same, plus
adding top_p metadata.
Curious what prompted this change?