New uploads Feb 12th?

#27

by jmagder - opened 9 days ago

Discussion

jmagder

9 days ago

While visiting your page as part of my daily OCD I noticed GLM 4.7 flash was updated. What changed? :)

dugrema

9 days ago

I'll be trying out the mxfp4 version. I did notice that in the main Unsloth GLM-4.7-Flash model (https://huggingface.co/unsloth/GLM-4.7-Flash) the config.json file had been updated. Not sure what the impact is.

coder543

9 days ago

Whoops, didn't mean to create a duplicate issue: https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF/discussions/28

I had codex investigate the difference between the old and new UD-Q4_K_XL files:

  1. File/container metadata

  - New file is 32 bytes larger.
  - GGUF.kv_count changed from 59 to 60.
  - One new GGUF key was added: general.sampling.top_p = 0.95.
  - Data block starts 32 bytes later in the new file (consistent with one added metadata entry).

  2. Model tensor content

  - Not just metadata: 4 tensors differ in quantization type and bytes:
      - blk.9.ffn_down_exps.weight: Q4_K -> Q5_K
      - blk.9.ffn_down_shexp.weight: Q6_K -> Q8_0
      - blk.13.ffn_down_exps.weight: Q5_K -> Q4_K
      - blk.13.ffn_down_shexp.weight: Q8_0 -> Q6_K
  - These 4 tensors also have different SHA-256 payloads by name, so this is a real re-quantization change, not just
    relabeling.
  - Other checked tensors (example: output.weight) matched exactly.

  3. Net effect

  - Looks like precision was shifted between block 9 and block 13 while keeping overall size essentially the same, plus
    adding top_p metadata.

Curious what prompted this change?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment