GLM-OCR converted for use with llama.cpp. As of b8094, it may crash in llama.cpp if flash-attn is on or certain backends are used (e.g: CPU).

Regrettably, does not seem to perform well for single CJK line texts out of the box. Or perhaps it expects the image in a specific resolution, due to its dependence on PaddleLayout.

GLM-OCR in llama.cpp behaves differently depending on the maximum context for some reason. Setting -c 2000 gives wrong outputs typically, for example. Whereas -c 9000 has produced much better results so far in my tests.

So far in my tests:

  • performs poorly on Korean horizontal text lines.
  • performs decently on Japanese horizontal text lines.
  • performs poorly on Japanese vertical text lines.
  • performs excellently on Chinese horizontal text lines.
  • relatively robust to quantization
Downloads last month
863
GGUF
Model size
0.9B params
Architecture
glm4
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for octopusmegalopod/some-glmocr-ggufs

Base model

zai-org/GLM-OCR
Quantized
(10)
this model