Small Language Models for Kazakh: models, tokenizers, and datasets for Kazakh language modeling.
Saken Tukenov PRO
stukenov
AI & ML interests
None yet
Recent Activity
updated
a model 1 day ago
stukenov/sozkz-llama-150m-200k-kk-base-v1 liked
a dataset 3 days ago
HuggingFaceTB/smollm-corpus published
a model 3 days ago
stukenov/sozkz-llama-150m-200k-kk-base-v1 Organizations
spaces 4
pinned
Runtime error
Kaz Offline LLM Arena
🥇
Kaz Offline Arena
Running on Zero
SozKZ Kazakh LLM
💬
Generate Kazakh responses to your questions
Runtime error
Kaz LLM Leaderboard
🏆
Evaluate LLMs using Kazakh MC tasks
Running
Sozkz Paper Small Language Models Kazakh
📈
Generate Kazakh text with small open‑source language models
models 26
stukenov/sozkz-llama-150m-200k-kk-base-v1
Text Generation • 0.2B • Updated
• 16
stukenov/kzcalm-baseline-v1
Updated
• 3
stukenov/kzcalm-sp-tokenizer-4k-kk-v1
Updated
stukenov/sozkz-core-gpt2-200k-kk-base-v1
Updated
stukenov/sozkz-core-llama-150m-kk-instruct-v2
Text Generation • 0.2B • Updated
• 87
stukenov/sozkz-core-llama-150m-kk-instruct-v1
Text Generation • 0.2B • Updated
• 30
stukenov/sozkz-core-llama-150m-kk-base-v1
Text Generation • 0.2B • Updated
• 94
stukenov/sozkz-core-llama-50m-kk-base-v4
60.8M • Updated
• 11
stukenov/sozkz-fix-mt5-50m-kk-morph-v1
Text Generation • 50.6M • Updated
• 4
stukenov/sozkz-core-gpt2-60m-kk-base-v1
Updated
datasets 31
stukenov/kzcalm-mimi-codes-kk-v1
Viewer
• Updated
• 232k • 46
stukenov/sozkz-corpus-tokenized-enkk-200k-v1
Viewer
• Updated
• 10.4M • 35
stukenov/kzcalm-mimi-codes-kk-v1-test
Viewer
• Updated
• 2 • 6
stukenov/kzcalm-tts-kk-v1
Viewer
• Updated
• 232k • 242
stukenov/sozkz-corpus-tokenized-kk-multidomain-200k-v1
Viewer
• Updated
• 1.18M • 35
stukenov/sozkz-corpus-tokenized-kk-200k-v1
Viewer
• Updated
• 422k • 32
stukenov/sozkz-corpus-tokenized-enkk-fineweb-edu-v1
Viewer
• Updated
• 9.02M • 23
stukenov/kaz-llm-lb-metainfo
Viewer
• Updated
• 13 • 23
stukenov/s-openbench-eval
Viewer
• Updated
• 5 • 201
stukenov/offline-data-results
Updated
• 43