TurkWeb-Edu Student (Reasoning) πŸ‡ΉπŸ‡·πŸ‡ΉπŸ‡·πŸ‡ΉπŸ‡·πŸ‡ΉπŸ‡·πŸ‡ΉπŸ‡·πŸ‡ΉπŸ‡·

A Turkish educational content scorer that generates reasoning before scoring. This is the Turkish equivalent of FineWeb-Edu classifier, but using Generative Reasoning Distillation.

How It Works

  1. You send Turkish text
  2. The model thinks (generates reasoning in Turkish)
  3. Then outputs an educational quality score (0-5)

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("YsK-dev/TurkWeb-Edu-Student-Qwen1.5B-SOTA", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("YsK-dev/TurkWeb-Edu-Student-Qwen1.5B-SOTA")

messages = [
    {"role": "system", "content": "You are an educational quality classifier."},
    {"role": "user", "content": "Analyze the following Turkish text for educational value (0-5):\n\n<your text>\n\nProvide your reasoning and final score."}
]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
output = model.generate(input_ids, max_new_tokens=300, temperature=0.1, do_sample=True)
print(tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True))

Training Details

Component Value
Teacher Qwen3-30B-A3B-Instruct-2507
Student Qwen/Qwen2.5-1.5B-Instruct
Method SFT with reasoning distillation (LoRA r=64)
Data 660K Turkish web samples from FineWeb-2
Hardware 1x NVIDIA H100 80GB
Steps 20,000
Downloads last month
16
Safetensors
Model size
2B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for YsK-dev/TurkWeb-Edu-Student-Qwen1.5B-SOTA

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1418)
this model

Dataset used to train YsK-dev/TurkWeb-Edu-Student-Qwen1.5B-SOTA