K2-V2-Instruct / README.md
desaifan-mbzuai's picture
First draft for K2-V2-Instruct
654833d verified
|
raw
history blame
3.98 kB
metadata
license: apache-2.0
language:
  - en
base_model:
  - LLM360/K2-V2

K2-V2-Instruct

馃摎 Tech Report - 馃摑 Code - 馃彚 Project Page

k2-banner-placeholder

K2-V2 is our best fully open source model to date and ranked among the best open weight models of its class. As the latest base model in the LLM360's strongest project family, K2 features a dense architecture with 70 billion parameters.

k2-sft-aime

Beyond standard competencies like knowledge and conversation, K2 provides advanced capabilities, including long context consistency, deep mathematical knowledge, and reasoning behaviors. These serve as foundational building blocks that enable sophisticated downstream use cases, such as solving complex math problems and executing agentic workflows.

k2-base-gpqa

During our light SFT phase, our goal is to capitalize on the reasoning capabilities obtained during mid-training while allowing users to experience the model without having to wait for lengthy reasoning to complete.


Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("llm360/k2-v2", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("llm360/k2-v2")

prompt = "Explain why the derivative of sin(x) is cos(x)."
messages = [
    {"role": "system", "content": "You are K2, a helpful assistant created by Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) Institute of Foundation Models (IFM)."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation Summary

Model Specifications LongBench V2 AIME25 HMMT25 GSM8K Minerva GPQA-D MBPP HumanEval LCBv6
K2 Low
Dense 路 70B
40.7 27.3 19.0 92.4 85.0 48.5 71.0 82.3 39.9
K2 Medium
Dense 路 70B
41.3 62.0 45.6 92.0 90.6 60.6 75.8 84.2 51.3
K2 High
Dense 路 70B
42.6 80.2 71.4 94.8 94.5 69.3 84.8 91.5 67.0

Please refer to our Tech Report for detailed evaluation results.


Datasets & Mixtures

SFT Mix

  • TxT360-3efforts: curated instruction + mixed-difficulty reasoning traces
  • Tool-calling demonstrations
  • Small but high-value corpus to showcase model potential

All mixtures, filtering rules, and data sources are fully released for reproducibility.


Model Description

  • Model type: Language model with transformer architecture
  • Training stage: Pretraining & Post-training
  • Language(s) (NLP): English
  • License: Apache 2.0
Model Hyperparameter Value
Total Parameters 70B
Hidden Size 8,192
Intermediate Size (MLPs) 28,672
Number of Attention Heads 64
Number of Hidden Layers 80
RMSNorm 蓻 1e^-5
Pre-training Seq Length 8,192
Post-training Seq Length 524,288
Vocab Size 250,000

Citation

@misc{llm360@k2v2,
  title         = {K2-V2: A 360-Open, Reasoning-Enhanced Open Foundation Model},
  author        = {K2 Team},
  year          = {2025},
  archivePrefix = {arXiv},
  eprint        = {XXXX.XXXXX},
  primaryClass  = {cs.CL}
}