LLM Reasoning: RL, Self-Play

university

https://github.com/xashru

AI & ML interests

None defined yet.

Recent Activity

xashru authored a paper about 1 month ago

SPHINX: A Synthetic Environment for Visual Perception and Reasoning

xashru authored a paper about 2 months ago

Limits of Generalization in RLVR: Two Case Studies in Mathematical Reasoning

xashru authored a paper about 2 months ago

AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence

View all activity

xashru

authored a paper about 1 month ago

SPHINX: A Synthetic Environment for Visual Perception and Reasoning

Paper • 2511.20814 • Published Nov 25, 2025 • 2

xashru

authored 2 papers about 2 months ago

Limits of Generalization in RLVR: Two Case Studies in Mathematical Reasoning

Paper • 2510.27044 • Published Oct 30, 2025 • 5

AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence

Paper • 2511.01144 • Published Nov 3, 2025 • 3

xashru

authored a paper over 1 year ago

CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence

Paper • 2406.07599 • Published Jun 11, 2024