ypwang61
/

One-Shot-RLVR-Qwen2.5-Math-1.5B-1.2k-dsr-sub

Text Generation

text-generation-inference

Model card Files Files and versions

ypwang61 commited on Aug 27

Commit

46abcb1

·

verified ·

1 Parent(s): 80c902d

Create README.md

Files changed (1) hide show

README.md +13 -0

README.md ADDED Viewed

	@@ -0,0 +1,13 @@

+---
+license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
+base_model:
+- Qwen/Qwen2.5-Math-1.5B
+datasets:
+- ypwang61/One-Shot-RLVR-Datasets
+---
+This repository contains the model presented in [Reinforcement Learning for Reasoning in Large Language Models with One Training Example](https://huggingface.co/papers/2504.20571).
+Code: https://github.com/ypwang61/One-Shot-RLVR