Add library_name, arXiv metadata and project page link
Browse filesHi! I'm Niels, part of the community team at Hugging Face.
This PR improves the model card by:
- Adding `library_name: transformers` to the metadata, which enables the "Use in Transformers" button and automated code snippets.
- Adding the `arxiv` ID to the metadata to link the model with the paper [Surprisal-Guided Selection: Compute-Optimal Test-Time Strategies for Execution-Grounded Code Generation](https://huggingface.co/papers/2602.07670).
- Adding a link to the research project page in the header.
Everything else looks great!
README.md
CHANGED
|
@@ -1,44 +1,46 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
base_model: openai/gpt-oss-120b
|
| 4 |
-
tags:
|
| 5 |
-
- gpu-kernel
|
| 6 |
-
- cuda
|
| 7 |
-
- code-generation
|
| 8 |
-
- reinforcement-learning
|
| 9 |
-
- grpo
|
| 10 |
-
- kernelbench
|
| 11 |
datasets:
|
| 12 |
-
|
| 13 |
language:
|
| 14 |
-
|
|
|
|
| 15 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
model-index:
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
---
|
| 36 |
|
| 37 |
# KernelBench-RLVR-120b
|
| 38 |
|
| 39 |
A 120B-parameter model fine-tuned with GRPO (Group Relative Policy Optimization) for GPU kernel generation. This model was used to study compute-optimal test-time strategies in [Surprisal-Guided Selection](http://arxiv.org/abs/2602.07670), where we find that Best-of-N search with surprisal-guided selection recovers oracle performance at zero additional cost.
|
| 40 |
|
| 41 |
-
**Paper**: [arXiv:2602.07670](http://arxiv.org/abs/2602.07670) | **Code**: [GitHub](https://github.com/jbarnes850/test-time-training)
|
| 42 |
|
| 43 |
## Quick Start
|
| 44 |
|
|
@@ -175,4 +177,4 @@ If you use this model, please cite [our paper](http://arxiv.org/abs/2602.07670):
|
|
| 175 |
- [KernelBench](https://github.com/ScalingIntelligence/KernelBench) - Ouyang et al., 2025
|
| 176 |
- [TTT-Discover](https://arxiv.org/abs/2601.16175) - Yuksekgonul et al., 2026
|
| 177 |
- [SDPO](https://arxiv.org/abs/2601.20802) - Zeng et al., 2026
|
| 178 |
-
- [Scalable Power Sampling](https://arxiv.org/abs/2601.21590) - Ji et al., 2026
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model: openai/gpt-oss-120b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
datasets:
|
| 4 |
+
- ScalingIntelligence/KernelBench
|
| 5 |
language:
|
| 6 |
+
- en
|
| 7 |
+
license: apache-2.0
|
| 8 |
pipeline_tag: text-generation
|
| 9 |
+
library_name: transformers
|
| 10 |
+
arxiv: 2602.07670
|
| 11 |
+
tags:
|
| 12 |
+
- gpu-kernel
|
| 13 |
+
- cuda
|
| 14 |
+
- code-generation
|
| 15 |
+
- reinforcement-learning
|
| 16 |
+
- grpo
|
| 17 |
+
- kernelbench
|
| 18 |
model-index:
|
| 19 |
+
- name: KernelBench-RLVR-120b
|
| 20 |
+
results:
|
| 21 |
+
- task:
|
| 22 |
+
type: text-generation
|
| 23 |
+
name: GPU Kernel Generation
|
| 24 |
+
dataset:
|
| 25 |
+
name: KernelBench L1
|
| 26 |
+
type: ScalingIntelligence/KernelBench
|
| 27 |
+
metrics:
|
| 28 |
+
- type: custom
|
| 29 |
+
value: 90.0
|
| 30 |
+
name: task_success_rate (K=64, 20 tasks)
|
| 31 |
+
- type: custom
|
| 32 |
+
value: 53.3
|
| 33 |
+
name: fast_1 (K=1, per-sample)
|
| 34 |
+
- type: accuracy
|
| 35 |
+
value: 98.4
|
| 36 |
+
name: correctness (training dist.)
|
| 37 |
---
|
| 38 |
|
| 39 |
# KernelBench-RLVR-120b
|
| 40 |
|
| 41 |
A 120B-parameter model fine-tuned with GRPO (Group Relative Policy Optimization) for GPU kernel generation. This model was used to study compute-optimal test-time strategies in [Surprisal-Guided Selection](http://arxiv.org/abs/2602.07670), where we find that Best-of-N search with surprisal-guided selection recovers oracle performance at zero additional cost.
|
| 42 |
|
| 43 |
+
**Paper**: [arXiv:2602.07670](http://arxiv.org/abs/2602.07670) | **Project Page**: [Blog](https://jbarnes850.github.io/2026/02/02/surprisal-guided-selection/) | **Code**: [GitHub](https://github.com/jbarnes850/test-time-training)
|
| 44 |
|
| 45 |
## Quick Start
|
| 46 |
|
|
|
|
| 177 |
- [KernelBench](https://github.com/ScalingIntelligence/KernelBench) - Ouyang et al., 2025
|
| 178 |
- [TTT-Discover](https://arxiv.org/abs/2601.16175) - Yuksekgonul et al., 2026
|
| 179 |
- [SDPO](https://arxiv.org/abs/2601.20802) - Zeng et al., 2026
|
| 180 |
+
- [Scalable Power Sampling](https://arxiv.org/abs/2601.21590) - Ji et al., 2026
|