Add evaluation results for HLE, MMLU-Pro
#9
by
SaylorTwift
HF Staff
- opened
Evaluation Results
This PR adds evaluation results extracted from the Model Card.
**Benchmarks:**
- MMLU-Pro: 87.8
HLE: 28.7
HLE: 48.3
**Files created:** - .eval_results/mmlu_pro.yaml.eval_results/hle.yaml
.eval_results/hle_with_tools.yaml
--- Extracted automatically using the [LLM-powered evaluation extractor](https://github.com/huggingface/community-evals).
Ssss