Add evaluation results for HLE, MMLU-Pro

#9
by SaylorTwift HF Staff - opened

Evaluation Results

    This PR adds evaluation results extracted from the Model Card.

    **Benchmarks:**
    - MMLU-Pro: 87.8
  • HLE: 28.7

  • HLE: 48.3

      **Files created:**
      - .eval_results/mmlu_pro.yaml
    
  • .eval_results/hle.yaml

  • .eval_results/hle_with_tools.yaml

      ---
    
      Extracted automatically using the [LLM-powered evaluation extractor](https://github.com/huggingface/community-evals).
    
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment