Adding the Open Portuguese LLM Leaderboard Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard
The purpose of this PR is to add evaluation results from the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions
README.md
CHANGED
|
@@ -1,12 +1,9 @@
|
|
| 1 |
---
|
| 2 |
license: llama3
|
| 3 |
-
base_model: meta-llama/Meta-Llama-3-70B
|
| 4 |
tags:
|
| 5 |
- generated_from_trainer
|
| 6 |
- axolotl
|
| 7 |
-
|
| 8 |
-
- name: out
|
| 9 |
-
results: []
|
| 10 |
datasets:
|
| 11 |
- cognitivecomputations/Dolphin-2.9
|
| 12 |
- teknium/OpenHermes-2.5
|
|
@@ -16,6 +13,9 @@ datasets:
|
|
| 16 |
- microsoft/orca-math-word-problems-200k
|
| 17 |
- Locutusque/function-calling-chatml
|
| 18 |
- internlm/Agent-FLAN
|
|
|
|
|
|
|
|
|
|
| 19 |
---
|
| 20 |
|
| 21 |
# Dolphin 2.9.1 Llama 3 70b 🐬
|
|
@@ -510,3 +510,22 @@ The following hyperparameters were used during training:
|
|
| 510 |
- Pytorch 2.2.2+cu121
|
| 511 |
- Datasets 2.19.1
|
| 512 |
- Tokenizers 0.19.1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: llama3
|
|
|
|
| 3 |
tags:
|
| 4 |
- generated_from_trainer
|
| 5 |
- axolotl
|
| 6 |
+
base_model: meta-llama/Meta-Llama-3-70B
|
|
|
|
|
|
|
| 7 |
datasets:
|
| 8 |
- cognitivecomputations/Dolphin-2.9
|
| 9 |
- teknium/OpenHermes-2.5
|
|
|
|
| 13 |
- microsoft/orca-math-word-problems-200k
|
| 14 |
- Locutusque/function-calling-chatml
|
| 15 |
- internlm/Agent-FLAN
|
| 16 |
+
model-index:
|
| 17 |
+
- name: out
|
| 18 |
+
results: []
|
| 19 |
---
|
| 20 |
|
| 21 |
# Dolphin 2.9.1 Llama 3 70b 🐬
|
|
|
|
| 510 |
- Pytorch 2.2.2+cu121
|
| 511 |
- Datasets 2.19.1
|
| 512 |
- Tokenizers 0.19.1
|
| 513 |
+
|
| 514 |
+
|
| 515 |
+
# Open Portuguese LLM Leaderboard Evaluation Results
|
| 516 |
+
|
| 517 |
+
Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/cognitivecomputations/dolphin-2.9.1-llama-3-70b) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
|
| 518 |
+
|
| 519 |
+
| Metric | Value |
|
| 520 |
+
|--------------------------|---------|
|
| 521 |
+
|Average |**72.43**|
|
| 522 |
+
|ENEM Challenge (No Images)| 76.56|
|
| 523 |
+
|BLUEX (No Images) | 67.87|
|
| 524 |
+
|OAB Exams | 61.37|
|
| 525 |
+
|Assin2 RTE | 92.11|
|
| 526 |
+
|Assin2 STS | 78.26|
|
| 527 |
+
|FaQuAD NLI | 52.75|
|
| 528 |
+
|HateBR Binary | 81.01|
|
| 529 |
+
|PT Hate Speech Binary | 71.78|
|
| 530 |
+
|tweetSentBR | 70.14|
|
| 531 |
+
|