Update README.md
Browse files
README.md
CHANGED
|
@@ -10,4 +10,11 @@ WIP
|
|
| 10 |
|leaderboard_gpqa | N/A| | | | | | | |
|
| 11 |
| - leaderboard_gpqa_diamond | 1|none | 0|acc_norm|↑ |0.2071|± |0.0289|
|
| 12 |
| - leaderboard_gpqa_extended| 1|none | 0|acc_norm|↑ |0.2308|± |0.0180|
|
| 13 |
-
| - leaderboard_gpqa_main | 1|none | 0|acc_norm|↑ |0.2679|± |0.0209|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|leaderboard_gpqa | N/A| | | | | | | |
|
| 11 |
| - leaderboard_gpqa_diamond | 1|none | 0|acc_norm|↑ |0.2071|± |0.0289|
|
| 12 |
| - leaderboard_gpqa_extended| 1|none | 0|acc_norm|↑ |0.2308|± |0.0180|
|
| 13 |
+
| - leaderboard_gpqa_main | 1|none | 0|acc_norm|↑ |0.2679|± |0.0209|
|
| 14 |
+
|
| 15 |
+
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|
| 16 |
+
|-------------------------------------|-------|------|-----:|--------|---|-----:|---|-----:|
|
| 17 |
+
|leaderboard_musr | N/A| | | | | | | |
|
| 18 |
+
| - leaderboard_musr_murder_mysteries | 1|none | 0|acc_norm|↑ |0.5160|± |0.0317|
|
| 19 |
+
| - leaderboard_musr_object_placements| 1|none | 0|acc_norm|↑ |0.2383|± |0.0267|
|
| 20 |
+
| - leaderboard_musr_team_allocation | 1|none | 0|acc_norm|↑ |0.4400|± |0.0315|
|