Update README.md
Browse files
README.md
CHANGED
|
@@ -18,6 +18,35 @@ Full model card coming soon. Long story short, the curation method for the datas
|
|
| 18 |
|
| 19 |
The full model card will be written and the dataset will be published once other models are finished training and evals are complete.
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
## Model Details
|
| 22 |
|
| 23 |
### Model Description
|
|
|
|
| 18 |
|
| 19 |
The full model card will be written and the dataset will be published once other models are finished training and evals are complete.
|
| 20 |
|
| 21 |
+
## Currently finished evals
|
| 22 |
+
|
| 23 |
+
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|
| 24 |
+
|---------------------------------|------:|------|-----:|--------|---|-----:|---|-----:|
|
| 25 |
+
|agieval_nous | 0|none | |acc_norm|↑ |0.4266|± |0.0095|
|
| 26 |
+
| - agieval_aqua_rat | 1|none | 0|acc |↑ |0.3268|± |0.0295|
|
| 27 |
+
| | |none | 0|acc_norm|↑ |0.3150|± |0.0292|
|
| 28 |
+
| - agieval_logiqa_en | 1|none | 0|acc |↑ |0.3825|± |0.0191|
|
| 29 |
+
| | |none | 0|acc_norm|↑ |0.3856|± |0.0191|
|
| 30 |
+
| - agieval_lsat_ar | 1|none | 0|acc |↑ |0.2652|± |0.0292|
|
| 31 |
+
| | |none | 0|acc_norm|↑ |0.2348|± |0.0280|
|
| 32 |
+
| - agieval_lsat_lr | 1|none | 0|acc |↑ |0.4667|± |0.0221|
|
| 33 |
+
| | |none | 0|acc_norm|↑ |0.4294|± |0.0219|
|
| 34 |
+
| - agieval_lsat_rc | 1|none | 0|acc |↑ |0.5911|± |0.0300|
|
| 35 |
+
| | |none | 0|acc_norm|↑ |0.5465|± |0.0304|
|
| 36 |
+
| - agieval_sat_en | 1|none | 0|acc |↑ |0.7670|± |0.0295|
|
| 37 |
+
| | |none | 0|acc_norm|↑ |0.7282|± |0.0311|
|
| 38 |
+
| - agieval_sat_en_without_passage| 1|none | 0|acc |↑ |0.4806|± |0.0349|
|
| 39 |
+
| | |none | 0|acc_norm|↑ |0.4320|± |0.0346|
|
| 40 |
+
| - agieval_sat_math | 1|none | 0|acc |↑ |0.5091|± |0.0338|
|
| 41 |
+
| | |none | 0|acc_norm|↑ |0.4364|± |0.0335|
|
| 42 |
+
|
| 43 |
+
**average acc:** 0.4736
|
| 44 |
+
|
| 45 |
+
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|
| 46 |
+
|---------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|
| 47 |
+
|gsm8k_cot_llama| 3|flexible-extract| 8|exact_match|↑ |0.8249|± |0.0105|
|
| 48 |
+
| | |strict-match | 8|exact_match|↑ |0.8241|± |0.0105|
|
| 49 |
+
|
| 50 |
## Model Details
|
| 51 |
|
| 52 |
### Model Description
|