Update README.md
Browse files
README.md
CHANGED
|
@@ -21,15 +21,14 @@ Average of 2 Test Runs with 1 point for correct answer, 0.5 point for partial co
|
|
| 21 |
--**Accuracy Score**: **82.25** correct out of 100
|
| 22 |
--Not Found Classification: 40.0%
|
| 23 |
--Boolean: 61.25%
|
| 24 |
-
--Math/Logic: 8.75%
|
| 25 |
--Complex Questions (1-5): 1 (Low)
|
| 26 |
--Summarization Quality (1-5): 2 (Coherent, extractive)
|
| 27 |
--Hallucinations: No hallucinations observed in test runs.
|
| 28 |
|
| 29 |
For test run results (and good indicator of target use cases), please see the files ("core_rag_test" and "answer_sheet" in this repo).
|
| 30 |
|
| 31 |
-
--As a reference point, this model shows substantial improvements in results, compared with the BLING 1.0B Pythia, with fine-tuning and the base training substantially the same.
|
| 32 |
-
--The model's ability to follow instructions and answer detailed questions improves dramatically from 1.0B -> 1.4B parameters.
|
| 33 |
|
| 34 |
|
| 35 |
### Model Description
|
|
|
|
| 21 |
--**Accuracy Score**: **82.25** correct out of 100
|
| 22 |
--Not Found Classification: 40.0%
|
| 23 |
--Boolean: 61.25%
|
| 24 |
+
--Math/Logic: 8.75%
|
| 25 |
--Complex Questions (1-5): 1 (Low)
|
| 26 |
--Summarization Quality (1-5): 2 (Coherent, extractive)
|
| 27 |
--Hallucinations: No hallucinations observed in test runs.
|
| 28 |
|
| 29 |
For test run results (and good indicator of target use cases), please see the files ("core_rag_test" and "answer_sheet" in this repo).
|
| 30 |
|
| 31 |
+
--As a reference point, this model shows substantial improvements in results, compared with the BLING 1.0B Pythia, with fine-tuning and the base training substantially the same. The model's ability to follow instructions and answer detailed questions improves dramatically from 1.0B -> 1.4B parameters.
|
|
|
|
| 32 |
|
| 33 |
|
| 34 |
### Model Description
|