Update README.md
Browse filesadd t5-11b comparison
README.md
CHANGED
|
@@ -36,7 +36,8 @@ SST accuracy going down (0.54 for 10k samples and 1 epoch, -20% compared to the
|
|
| 36 |
This was unexpected, and weird, and probably would bear further investigation.
|
| 37 |
|
| 38 |
The model was much worse at correctly identifying positive sentiment (57% accuracy) than it was at
|
| 39 |
-
identifying negative sentiment (93% accuracy) - see Confusion Matrix, below.
|
|
|
|
| 40 |
|
| 41 |
Since the training dataset was balanced across positive and negative examples, this mismatch seems likely
|
| 42 |
to have been present in the base model, although this was not confirmed. Next steps for improvement
|
|
|
|
| 36 |
This was unexpected, and weird, and probably would bear further investigation.
|
| 37 |
|
| 38 |
The model was much worse at correctly identifying positive sentiment (57% accuracy) than it was at
|
| 39 |
+
identifying negative sentiment (93% accuracy) - see Confusion Matrix, below. This performance on
|
| 40 |
+
negative sentiment is good - State of the Art for SST2 overall is 97% (achieved by [T5-11B](https://huggingface.co/google-t5/t5-11b))
|
| 41 |
|
| 42 |
Since the training dataset was balanced across positive and negative examples, this mismatch seems likely
|
| 43 |
to have been present in the base model, although this was not confirmed. Next steps for improvement
|