End of training

Browse files

Files changed (2) hide show

README.md +30 -28
runs/Jan23_15-47-00_ultramarine/events.out.tfevents.1737636421.ultramarine.3980114.0 +2 -2

README.md CHANGED Viewed

@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6006
-- F1: 0.9060
-- Accuracy: 0.9061
 ## Model description
@@ -41,37 +41,39 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 8e-05
-- train_batch_size: 64
-- eval_batch_size: 64
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
 - num_epochs: 20
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | F1     | Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:------:|:--------:|
-| 1.7403        | 1.0   | 73   | 1.2040          | 0.5117 | 0.5429   |
-| 1.036         | 2.0   | 146  | 0.7293          | 0.7292 | 0.7347   |
-| 0.6043        | 3.0   | 219  | 0.5232          | 0.8202 | 0.8163   |
-| 0.3395        | 4.0   | 292  | 0.3972          | 0.8602 | 0.8571   |
-| 0.173         | 5.0   | 365  | 0.4365          | 0.8548 | 0.8571   |
-| 0.0784        | 6.0   | 438  | 0.4332          | 0.8667 | 0.8653   |
-| 0.0445        | 7.0   | 511  | 0.5221          | 0.8856 | 0.8857   |
-| 0.0354        | 8.0   | 584  | 0.5177          | 0.9023 | 0.9020   |
-| 0.0239        | 9.0   | 657  | 0.8380          | 0.8676 | 0.8653   |
-| 0.0199        | 10.0  | 730  | 0.6887          | 0.8755 | 0.8735   |
-| 0.017         | 11.0  | 803  | 0.6245          | 0.9013 | 0.9020   |
-| 0.0058        | 12.0  | 876  | 0.5052          | 0.9147 | 0.9143   |
-| 0.0051        | 13.0  | 949  | 0.6583          | 0.8834 | 0.8816   |
-| 0.0049        | 14.0  | 1022 | 0.6444          | 0.8889 | 0.8898   |
-| 0.0021        | 15.0  | 1095 | 0.7612          | 0.8761 | 0.8776   |
-| 0.001         | 16.0  | 1168 | 0.6017          | 0.9143 | 0.9143   |
-| 0.0005        | 17.0  | 1241 | 0.6069          | 0.9016 | 0.9020   |
-| 0.0005        | 18.0  | 1314 | 0.6110          | 0.9019 | 0.9020   |
-| 0.0004        | 19.0  | 1387 | 0.6050          | 0.9019 | 0.9020   |
-| 0.0004        | 20.0  | 1460 | 0.6006          | 0.9060 | 0.9061   |
 ### Framework versions

 This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.8071
+- F1: 0.7980
+- Accuracy: 0.7959
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 8e-05
+- train_batch_size: 128
+- eval_batch_size: 128
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 256
 - optimizer: Use adamw_torch with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 500
 - num_epochs: 20
 ### Training results
+| Training Loss | Epoch   | Step | Validation Loss | F1     | Accuracy |
+|:-------------:|:-------:|:----:|:---------------:|:------:|:--------:|
+| 3.981         | 1.0     | 19   | 1.9665          | 0.0904 | 0.1633   |
+| 3.6781        | 2.0     | 38   | 1.8383          | 0.1882 | 0.2449   |
+| 3.1283        | 3.0     | 57   | 1.4232          | 0.4896 | 0.5061   |
+| 2.4044        | 4.0     | 76   | 1.2398          | 0.5488 | 0.5673   |
+| 1.9166        | 5.0     | 95   | 1.1468          | 0.5726 | 0.6041   |
+| 1.7122        | 6.0     | 114  | 1.0013          | 0.6649 | 0.6653   |
+| 1.4626        | 7.0     | 133  | 0.8954          | 0.7198 | 0.7224   |
+| 1.2173        | 8.0     | 152  | 0.7306          | 0.7611 | 0.7592   |
+| 1.0648        | 9.0     | 171  | 0.7449          | 0.7412 | 0.7388   |
+| 0.9008        | 10.0    | 190  | 0.6874          | 0.7754 | 0.7714   |
+| 0.856         | 11.0    | 209  | 0.6584          | 0.8071 | 0.8082   |
+| 0.7557        | 12.0    | 228  | 0.6046          | 0.7854 | 0.7837   |
+| 0.472         | 13.0    | 247  | 0.8246          | 0.7428 | 0.7429   |
+| 0.4386        | 14.0    | 266  | 0.7892          | 0.8042 | 0.8082   |
+| 0.3418        | 15.0    | 285  | 0.6727          | 0.8248 | 0.8286   |
+| 0.2662        | 16.0    | 304  | 0.8244          | 0.8144 | 0.8163   |
+| 0.1774        | 17.0    | 323  | 0.7832          | 0.8083 | 0.8041   |
+| 0.1246        | 18.0    | 342  | 0.5501          | 0.8703 | 0.8694   |
+| 0.121         | 18.9730 | 360  | 0.8071          | 0.7980 | 0.7959   |
 ### Framework versions

runs/Jan23_15-47-00_ultramarine/events.out.tfevents.1737636421.ultramarine.3980114.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:20f4e748ecf2ac6ed63341ebe55229c9ce6fa0507be5b51386f1b8125e9d966b
-size 16305

 version https://git-lfs.github.com/spec/v1
+oid sha256:f99b2200119d579778cd9b124d300698feef1c95674f293ecb7f1a3bc1030eb6
+size 17239