gokuls's picture
End of training
9fb1638
metadata
base_model: gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new_ffn_1
tags:
  - generated_from_trainer
datasets:
  - massive
metrics:
  - accuracy
model-index:
  - name: hbertv1-massive-logit_KD-tiny_ffn_1
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: massive
          type: massive
          config: en-US
          split: validation
          args: en-US
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.8342351205115592

hbertv1-massive-logit_KD-tiny_ffn_1

This model is a fine-tuned version of gokuls/model_v1_complete_training_wt_init_48_tiny_freeze_new_ffn_1 on the massive dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6331
  • Accuracy: 0.8342

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 33
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.296 1.0 180 3.7345 0.2081
3.5386 2.0 360 3.0526 0.2730
2.9946 3.0 540 2.6051 0.3360
2.6126 4.0 720 2.2810 0.4215
2.3148 5.0 900 2.0377 0.4683
2.0838 6.0 1080 1.8401 0.5371
1.9016 7.0 1260 1.6686 0.6080
1.7431 8.0 1440 1.5358 0.6439
1.613 9.0 1620 1.4238 0.6886
1.4952 10.0 1800 1.3339 0.7127
1.4 11.0 1980 1.2511 0.7162
1.3069 12.0 2160 1.1877 0.7285
1.2288 13.0 2340 1.1277 0.7329
1.1684 14.0 2520 1.0877 0.7418
1.0971 15.0 2700 1.0285 0.7570
1.0424 16.0 2880 0.9811 0.7619
0.9865 17.0 3060 0.9552 0.7629
0.943 18.0 3240 0.9216 0.7742
0.9047 19.0 3420 0.8812 0.7762
0.857 20.0 3600 0.8619 0.7821
0.8274 21.0 3780 0.8326 0.7914
0.7955 22.0 3960 0.8086 0.7919
0.7618 23.0 4140 0.7861 0.7973
0.7356 24.0 4320 0.7750 0.7993
0.7109 25.0 4500 0.7580 0.8028
0.6872 26.0 4680 0.7430 0.8077
0.6683 27.0 4860 0.7417 0.8101
0.6503 28.0 5040 0.7132 0.8155
0.6279 29.0 5220 0.7100 0.8106
0.6168 30.0 5400 0.6991 0.8165
0.5981 31.0 5580 0.6935 0.8185
0.5816 32.0 5760 0.6843 0.8200
0.5746 33.0 5940 0.6795 0.8155
0.5602 34.0 6120 0.6775 0.8210
0.5525 35.0 6300 0.6683 0.8244
0.5403 36.0 6480 0.6641 0.8219
0.5289 37.0 6660 0.6598 0.8278
0.5245 38.0 6840 0.6546 0.8278
0.518 39.0 7020 0.6523 0.8259
0.5105 40.0 7200 0.6488 0.8283
0.4988 41.0 7380 0.6463 0.8278
0.4971 42.0 7560 0.6414 0.8308
0.491 43.0 7740 0.6376 0.8318
0.4901 44.0 7920 0.6395 0.8298
0.4846 45.0 8100 0.6348 0.8298
0.4805 46.0 8280 0.6357 0.8313
0.481 47.0 8460 0.6320 0.8313
0.4767 48.0 8640 0.6331 0.8342
0.474 49.0 8820 0.6319 0.8328
0.4765 50.0 9000 0.6318 0.8308

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.15.0
  • Tokenizers 0.15.0