ratish/DBERT_CleanDesc_MAKE_v11
This model is a fine-tuned version of distilbert-base-uncased on an unknown dataset.
It achieves the following results on the evaluation set:
- Train Loss: 0.1889
- Validation Loss: 1.0498
- Train Accuracy: 0.8
- Epoch: 14
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 4620, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
- training_precision: float32
Training results
| Train Loss |
Validation Loss |
Train Accuracy |
Epoch |
| 2.1852 |
2.0907 |
0.375 |
0 |
| 1.7165 |
1.7453 |
0.525 |
1 |
| 1.2878 |
1.4632 |
0.55 |
2 |
| 0.9851 |
1.2769 |
0.575 |
3 |
| 0.7653 |
1.1689 |
0.675 |
4 |
| 0.6014 |
1.1163 |
0.65 |
5 |
| 0.4997 |
1.0490 |
0.7 |
6 |
| 0.4344 |
0.9967 |
0.7 |
7 |
| 0.3263 |
0.9887 |
0.75 |
8 |
| 0.2837 |
1.0332 |
0.775 |
9 |
| 0.2291 |
1.0496 |
0.775 |
10 |
| 0.1994 |
1.0560 |
0.775 |
11 |
| 0.1736 |
1.1081 |
0.775 |
12 |
| 0.1589 |
1.0679 |
0.8 |
13 |
| 0.1889 |
1.0498 |
0.8 |
14 |
Framework versions
- Transformers 4.28.1
- TensorFlow 2.12.0
- Datasets 2.12.0
- Tokenizers 0.13.3