malakhovks commited on
Commit
71a0d82
·
1 Parent(s): 0141be8

v2: new checkpoint + updated metrics + changelog

Browse files

| Date (UTC) | Version | Notes |
| ---------- | ------- | ------------------------------------------------- |
| 2025-07-18 | **v2** | new checkpoint ➜ test acc 0.9986, macro F1 0.9987 |

CHANGELOG.md ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ # Changelog
2
+
3
+ ## v2 – 2025-07-18
4
+ * Added 20-epoch fine-tuned checkpoint
5
+ * Test accuracy ↑ 0.9986, macro F1 ↑ 0.9987
6
+ * Refreshed tokenizer, model card, metrics
7
+
8
+ ## v1 – 2024-11-10
9
+ * Initial public release
README.md CHANGED
@@ -1,10 +1,73 @@
1
  ---
2
- license: mit
3
- datasets:
4
- - malakhovks/MeDeBERTa
5
- language:
6
- - en
7
- base_model:
8
- - microsoft/deberta-v3-small
9
- - microsoft/deberta-v3-xsmall
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
+ language: en
4
+ library_name: transformers
5
+ tags:
6
+ - deberta
7
+ - sequence-classification
8
+ - medicine
9
+ - telerehabilitation
10
+ metrics:
11
+ - name: accuracy
12
+ type: accuracy
13
+ value: 0.9986 # test_accuracy
14
+ - name: macro_f1
15
+ type: f1
16
+ value: 0.9987
17
+ - name: balanced_accuracy
18
+ type: balanced_accuracy
19
+ value: 0.9987
20
+ - name: auc_micro
21
+ type: auc
22
+ value: 0.999997
23
+ - name: ap_micro
24
+ type: average_precision
25
+ value: 0.99993
26
+ ---
27
+
28
+ # **MeDeBERTa** – v2 (July 2025)
29
+
30
+ Fine-tuned **microsoft/deberta-v3-xsmall** on 269 874 Q-A pairs (30 intent labels) for the *MeDeBERTa* telerehabilitation question-classification task.
31
+
32
+ | | Value |
33
+ |------------------------------------|-------|
34
+ | **Epochs** | 20 (best @ epoch 17) |
35
+ | **Batch / Grad. Accum.** | 16 / 4 (eff. 64) |
36
+ | **Learning rate** | 5 × 10⁻⁵ |
37
+ | **Best val. accuracy** | **0.99855** |
38
+ | **Test accuracy** | **0.99859** |
39
+ | **Macro F1 (test)** | **0.99867** |
40
+ | **Balanced accuracy (test)** | 0.99868 |
41
+ | **Micro AUC** | 0.999997 |
42
+ | **Micro average precision** | 0.99993 |
43
+ | **Loss (val | test)** | 0.01371 \| 0.01305 |
44
+ | **Hardware** | RTX 2080 Ti (11 GB) |
45
+
46
+ <details>
47
+ <summary>Per-class metrics (excerpt)</summary>
48
+
49
+ | Label | Precision | Recall | F1 | Support |
50
+ |-------|-----------|--------|----|---------|
51
+ | any_code | 1.000 | 1.000 | 1.000 | 980 |
52
+ | contexts | 0.988 | 0.987 | 0.988 | 923 |
53
+ | treatment summary | 1.000 | 0.998 | 0.999 | 927 |
54
+ | … | … | … | … | … |
55
+
56
+ Full table: see `classification_report.json` / `classification_report.csv`.
57
+ </details>
58
+
59
+ ## Usage
60
+
61
+ ```python
62
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
63
+
64
+ tok = AutoTokenizer.from_pretrained("malakhovks/MeDeBERTa")
65
+ model = AutoModelForSequenceClassification.from_pretrained("malakhovks/MeDeBERTa")
66
+
67
+ inputs = tok("what are contraindications for TENS?", return_tensors="pt")
68
+ pred = model(**inputs).logits.argmax(-1).item()
69
+ print(model.config.id2label[pred])
70
+ ```
71
+
72
+ ## Changelog
73
+ See [CHANGELOG.md](./CHANGELOG.md) for full version history.
classification_report.csv ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ,precision,recall,f1-score,support
2
+ any_code,1.0000,1.0000,1.0000,980.0000
3
+ any_icd,1.0000,1.0000,1.0000,950.0000
4
+ articles,1.0000,1.0000,1.0000,1022.0000
5
+ causes,1.0000,0.9990,0.9995,961.0000
6
+ certain cpt,1.0000,1.0000,1.0000,239.0000
7
+ certain g-code,1.0000,1.0000,1.0000,231.0000
8
+ certain hcpcs,1.0000,1.0000,1.0000,56.0000
9
+ certain icd 10,1.0000,1.0000,1.0000,918.0000
10
+ certain icd 9,1.0000,1.0000,1.0000,884.0000
11
+ contexts,0.9881,0.9870,0.9875,923.0000
12
+ contraindications precautions,1.0000,0.9990,0.9995,965.0000
13
+ cpt,1.0000,0.9990,0.9995,977.0000
14
+ description,0.9864,0.9931,0.9897,875.0000
15
+ diagnosis need for treatment,1.0000,0.9989,0.9995,940.0000
16
+ g-code,0.9968,1.0000,0.9984,939.0000
17
+ hcpcs,1.0000,0.9989,0.9995,947.0000
18
+ icd 10,1.0000,1.0000,1.0000,907.0000
19
+ icd 9,1.0000,0.9990,0.9995,954.0000
20
+ indicatons,0.9978,0.9946,0.9962,921.0000
21
+ pathogenesis,0.9979,0.9990,0.9985,970.0000
22
+ patient education,1.0000,1.0000,1.0000,961.0000
23
+ prognosis,0.9979,1.0000,0.9990,954.0000
24
+ references,1.0000,1.0000,1.0000,967.0000
25
+ reimbursement,0.9990,0.9990,0.9990,968.0000
26
+ relations,0.9968,0.9968,0.9968,952.0000
27
+ risk factors,1.0000,1.0000,1.0000,974.0000
28
+ rule out,1.0000,1.0000,1.0000,867.0000
29
+ symptoms,1.0000,0.9979,0.9990,975.0000
30
+ synonyms,0.9990,1.0000,0.9995,954.0000
31
+ test,0.9989,1.0000,0.9995,930.0000
32
+ treatment summary,1.0000,0.9978,0.9989,927.0000
33
+ accuracy,0.9986,0.9986,0.9986,0.9986
34
+ macro avg,0.9987,0.9987,0.9987,26988.0000
35
+ weighted avg,0.9986,0.9986,0.9986,26988.0000
classification_report.json ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "any_code": {
3
+ "precision": 1.0,
4
+ "recall": 1.0,
5
+ "f1-score": 1.0,
6
+ "support": 980.0
7
+ },
8
+ "any_icd": {
9
+ "precision": 1.0,
10
+ "recall": 1.0,
11
+ "f1-score": 1.0,
12
+ "support": 950.0
13
+ },
14
+ "articles": {
15
+ "precision": 1.0,
16
+ "recall": 1.0,
17
+ "f1-score": 1.0,
18
+ "support": 1022.0
19
+ },
20
+ "causes": {
21
+ "precision": 1.0,
22
+ "recall": 0.9989594172736732,
23
+ "f1-score": 0.9994794377928162,
24
+ "support": 961.0
25
+ },
26
+ "certain cpt": {
27
+ "precision": 1.0,
28
+ "recall": 1.0,
29
+ "f1-score": 1.0,
30
+ "support": 239.0
31
+ },
32
+ "certain g-code": {
33
+ "precision": 1.0,
34
+ "recall": 1.0,
35
+ "f1-score": 1.0,
36
+ "support": 231.0
37
+ },
38
+ "certain hcpcs": {
39
+ "precision": 1.0,
40
+ "recall": 1.0,
41
+ "f1-score": 1.0,
42
+ "support": 56.0
43
+ },
44
+ "certain icd 10": {
45
+ "precision": 1.0,
46
+ "recall": 1.0,
47
+ "f1-score": 1.0,
48
+ "support": 918.0
49
+ },
50
+ "certain icd 9": {
51
+ "precision": 1.0,
52
+ "recall": 1.0,
53
+ "f1-score": 1.0,
54
+ "support": 884.0
55
+ },
56
+ "contexts": {
57
+ "precision": 0.9880694143167028,
58
+ "recall": 0.9869989165763814,
59
+ "f1-score": 0.9875338753387534,
60
+ "support": 923.0
61
+ },
62
+ "contraindications precautions": {
63
+ "precision": 1.0,
64
+ "recall": 0.9989637305699481,
65
+ "f1-score": 0.9994815966822188,
66
+ "support": 965.0
67
+ },
68
+ "cpt": {
69
+ "precision": 1.0,
70
+ "recall": 0.9989764585465711,
71
+ "f1-score": 0.9994879672299027,
72
+ "support": 977.0
73
+ },
74
+ "description": {
75
+ "precision": 0.9863791146424518,
76
+ "recall": 0.9931428571428571,
77
+ "f1-score": 0.989749430523918,
78
+ "support": 875.0
79
+ },
80
+ "diagnosis need for treatment": {
81
+ "precision": 1.0,
82
+ "recall": 0.9989361702127659,
83
+ "f1-score": 0.9994678020223523,
84
+ "support": 940.0
85
+ },
86
+ "g-code": {
87
+ "precision": 0.9968152866242038,
88
+ "recall": 1.0,
89
+ "f1-score": 0.9984051036682615,
90
+ "support": 939.0
91
+ },
92
+ "hcpcs": {
93
+ "precision": 1.0,
94
+ "recall": 0.9989440337909187,
95
+ "f1-score": 0.9994717379820391,
96
+ "support": 947.0
97
+ },
98
+ "icd 10": {
99
+ "precision": 1.0,
100
+ "recall": 1.0,
101
+ "f1-score": 1.0,
102
+ "support": 907.0
103
+ },
104
+ "icd 9": {
105
+ "precision": 1.0,
106
+ "recall": 0.9989517819706499,
107
+ "f1-score": 0.9994756161510225,
108
+ "support": 954.0
109
+ },
110
+ "indicatons": {
111
+ "precision": 0.9978213507625272,
112
+ "recall": 0.99457111834962,
113
+ "f1-score": 0.9961935834692768,
114
+ "support": 921.0
115
+ },
116
+ "pathogenesis": {
117
+ "precision": 0.9979402677651905,
118
+ "recall": 0.9989690721649485,
119
+ "f1-score": 0.9984544049459042,
120
+ "support": 970.0
121
+ },
122
+ "patient education": {
123
+ "precision": 1.0,
124
+ "recall": 1.0,
125
+ "f1-score": 1.0,
126
+ "support": 961.0
127
+ },
128
+ "prognosis": {
129
+ "precision": 0.997907949790795,
130
+ "recall": 1.0,
131
+ "f1-score": 0.9989528795811519,
132
+ "support": 954.0
133
+ },
134
+ "references": {
135
+ "precision": 1.0,
136
+ "recall": 1.0,
137
+ "f1-score": 1.0,
138
+ "support": 967.0
139
+ },
140
+ "reimbursement": {
141
+ "precision": 0.9989669421487604,
142
+ "recall": 0.9989669421487604,
143
+ "f1-score": 0.9989669421487604,
144
+ "support": 968.0
145
+ },
146
+ "relations": {
147
+ "precision": 0.9968487394957983,
148
+ "recall": 0.9968487394957983,
149
+ "f1-score": 0.9968487394957983,
150
+ "support": 952.0
151
+ },
152
+ "risk factors": {
153
+ "precision": 1.0,
154
+ "recall": 1.0,
155
+ "f1-score": 1.0,
156
+ "support": 974.0
157
+ },
158
+ "rule out": {
159
+ "precision": 1.0,
160
+ "recall": 1.0,
161
+ "f1-score": 1.0,
162
+ "support": 867.0
163
+ },
164
+ "symptoms": {
165
+ "precision": 1.0,
166
+ "recall": 0.997948717948718,
167
+ "f1-score": 0.9989733059548255,
168
+ "support": 975.0
169
+ },
170
+ "synonyms": {
171
+ "precision": 0.9989528795811519,
172
+ "recall": 1.0,
173
+ "f1-score": 0.9994761655316919,
174
+ "support": 954.0
175
+ },
176
+ "test": {
177
+ "precision": 0.9989258861439313,
178
+ "recall": 1.0,
179
+ "f1-score": 0.999462654486835,
180
+ "support": 930.0
181
+ },
182
+ "treatment summary": {
183
+ "precision": 1.0,
184
+ "recall": 0.9978425026968716,
185
+ "f1-score": 0.9989200863930886,
186
+ "support": 927.0
187
+ },
188
+ "accuracy": 0.9985919668000592,
189
+ "macro avg": {
190
+ "precision": 0.9986654139119842,
191
+ "recall": 0.9986780793189832,
192
+ "f1-score": 0.998671010625762,
193
+ "support": 26988.0
194
+ },
195
+ "weighted avg": {
196
+ "precision": 0.9985949747289834,
197
+ "recall": 0.9985919668000592,
198
+ "f1-score": 0.9985927033253673,
199
+ "support": 26988.0
200
+ }
201
+ }
config.json CHANGED
@@ -5,7 +5,7 @@
5
  "attention_probs_dropout_prob": 0.1,
6
  "hidden_act": "gelu",
7
  "hidden_dropout_prob": 0.1,
8
- "hidden_size": 768,
9
  "id2label": {
10
  "0": "any_code",
11
  "1": "any_icd",
@@ -40,7 +40,7 @@
40
  "30": "treatment summary"
41
  },
42
  "initializer_range": 0.02,
43
- "intermediate_size": 3072,
44
  "label2id": {
45
  "any_code": 0,
46
  "any_icd": 1,
@@ -80,12 +80,12 @@
80
  "max_relative_positions": -1,
81
  "model_type": "deberta-v2",
82
  "norm_rel_ebd": "layer_norm",
83
- "num_attention_heads": 12,
84
- "num_hidden_layers": 6,
85
  "pad_token_id": 0,
86
  "pooler_dropout": 0,
87
  "pooler_hidden_act": "gelu",
88
- "pooler_hidden_size": 768,
89
  "pos_att_type": [
90
  "p2c",
91
  "c2p"
 
5
  "attention_probs_dropout_prob": 0.1,
6
  "hidden_act": "gelu",
7
  "hidden_dropout_prob": 0.1,
8
+ "hidden_size": 384,
9
  "id2label": {
10
  "0": "any_code",
11
  "1": "any_icd",
 
40
  "30": "treatment summary"
41
  },
42
  "initializer_range": 0.02,
43
+ "intermediate_size": 1536,
44
  "label2id": {
45
  "any_code": 0,
46
  "any_icd": 1,
 
80
  "max_relative_positions": -1,
81
  "model_type": "deberta-v2",
82
  "norm_rel_ebd": "layer_norm",
83
+ "num_attention_heads": 6,
84
+ "num_hidden_layers": 12,
85
  "pad_token_id": 0,
86
  "pooler_dropout": 0,
87
  "pooler_hidden_act": "gelu",
88
+ "pooler_hidden_size": 384,
89
  "pos_att_type": [
90
  "p2c",
91
  "c2p"
metrics.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_val_epoch": 17.0,
3
+ "best_val_accuracy": 0.9985548597472858,
4
+ "best_val_loss": 0.01370711624622345,
5
+ "final_train_loss": 0.0005,
6
+ "test_accuracy": 0.9985919668000592,
7
+ "test_loss": 0.013053071685135365,
8
+ "accuracy": 0.9985919668000592,
9
+ "balanced_accuracy": 0.9986780793189832,
10
+ "macro_precision": 0.9986654139119842,
11
+ "macro_recall": 0.9986780793189832,
12
+ "macro_f1": 0.998671010625762,
13
+ "weighted_f1": 0.9985927033253673,
14
+ "micro_f1": 0.9985919668000592,
15
+ "auc_micro": 0.9999974629488196,
16
+ "ap_micro": 0.9999324746733634
17
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c5fcf2526dbf5f8326da79da28b3f5c3e698c359f908439338c2e1a1933ad3b5
3
- size 567687764
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7a4c2aa3a73027a5d45bb686cdf20d077b50c1fa3a6e73c7faedba2a30445ec4
3
+ size 283392108
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:39720f29c338c54d2269d57b08eaff7d4403c932a6e4ae333eacd6e392357241
3
  size 5713
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf3f6bdff1f26ed479f70a4c50c0079d364eefb37e4e306f42ca7a97ac497df0
3
  size 5713