Update README.md
Browse files
README.md
CHANGED
|
@@ -16,15 +16,17 @@ language:
|
|
| 16 |
- en
|
| 17 |
base_model:
|
| 18 |
- HuggingFaceTB/SmolLM2-135M-Instruct
|
|
|
|
|
|
|
| 19 |
---
|
| 20 |
|
| 21 |
# Orga Dynamic (1) — Bilingual End-of-Utterance Classifier
|
| 22 |
|
| 23 |
-
**Orga Dynamic (1)** es un adaptador LoRA (Low-Rank Adaptation) entrenado para detectar automáticamente el **fin de turno** (End of Utterance, EOU) en conversaciones
|
| 24 |
|
| 25 |
-
- **Base model:** `HuggingFaceTB/SmolLM2-135M-Instruct`
|
| 26 |
- **Method:** LoRA-r16 / α32 sobre `q_proj`, `k_proj`, `v_proj`, `o_proj`
|
| 27 |
-
- **Training data:** 4 000 intervenciones
|
| 28 |
- **Metrics (test 20 %)**
|
| 29 |
|
| 30 |
| Metric | EN + ES |
|
|
@@ -32,7 +34,6 @@ base_model:
|
|
| 32 |
| Accuracy | **0.951** |
|
| 33 |
| F1 | **0.948** |
|
| 34 |
|
| 35 |
-
> **Use-case:** dotar a bots, ASR o UX-logging de una señal fiable para saber cuándo el usuario ha terminado de hablar o escribir.
|
| 36 |
|
| 37 |
---
|
| 38 |
|
|
@@ -40,12 +41,11 @@ base_model:
|
|
| 40 |
|
| 41 |
| | |
|
| 42 |
|---|---|
|
| 43 |
-
| **Architecture** | Llama-based sequence classifier (135 M params) + LoRA-r16 |
|
| 44 |
| **Languages** | English (en), Spanish (es) |
|
| 45 |
| **Labels** | `0 = NO_EOU`, `1 = EOU` |
|
| 46 |
| **Precision** | fp16 (LoRA weights ≈ 5 MB) |
|
| 47 |
| **License** | Apache 2.0 |
|
| 48 |
-
| **Author** | @
|
| 49 |
|
| 50 |
|
| 51 |
---
|
|
@@ -58,7 +58,7 @@ from peft import PeftModel
|
|
| 58 |
|
| 59 |
base = AutoModelForSequenceClassification.from_pretrained(
|
| 60 |
"HuggingFaceTB/SmolLM2-135M-Instruct", num_labels=2)
|
| 61 |
-
model = PeftModel.from_pretrained(base, "
|
| 62 |
tok = AutoTokenizer.from_pretrained("latishab/turnsense")
|
| 63 |
|
| 64 |
def is_end(text):
|
|
|
|
| 16 |
- en
|
| 17 |
base_model:
|
| 18 |
- HuggingFaceTB/SmolLM2-135M-Instruct
|
| 19 |
+
metrics:
|
| 20 |
+
- accuracy
|
| 21 |
---
|
| 22 |
|
| 23 |
# Orga Dynamic (1) — Bilingual End-of-Utterance Classifier
|
| 24 |
|
| 25 |
+
**Orga Dynamic (1)** es un adaptador LoRA (Low-Rank Adaptation) entrenado para detectar automáticamente el **fin de turno** (End of Utterance, EOU) en conversaciones.
|
| 26 |
|
| 27 |
+
- **Base model:** `HuggingFaceTB/SmolLM2-135M-Instruct`
|
| 28 |
- **Method:** LoRA-r16 / α32 sobre `q_proj`, `k_proj`, `v_proj`, `o_proj`
|
| 29 |
+
- **Training data:** 4 000 intervenciones
|
| 30 |
- **Metrics (test 20 %)**
|
| 31 |
|
| 32 |
| Metric | EN + ES |
|
|
|
|
| 34 |
| Accuracy | **0.951** |
|
| 35 |
| F1 | **0.948** |
|
| 36 |
|
|
|
|
| 37 |
|
| 38 |
---
|
| 39 |
|
|
|
|
| 41 |
|
| 42 |
| | |
|
| 43 |
|---|---|
|
|
|
|
| 44 |
| **Languages** | English (en), Spanish (es) |
|
| 45 |
| **Labels** | `0 = NO_EOU`, `1 = EOU` |
|
| 46 |
| **Precision** | fp16 (LoRA weights ≈ 5 MB) |
|
| 47 |
| **License** | Apache 2.0 |
|
| 48 |
+
| **Author** | @marc-es |
|
| 49 |
|
| 50 |
|
| 51 |
---
|
|
|
|
| 58 |
|
| 59 |
base = AutoModelForSequenceClassification.from_pretrained(
|
| 60 |
"HuggingFaceTB/SmolLM2-135M-Instruct", num_labels=2)
|
| 61 |
+
model = PeftModel.from_pretrained(base, "marc-es/orga-dynamic-1")
|
| 62 |
tok = AutoTokenizer.from_pretrained("latishab/turnsense")
|
| 63 |
|
| 64 |
def is_end(text):
|