Update README.md
Browse files
README.md
CHANGED
|
@@ -21,7 +21,7 @@ metrics:
|
|
| 21 |
# 🧠 Mistral LoRA Transcript Chunking Model
|
| 22 |
|
| 23 |
## Model Overview
|
| 24 |
-
This LoRA adapter was trained on a custom dataset of **1,000 English transcript examples** to teach a **Mistral-7B-v0.2** model how to segment long transcripts into topic-based chunks using
|
| 25 |
It enables automated **topic boundary detection** in conversation, meeting, and podcast transcripts — ideal for preprocessing before summarization, classification, or retrieval.
|
| 26 |
|
| 27 |
---
|
|
@@ -78,7 +78,7 @@ model = AutoModelForCausalLM.from_pretrained(base)
|
|
| 78 |
model = PeftModel.from_pretrained(model, adapter)
|
| 79 |
|
| 80 |
text = (
|
| 81 |
-
"Break this transcript wherever a new topic begins. Use
|
| 82 |
"Transcript: Let's start with last week's performance metrics. "
|
| 83 |
"Next, we’ll review upcoming campaign deadlines."
|
| 84 |
)
|
|
|
|
| 21 |
# 🧠 Mistral LoRA Transcript Chunking Model
|
| 22 |
|
| 23 |
## Model Overview
|
| 24 |
+
This LoRA adapter was trained on a custom dataset of **1,000 English transcript examples** to teach a **Mistral-7B-v0.2** model how to segment long transcripts into topic-based chunks using 'section #:' as delimiters.
|
| 25 |
It enables automated **topic boundary detection** in conversation, meeting, and podcast transcripts — ideal for preprocessing before summarization, classification, or retrieval.
|
| 26 |
|
| 27 |
---
|
|
|
|
| 78 |
model = PeftModel.from_pretrained(model, adapter)
|
| 79 |
|
| 80 |
text = (
|
| 81 |
+
"Break this transcript wherever a new topic begins. Use 'section #:' as a delimiter.\n"
|
| 82 |
"Transcript: Let's start with last week's performance metrics. "
|
| 83 |
"Next, we’ll review upcoming campaign deadlines."
|
| 84 |
)
|