Update README.md
Browse files
README.md
CHANGED
|
@@ -67,7 +67,7 @@ Listen to the difference between the generic multilingual baseline and our high-
|
|
| 67 |
|
| 68 |
The model was trained on a diverse corpus of **16,604 samples** to capture the nuances of Finnish phonetics, including vowel length and gemination.
|
| 69 |
|
| 70 |
-
* **Sources**: Mozilla Common Voice (cv-15), Filmot, YouTube, and Parliament data.
|
| 71 |
* **Zero-Shot Integrity**: Specific speakers (`cv-15_11`, `cv-15_16`, `cv-15_2`) were strictly excluded from training to ensure valid OOD testing.
|
| 72 |
* **Traceability**: Full attribution and filtering lineage are provided in `attribution.csv`.
|
| 73 |
|
|
|
|
| 67 |
|
| 68 |
The model was trained on a diverse corpus of **16,604 samples** to capture the nuances of Finnish phonetics, including vowel length and gemination.
|
| 69 |
|
| 70 |
+
* **Sources**: Mozilla Common Voice (cv-15, lisence CC0-1.0)), Filmot (CC BY), YouTube (CC BY), and Parliament data (CLARIN PUB +BY +PRIV).
|
| 71 |
* **Zero-Shot Integrity**: Specific speakers (`cv-15_11`, `cv-15_16`, `cv-15_2`) were strictly excluded from training to ensure valid OOD testing.
|
| 72 |
* **Traceability**: Full attribution and filtering lineage are provided in `attribution.csv`.
|
| 73 |
|