--- license: apache-2.0 base_model: HKUSTAudio/Llasa-1B-Multilingual datasets: - amu-cai/pl-asr-bigos-v2 language: - pl tags: - speech - audio - polish - llama - tts - fine-tuned - text-to-speech model-index: - name: From Llasa to Łazanki results: [] --- # From Llasa to Łazanki: Fine-tuned Llasa-1B on Polish Speech This is a fine-tuned version of [`HKUSTAudio/Llasa-1B-Multilingual`](https://huggingface.co/HKUSTAudio/Llasa-1B-Multilingual), adapted for **Polish Text-to-Speech (TTS)**. It was fine-tuned on the [`pl-asr-bigos-v2`](https://huggingface.co/datasets/amu-cai/pl-asr-bigos-v2) dataset, specifically the `mozilla-common_voice_15-23` subset, which includes high-quality Polish speech recordings suitable for training TTS models. --- ## 🧠 Base Model [Llasa-1B-Multilingual](https://huggingface.co/HKUSTAudio/Llasa-1B-Multilingual) model developed by HKUST. The approach leverages the LLAMA-initialized text BPE tokenizer, which can handle multilingual text without the need to design language-specific G2P (grapheme-to-phoneme) systems. --- ## 🗣 Fine-tuning Details - **Dataset**: [PL-ASR-BIGOS-v2](https://huggingface.co/datasets/amu-cai/pl-asr-bigos-v2), `mozilla-common_voice_15-23` subset - **Language**: 🇵🇱 Polish - **Task**: Text to speech