Added NVIDIA developer links
Browse files
README.md
CHANGED
|
@@ -209,6 +209,16 @@ The model is intended for users requiring speech-to-text transcription capabilit
|
|
| 209 |
|
| 210 |
Huggingface 07/17/2025 via https://huggingface.co/nvidia/canary-qwen-2.5b
|
| 211 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 212 |
## Model Architecture:
|
| 213 |
Canary-Qwen is a Speech-Augmented Language Model (SALM) [9] model with FastConformer [2] Encoder and Transformer Decoder [3]. It is built using two base models: `nvidia/canary-1b-flash` [1,5] and `Qwen/Qwen3-1.7B` [4], a linear projection, and low-rank adaptation (LoRA) applied to the LLM. The audio encoder computes audio representation that is mapped to the LLM embedding space via a linear projection, and concatenated with the embeddings of text tokens. The model is prompted with "Transcribe the following: <audio>", using Qwen's chat template.
|
| 214 |
|
|
|
|
| 209 |
|
| 210 |
Huggingface 07/17/2025 via https://huggingface.co/nvidia/canary-qwen-2.5b
|
| 211 |
|
| 212 |
+
## Discover more from NVIDIA:
|
| 213 |
+
For documentation, deployment guides, enterprise-ready APIs, and the latest open models—including Nemotron and other cutting-edge speech, translation, and generative AI—visit the NVIDIA Developer Portal at [developer.nvidia.com](https://developer.nvidia.com/).
|
| 214 |
+
Join the community to access tools, support, and resources to accelerate your development with NVIDIA’s NeMo, Riva, NIM, and foundation models.<br>
|
| 215 |
+
|
| 216 |
+
### Explore more from NVIDIA: <br>
|
| 217 |
+
What is [Nemotron](https://www.nvidia.com/en-us/ai-data-science/foundation-models/nemotron/)?<br>
|
| 218 |
+
NVIDIA Developer [Nemotron](https://developer.nvidia.com/nemotron)<br>
|
| 219 |
+
[NVIDIA Riva Speech](https://developer.nvidia.com/riva?sortBy=developer_learning_library%2Fsort%2Ffeatured_in.riva%3Adesc%2Ctitle%3Aasc#demos)<br>
|
| 220 |
+
[NeMo Documentation](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/asr/models.html)<br>
|
| 221 |
+
|
| 222 |
## Model Architecture:
|
| 223 |
Canary-Qwen is a Speech-Augmented Language Model (SALM) [9] model with FastConformer [2] Encoder and Transformer Decoder [3]. It is built using two base models: `nvidia/canary-1b-flash` [1,5] and `Qwen/Qwen3-1.7B` [4], a linear projection, and low-rank adaptation (LoRA) applied to the LLM. The audio encoder computes audio representation that is mapped to the LLM embedding space via a linear projection, and concatenated with the embeddings of text tokens. The model is prompted with "Transcribe the following: <audio>", using Qwen's chat template.
|
| 224 |
|