YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Pyannote

Run Pyannote optimized for Qualcomm SnapDragon device's NPU with nexaSDK.

Quickstart

Install NexaSDK and create a free account at sdk.nexa.ai

Activate your device with your access token:

nexa config set license '<access_token>'

Run the model on Qualcomm NPU in one line:
```
nexa infer NexaAI/Pyannote-NPU
```

Input: Enter input audio path,
Output: Returns speech diarization results, or report error if any required input cannot be found

Model Description

pyannote-audio (Community Version) is an open-source speech diarization model designed for accurate speaker segmentation and labeling in audio streams.
Developed by the Pyannote community, it combines audio processing, speaker embedding, and clustering into a unified framework, enabling robust speech segmentation on local machines without cloud dependency.

Features

🔊 End-to-End Diarization Pipeline — Automatically detects and labels who spoke when in an audio file.
⚡ Lightweight & Efficient — Optimized for real-time or batch processing on consumer hardware and GPUs.
🧠 Speaker Embedding & Clustering — Extracts rich speaker representations and groups them for identity separation.
🔧 Customizable & Modular — Easily integrates with PyTorch pipelines or modified components for research and prototyping.
🌍 Community-Driven & Transparent — Fully open and maintained by an active community of speech researchers and developers.

Use Cases

Meeting Transcription: Segment conversations by speaker for clearer transcripts.
Broadcast and Podcast Analysis: Attribute voices and structure long-form audio content.
Call Center Analytics: Separate agent and customer segments for interaction insights.
Research: Test diarization algorithms or contribute new speaker models.
Voice Dataset Preparation: Preprocess large audio datasets for training ASR or emotion recognition systems.

Inputs and Outputs

Input

Audio file or stream

Output

Speaker-labeled time segments

License

This repo is licensed under the Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0) license, which allows use, sharing, and modification only for non-commercial purposes with proper attribution.
All NPU-related models, runtimes, and code in this project are protected under this non-commercial license and cannot be used in any commercial or revenue-generating applications.
Commercial licensing or enterprise usage requires a separate agreement.
For inquiries, please contact dev@nexa.ai.

Downloads last month: 24

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/Pyannote-NPU

Qualcomm NPU

Collection

Latest SOTA models supported on Qualcomm NPU. • 25 items • Updated about 4 hours ago • 3