Spaces:

AndroidGuy
/

Speaker-Diarization

Sleeping

App Files Files Community

Speaker-Diarization / README.md

Saiyaswanth007

Backend connection

4641c1c 6 months ago

preview code

raw

history blame contribute delete

2.49 kB

	---
	title: Speaker Diarization
	emoji: 🔥
	colorFrom: blue
	colorTo: blue
	sdk: docker
	pinned: false
	license: mit
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	# Real-Time Speaker Diarization

	This project implements real-time speaker diarization using WebRTC, FastAPI, and Gradio. It automatically transcribes speech and identifies different speakers in real-time.

	## Architecture

	The system is split into two components:

	1. Model Server (Hugging Face Space): Runs the speech recognition and speaker diarization models
	2. Signaling Server (Render): Handles WebRTC signaling for direct audio streaming from browser

	## Deployment Instructions

	### Deploy Model Server on Hugging Face Space

	1. Create a new Space on Hugging Face (Docker SDK)
	2. Upload all files from the `Speaker-Diarization` directory
	3. In Space settings:
	- Set Hardware to CPU (or GPU if available)
	- Set the public visibility
	- Environment: Make sure Docker SDK is selected

	### Deploy Signaling Server on Render

	1. Create a new Render Web Service
	2. Connect to your GitHub repo containing the `render-signal` directory
	3. Configure Render service:
	- Set Build Command: `cd render-signal && pip install -r requirements.txt`
	- Set Start Command: `cd render-signal && python backend.py`
	- Select Environment: Python 3
	- Set Environment Variables:
	- `HF_SPACE_URL`: Set to your Hugging Face Space URL (e.g., `your-username-speaker-diarization.hf.space`)

	### Update Configuration

	After both services are deployed:

	1. Update `ui.py` on your Hugging Face Space:
	- Change `RENDER_SIGNALING_URL` to your Render app URL (`wss://your-app.onrender.com/stream`)
	- Make sure `HF_SPACE_URL` matches your actual Hugging Face Space URL

	2. Update `backend.py` on your Render service:
	- Set `API_WS` to your Hugging Face Space WebSocket URL (`wss://your-username-speaker-diarization.hf.space/ws_inference`)

	## Usage

	1. Open your Hugging Face Space URL in a web browser
	2. Click "Start Listening" to begin
	3. Speak into your microphone
	4. The system will transcribe your speech and identify different speakers in real-time

	## Technology Stack

	- Frontend: Gradio UI with WebRTC for audio streaming
	- Signaling: FastRTC on Render for WebRTC signaling
	- Backend: FastAPI + WebSockets
	- Models:
	- SpeechBrain ECAPA-TDNN for speaker embeddings
	- Automatic Speech Recognition for transcription

	## License

	MIT