Whisper Small Turbo

This is a "turbo" variant of openai/whisper-small created by reducing the decoder layers from 12 to 4 (following the same approach used for whisper-large-v3-turbo).

Model Description

  • Base model: openai/whisper-small (244M parameters)
  • Turbo variant: 168M parameters (31% reduction)
  • Decoder layers: 4 (reduced from 12)
  • Encoder layers: 12 (unchanged)

Architecture Changes

Parameter Original Turbo
encoder_layers 12 12
decoder_layers 12 4
d_model 768 768
Total Parameters 244M ~168M

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch

processor = WhisperProcessor.from_pretrained("mekpro/whisper-small-turbo")
model = WhisperForConditionalGeneration.from_pretrained("mekpro/whisper-small-turbo")

# Load audio and transcribe
# audio_input = ...  # Your audio array at 16kHz
# input_features = processor(audio_input, sampling_rate=16000, return_tensors="pt").input_features
# predicted_ids = model.generate(input_features)
# transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)

Creation Method

This model was created by:

  1. Loading the original whisper-small model
  2. Creating a new model with decoder_layers=4
  3. Copying encoder weights (unchanged)
  4. Copying first 4 decoder layers (indices 0-3)
  5. Copying all embeddings and layer norms

No additional fine-tuning was performed on this model.

Limitations

As this model has not been fine-tuned after decoder reduction, it may show degraded performance compared to the original whisper-small. For best results, consider fine-tuning on your target domain.

Downloads last month
41
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support