PawanEmbd-68M

A 68M parameter embedding model distilled from Granite-278M

Model Details

  • Model Type: Sentence Embedding Model
  • Architecture: Transformer-based encoder with projection layer
  • Parameters: ~68 million
  • Teacher Model: IBM Granite-278M Multilingual Embedding
  • Training Method: Knowledge Distillation
  • Output Dimensions: 768
  • Max Sequence Length: 512 tokens

Training Details

This model was trained using knowledge distillation from the IBM Granite-278M teacher model on the All-NLI dataset (SNLI + MultiNLI).

Training Hyperparameters

  • Dataset: sentence-transformers/all-nli (100K samples)
  • Epochs: 20
  • Batch Size: 32
  • Learning Rate: 5e-4 with OneCycleLR scheduler
  • Loss Function: Combined MSE + Cosine Similarity (Ξ±=0.5, Ξ²=0.5)
  • Mixed Precision: FP16 (AMP)
  • Hardware: NVIDIA T4 GPU

Usage

Using Transformers

from transformers import AutoModel, AutoTokenizer
import torch
import torch.nn.functional as F

# Load model and tokenizer
model = AutoModel.from_pretrained("dmedhi/PawanEmbd-68M")
tokenizer = AutoTokenizer.from_pretrained("dmedhi/PawanEmbd-68M")

# Encode sentences
sentences = ["This is an example sentence", "Each sentence is converted to a vector"]
encoded = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Get embeddings
with torch.no_grad():
    outputs = model(**encoded)
    embeddings = outputs.pooler_output # Already normalized

# Compute similarity
similarity = F.cosine_similarity(embeddings[0:1], embeddings[1:2])
print(f"Similarity: {similarity.item():.4f}")

Using Sentence-Transformers

from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim

# Load your model (should work now!)
model = SentenceTransformer("dmedhi/PawanEmbd-68M")

# Test encoding
sentences = ["This is an example sentence", "Each sentence is converted to a vector"]
embeddings = model.encode(sentences)

print(f"βœ… Embeddings shape: {embeddings.shape}")

# Compute similarity
similarity = cos_sim(embeddings[0], embeddings[1])
print(f"βœ… Similarity: {similarity.item():.4f}")

Performance

Comparison with Teacher Model

Metric Teacher (Granite-278M) Student (PawanEmbd-68M)
Parameters 278M 68M (4.1x smaller)
Model Size ~1.1 GB ~258.7 MB
Inference Speed (CPU) 269.57 ms 11.57 (23.3x faster)
Inference Speed (GPU) 17.94.57 ms 2.75 (6.5x faster)
Cosine Similarity 1.000 0.943

Intended Uses

This model is suitable for:

βœ… Semantic Search: Find similar documents or passages
βœ… Clustering: Group similar texts together
βœ… Duplicate Detection: Identify near-duplicate content
βœ… Recommendation Systems: Find similar items
βœ… Question Answering: Retrieve relevant passages
βœ… Sentence Similarity: Measure semantic similarity between texts

Training Code

The model was trained using PyTorch with knowledge distillation. Training code available at: TODO

Citation

@misc{pawanembdmodel2025,
  author = {Dipankar Medhi},
  title = {PawanEmbd: A Lightweight Embedding Model via Knowledge Distillation},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = { \url{https://huggingface.co/dmedhi/PawanEmbd-68M} }
}

Acknowledgments

License

Apache 2.0

Contact

For questions or feedback, please open an issue on Github.

Downloads last month
158
Safetensors
Model size
67.8M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train dmedhi/PawanEmbd-68M