Add exported onnx model 'model.onnx'
Hello!
This pull request has been automatically generated from the Sentence Transformers backend-export Space.
Pull Request overview
- Add exported ONNX model
model.onnx.
Tip:
Consider testing this pull request before merging by loading the model from this PR with the revision argument:
from sentence_transformers import SentenceTransformer
# TODO: Fill in the PR number
pr_number = 2
model = SentenceTransformer(
"TechWolf/JobBERT-v2",
revision=f"refs/pr/{pr_number}",
backend="onnx",
)
# Verify that everything works as expected
embeddings = model.encode(["The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium."])
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities)
Title: Issue: ONNX Export Missing Asym Layer (768-dim instead of 1024-dim)
Hi,
I'm trying to deploy JobBERT-v2 using Text Embeddings Inference (TEI) for production use.
The Issue:
The exported ONNX model produces 768-dimensional embeddings instead of the expected 1024-dimensional output. This appears to be because the Asym (asymmetric projection) layer that transforms 768β1024 dimensions is not included in the ONNX export.
TEI's logs show (When not using ONNX):
WARN: modules.json could be downloaded but parsing the modules failed:
unknown variant sentence_transformers.models.Asym
Why This Matters:
I have a large production dataset already embedded with the 1024-dimensional vectors from sentence-transformers. Re-embedding would require significant time and compute resources.
Request:
Would it be possible to provide an ONNX export that includes the Asym projection layer baked in, so the model outputs 1024-dimensional embeddings? Alternatively, any guidance on how to properly export the full model graph to ONNX would be greatly appreciated.
Thank you for your help!
I've updated the ONNX export to include the Asym projection layer.
Changes made:
Re-exported model.onnx using a custom script that wraps the full SentenceTransformer forward pass
The model now outputs 1024-dimensional embeddings (matching the original sentence-transformers output)
Export script used:
class FullModel(torch.nn.Module):
def __init__(self, st_model):
super().__init__()
self.model = st_model
def forward(self, input_ids, attention_mask):
features = {"input_ids": input_ids, "attention_mask": attention_mask}
output = self.model(features)
return output["sentence_embedding"]
This ensures the asymmetric projection layer (768β1024) is included in the ONNX graph.