Add exported onnx model 'model.onnx'

#3
by Ahmad09 - opened

Hello!

This pull request has been automatically generated from the Sentence Transformers backend-export Space.

Pull Request overview

  • Add exported ONNX model model.onnx.

Tip:

Consider testing this pull request before merging by loading the model from this PR with the revision argument:

from sentence_transformers import SentenceTransformer

# TODO: Fill in the PR number
pr_number = 2
model = SentenceTransformer(
    "TechWolf/JobBERT-v2",
    revision=f"refs/pr/{pr_number}",
    backend="onnx",
)

# Verify that everything works as expected
embeddings = model.encode(["The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium."])
print(embeddings.shape)

similarities = model.similarity(embeddings, embeddings)
print(similarities)

Title: Issue: ONNX Export Missing Asym Layer (768-dim instead of 1024-dim)

Hi,
I'm trying to deploy JobBERT-v2 using Text Embeddings Inference (TEI) for production use.
The Issue:
The exported ONNX model produces 768-dimensional embeddings instead of the expected 1024-dimensional output. This appears to be because the Asym (asymmetric projection) layer that transforms 768β†’1024 dimensions is not included in the ONNX export.

TEI's logs show (When not using ONNX):

WARN: modules.json could be downloaded but parsing the modules failed:
unknown variant sentence_transformers.models.Asym

Why This Matters:

I have a large production dataset already embedded with the 1024-dimensional vectors from sentence-transformers. Re-embedding would require significant time and compute resources.

Request:

Would it be possible to provide an ONNX export that includes the Asym projection layer baked in, so the model outputs 1024-dimensional embeddings? Alternatively, any guidance on how to properly export the full model graph to ONNX would be greatly appreciated.

Thank you for your help!

I've updated the ONNX export to include the Asym projection layer.

Changes made:

Re-exported model.onnx using a custom script that wraps the full SentenceTransformer forward pass
The model now outputs 1024-dimensional embeddings (matching the original sentence-transformers output)

Export script used:

class FullModel(torch.nn.Module):
    def __init__(self, st_model):
        super().__init__()
        self.model = st_model
    
    def forward(self, input_ids, attention_mask):
        features = {"input_ids": input_ids, "attention_mask": attention_mask}
        output = self.model(features)
        return output["sentence_embedding"]

This ensures the asymmetric projection layer (768β†’1024) is included in the ONNX graph.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment