Tom Aarsen commited on
Commit
a24e3f0
·
1 Parent(s): f3cb564

Add Sentence Transformers compatibility

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 2048,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md CHANGED
@@ -8,9 +8,10 @@ tags:
8
  - text-embeddings
9
  - retrieval
10
  - semantic-search
 
11
  language:
12
  - multilingual
13
- library_name: transformers
14
  ---
15
 
16
  ## **Model Overview**
@@ -61,13 +62,46 @@ This NeMo embedding model is a transformer encoder - a fine-tuned version of Lla
61
 
62
  ### **Installation**
63
 
 
 
64
  The model requires transformers version 4.47.1.
65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
  ```bash
67
  pip install transformers==4.47.1
68
  ```
69
 
70
- ### **Usage**
71
  ```python
72
  import torch
73
  import torch.nn.functional as F
@@ -133,8 +167,6 @@ print(scores.tolist())
133
 
134
  #Similarity scores:
135
  #[[0.5968121290206909, -0.04534469544887543], [-0.03361201286315918, 0.46140915155410767]]
136
-
137
-
138
  ```
139
 
140
  ### **Software Integration**
 
8
  - text-embeddings
9
  - retrieval
10
  - semantic-search
11
+ - transformers
12
  language:
13
  - multilingual
14
+ library_name: sentence-transformers
15
  ---
16
 
17
  ## **Model Overview**
 
62
 
63
  ### **Installation**
64
 
65
+ ### **Sentence Transformers Usage**
66
+
67
  The model requires transformers version 4.47.1.
68
 
69
+ ```bash
70
+ pip install transformers==4.47.1 sentence-transformers
71
+ ```
72
+
73
+ ```python
74
+ from sentence_transformers import SentenceTransformer
75
+
76
+ model = SentenceTransformer("nvidia/llama-nemotron-embed-1b-v2", trust_remote_code=True)
77
+
78
+ queries = [
79
+ "how much protein should a female eat",
80
+ "summit define",
81
+ ]
82
+ documents = [
83
+ "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
84
+ "Definition of summit for English Language Learners. : 1 the highest point of a mountain : the top of a mountain. : 2 the highest level. : 3 a meeting or series of meetings between the leaders of two or more governments."
85
+ ]
86
+
87
+ query_embeddings = model.encode_query(queries, convert_to_tensor=True)
88
+ document_embeddings = model.encode_document(documents, convert_to_tensor=True)
89
+
90
+ # Compute similarity scores
91
+ scores = model.similarity(query_embeddings, document_embeddings)
92
+ """
93
+ tensor([[ 0.5968, -0.0454],
94
+ [-0.0336, 0.4613]], device='cuda:0')
95
+ """
96
+ ```
97
+
98
+ ### **Transformers Usage**
99
+ You can also use transformers directly to run the model. The model requires transformers version 4.47.1.
100
+
101
  ```bash
102
  pip install transformers==4.47.1
103
  ```
104
 
 
105
  ```python
106
  import torch
107
  import torch.nn.functional as F
 
167
 
168
  #Similarity scores:
169
  #[[0.5968121290206909, -0.04534469544887543], [-0.03361201286315918, 0.46140915155410767]]
 
 
170
  ```
171
 
172
  ### **Software Integration**
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "SentenceTransformer",
3
+ "__version__": {
4
+ "sentence_transformers": "5.0.1",
5
+ "transformers": "4.47.1",
6
+ "pytorch": "2.9.1+cu126"
7
+ },
8
+ "prompts": {
9
+ "query": "query: ",
10
+ "document": "passage: "
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 131072,
3
+ "do_lower_case": false
4
+ }