Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ayushexel/embed-all-MiniLM-L6-v2-squad-10-epochs")
# Run inference
sentences = [
'To whom did Herschel present his work on infrared radiation?',
'The discovery of infrared radiation is ascribed to William Herschel, the astronomer, in the early 19th century. Herschel published his results in 1800 before the Royal Society of London. Herschel used a prism to refract light from the sun and detected the infrared, beyond the red part of the spectrum, through an increase in the temperature recorded on a thermometer. He was surprised at the result and called them "Calorific Rays". The term \'Infrared\' did not appear until late in the 19th century.',
'The discovery of infrared radiation is ascribed to William Herschel, the astronomer, in the early 19th century. Herschel published his results in 1800 before the Royal Society of London. Herschel used a prism to refract light from the sun and detected the infrared, beyond the red part of the spectrum, through an increase in the temperature recorded on a thermometer. He was surprised at the result and called them "Calorific Rays". The term \'Infrared\' did not appear until late in the 19th century.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
gooqa-devTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.4098 |
question, context, and negative| question | context | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| question | context | negative |
|---|---|---|
As cartridges couldn't be returned to Nintendo, the developers took on the entirety of what? |
Nintendo was not as restrictive as Sega, which did not permit third-party publishing until Mediagenic in late summer 1988. Nintendo's intention, however, was to reserve a large part of NES game revenue for itself. Nintendo required that they be the sole manufacturer of all cartridges, and that the publisher had to pay in full before the cartridges for that game be produced. Cartridges could not be returned to Nintendo, so publishers assumed all the risk. As a result, some publishers lost more money due to distress sales of remaining inventory at the end of the NES era than they ever earned in profits from sales of the games. Because Nintendo controlled the production of all cartridges, it was able to enforce strict rules on its third-party developers, which were required to sign a contract by Nintendo that would obligate these parties to develop exclusively for the system, order at least 10,000 cartridges, and only make five games per year. A 1988 shortage of DRAM and ROM chips also re... |
The back of the cartridge bears a label with instructions on handling. Production and software revision codes were imprinted as stamps on the back label to correspond with the software version and producer. With the exception of The Legend of Zelda and Zelda II: The Adventure of Link, manufactured in gold-plastic carts, all licensed NTSC and PAL cartridges are a standard shade of gray plastic. Unlicensed carts were produced in black, robin egg blue, and gold, and are all slightly different shapes than standard NES cartridges. Nintendo also produced yellow-plastic carts for internal use at Nintendo Service Centers, although these "test carts" were never made available for purchase. All licensed US cartridges were made by Nintendo, Konami and Acclaim. For promotion of DuckTales: Remastered, Capcom sent 150 limited-edition gold NES cartridges with the original game, featuring the Remastered art as the sticker, to different gaming news agencies. The instruction label on the back included t... |
During what time did Arsenal play at Wembley? |
Highbury could hold more than 60,000 spectators at its peak, and had a capacity of 57,000 until the early 1990s. The Taylor Report and Premier League regulations obliged Arsenal to convert Highbury to an all-seater stadium in time for the 1993–94 season, thus reducing the capacity to 38,419 seated spectators. This capacity had to be reduced further during Champions League matches to accommodate additional advertising boards, so much so that for two seasons, from 1998 to 2000, Arsenal played Champions League home matches at Wembley, which could house more than 70,000 spectators. |
Arsenal's longest-running and deepest rivalry is with their nearest major neighbours, Tottenham Hotspur; matches between the two are referred to as North London derbies. Other rivalries within London include those with Chelsea, Fulham and West Ham United. In addition, Arsenal and Manchester United developed a strong on-pitch rivalry in the late 1980s, which intensified in recent years when both clubs were competing for the Premier League title – so much so that a 2003 online poll by the Football Fans Census listed Manchester United as Arsenal's biggest rivals, followed by Tottenham and Chelsea. A 2008 poll listed the Tottenham rivalry as more important. |
What causes fermentation during the brewing process when making a beer? |
The basic ingredients of beer are water; a starch source, such as malted barley, able to be saccharified (converted to sugars) then fermented (converted into ethanol and carbon dioxide); a brewer's yeast to produce the fermentation; and a flavouring such as hops. A mixture of starch sources may be used, with a secondary starch source, such as maize (corn), rice or sugar, often being termed an adjunct, especially when used as a lower-cost substitute for malted barley. Less widely used starch sources include millet, sorghum and cassava root in Africa, and potato in Brazil, and agave in Mexico, among others. The amount of each starch source in a beer recipe is collectively called the grain bill. |
The strength of beers has climbed during the later years of the 20th century. Vetter 33, a 10.5% abv (33 degrees Plato, hence Vetter "33") doppelbock, was listed in the 1994 Guinness Book of World Records as the strongest beer at that time, though Samichlaus, by the Swiss brewer Hürlimann, had also been listed by the Guinness Book of World Records as the strongest at 14% abv. Since then, some brewers have used champagne yeasts to increase the alcohol content of their beers. Samuel Adams reached 20% abv with Millennium, and then surpassed that amount to 25.6% abv with Utopias. The strongest beer brewed in Britain was Baz's Super Brew by Parish Brewery, a 23% abv beer. In September 2011, the Scottish brewery BrewDog produced Ghost Deer, which, at 28%, they claim to be the world's strongest beer produced by fermentation alone. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question, context, and negative_1| question | context | negative_1 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| question | context | negative_1 |
|---|---|---|
Who was the first contestant to benefit from the Judges Save? |
There were 13 finalists this season, but two were eliminated in the first result show of the finals. A new feature introduced was the "Judges' Save", and Matt Giraud was saved from elimination at the top seven by the judges when he received the fewest votes. The next week, Lil Rounds and Anoop Desai were eliminated. |
There were 13 finalists this season, but two were eliminated in the first result show of the finals. A new feature introduced was the "Judges' Save", and Matt Giraud was saved from elimination at the top seven by the judges when he received the fewest votes. The next week, Lil Rounds and Anoop Desai were eliminated. |
What anti-communist archbisoph died in 1960? |
In 1966 an agreement with the Vatican, fostered in part by the death in 1960 of anti-communist archbishop of Zagreb Aloysius Stepinac and shifts in the church's approach to resisting communism originating in the Second Vatican Council, accorded new freedom to the Yugoslav Roman Catholic Church, particularly to catechize and open seminaries. The agreement also eased tensions, which had prevented the naming of new bishops in Yugoslavia since 1945. Tito's new socialism met opposition from traditional communists culminating in conspiracy headed by Aleksandar Ranković. In the same year Tito declared that Communists must henceforth chart Yugoslavia's course by the force of their arguments (implying an abandonment of Leninist orthodoxy and development of liberal Communism). The State Security Administration (UDBA) saw its power scaled back and its staff reduced to 5000. |
In 1966 an agreement with the Vatican, fostered in part by the death in 1960 of anti-communist archbishop of Zagreb Aloysius Stepinac and shifts in the church's approach to resisting communism originating in the Second Vatican Council, accorded new freedom to the Yugoslav Roman Catholic Church, particularly to catechize and open seminaries. The agreement also eased tensions, which had prevented the naming of new bishops in Yugoslavia since 1945. Tito's new socialism met opposition from traditional communists culminating in conspiracy headed by Aleksandar Ranković. In the same year Tito declared that Communists must henceforth chart Yugoslavia's course by the force of their arguments (implying an abandonment of Leninist orthodoxy and development of liberal Communism). The State Security Administration (UDBA) saw its power scaled back and its staff reduced to 5000. |
Where was an attempt made to take the torch? |
Of the 80 torch-bearers in London, Sir Steve Redgrave, who started the relay, mentioned to the media that he had received e-mailed pleas to boycott the event and could "see why they would like to make an issue" of it. Francesca Martinez and Richard Vaughan refused to carry the torch, while Konnie Huq decided to carry it and also speak out against China. The pro-Tibetan Member of Parliament Norman Baker asked all bearers to reconsider. Amid pressure from both directions, Prime Minister Gordon Brown welcomed the torch outside 10 Downing Street without holding or touching it. The London relay saw the torch surrounded by what the BBC described as "a mobile protective ring." Protests began as soon as Redgrave started the event, leading to at least thirty-five arrests. In Ladbroke Grove a demonstrator attempted to snatch the torch from Konnie Huq in a momentary struggle, and in a separate incident, a fire extinguisher was set off near the torch. The Chinese ambassador carried the torch throu... |
Of the 80 torch-bearers in London, Sir Steve Redgrave, who started the relay, mentioned to the media that he had received e-mailed pleas to boycott the event and could "see why they would like to make an issue" of it. Francesca Martinez and Richard Vaughan refused to carry the torch, while Konnie Huq decided to carry it and also speak out against China. The pro-Tibetan Member of Parliament Norman Baker asked all bearers to reconsider. Amid pressure from both directions, Prime Minister Gordon Brown welcomed the torch outside 10 Downing Street without holding or touching it. The London relay saw the torch surrounded by what the BBC described as "a mobile protective ring." Protests began as soon as Redgrave started the event, leading to at least thirty-five arrests. In Ladbroke Grove a demonstrator attempted to snatch the torch from Konnie Huq in a momentary struggle, and in a separate incident, a fire extinguisher was set off near the torch. The Chinese ambassador carried the torch throu... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 128per_device_eval_batch_size: 128num_train_epochs: 10warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss | gooqa-dev_cosine_accuracy |
|---|---|---|---|---|
| -1 | -1 | - | - | 0.3262 |
| 0.2882 | 100 | 0.4488 | 0.8070 | 0.3810 |
| 0.5764 | 200 | 0.3927 | 0.7747 | 0.3930 |
| 0.8646 | 300 | 0.3949 | 0.7766 | 0.3952 |
| 1.1527 | 400 | 0.3429 | 0.7624 | 0.3982 |
| 1.4409 | 500 | 0.3066 | 0.7514 | 0.4070 |
| 1.7291 | 600 | 0.3102 | 0.7418 | 0.4110 |
| 2.0173 | 700 | 0.3128 | 0.7473 | 0.4112 |
| 2.3055 | 800 | 0.2137 | 0.7451 | 0.4082 |
| 2.5937 | 900 | 0.221 | 0.7476 | 0.4110 |
| 2.8818 | 1000 | 0.2207 | 0.7496 | 0.4078 |
| 3.1700 | 1100 | 0.1822 | 0.7525 | 0.4090 |
| 3.4582 | 1200 | 0.1668 | 0.7506 | 0.4062 |
| 3.7464 | 1300 | 0.1699 | 0.7561 | 0.4052 |
| 4.0346 | 1400 | 0.1656 | 0.7509 | 0.4088 |
| 4.3228 | 1500 | 0.1305 | 0.7563 | 0.4108 |
| 4.6110 | 1600 | 0.1369 | 0.7600 | 0.4036 |
| 4.8991 | 1700 | 0.1381 | 0.7621 | 0.4056 |
| 5.1873 | 1800 | 0.1149 | 0.7703 | 0.4086 |
| 5.4755 | 1900 | 0.1159 | 0.7639 | 0.4094 |
| 5.7637 | 2000 | 0.1132 | 0.7652 | 0.4088 |
| 6.0519 | 2100 | 0.109 | 0.7649 | 0.4022 |
| 6.3401 | 2200 | 0.0947 | 0.7699 | 0.4160 |
| 6.6282 | 2300 | 0.0978 | 0.7759 | 0.4094 |
| 6.9164 | 2400 | 0.0994 | 0.7694 | 0.4064 |
| 7.2046 | 2500 | 0.0903 | 0.7718 | 0.4120 |
| 7.4928 | 2600 | 0.0897 | 0.7721 | 0.4096 |
| 7.7810 | 2700 | 0.085 | 0.7767 | 0.4128 |
| 8.0692 | 2800 | 0.0859 | 0.7709 | 0.4132 |
| 8.3573 | 2900 | 0.0807 | 0.7740 | 0.4102 |
| 8.6455 | 3000 | 0.0798 | 0.7744 | 0.4100 |
| 8.9337 | 3100 | 0.078 | 0.7757 | 0.4072 |
| 9.2219 | 3200 | 0.0749 | 0.7767 | 0.4098 |
| 9.5101 | 3300 | 0.0751 | 0.7761 | 0.4084 |
| 9.7983 | 3400 | 0.076 | 0.7763 | 0.4106 |
| -1 | -1 | - | - | 0.4098 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
sentence-transformers/all-MiniLM-L6-v2