Qwen3-MedEmbed-0.6B / README.md

luluw

Update README.md

e517248 verified about 1 month ago

preview code

raw

history blame contribute delete

28.1 kB

metadata

language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:209415
  - loss:CachedMultipleNegativesRankingLoss
base_model: Qwen/Qwen3-Embedding-0.6B
widget:
  - source_sentence: Liver cancer treatments options
    sentences:
      - >-
        The patient underwent mesohepatectomy with total caudate lobectomy of
        the liver.
      - >-
        The patient was treated empirically with amikacin and ceftazidime, which
        was later replaced with meropenem and piperacillin-tazobactam. From
        August 3 to August 11, the patient was treated with colistin.
      - >-
        A porto-systemic shunt was created to increase blood flow. Transarterial
        chemoembolization (TACE) was performed using 5-fluorouracil (5-FU),
        lobaplatin, and pirarubicin to treat tumor thrombi in the Right
        IntraHepatic Vessels (RIPV).
  - source_sentence: What was the outcome of the right adrenalectomy?
    sentences:
      - >-
        Overall, the patient was observed for an extended period to monitor the
        progress of the condition, until the time of discharge. The patient was
        advised to continue follow-up with his physician at regular intervals,
        and it was recommended that the patient follow the prescribed plan of
        care to maintain the best possible clinical outcome.
      - >-
        The right adrenalectomy was performed. Histology was consistent with the
        material difficult to typify. Immunohistochemistry was positive for CK20
        and cytokeratin AE1/AE3, but negative for CK7, which is related to
        colorectal metastasis.
      - >-
        The combination of the thyroid nodule and adrenal mass was surgically
        removed, and the diagnosis of VIP-secreting pheochromocytoma was made.
  - source_sentence: giant omphalocele symptoms
    sentences:
      - >-
        Although the treatment initially showed signs of efficacy, the tumor
        progressed rapidly, and the patient died three months after the second
        surgical debulking procedure.
      - >-
        The patient, a 9-year-old female, presented to the hospital with a large
        lump in the anterior abdominal wall extending from the xiphisternum to
        the level of iliac crest.
      - >-
        The patient presented with bilateral nasovestibular lumps which grew in
        size over several months, occluding nasal entrance and protruding
        outside the nose.
datasets:
  - abhinand/MedEmbed-training-triplets-v1
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: Qwen3-MedEmbed-0.6B
    results:
      - task:
          type: triplet
          name: Triplet
        dataset:
          name: triplet eval
          type: triplet-eval
        metrics:
          - type: cosine_accuracy
            value: 0.9743435382843018
            name: Cosine Accuracy
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: eval set 23kq 46kd
          type: eval-set-23kq-46kd
        metrics:
          - type: cosine_accuracy@1
            value: 0.4359018436546478
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.8112510206712794
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9102668786797885
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9586144655980059
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.4359018436546478
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.27041700689042647
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.18205337573595776
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09586144655980061
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.4359018436546478
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.8112510206712794
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9102668786797885
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9586144655980059
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7130583629700963
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.6320255950590608
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.6330765121051298
            name: Cosine Map@100
      - task:
          type: information-retrieval
          name: Information Retrieval
        dataset:
          name: eval set 23kq 23kd
          type: eval-set-23kq-23kd
        metrics:
          - type: cosine_accuracy@1
            value: 0.5437706820232928
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.899694873007005
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9529846576990846
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 0.9703038377240105
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.5437706820232928
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.29989829100233495
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.19059693153981697
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.09703038377240106
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.5437706820232928
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.899694873007005
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.9529846576990846
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9703038377240105
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.7836589422172208
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.7205542048928318
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.7209152572399424
            name: Cosine Map@100

Qwen3-MedEmbed-0.6B

This is a sentence-transformers model finetuned from Qwen/Qwen3-Embedding-0.6B on the med_embed-training-triplets-v1 dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: Qwen/Qwen3-Embedding-0.6B
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- med_embed-training-triplets-v1
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

import torch
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer(
    "luluw/Qwen3-MedEmbed-0.6B",
    model_kwargs={
      "attn_implementation": "flash_attention_2",  # If your GPU supports
      "dtype": torch.bfloat16,
      "device_map": "auto"
    },
    tokenizer_kwargs={"padding_side": "left"},
)

# Run inference
queries = [
    "diagnostic criteria for neurofibromatosis",
]
documents = [
    'The patient had a history of type I Neurofibromatosis diagnosed 20 years previously. On examination, the patient exhibited cutaneous nodules/café-au-lait spots scoliosis.',
    'The diagnosis includes: 1. Developmental delays. 2. Microdeletion of 1q21.1-1q21.2. 3. AUTS2 gene deletion. 4. Xq28 duplication syndrome.',
    'The patient was re-presented to the Heart Transplant Selection Committee, and was listed for heart transplant given excellent estimated 5-year survival rate for her breast cancer.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 1024] [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.7985,  0.2617, -0.1669]])

Evaluation

Metrics

Triplet

Dataset: triplet-eval
Evaluated with TripletEvaluator

Metric	Value
cosine_accuracy	0.9743

Information Retrieval

The dataset was splitted in 90:10 ratio. Eval set had 23269 samples. Extra 23269 negative samples were added for information retrieval.

Datasets: eval-set-23269kq-46538kd and eval-set-23269kq-23269kd
Evaluated with InformationRetrievalEvaluator

Metric	eval-set-23kq-46kd	eval-set-23kq-23kd
cosine_accuracy@1	0.4359	0.5438
cosine_accuracy@3	0.8113	0.8997
cosine_accuracy@5	0.9103	0.953
cosine_accuracy@10	0.9586	0.9703
cosine_precision@1	0.4359	0.5438
cosine_precision@3	0.2704	0.2999
cosine_precision@5	0.1821	0.1906
cosine_precision@10	0.0959	0.097
cosine_recall@1	0.4359	0.5438
cosine_recall@3	0.8113	0.8997
cosine_recall@5	0.9103	0.953
cosine_recall@10	0.9586	0.9703
cosine_ndcg@10	0.7131	0.7837
cosine_mrr@10	0.632	0.7206
cosine_map@100	0.6331	0.7209

Training Details

Training Dataset

med_embed-training-triplets-v1

Dataset: med_embed-training-triplets-v1 at 0b344f0
Size: 209,415 training samples
Columns: query, pos, and neg

Approximate statistics based on the first 1000 samples:

	query	pos	neg
type	string	string	string
details	min: 4 tokens mean: 10.1 tokens max: 26 tokens	min: 4 tokens mean: 38.98 tokens max: 138 tokens	min: 6 tokens mean: 37.1 tokens max: 155 tokens

Samples:

query	pos	neg
`play therapy for trichotillomania`	`The patient was subjected to play therapy and behavioural counselling that involved her parents as co-therapists. In play therapy, she was encouraged to gain self-confidence and overcome her anxiety.`	`The patient, a 7-year-old male, was admitted to the pediatric outpatient department with aggressive and hyperactive behavior, frequent falling from sitting and standing posture, and loss of speech. These symptoms had progressed rapidly during the first month but had been static for four months.`
`Post-operative care plan for submandibular gland surgery`	`The patient had an unremarkable postoperative recovery. The patient is recommended to follow up with her primary care provider and otorhinolaryngologist for further care and management.`	`The patient was discharged with recommendations to attend follow-up appointments with the ophthalmologist specialist for ongoing monitoring and maintenance of the current treatment plan. The patient was advised to follow standard postoperative care practices and to report any symptoms or concerns to the medical team immediately.`
`Complications of esophageal perforation`	`A diagnosis of esophageal perforation was established, and the patient was immediately prepared for an urgent thoracotomy. Primary repair was then implemented, and the repair site was buttressed using a TachoSil patch measuring 9.5- × 4.8-cm.`	`Day 92: EGD showed shrinking of the ulcers. However, Day 102: The patient complained of left-sided chest pain, and chest CT identified a pneumothorax that was relieved by inserting a drainage tube into the chest cavity. Additionally, chest CT after the administration of diluted amidotrizoate showed it to be leaking from the stomach into the thoracic cavity, suggesting the presence of a gastropleural fistula.`

Loss: CachedMultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "mini_batch_size": 32,
    "gather_across_devices": false
}

Evaluation Dataset

med_embed-training-triplets-v1

Dataset: med_embed-training-triplets-v1 at 0b344f0
Size: 23,269 evaluation samples
Columns: query, pos, and neg

Approximate statistics based on the first 1000 samples:

	query	pos	neg
type	string	string	string
details	min: 4 tokens mean: 10.16 tokens max: 32 tokens	min: 5 tokens mean: 37.21 tokens max: 213 tokens	min: 3 tokens mean: 37.27 tokens max: 124 tokens

Samples:

query	pos	neg
`What was the initial presentation of the patient?`	`The 45-year-old female patient presented to the department with an enlarging lesion in her upper abdomen.`	`The patient was transferred to this hospital for further evaluation.`
`giant omphalocele symptoms`	`The patient, a 9-year-old female, presented to the hospital with a large lump in the anterior abdominal wall extending from the xiphisternum to the level of iliac crest.`	`The patient presented with bilateral nasovestibular lumps which grew in size over several months, occluding nasal entrance and protruding outside the nose.`
`granulomatous lymphocytic interstitial lung disease treatment`	`The patient had clubbing and chronic lung findings, and thorax CT revealed extended and severe bronchiectasis with thickened bronchial walls, some granulomatous nodules and mosaic appearance, compatible with granulomatous lymphocytic interstitial lung disease (GLILD). Regular intravenous immunoglobulin (IVIG) replacement was started.`	`The patient was treated with methylprednisolone pulse therapy followed by oral prednisolone (PSL) and cyclophosphamide intravenously. After treatment, arthralgia, renal function, proteinuria, and skin manifestations improved.`

Loss: CachedMultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "mini_batch_size": 32,
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 96
per_device_eval_batch_size: 96
learning_rate: 0.0001
weight_decay: 0.01
warmup_ratio: 0.1
bf16: True
dataloader_num_workers: 8
dataloader_prefetch_factor: 16
load_best_model_at_end: True
push_to_hub: True
prompts: {'query': 'Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery:', 'pos': ''}
batch_sampler: no_duplicates

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss	triplet-eval_cosine_accuracy	eval-set-23269kq-46538kd_cosine_ndcg@10	eval-set-23269kq-23269kd_cosine_ndcg@10
0.2291	500	1.3091	1.2201	0.7698	-	-
0.4583	1000	0.6312	0.6813	0.8759	-	-
0.6874	1500	0.3722	0.3620	0.9213	-	-
0.9166	2000	0.2085	0.2422	0.9469	-	-
1.1457	2500	0.1684	0.1901	0.9533	-	-
1.6040	3500	0.1227	0.1412	0.9698	-	-
1.8332	4000	0.0927	0.1293	0.9713	-	-
2.0623	4500	0.0873	0.1246	0.9718	-	-
2.2915	5000	0.0705	0.1218	0.9752	-	-
2.5206	5500	0.06	0.1198	0.9748	-	-
2.7498	6000	0.0682	0.1193	0.9743	-	-
2.9789	6500	0.0536	0.1191	0.9743	-	-
-1	-1	-	-	-	0.7131	0.7837

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.12.11
Sentence Transformers: 5.1.2
Transformers: 4.57.1
PyTorch: 2.8.0+cu128
Accelerate: 1.11.0
Datasets: 4.3.0
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}