Qwen3-MedEmbed-0.6B
This is a sentence-transformers model finetuned from Qwen/Qwen3-Embedding-0.6B on the med_embed-training-triplets-v1 dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Qwen/Qwen3-Embedding-0.6B
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'Qwen3Model'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
import torch
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer(
"luluw/Qwen3-MedEmbed-0.6B",
model_kwargs={
"attn_implementation": "flash_attention_2", # If your GPU supports
"dtype": torch.bfloat16,
"device_map": "auto"
},
tokenizer_kwargs={"padding_side": "left"},
)
# Run inference
queries = [
"diagnostic criteria for neurofibromatosis",
]
documents = [
'The patient had a history of type I Neurofibromatosis diagnosed 20 years previously. On examination, the patient exhibited cutaneous nodules/café-au-lait spots scoliosis.',
'The diagnosis includes: 1. Developmental delays. 2. Microdeletion of 1q21.1-1q21.2. 3. AUTS2 gene deletion. 4. Xq28 duplication syndrome.',
'The patient was re-presented to the Heart Transplant Selection Committee, and was listed for heart transplant given excellent estimated 5-year survival rate for her breast cancer.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 1024] [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.7985, 0.2617, -0.1669]])
Evaluation
Metrics
Triplet
- Dataset:
triplet-eval - Evaluated with
TripletEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy | 0.9743 |
Information Retrieval
The dataset was splitted in 90:10 ratio. Eval set had 23269 samples. Extra 23269 negative samples were added for information retrieval.
- Datasets:
eval-set-23269kq-46538kdandeval-set-23269kq-23269kd - Evaluated with
InformationRetrievalEvaluator
| Metric | eval-set-23kq-46kd | eval-set-23kq-23kd |
|---|---|---|
| cosine_accuracy@1 | 0.4359 | 0.5438 |
| cosine_accuracy@3 | 0.8113 | 0.8997 |
| cosine_accuracy@5 | 0.9103 | 0.953 |
| cosine_accuracy@10 | 0.9586 | 0.9703 |
| cosine_precision@1 | 0.4359 | 0.5438 |
| cosine_precision@3 | 0.2704 | 0.2999 |
| cosine_precision@5 | 0.1821 | 0.1906 |
| cosine_precision@10 | 0.0959 | 0.097 |
| cosine_recall@1 | 0.4359 | 0.5438 |
| cosine_recall@3 | 0.8113 | 0.8997 |
| cosine_recall@5 | 0.9103 | 0.953 |
| cosine_recall@10 | 0.9586 | 0.9703 |
| cosine_ndcg@10 | 0.7131 | 0.7837 |
| cosine_mrr@10 | 0.632 | 0.7206 |
| cosine_map@100 | 0.6331 | 0.7209 |
Training Details
Training Dataset
med_embed-training-triplets-v1
- Dataset: med_embed-training-triplets-v1 at 0b344f0
- Size: 209,415 training samples
- Columns:
query,pos, andneg - Approximate statistics based on the first 1000 samples:
query pos neg type string string string details - min: 4 tokens
- mean: 10.1 tokens
- max: 26 tokens
- min: 4 tokens
- mean: 38.98 tokens
- max: 138 tokens
- min: 6 tokens
- mean: 37.1 tokens
- max: 155 tokens
- Samples:
query pos neg play therapy for trichotillomaniaThe patient was subjected to play therapy and behavioural counselling that involved her parents as co-therapists. In play therapy, she was encouraged to gain self-confidence and overcome her anxiety.The patient, a 7-year-old male, was admitted to the pediatric outpatient department with aggressive and hyperactive behavior, frequent falling from sitting and standing posture, and loss of speech. These symptoms had progressed rapidly during the first month but had been static for four months.Post-operative care plan for submandibular gland surgeryThe patient had an unremarkable postoperative recovery. The patient is recommended to follow up with her primary care provider and otorhinolaryngologist for further care and management.The patient was discharged with recommendations to attend follow-up appointments with the ophthalmologist specialist for ongoing monitoring and maintenance of the current treatment plan. The patient was advised to follow standard postoperative care practices and to report any symptoms or concerns to the medical team immediately.Complications of esophageal perforationA diagnosis of esophageal perforation was established, and the patient was immediately prepared for an urgent thoracotomy. Primary repair was then implemented, and the repair site was buttressed using a TachoSil patch measuring 9.5- × 4.8-cm.Day 92: EGD showed shrinking of the ulcers. However, Day 102: The patient complained of left-sided chest pain, and chest CT identified a pneumothorax that was relieved by inserting a drainage tube into the chest cavity. Additionally, chest CT after the administration of diluted amidotrizoate showed it to be leaking from the stomach into the thoracic cavity, suggesting the presence of a gastropleural fistula. - Loss:
CachedMultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "mini_batch_size": 32, "gather_across_devices": false }
Evaluation Dataset
med_embed-training-triplets-v1
- Dataset: med_embed-training-triplets-v1 at 0b344f0
- Size: 23,269 evaluation samples
- Columns:
query,pos, andneg - Approximate statistics based on the first 1000 samples:
query pos neg type string string string details - min: 4 tokens
- mean: 10.16 tokens
- max: 32 tokens
- min: 5 tokens
- mean: 37.21 tokens
- max: 213 tokens
- min: 3 tokens
- mean: 37.27 tokens
- max: 124 tokens
- Samples:
query pos neg What was the initial presentation of the patient?The 45-year-old female patient presented to the department with an enlarging lesion in her upper abdomen.The patient was transferred to this hospital for further evaluation.giant omphalocele symptomsThe patient, a 9-year-old female, presented to the hospital with a large lump in the anterior abdominal wall extending from the xiphisternum to the level of iliac crest.The patient presented with bilateral nasovestibular lumps which grew in size over several months, occluding nasal entrance and protruding outside the nose.granulomatous lymphocytic interstitial lung disease treatmentThe patient had clubbing and chronic lung findings, and thorax CT revealed extended and severe bronchiectasis with thickened bronchial walls, some granulomatous nodules and mosaic appearance, compatible with granulomatous lymphocytic interstitial lung disease (GLILD). Regular intravenous immunoglobulin (IVIG) replacement was started.The patient was treated with methylprednisolone pulse therapy followed by oral prednisolone (PSL) and cyclophosphamide intravenously. After treatment, arthralgia, renal function, proteinuria, and skin manifestations improved. - Loss:
CachedMultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "mini_batch_size": 32, "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 96per_device_eval_batch_size: 96learning_rate: 0.0001weight_decay: 0.01warmup_ratio: 0.1bf16: Truedataloader_num_workers: 8dataloader_prefetch_factor: 16load_best_model_at_end: Truepush_to_hub: Trueprompts:{'query': 'Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery:', 'pos': ''}batch_sampler: no_duplicates
Training Logs
Click to expand
| Epoch | Step | Training Loss | Validation Loss | triplet-eval_cosine_accuracy | eval-set-23269kq-46538kd_cosine_ndcg@10 | eval-set-23269kq-23269kd_cosine_ndcg@10 |
|---|---|---|---|---|---|---|
| 0.2291 | 500 | 1.3091 | 1.2201 | 0.7698 | - | - |
| 0.4583 | 1000 | 0.6312 | 0.6813 | 0.8759 | - | - |
| 0.6874 | 1500 | 0.3722 | 0.3620 | 0.9213 | - | - |
| 0.9166 | 2000 | 0.2085 | 0.2422 | 0.9469 | - | - |
| 1.1457 | 2500 | 0.1684 | 0.1901 | 0.9533 | - | - |
| 1.6040 | 3500 | 0.1227 | 0.1412 | 0.9698 | - | - |
| 1.8332 | 4000 | 0.0927 | 0.1293 | 0.9713 | - | - |
| 2.0623 | 4500 | 0.0873 | 0.1246 | 0.9718 | - | - |
| 2.2915 | 5000 | 0.0705 | 0.1218 | 0.9752 | - | - |
| 2.5206 | 5500 | 0.06 | 0.1198 | 0.9748 | - | - |
| 2.7498 | 6000 | 0.0682 | 0.1193 | 0.9743 | - | - |
| 2.9789 | 6500 | 0.0536 | 0.1191 | 0.9743 | - | - |
| -1 | -1 | - | - | - | 0.7131 | 0.7837 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.12.11
- Sentence Transformers: 5.1.2
- Transformers: 4.57.1
- PyTorch: 2.8.0+cu128
- Accelerate: 1.11.0
- Datasets: 4.3.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CachedMultipleNegativesRankingLoss
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
- Downloads last month
- 17
Model tree for luluw/Qwen3-MedEmbed-0.6B
Dataset used to train luluw/Qwen3-MedEmbed-0.6B
Evaluation results
- Cosine Accuracy on triplet evalself-reported0.974
- Cosine Accuracy@1 on eval set 23kq 46kdself-reported0.436
- Cosine Accuracy@3 on eval set 23kq 46kdself-reported0.811
- Cosine Accuracy@5 on eval set 23kq 46kdself-reported0.910
- Cosine Accuracy@10 on eval set 23kq 46kdself-reported0.959
- Cosine Precision@1 on eval set 23kq 46kdself-reported0.436
- Cosine Precision@3 on eval set 23kq 46kdself-reported0.270
- Cosine Precision@5 on eval set 23kq 46kdself-reported0.182
- Cosine Precision@10 on eval set 23kq 46kdself-reported0.096
- Cosine Recall@1 on eval set 23kq 46kdself-reported0.436
