MTGEmb-small / README.md
philipp-zettl's picture
End of training script.
627c4f5 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:149460
  - loss:ContrastiveLoss
base_model: sentence-transformers/clip-ViT-B-32
widget:
  - source_sentence: Meltdown
    sentences:
      - Ancient Imperiosaur
      - >-
        https://cards.scryfall.io/normal/front/1/9/192ccc7f-ffb1-4f78-8cf0-a220df612be7.jpg?1682536817
      - >-
        https://cards.scryfall.io/normal/front/5/6/56301392-3496-48d0-8d91-6b82e1164c98.jpg?1721427942
  - source_sentence: Etali, Primal Storm
    sentences:
      - >-
        https://cards.scryfall.io/normal/front/4/8/4874388e-0227-4b89-a986-d86c14482c81.jpg?1594065427
      - Battle of Wits
      - >-
        https://cards.scryfall.io/normal/front/1/d/1d3d8bb4-0430-45bb-930d-5d6db6521945.jpg?1587309687
  - source_sentence: Chrome Prowler
    sentences:
      - >-
        https://cards.scryfall.io/normal/front/a/2/a263f594-621e-46af-8561-f7eee565a19a.jpg?1562643297
      - >-
        https://cards.scryfall.io/normal/front/3/d/3dff363d-7e9f-4764-a9ee-ec2f23239df6.jpg?1562907900
      - >-
        https://cards.scryfall.io/normal/front/2/1/21121857-85b8-4ba8-9363-beafdb1005c2.jpg?1730486782
  - source_sentence: Beastbreaker of Bala Ged
    sentences:
      - >-
        https://cards.scryfall.io/normal/front/2/8/287ca034-9cea-4b84-98ba-76c24f038edb.jpg?1599709496
      - >-
        https://cards.scryfall.io/normal/front/5/4/547f2641-bcd6-4536-ba5a-f46170dd2803.jpg?1573513110
      - >-
        https://cards.scryfall.io/normal/front/4/c/4c29f6a1-42a5-433f-9c09-c160b096f8e1.jpg?1562542378
  - source_sentence: Against All Odds
    sentences:
      - >-
        https://cards.scryfall.io/normal/front/4/a/4ab2f81a-fcbe-44d1-8281-04dd78bb9ea3.jpg?1593274931
      - >-
        https://cards.scryfall.io/normal/front/3/c/3cd8dd4e-6892-49d7-8fae-97d04f9f6c84.jpg?1675956885
      - Sheltering Prayers
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/clip-ViT-B-32

This is a sentence-transformers model finetuned from sentence-transformers/clip-ViT-B-32. It maps sentences & paragraphs to a None-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/clip-ViT-B-32
  • Maximum Sequence Length: 77 tokens
  • Output Dimensionality: None dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): CLIPModel()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("philipp-zettl/MTGEmb-small")
# Run inference
sentences = [
    'Against All Odds',
    'https://cards.scryfall.io/normal/front/3/c/3cd8dd4e-6892-49d7-8fae-97d04f9f6c84.jpg?1675956885',
    'https://cards.scryfall.io/normal/front/4/a/4ab2f81a-fcbe-44d1-8281-04dd78bb9ea3.jpg?1593274931',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9248, 0.6695],
#         [0.9248, 1.0000, 0.6947],
#         [0.6695, 0.6947, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.2140 500 0.0342
0.4281 1000 0.0311
0.6421 1500 0.0306
0.8562 2000 0.0302
1.0702 2500 0.0287
1.2842 3000 0.0262
1.4983 3500 0.025
1.7123 4000 0.0236
1.9264 4500 0.022
2.1404 5000 0.016
2.3545 5500 0.0128
2.5685 6000 0.0119
2.7825 6500 0.0108
2.9966 7000 0.0103

Framework Versions

  • Python: 3.13.7
  • Sentence Transformers: 5.1.2
  • Transformers: 4.49.0
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.1.1
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ContrastiveLoss

@inproceedings{hadsell2006dimensionality,
    author={Hadsell, R. and Chopra, S. and LeCun, Y.},
    booktitle={2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)},
    title={Dimensionality Reduction by Learning an Invariant Mapping},
    year={2006},
    volume={2},
    number={},
    pages={1735-1742},
    doi={10.1109/CVPR.2006.100}
}