sam3 / README.md

1038lab

Create README.md

c47dc80 verified 16 days ago

preview code

raw

history blame contribute delete

3.45 kB

metadata

license: other
extra_gated_fields:
  First Name: text
  Last Name: text
  Date of birth: date_picker
  Country: country
  Affiliation: text
  Job title:
    type: select
    options:
      - Student
      - Research Graduate
      - AI researcher
      - AI developer/engineer
      - Reporter
      - Other
  geo: ip_location
  By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Meta Privacy Policy: checkbox
extra_gated_description: >-
  The information you provide will be collected, stored, processed and shared in
  accordance with the [Meta Privacy
  Policy](https://www.facebook.com/privacy/policy/).
extra_gated_button_content: Submit
language:
  - en
pipeline_tag: mask-generation
library_name: transformers
tags:
  - sam3

SAM 3

This repository mirrors the official Segment Anything Model 3 (SAM 3) weights released by Meta Superintelligence Labs. SAM 3 is a unified foundation model for prompt-driven segmentation in images and videos. It supports open-vocabulary text prompts and visual prompts (points/boxes/masks). Compared to SAM 2, SAM 3 exhaustively segments each instance of a requested concept and reaches ~75–80% of human-level performance on the SA-CO benchmark (270K unique concepts).

Highlights

Presence token improves discrimination between closely related prompts.
Decoupled detector + tracker scales better for long video sequences.
4M+ automatically annotated concepts ensure broad coverage of open-world categories.

Original paper: SAM 3: Segment Anything with Concepts (Meta AI, 2024).
Resources: Project Page · Demo

Files Included

sam3.safetensors — detector and tracker weights for image + video segmentation.
Tokenizer/config assets should be copied from the official facebookresearch/sam3 repository; this mirror only repackages the safetensors weights for self-hosting.

Quickstart

pip install torch==2.7.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
pip install git+https://github.com/facebookresearch/sam3.git

python - <<'PY'
from sam3 import build_sam3_image_model
from sam3.model.sam3_image_processor import Sam3Processor
model = build_sam3_image_model(
    bpe_path="sam3/assets/bpe_simple_vocab_16e6.txt.gz",
    device="cuda",
    eval_mode=True,
    checkpoint_path="sam3.safetensors",
    load_from_HF=False,
)
processor = Sam3Processor(model, device="cuda")
state = processor.set_image("your_image.jpg")
state = processor.set_text_prompt("white bicycle", state)
print(state["masks"].shape)
PY

Integration Notes

These mirrored weights are used in the AILab SAM3 ComfyUI node (RMBG edition) to enable promptable segmentation workflows directly inside ComfyUI. The node loads sam3.safetensors, tokenizer assets, and the SAM3 processors locally, so the entire pipeline stays compatible even when offline.

License & Usage

This mirror preserves Meta's original weights and is subject to the license on facebook/sam3. You must accept Meta's terms before downloading the official release.
When hosting this file in your own Hugging Face repository, keep this notice and credit the original authors.
Cite the SAM 3 paper for any research or product that builds upon these weights.