Fix: seamless integration with 🤗 generation pipelines

by sangioai - opened Sep 5, 2025

base: refs/heads/main

←

from: refs/pr/9

Discussion Files changed

-0

sangioai

Sep 5, 2025

•

edited Sep 5, 2025

Add input_ids parsing into the generate function, commonly used in 🤗 pipelines

Example:

# Load models
tok = AutoTokenizer.from_pretrained("apple/FastVLM-7B", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "apple/FastVLM-7B",
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto",
    trust_remote_code=True,
)

# Build prompt
messages = [
    {"role": "user", "content": "Describe San Francisco"}
]

# Tokenize
inputs = tokenizer.apply_chat_template(
      messages,
      add_generation_prompt=True,
      tokenize=True,
      return_dict=True,
      return_tensors="pt",
 ).to(model.device)

# Generate  
with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=128,
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))

Fix: seamless integration with 🤗 generation pipelinesf31c5099

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment