Deoth of Field

This model predicts an image's cinematic depth of field [deep, shallow]. The model is a DinoV2 with registers backbone (initiated with facebook/dinov2-with-registers-large weights) and trained on a diverse set of five thousand human-annotated images.

How to use:


import torch
from PIL import Image
from transformers import AutoImageProcessor
from transformers import AutoModelForImageClassification

image_processor = AutoImageProcessor.from_pretrained("facebook/dinov2-with-registers-large")
model = AutoModelForImageClassification.from_pretrained('aslakey/depth_of_field')
model.eval()

# Model labels: [deep, shallow]
image = Image.open('cinematic_shot.jpg')
inputs = image_processor(image, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

predicted_label = outputs.logits.argmax(-1).item()
print(model.config.id2label[predicted_label])

Performance:

Category	Precision	Recall
deep	85%	77%
shallow	75%	84%

Downloads last month: 5

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support