CoVT Checkpoint (Segmentation, Depth, and DINO Aligned)

Checkpoint of https://huggingface.co/papers/2511.19418.

Model Description

This CoVT checkpoint is aligned with 4 Depth tokens, based on LLaVA-v1.5-13B.
These task-specific tokens are integrated into the model’s embedding space to enhance 3D-awareness.

Downloads last month
23
Safetensors
Model size
13B params
Tensor type
F32
·
F16
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including Wakals/CoVT-LLaVA-13B-depth