--- license: apache-2.0 --- # CoVT Checkpoint (Segmentation, Depth, and DINO Aligned) Checkpoint of https://huggingface.co/papers/2511.19418. ## Model Description This CoVT checkpoint is aligned with **4 Depth tokens**, based on LLaVA-v1.5-13B. These task-specific tokens are integrated into the model’s embedding space to enhance 3D-awareness.