Enrich VLMs’ vision-centric reasoning capabilities via Chain-of-Visual-Thought!
YM Qin
Wakals
AI & ML interests
Computer Vision, Vision-language Model, Generative Model
Recent Activity
liked
a dataset
13 days ago
DietCoke4671/ToolVQA
upvoted
a
paper
about 1 month ago
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
Organizations
None yet