Innovator-VL
Collection
A Multimodal Large Language Model for Scientific Discovery
•
10 items
•
Updated
Innovator-VL-8B-Instruct is a multimodal instruction-following large language model designed for scientific understanding and reasoning. The model integrates strong general-purpose vision-language capabilities with enhanced scientific multimodal alignment, while maintaining a fully transparent and reproducible training pipeline.
Unlike approaches that rely on large-scale domain-specific pretraining, Innovator-VL-8B-Instruct achieves competitive scientific performance using high-quality instruction tuning, without additional scientific text continued pretraining.
The model supports native-resolution multi-image inputs and is suitable for complex scientific visual analysis.
No additional scientific text continued pretraining is applied.
@article{innovator-vl,
title={Innovator-VL: A Multimodal Large Language Model for Scientific Discovery},
year={2025}
}