Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ library_name: transformers
|
|
| 8 |
---
|
| 9 |
## Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
|
| 10 |
|
| 11 |
-
<p align="center">📑 <a href="">Technical Report</a>|📖<a href="https://inclusionai.github.io/blog/mingtok/">Project Page</a> |🤗 <a href="https://huggingface.co/inclusionAI/Ming-UniVision-16B-A3B">Hugging Face</a>| 🤖 <a href="https://www.modelscope.cn/models/inclusionAI/Ming-UniVision-16B-A3B">ModelScope</a>| 💾 <a href="https://github.com/inclusionAI/Ming-UniVision">GitHub</a></p>
|
| 12 |
|
| 13 |
## Key Features
|
| 14 |
- 🌐 **First Unified Autoregressive MLLM with Continuous Vision Tokens:** Ming-UniVision is the first multimodal large language model that natively integrates continuous visual representations from MingTok into a next-token prediction (NTP) framework—unifying vision and language under a single autoregressive paradigm without discrete quantization or modality-specific heads.
|
|
@@ -707,4 +707,11 @@ model.reset_inner_state()
|
|
| 707 |
|
| 708 |
|
| 709 |
## Reference
|
| 710 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
## Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
|
| 10 |
|
| 11 |
+
<p align="center">📑 <a href="https://arxiv.org/pdf/2510.06590">Technical Report</a>|📖<a href="https://inclusionai.github.io/blog/mingtok/">Project Page</a> |🤗 <a href="https://huggingface.co/inclusionAI/Ming-UniVision-16B-A3B">Hugging Face</a>| 🤖 <a href="https://www.modelscope.cn/models/inclusionAI/Ming-UniVision-16B-A3B">ModelScope</a>| 💾 <a href="https://github.com/inclusionAI/Ming-UniVision">GitHub</a></p>
|
| 12 |
|
| 13 |
## Key Features
|
| 14 |
- 🌐 **First Unified Autoregressive MLLM with Continuous Vision Tokens:** Ming-UniVision is the first multimodal large language model that natively integrates continuous visual representations from MingTok into a next-token prediction (NTP) framework—unifying vision and language under a single autoregressive paradigm without discrete quantization or modality-specific heads.
|
|
|
|
| 707 |
|
| 708 |
|
| 709 |
## Reference
|
| 710 |
+
```
|
| 711 |
+
@article{huang2025mingunivision,
|
| 712 |
+
title={Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer},
|
| 713 |
+
author={Huang, Ziyuan and Zheng, DanDan and Zou, Cheng and Liu, Rui and Wang, Xiaolong and Ji, Kaixiang and Chai, Weilong and Sun, Jianxin and Wang, Libin and Lv, Yongjie and Huang, Taozhi and Liu, Jiajia and Guo, Qingpei and Yang, Ming and Chen, Jingdong and Zhou, Jun},
|
| 714 |
+
journal={arXiv preprint arXiv:2510.06590},
|
| 715 |
+
year={2025}
|
| 716 |
+
}
|
| 717 |
+
```
|