Update README.md
Browse files
README.md
CHANGED
|
@@ -21,6 +21,7 @@ tags:
|
|
| 21 |
* **Objective**: To solve the key challenge of integrating the visual content of **specific objects** in an image with their **associated text** for comprehensive understanding.
|
| 22 |
* **Method**: Trained with the **RCVIT(Region-level Context-aware Visual Instruction Tuning)** method and **RCMU** dataset, it uses **bounding boxes** to precisely link visual content with text.
|
| 23 |
* **Performance & Applications**: It achieves outstanding performance on RCMU tasks and is successfully applied in advanced scenarios like **multimodal RAG** and **personalized conversation**.
|
|
|
|
| 24 |
|
| 25 |
|
| 26 |
## Refer to Qwen2-VL for the requirements:
|
|
|
|
| 21 |
* **Objective**: To solve the key challenge of integrating the visual content of **specific objects** in an image with their **associated text** for comprehensive understanding.
|
| 22 |
* **Method**: Trained with the **RCVIT(Region-level Context-aware Visual Instruction Tuning)** method and **RCMU** dataset, it uses **bounding boxes** to precisely link visual content with text.
|
| 23 |
* **Performance & Applications**: It achieves outstanding performance on RCMU tasks and is successfully applied in advanced scenarios like **multimodal RAG** and **personalized conversation**.
|
| 24 |
+

|
| 25 |
|
| 26 |
|
| 27 |
## Refer to Qwen2-VL for the requirements:
|