Spaces:
Paused
Paused
tianzhuotao
commited on
Commit
·
fef4159
1
Parent(s):
1f5e026
Update README.md
Browse filesFormer-commit-id: 11dc173776f538e2510f50b202755e763f8267cb
README.md
CHANGED
|
@@ -74,6 +74,27 @@ After that, input the text prompt and then the image path. For example,
|
|
| 74 |
The results should be like:
|
| 75 |
<p align="center"> <img src="imgs/example1.jpg" width="22%"> <img src="vis_output/example1_masked_img_0.jpg" width="22%"> <img src="imgs/example2.jpg" width="25%"> <img src="vis_output/example2_masked_img_0.jpg" width="25%"> </p>
|
| 76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
## Citation
|
| 79 |
If you find this project useful in your research, please consider citing:
|
|
|
|
| 74 |
The results should be like:
|
| 75 |
<p align="center"> <img src="imgs/example1.jpg" width="22%"> <img src="vis_output/example1_masked_img_0.jpg" width="22%"> <img src="imgs/example2.jpg" width="25%"> <img src="vis_output/example2_masked_img_0.jpg" width="25%"> </p>
|
| 76 |
|
| 77 |
+
## Dataset
|
| 78 |
+
We have collected 1218 images, i.e., 239 train, 200 val, and 779 test. The training and validation sets can be download from <a href="https://drive.google.com/drive/folders/125mewyg5Ao6tZ3ZdJ-1-E3n04LGVELqy?usp=sharing">**this link**</a>.
|
| 79 |
+
|
| 80 |
+
Each image is provided with an annotation JSON file:
|
| 81 |
+
```
|
| 82 |
+
image_1.jpg, image_1.json
|
| 83 |
+
image_2.jpg, image_2.json
|
| 84 |
+
...
|
| 85 |
+
image_n.jpg, image_n.json
|
| 86 |
+
```
|
| 87 |
+
Important keys contained in JSON files:
|
| 88 |
+
```
|
| 89 |
+
- "text": text instructions.
|
| 90 |
+
- "is_sentence": whether the text instructions are long sentences.
|
| 91 |
+
- "shapes": target polygons.
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
The elements of the "shapes" exhibit two categories, namely **"target"** and **"ignore"**. The former category is indispensable for evaluation, while the latter category denotes the ambiguous region and hence disregarded during the evaluation process.
|
| 95 |
+
|
| 96 |
+
Besides, we leveraged GPT-3.5 for rephrasing instructions, so images in the training set may have **more than one instructions (but fewer than six)** in the "text" field. Users can randomly select one instruction as the text query to obtain a better model.
|
| 97 |
+
|
| 98 |
|
| 99 |
## Citation
|
| 100 |
If you find this project useful in your research, please consider citing:
|