Spaces:
Running
Running
update
Browse files
app/src/content/article.mdx
CHANGED
|
@@ -38,7 +38,7 @@ Even though open-weights Vision-Language Models (VLMs) are becoming ever more po
|
|
| 38 |
### Data Collection
|
| 39 |
We manually collect over 180 image-text datasets from the recent literature and create new subsets in lacking domains.
|
| 40 |
|
| 41 |
-
<
|
| 42 |
<Accordion title="FineVision Subsets">
|
| 43 |
|Subset Name |Total Images|Total Samples|Total Turns|Total Question Tokens|Total Answer Tokens|Category |
|
| 44 |
|--------------------------------------|------------|-------------|-----------|---------------------|-------------------|----------------------|
|
|
@@ -228,7 +228,7 @@ We manually collect over 180 image-text datasets from the recent literature and
|
|
| 228 |
|text_wizardlm_evol |0 |69,999 |69,999 |7,753,963 |21,955,856 |Text-only |
|
| 229 |
|text_OpenMathInstruct-2 |0 |1,000,000 |1,000,000 |74,905,850 |413,132,418 |Text-only |
|
| 230 |
</Accordion>
|
| 231 |
-
</
|
| 232 |
|
| 233 |
### Cleaning
|
| 234 |
After gathering all the sub-datasets, every turn is cleaned. We remove all individual turns whose combined question and answer length exceeds 8192 tokens. We resize big images to have a longest side of 2048 pixels while keeping the aspect ratio, and discard images with corrupted metadata. This results in a clean final dataset with a maximum turn length of 8192 tokens and a maximum image dimension of 2048 pixels on the longest side.
|
|
|
|
| 38 |
### Data Collection
|
| 39 |
We manually collect over 180 image-text datasets from the recent literature and create new subsets in lacking domains.
|
| 40 |
|
| 41 |
+
<FullWidth>
|
| 42 |
<Accordion title="FineVision Subsets">
|
| 43 |
|Subset Name |Total Images|Total Samples|Total Turns|Total Question Tokens|Total Answer Tokens|Category |
|
| 44 |
|--------------------------------------|------------|-------------|-----------|---------------------|-------------------|----------------------|
|
|
|
|
| 228 |
|text_wizardlm_evol |0 |69,999 |69,999 |7,753,963 |21,955,856 |Text-only |
|
| 229 |
|text_OpenMathInstruct-2 |0 |1,000,000 |1,000,000 |74,905,850 |413,132,418 |Text-only |
|
| 230 |
</Accordion>
|
| 231 |
+
</FullWidth>
|
| 232 |
|
| 233 |
### Cleaning
|
| 234 |
After gathering all the sub-datasets, every turn is cleaned. We remove all individual turns whose combined question and answer length exceeds 8192 tokens. We resize big images to have a longest side of 2048 pixels while keeping the aspect ratio, and discard images with corrupted metadata. This results in a clean final dataset with a maximum turn length of 8192 tokens and a maximum image dimension of 2048 pixels on the longest side.
|