rename (#2)
Browse files- rename (757a76e08cdf931c5536cb883c6e6cec6ecc33a4)
Co-authored-by: Bo Liu <BoLiu@users.noreply.huggingface.co>
- Demo.ipynb +2 -2
- README.md +11 -11
Demo.ipynb
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3fbac9002e8052dd97f8de19ede6a60fa119d237a08aa13e09fc294c73708489
|
| 3 |
+
size 1085041
|
README.md
CHANGED
|
@@ -14,7 +14,7 @@ tags:
|
|
| 14 |
- ingestion
|
| 15 |
- yolox
|
| 16 |
---
|
| 17 |
-
#
|
| 18 |
|
| 19 |
## **Model Overview**
|
| 20 |
|
|
@@ -22,11 +22,11 @@ tags:
|
|
| 22 |
|
| 23 |
*Preview of the model output on the example image.*
|
| 24 |
|
| 25 |
-
The input of this model is expected to be a table image. You can use the [
|
| 26 |
|
| 27 |
### Description
|
| 28 |
|
| 29 |
-
The **
|
| 30 |
|
| 31 |
The model excels at detecting and localizing the fundamental structural elements within tables. Through careful fine-tuning, it can accurately identify and delineate three key components within tables:
|
| 32 |
|
|
@@ -38,7 +38,7 @@ This specialized focus on table structure enables precise decomposition of compl
|
|
| 38 |
|
| 39 |
This model is ready for commercial/non-commercial use.
|
| 40 |
|
| 41 |
-
We are excited to announce the open sourcing of this commercial model. For users interested in deploying this model in production environments, it is also available via the model API in NVIDIA Inference Microservices (NIM) at [
|
| 42 |
|
| 43 |
### License/Terms of use
|
| 44 |
|
|
@@ -59,7 +59,7 @@ Global
|
|
| 59 |
|
| 60 |
### Use Case
|
| 61 |
|
| 62 |
-
The **
|
| 63 |
- Detecting and extracting table structure elements (rows, columns, and cells)
|
| 64 |
- Providing precise location information for each detected element
|
| 65 |
- Supporting downstream tasks like table analysis and data extraction
|
|
@@ -77,7 +77,7 @@ Ideal for:
|
|
| 77 |
|
| 78 |
### Release Date
|
| 79 |
|
| 80 |
-
10/23/2025 via https://huggingface.co/nvidia/
|
| 81 |
|
| 82 |
### References
|
| 83 |
|
|
@@ -128,11 +128,11 @@ git lfs install
|
|
| 128 |
```
|
| 129 |
- Using https
|
| 130 |
```
|
| 131 |
-
git clone https://huggingface.co/nvidia/
|
| 132 |
```
|
| 133 |
- Or using ssh
|
| 134 |
```
|
| 135 |
-
git clone git@hf.co:nvidia/
|
| 136 |
```
|
| 137 |
|
| 138 |
2. Run the model using the following code:
|
|
@@ -182,7 +182,7 @@ If you wish to do additional training, [refer to the original repo](https://gith
|
|
| 182 |
3. Advanced post-processing
|
| 183 |
|
| 184 |
Additional post-processing might be required to use the model as part of a data extraction pipeline.
|
| 185 |
-
We show how to use the model as part of a table to text pipeline alongside with the [
|
| 186 |
|
| 187 |
**Disclaimer:**
|
| 188 |
We are aware of some issues with the model, and will provide a v2 with improved performance in the future which addresses the following issues:
|
|
@@ -194,7 +194,7 @@ We are aware of some issues with the model, and will provide a v2 with improved
|
|
| 194 |
### Software Integration
|
| 195 |
|
| 196 |
**Runtime Engine(s):**
|
| 197 |
-
- **
|
| 198 |
|
| 199 |
|
| 200 |
**Supported Hardware Microarchitecture Compatibility [List in Alphabetic Order]:**
|
|
@@ -211,7 +211,7 @@ This AI model can be embedded as an Application Programming Interface (API) call
|
|
| 211 |
|
| 212 |
## Model Version(s):
|
| 213 |
|
| 214 |
-
* `
|
| 215 |
|
| 216 |
## Training and Evaluation Datasets:
|
| 217 |
|
|
|
|
| 14 |
- ingestion
|
| 15 |
- yolox
|
| 16 |
---
|
| 17 |
+
# Nemotron Table Structure v1
|
| 18 |
|
| 19 |
## **Model Overview**
|
| 20 |
|
|
|
|
| 22 |
|
| 23 |
*Preview of the model output on the example image.*
|
| 24 |
|
| 25 |
+
The input of this model is expected to be a table image. You can use the [Nemotron Page Element v3](https://huggingface.co/nvidia/nemotron-page-elements-v3) to detect and crop such images.
|
| 26 |
|
| 27 |
### Description
|
| 28 |
|
| 29 |
+
The **Nemotron Table Structure v1** model is a specialized object detection model designed to identify and extract the structure of tables in images. Based on YOLOX, an anchor-free version of YOLO (You Only Look Once), this model combines a simpler architecture with enhanced performance. While the underlying technology builds upon work from [Megvii Technology](https://github.com/Megvii-BaseDetection/YOLOX), we developed our own base model through complete retraining rather than using pre-trained weights.
|
| 30 |
|
| 31 |
The model excels at detecting and localizing the fundamental structural elements within tables. Through careful fine-tuning, it can accurately identify and delineate three key components within tables:
|
| 32 |
|
|
|
|
| 38 |
|
| 39 |
This model is ready for commercial/non-commercial use.
|
| 40 |
|
| 41 |
+
We are excited to announce the open sourcing of this commercial model. For users interested in deploying this model in production environments, it is also available via the model API in NVIDIA Inference Microservices (NIM) at [nemotron-table-structure-v1](https://build.nvidia.com/nvidia/nemotron-table-structure-v1).
|
| 42 |
|
| 43 |
### License/Terms of use
|
| 44 |
|
|
|
|
| 59 |
|
| 60 |
### Use Case
|
| 61 |
|
| 62 |
+
The **Nemotron Table Structure v1** model specializes in analyzing images containing tables by:
|
| 63 |
- Detecting and extracting table structure elements (rows, columns, and cells)
|
| 64 |
- Providing precise location information for each detected element
|
| 65 |
- Supporting downstream tasks like table analysis and data extraction
|
|
|
|
| 77 |
|
| 78 |
### Release Date
|
| 79 |
|
| 80 |
+
10/23/2025 via https://huggingface.co/nvidia/nemotron-table-structure-v1
|
| 81 |
|
| 82 |
### References
|
| 83 |
|
|
|
|
| 128 |
```
|
| 129 |
- Using https
|
| 130 |
```
|
| 131 |
+
git clone https://huggingface.co/nvidia/nemotron-table-structure-v1
|
| 132 |
```
|
| 133 |
- Or using ssh
|
| 134 |
```
|
| 135 |
+
git clone git@hf.co:nvidia/nemotron-table-structure-v1
|
| 136 |
```
|
| 137 |
|
| 138 |
2. Run the model using the following code:
|
|
|
|
| 182 |
3. Advanced post-processing
|
| 183 |
|
| 184 |
Additional post-processing might be required to use the model as part of a data extraction pipeline.
|
| 185 |
+
We show how to use the model as part of a table to text pipeline alongside with the [Nemotron OCR](https://huggingface.co/nvidia/nemotron-ocr-v1) in the notebook `Demo.ipynb`.
|
| 186 |
|
| 187 |
**Disclaimer:**
|
| 188 |
We are aware of some issues with the model, and will provide a v2 with improved performance in the future which addresses the following issues:
|
|
|
|
| 194 |
### Software Integration
|
| 195 |
|
| 196 |
**Runtime Engine(s):**
|
| 197 |
+
- **Nemotron Page Elements v3** NIM
|
| 198 |
|
| 199 |
|
| 200 |
**Supported Hardware Microarchitecture Compatibility [List in Alphabetic Order]:**
|
|
|
|
| 211 |
|
| 212 |
## Model Version(s):
|
| 213 |
|
| 214 |
+
* `nemotron-table-structure-v1`
|
| 215 |
|
| 216 |
## Training and Evaluation Datasets:
|
| 217 |
|