sosahi BoLiu commited on
Commit
8359ba4
·
verified ·
1 Parent(s): 49a14ce

- rename (757a76e08cdf931c5536cb883c6e6cec6ecc33a4)


Co-authored-by: Bo Liu <BoLiu@users.noreply.huggingface.co>

Files changed (2) hide show
  1. Demo.ipynb +2 -2
  2. README.md +11 -11
Demo.ipynb CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e799b29f13e7dba664c2e5e9af5866cc10c7fc847fdd32295348cc78cdf9d13f
3
- size 1085057
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3fbac9002e8052dd97f8de19ede6a60fa119d237a08aa13e09fc294c73708489
3
+ size 1085041
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
  - ingestion
15
  - yolox
16
  ---
17
- # Nemoretriever Table Structure v1
18
 
19
  ## **Model Overview**
20
 
@@ -22,11 +22,11 @@ tags:
22
 
23
  *Preview of the model output on the example image.*
24
 
25
- The input of this model is expected to be a table image. You can use the [Nemoretriever Page Element v3](https://huggingface.co/nvidia/nemoretriever-page-elements-v3) to detect and crop such images.
26
 
27
  ### Description
28
 
29
- The **NeMo Retriever Table Structure v1** model is a specialized object detection model designed to identify and extract the structure of tables in images. Based on YOLOX, an anchor-free version of YOLO (You Only Look Once), this model combines a simpler architecture with enhanced performance. While the underlying technology builds upon work from [Megvii Technology](https://github.com/Megvii-BaseDetection/YOLOX), we developed our own base model through complete retraining rather than using pre-trained weights.
30
 
31
  The model excels at detecting and localizing the fundamental structural elements within tables. Through careful fine-tuning, it can accurately identify and delineate three key components within tables:
32
 
@@ -38,7 +38,7 @@ This specialized focus on table structure enables precise decomposition of compl
38
 
39
  This model is ready for commercial/non-commercial use.
40
 
41
- We are excited to announce the open sourcing of this commercial model. For users interested in deploying this model in production environments, it is also available via the model API in NVIDIA Inference Microservices (NIM) at [nemoretriever-table-structure-v1](https://build.nvidia.com/nvidia/nemoretriever-table-structure-v1).
42
 
43
  ### License/Terms of use
44
 
@@ -59,7 +59,7 @@ Global
59
 
60
  ### Use Case
61
 
62
- The **NeMo Retriever Table Structure v1** model specializes in analyzing images containing tables by:
63
  - Detecting and extracting table structure elements (rows, columns, and cells)
64
  - Providing precise location information for each detected element
65
  - Supporting downstream tasks like table analysis and data extraction
@@ -77,7 +77,7 @@ Ideal for:
77
 
78
  ### Release Date
79
 
80
- 10/23/2025 via https://huggingface.co/nvidia/nemoretriever-table-structure-v1
81
 
82
  ### References
83
 
@@ -128,11 +128,11 @@ git lfs install
128
  ```
129
  - Using https
130
  ```
131
- git clone https://huggingface.co/nvidia/nemoretriever-table-structure-v1
132
  ```
133
  - Or using ssh
134
  ```
135
- git clone git@hf.co:nvidia/nemoretriever-table-structure-v1
136
  ```
137
 
138
  2. Run the model using the following code:
@@ -182,7 +182,7 @@ If you wish to do additional training, [refer to the original repo](https://gith
182
  3. Advanced post-processing
183
 
184
  Additional post-processing might be required to use the model as part of a data extraction pipeline.
185
- We show how to use the model as part of a table to text pipeline alongside with the [Nemo Retriever OCR](https://huggingface.co/nvidia/nemoretriever-ocr-v1) in the notebook `Demo.ipynb`.
186
 
187
  **Disclaimer:**
188
  We are aware of some issues with the model, and will provide a v2 with improved performance in the future which addresses the following issues:
@@ -194,7 +194,7 @@ We are aware of some issues with the model, and will provide a v2 with improved
194
  ### Software Integration
195
 
196
  **Runtime Engine(s):**
197
- - **NeMo Retriever Page Elements v3** NIM
198
 
199
 
200
  **Supported Hardware Microarchitecture Compatibility [List in Alphabetic Order]:**
@@ -211,7 +211,7 @@ This AI model can be embedded as an Application Programming Interface (API) call
211
 
212
  ## Model Version(s):
213
 
214
- * `nemoretriever-table-structure-v1`
215
 
216
  ## Training and Evaluation Datasets:
217
 
 
14
  - ingestion
15
  - yolox
16
  ---
17
+ # Nemotron Table Structure v1
18
 
19
  ## **Model Overview**
20
 
 
22
 
23
  *Preview of the model output on the example image.*
24
 
25
+ The input of this model is expected to be a table image. You can use the [Nemotron Page Element v3](https://huggingface.co/nvidia/nemotron-page-elements-v3) to detect and crop such images.
26
 
27
  ### Description
28
 
29
+ The **Nemotron Table Structure v1** model is a specialized object detection model designed to identify and extract the structure of tables in images. Based on YOLOX, an anchor-free version of YOLO (You Only Look Once), this model combines a simpler architecture with enhanced performance. While the underlying technology builds upon work from [Megvii Technology](https://github.com/Megvii-BaseDetection/YOLOX), we developed our own base model through complete retraining rather than using pre-trained weights.
30
 
31
  The model excels at detecting and localizing the fundamental structural elements within tables. Through careful fine-tuning, it can accurately identify and delineate three key components within tables:
32
 
 
38
 
39
  This model is ready for commercial/non-commercial use.
40
 
41
+ We are excited to announce the open sourcing of this commercial model. For users interested in deploying this model in production environments, it is also available via the model API in NVIDIA Inference Microservices (NIM) at [nemotron-table-structure-v1](https://build.nvidia.com/nvidia/nemotron-table-structure-v1).
42
 
43
  ### License/Terms of use
44
 
 
59
 
60
  ### Use Case
61
 
62
+ The **Nemotron Table Structure v1** model specializes in analyzing images containing tables by:
63
  - Detecting and extracting table structure elements (rows, columns, and cells)
64
  - Providing precise location information for each detected element
65
  - Supporting downstream tasks like table analysis and data extraction
 
77
 
78
  ### Release Date
79
 
80
+ 10/23/2025 via https://huggingface.co/nvidia/nemotron-table-structure-v1
81
 
82
  ### References
83
 
 
128
  ```
129
  - Using https
130
  ```
131
+ git clone https://huggingface.co/nvidia/nemotron-table-structure-v1
132
  ```
133
  - Or using ssh
134
  ```
135
+ git clone git@hf.co:nvidia/nemotron-table-structure-v1
136
  ```
137
 
138
  2. Run the model using the following code:
 
182
  3. Advanced post-processing
183
 
184
  Additional post-processing might be required to use the model as part of a data extraction pipeline.
185
+ We show how to use the model as part of a table to text pipeline alongside with the [Nemotron OCR](https://huggingface.co/nvidia/nemotron-ocr-v1) in the notebook `Demo.ipynb`.
186
 
187
  **Disclaimer:**
188
  We are aware of some issues with the model, and will provide a v2 with improved performance in the future which addresses the following issues:
 
194
  ### Software Integration
195
 
196
  **Runtime Engine(s):**
197
+ - **Nemotron Page Elements v3** NIM
198
 
199
 
200
  **Supported Hardware Microarchitecture Compatibility [List in Alphabetic Order]:**
 
211
 
212
  ## Model Version(s):
213
 
214
+ * `nemotron-table-structure-v1`
215
 
216
  ## Training and Evaluation Datasets:
217