KennethTM
/

pix2struct-base-table2html

Model card Files Files and versions

KennethTM commited on Sep 10, 2024

Commit

13d6ca3

·

verified ·

1 Parent(s): 6e6f093

Update README.md

Files changed (1) hide show

README.md +8 -11

README.md CHANGED Viewed

@@ -13,9 +13,15 @@ base_model: google/pix2struct-base
 *Turn table images into HTML!*
 ## About
-This model takes an image of a table and outputs HTML - the model parses both the optical character recognition (OCR) and the structure to HTML format.
 The model expects an image containing only a table. If the table is embedded in a document, first use a table detection model to extract it.
@@ -25,7 +31,7 @@ The model has been trained using two datasets: [MMTab](https://huggingface.co/da
 ## Usage
-Below is a complete example for loading the model and performing inference on an example table image:
 ```python
 import torch
@@ -42,7 +48,6 @@ model.to(device)
 model.eval()
 # Load example image from URL
-# Example from the MMTab dataset
 url = "https://example.com/path_to_table_image.jpg"
 response = requests.get(url)
 image = Image.open(BytesIO(response.content))
@@ -59,11 +64,3 @@ predictions_decoded = processor.tokenizer.batch_decode(predictions, skip_special
 # Show predictions as text
 print(predictions_decoded[0])
 ```
-## Demo app
-Try the [demo app]() which contain both table detection and recognition!

 *Turn table images into HTML!*
+## Demo app
+Try the [demo app]() which contains both table detection and recognition!
 ## About
+This model takes an image of a table and outputs HTML - the model parses the image and performs optical character recognition (OCR) and structure recognition to HTML format.
 The model expects an image containing only a table. If the table is embedded in a document, first use a table detection model to extract it.
 ## Usage
+Below is a complete example of loading the model and performing inference on an example table image (example from the [MMTab dataset](https://huggingface.co/datasets/SpursgoZmy/MMTab)):
 ```python
 import torch
 model.eval()
 # Load example image from URL
 url = "https://example.com/path_to_table_image.jpg"
 response = requests.get(url)
 image = Image.open(BytesIO(response.content))
 # Show predictions as text
 print(predictions_decoded[0])
 ```