manaestras EthannW commited on
Commit
ea397eb
·
verified ·
1 Parent(s): b28a172

Change Title link (#2)

Browse files

- Change Title link (76081691fb97261dc4f579467a9b9da6be82320b)


Co-authored-by: Ethannwan <EthannW@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -7,12 +7,6 @@ pipeline_tag: image-text-to-text
7
  library_name: transformers
8
  ---
9
 
10
- <div align="center">
11
-
12
- # HunyuanOCR
13
-
14
- </div>
15
-
16
  <p align="center">
17
  <img src="https://github.com/Tencent-Hunyuan/HunyuanOCR/blob/main/assets/hyocr-head-img.png?raw=true" width="80%"/> <br>
18
  </p>
@@ -25,6 +19,12 @@ library_name: transformers
25
  <a href="https://github.com/Tencent-Hunyuan/HunyuanOCR"><b>🌟 Github</b></a>
26
  </p>
27
 
 
 
 
 
 
 
28
 
29
  ## 📖 Introduction
30
  **HunyuanOCR** stands as a leading end-to-end OCR expert VLM powered by Hunyuan's native multimodal architecture. With a remarkably lightweight 1B parameter design, it has achieved multiple state-of-the-art benchmarks across the industry. The model demonstrates mastery in **complex multilingual document parsing** while excelling in practical applications including **text spotting, open-field information extraction, video subtitle extraction, and photo translation**.
 
7
  library_name: transformers
8
  ---
9
 
 
 
 
 
 
 
10
  <p align="center">
11
  <img src="https://github.com/Tencent-Hunyuan/HunyuanOCR/blob/main/assets/hyocr-head-img.png?raw=true" width="80%"/> <br>
12
  </p>
 
19
  <a href="https://github.com/Tencent-Hunyuan/HunyuanOCR"><b>🌟 Github</b></a>
20
  </p>
21
 
22
+ <h2>
23
+ <p align="center">
24
+ <a href="https://github.com/Tencent-Hunyuan/HunyuanOCR/blob/main/HunyuanOCR_Technical_Report.pdf">HunyuanOCR</a>
25
+ </p>
26
+ </h2>
27
+
28
 
29
  ## 📖 Introduction
30
  **HunyuanOCR** stands as a leading end-to-end OCR expert VLM powered by Hunyuan's native multimodal architecture. With a remarkably lightweight 1B parameter design, it has achieved multiple state-of-the-art benchmarks across the industry. The model demonstrates mastery in **complex multilingual document parsing** while excelling in practical applications including **text spotting, open-field information extraction, video subtitle extraction, and photo translation**.