Add pipeline_tag and library_name to metadata
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,10 +1,12 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
language:
|
| 4 |
- zh
|
| 5 |
- en
|
| 6 |
-
|
| 7 |
-
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
# TableGPT-R1
|
|
@@ -180,13 +182,13 @@ Inquiries and feedback are welcome at [j.zhao@zju.edu.cn](mailto:j.zhao@zju.edu.
|
|
| 180 |
|
| 181 |
TableGPT-R1 demonstrates substantial advancements over its predecessor, TableGPT2-7B, particularly in table comprehension and reasoning capabilities. Detailed comparisons are as follows:
|
| 182 |
|
| 183 |
-
* **TableBench Benchmark**: TableGPT-R1 demonstrates strong performance. It achieves an average gain of 6.9
|
| 184 |
|
| 185 |
-
* **Natural Language to SQL**: TableGPT-R1 exhibits superior generalization capabilities. While showing consistent improvements over Qwen3-8B on Spider 1.0 (+0.66
|
| 186 |
|
| 187 |
-
* **RealHitBench Test**: In this highly challenging test, TableGPT-R1 achieved outstanding results, particularly surpassing the top closed-source baseline model GPT-4o. This highlights its powerful capabilities in hierarchical table reasoning. Quantitative analysis shows that TableGPT-R1 matches or outperforms Qwen3-8B across subtasks, achieving an average improvement of 11.81
|
| 188 |
|
| 189 |
-
* **Internal Benchmark**: Evaluation further attests to the model's robustness. TableGPT-R1 surpasses Qwen3-8B by substantial margins: 10.8
|
| 190 |
|
| 191 |
|
| 192 |
| Benchmark | Task | Met. | Q3-8B | T-LLM | Llama | T-R1-Z | TGPT2 | **TGPT-R1** | Q3-14B | Q3-32B | Q3-30B | QwQ | GPT-4o | DS-V3 | Q-Plus | vs.Q3-8B | vs.TGPT2 |
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- Qwen/Qwen3-8B
|
| 4 |
language:
|
| 5 |
- zh
|
| 6 |
- en
|
| 7 |
+
license: apache-2.0
|
| 8 |
+
library_name: transformers
|
| 9 |
+
pipeline_tag: text-generation
|
| 10 |
---
|
| 11 |
|
| 12 |
# TableGPT-R1
|
|
|
|
| 182 |
|
| 183 |
TableGPT-R1 demonstrates substantial advancements over its predecessor, TableGPT2-7B, particularly in table comprehension and reasoning capabilities. Detailed comparisons are as follows:
|
| 184 |
|
| 185 |
+
* **TableBench Benchmark**: TableGPT-R1 demonstrates strong performance. It achieves an average gain of 6.9% over the Qwen3-8B across four core sub-tasks. Compared to the TableGPT2-7B, it records an average improvement of 3.12%, validating its enhanced reasoning capability despite a trade-off in the PoT task.
|
| 186 |
|
| 187 |
+
* **Natural Language to SQL**: TableGPT-R1 exhibits superior generalization capabilities. While showing consistent improvements over Qwen3-8B on Spider 1.0 (+0.66%) and BIRD (+1.5%), it represents a significant leap compared to TableGPT2-7B, registering dramatic performance increases of 12.35% and 13.89%, respectively.
|
| 188 |
|
| 189 |
+
* **RealHitBench Test**: In this highly challenging test, TableGPT-R1 achieved outstanding results, particularly surpassing the top closed-source baseline model GPT-4o. This highlights its powerful capabilities in hierarchical table reasoning. Quantitative analysis shows that TableGPT-R1 matches or outperforms Qwen3-8B across subtasks, achieving an average improvement of 11.81%, with a remarkable peak gain of 31.17% in the Chart Generation task. Furthermore, compared to TableGPT2-7B, the model represents a significant advancement, registering an average improvement of 19.85% across all subtasks.
|
| 190 |
|
| 191 |
+
* **Internal Benchmark**: Evaluation further attests to the model's robustness. TableGPT-R1 surpasses Qwen3-8B by substantial margins: 10.8% on the Table Info and 8.8% on the Table Path.
|
| 192 |
|
| 193 |
|
| 194 |
| Benchmark | Task | Met. | Q3-8B | T-LLM | Llama | T-R1-Z | TGPT2 | **TGPT-R1** | Q3-14B | Q3-32B | Q3-30B | QwQ | GPT-4o | DS-V3 | Q-Plus | vs.Q3-8B | vs.TGPT2 |
|