tablegpt
/

TableGPT-R1

@@ -1,10 +1,12 @@
 ---
-license: apache-2.0
 language:
 - zh
 - en
-base_model:
-- Qwen/Qwen3-8B
 ---
 # TableGPT-R1
@@ -180,13 +182,13 @@ Inquiries and feedback are welcome at [j.zhao@zju.edu.cn](mailto:j.zhao@zju.edu.
 TableGPT-R1 demonstrates substantial advancements over its predecessor, TableGPT2-7B, particularly in table comprehension and reasoning capabilities. Detailed comparisons are as follows:
-* **TableBench Benchmark**: TableGPT-R1 demonstrates strong performance. It achieves an average gain of 6.9\% over the Qwen3-8B across four core sub-tasks. Compared to the TableGPT2-7B, it records an average improvement of 3.12\%, validating its enhanced reasoning capability despite a trade-off in the PoT task.
-* **Natural Language to SQL**: TableGPT-R1 exhibits superior generalization capabilities. While showing consistent improvements over Qwen3-8B on Spider 1.0 (+0.66\%) and BIRD (+1.5\%), it represents a significant leap compared to TableGPT2-7B, registering dramatic performance increases of 12.35\% and 13.89\%, respectively.
-* **RealHitBench Test**: In this highly challenging test, TableGPT-R1 achieved outstanding results, particularly surpassing the top closed-source baseline model GPT-4o. This highlights its powerful capabilities in hierarchical table reasoning. Quantitative analysis shows that TableGPT-R1 matches or outperforms Qwen3-8B across subtasks, achieving an average improvement of 11.81\%, with a remarkable peak gain of 31.17\% in the Chart Generation task. Furthermore, compared to TableGPT2-7B, the model represents a significant advancement, registering an average improvement of 19.85\% across all subtasks.
-* **Internal Benchmark**: Evaluation further attests to the model's robustness. TableGPT-R1 surpasses Qwen3-8B by substantial margins: 10.8\% on the Table Info and 8.8\% on the Table Path.
 | Benchmark | Task | Met. | Q3-8B | T-LLM | Llama | T-R1-Z | TGPT2 | **TGPT-R1** | Q3-14B | Q3-32B | Q3-30B | QwQ | GPT-4o | DS-V3 | Q-Plus | vs.Q3-8B | vs.TGPT2 |

 ---
+base_model:
+- Qwen/Qwen3-8B
 language:
 - zh
 - en
+license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
 # TableGPT-R1
 TableGPT-R1 demonstrates substantial advancements over its predecessor, TableGPT2-7B, particularly in table comprehension and reasoning capabilities. Detailed comparisons are as follows:
+* **TableBench Benchmark**: TableGPT-R1 demonstrates strong performance. It achieves an average gain of 6.9% over the Qwen3-8B across four core sub-tasks. Compared to the TableGPT2-7B, it records an average improvement of 3.12%, validating its enhanced reasoning capability despite a trade-off in the PoT task.
+* **Natural Language to SQL**: TableGPT-R1 exhibits superior generalization capabilities. While showing consistent improvements over Qwen3-8B on Spider 1.0 (+0.66%) and BIRD (+1.5%), it represents a significant leap compared to TableGPT2-7B, registering dramatic performance increases of 12.35% and 13.89%, respectively.
+* **RealHitBench Test**: In this highly challenging test, TableGPT-R1 achieved outstanding results, particularly surpassing the top closed-source baseline model GPT-4o. This highlights its powerful capabilities in hierarchical table reasoning. Quantitative analysis shows that TableGPT-R1 matches or outperforms Qwen3-8B across subtasks, achieving an average improvement of 11.81%, with a remarkable peak gain of 31.17% in the Chart Generation task. Furthermore, compared to TableGPT2-7B, the model represents a significant advancement, registering an average improvement of 19.85% across all subtasks.
+* **Internal Benchmark**: Evaluation further attests to the model's robustness. TableGPT-R1 surpasses Qwen3-8B by substantial margins: 10.8% on the Table Info and 8.8% on the Table Path.
 | Benchmark | Task | Met. | Q3-8B | T-LLM | Llama | T-R1-Z | TGPT2 | **TGPT-R1** | Q3-14B | Q3-32B | Q3-30B | QwQ | GPT-4o | DS-V3 | Q-Plus | vs.Q3-8B | vs.TGPT2 |