Enhance model card: Add metadata, paper link, GitHub, and usage example
Browse filesThis PR significantly improves the model card for the SLM-SQL model by:
- Adding the `pipeline_tag: text-generation` to correctly categorize the model for Text-to-SQL tasks on the Hub.
- Adding the `library_name: transformers` to ensure compatibility with the Hugging Face Transformers library and enable the "how to use" button.
- Including relevant `tags` such as `text-to-sql`, `sql`, `qwen2`, and `small-language-model` for better discoverability.
- Providing a link to the paper: [SLM-SQL: An Exploration of Small Language Models for Text-to-SQL](https://huggingface.co/papers/2507.22478).
- Adding a link to the official GitHub repository for code and datasets: [https://github.com/alibaba/SLM-SQL](https://github.com/alibaba/SLM-SQL).
- Including a practical Python code snippet demonstrating how to use the model for Text-to-SQL inference.
- Removing the extraneous "File information" section.
|
@@ -1,3 +1,71 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: text-generation
|
| 4 |
+
library_name: transformers
|
| 5 |
+
tags:
|
| 6 |
+
- text-to-sql
|
| 7 |
+
- sql
|
| 8 |
+
- qwen2
|
| 9 |
+
- small-language-model
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
|
| 13 |
+
|
| 14 |
+
This repository contains **SLM-SQL**, a small language model (SLM) designed for Text-to-SQL tasks, as presented in the paper [SLM-SQL: An Exploration of Small Language Models for Text-to-SQL](https://huggingface.co/papers/2507.22478).
|
| 15 |
+
|
| 16 |
+
SLM-SQL aims to bridge the performance gap between large language models and smaller models in SQL generation, offering advantages in inference speed and suitability for edge deployment. It achieves this through innovations in post-training techniques, including supervised fine-tuning and reinforcement learning, applied to a Qwen2 base model. The models were evaluated on the BIRD development set, showing significant improvements in execution accuracy, with the 0.5B model reaching 56.87% execution accuracy (EX) and the 1.5B model achieving 67.08% EX.
|
| 17 |
+
|
| 18 |
+
Code and datasets are available on the official GitHub repository: [https://github.com/alibaba/SLM-SQL](https://github.com/alibaba/SLM-SQL).
|
| 19 |
+
|
| 20 |
+
## Usage
|
| 21 |
+
|
| 22 |
+
You can use the model with the Hugging Face `transformers` library. Below is a basic example demonstrating how to load the model and generate a SQL query from a natural language question.
|
| 23 |
+
|
| 24 |
+
```python
|
| 25 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 26 |
+
import torch
|
| 27 |
+
|
| 28 |
+
# Replace "your_model_id_here" with the specific model ID you want to use (e.g., SLM-SQL/slm-sql-1.5b)
|
| 29 |
+
model_name = "your_model_id_here"
|
| 30 |
+
|
| 31 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 32 |
+
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
|
| 33 |
+
|
| 34 |
+
# Example Text-to-SQL prompt
|
| 35 |
+
question = "List the names of all employees who joined after 2020."
|
| 36 |
+
schema = "CREATE TABLE employees (employee_id INT, name TEXT, join_date DATE);"
|
| 37 |
+
|
| 38 |
+
# Construct the prompt according to the model's expected format (this may vary)
|
| 39 |
+
# A common format for Text-to-SQL is providing the database schema and the natural language question.
|
| 40 |
+
prompt = f"Given the database schema below, generate a SQL query for the following question:
|
| 41 |
+
|
| 42 |
+
" \
|
| 43 |
+
f"Schema: {schema}
|
| 44 |
+
" \
|
| 45 |
+
f"Question: {question}
|
| 46 |
+
|
| 47 |
+
" \
|
| 48 |
+
f"SQL Query:"
|
| 49 |
+
|
| 50 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 51 |
+
|
| 52 |
+
# Generate the SQL query
|
| 53 |
+
# Adjust generation parameters like max_new_tokens, temperature, top_p as needed
|
| 54 |
+
outputs = model.generate(
|
| 55 |
+
**inputs,
|
| 56 |
+
max_new_tokens=256, # Set an appropriate length for SQL queries
|
| 57 |
+
temperature=0.7,
|
| 58 |
+
top_p=0.9,
|
| 59 |
+
do_sample=True,
|
| 60 |
+
eos_token_id=tokenizer.eos_token_id
|
| 61 |
+
)
|
| 62 |
+
|
| 63 |
+
# Decode the generated output, skipping the input prompt and special tokens
|
| 64 |
+
generated_sql = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True).strip()
|
| 65 |
+
|
| 66 |
+
print(f"Question: {question}")
|
| 67 |
+
print(f"Generated SQL: {generated_sql}")
|
| 68 |
+
|
| 69 |
+
# Example output might look like:
|
| 70 |
+
# SELECT name FROM employees WHERE join_date > '2020-12-31';
|
| 71 |
+
```
|