Enhance model card: Add metadata, paper link, GitHub, and usage example
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,3 +1,71 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: text-generation
|
| 4 |
+
library_name: transformers
|
| 5 |
+
tags:
|
| 6 |
+
- text-to-sql
|
| 7 |
+
- sql
|
| 8 |
+
- qwen2
|
| 9 |
+
- small-language-model
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
|
| 13 |
+
|
| 14 |
+
This repository contains **SLM-SQL**, a small language model (SLM) designed for Text-to-SQL tasks, as presented in the paper [SLM-SQL: An Exploration of Small Language Models for Text-to-SQL](https://huggingface.co/papers/2507.22478).
|
| 15 |
+
|
| 16 |
+
SLM-SQL aims to bridge the performance gap between large language models and smaller models in SQL generation, offering advantages in inference speed and suitability for edge deployment. It achieves this through innovations in post-training techniques, including supervised fine-tuning and reinforcement learning, applied to a Qwen2 base model. The models were evaluated on the BIRD development set, showing significant improvements in execution accuracy, with the 0.5B model reaching 56.87% execution accuracy (EX) and the 1.5B model achieving 67.08% EX.
|
| 17 |
+
|
| 18 |
+
Code and datasets are available on the official GitHub repository: [https://github.com/alibaba/SLM-SQL](https://github.com/alibaba/SLM-SQL).
|
| 19 |
+
|
| 20 |
+
## Usage
|
| 21 |
+
|
| 22 |
+
You can use the model with the Hugging Face `transformers` library. Below is a basic example demonstrating how to load the model and generate a SQL query from a natural language question.
|
| 23 |
+
|
| 24 |
+
```python
|
| 25 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 26 |
+
import torch
|
| 27 |
+
|
| 28 |
+
# Replace "your_model_id_here" with the specific model ID you want to use (e.g., SLM-SQL/slm-sql-1.5b)
|
| 29 |
+
model_name = "your_model_id_here"
|
| 30 |
+
|
| 31 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 32 |
+
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
|
| 33 |
+
|
| 34 |
+
# Example Text-to-SQL prompt
|
| 35 |
+
question = "List the names of all employees who joined after 2020."
|
| 36 |
+
schema = "CREATE TABLE employees (employee_id INT, name TEXT, join_date DATE);"
|
| 37 |
+
|
| 38 |
+
# Construct the prompt according to the model's expected format (this may vary)
|
| 39 |
+
# A common format for Text-to-SQL is providing the database schema and the natural language question.
|
| 40 |
+
prompt = f"Given the database schema below, generate a SQL query for the following question:
|
| 41 |
+
|
| 42 |
+
" \
|
| 43 |
+
f"Schema: {schema}
|
| 44 |
+
" \
|
| 45 |
+
f"Question: {question}
|
| 46 |
+
|
| 47 |
+
" \
|
| 48 |
+
f"SQL Query:"
|
| 49 |
+
|
| 50 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 51 |
+
|
| 52 |
+
# Generate the SQL query
|
| 53 |
+
# Adjust generation parameters like max_new_tokens, temperature, top_p as needed
|
| 54 |
+
outputs = model.generate(
|
| 55 |
+
**inputs,
|
| 56 |
+
max_new_tokens=256, # Set an appropriate length for SQL queries
|
| 57 |
+
temperature=0.7,
|
| 58 |
+
top_p=0.9,
|
| 59 |
+
do_sample=True,
|
| 60 |
+
eos_token_id=tokenizer.eos_token_id
|
| 61 |
+
)
|
| 62 |
+
|
| 63 |
+
# Decode the generated output, skipping the input prompt and special tokens
|
| 64 |
+
generated_sql = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True).strip()
|
| 65 |
+
|
| 66 |
+
print(f"Question: {question}")
|
| 67 |
+
print(f"Generated SQL: {generated_sql}")
|
| 68 |
+
|
| 69 |
+
# Example output might look like:
|
| 70 |
+
# SELECT name FROM employees WHERE join_date > '2020-12-31';
|
| 71 |
+
```
|