Enhance model card: Add metadata, paper link, GitHub, and usage example

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +71 -3
README.md CHANGED
@@ -1,3 +1,71 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
+ tags:
6
+ - text-to-sql
7
+ - sql
8
+ - qwen2
9
+ - small-language-model
10
+ ---
11
+
12
+ # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
13
+
14
+ This repository contains **SLM-SQL**, a small language model (SLM) designed for Text-to-SQL tasks, as presented in the paper [SLM-SQL: An Exploration of Small Language Models for Text-to-SQL](https://huggingface.co/papers/2507.22478).
15
+
16
+ SLM-SQL aims to bridge the performance gap between large language models and smaller models in SQL generation, offering advantages in inference speed and suitability for edge deployment. It achieves this through innovations in post-training techniques, including supervised fine-tuning and reinforcement learning, applied to a Qwen2 base model. The models were evaluated on the BIRD development set, showing significant improvements in execution accuracy, with the 0.5B model reaching 56.87% execution accuracy (EX) and the 1.5B model achieving 67.08% EX.
17
+
18
+ Code and datasets are available on the official GitHub repository: [https://github.com/alibaba/SLM-SQL](https://github.com/alibaba/SLM-SQL).
19
+
20
+ ## Usage
21
+
22
+ You can use the model with the Hugging Face `transformers` library. Below is a basic example demonstrating how to load the model and generate a SQL query from a natural language question.
23
+
24
+ ```python
25
+ from transformers import AutoModelForCausalLM, AutoTokenizer
26
+ import torch
27
+
28
+ # Replace "your_model_id_here" with the specific model ID you want to use (e.g., SLM-SQL/slm-sql-1.5b)
29
+ model_name = "your_model_id_here"
30
+
31
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
32
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
33
+
34
+ # Example Text-to-SQL prompt
35
+ question = "List the names of all employees who joined after 2020."
36
+ schema = "CREATE TABLE employees (employee_id INT, name TEXT, join_date DATE);"
37
+
38
+ # Construct the prompt according to the model's expected format (this may vary)
39
+ # A common format for Text-to-SQL is providing the database schema and the natural language question.
40
+ prompt = f"Given the database schema below, generate a SQL query for the following question:
41
+
42
+ " \
43
+ f"Schema: {schema}
44
+ " \
45
+ f"Question: {question}
46
+
47
+ " \
48
+ f"SQL Query:"
49
+
50
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
51
+
52
+ # Generate the SQL query
53
+ # Adjust generation parameters like max_new_tokens, temperature, top_p as needed
54
+ outputs = model.generate(
55
+ **inputs,
56
+ max_new_tokens=256, # Set an appropriate length for SQL queries
57
+ temperature=0.7,
58
+ top_p=0.9,
59
+ do_sample=True,
60
+ eos_token_id=tokenizer.eos_token_id
61
+ )
62
+
63
+ # Decode the generated output, skipping the input prompt and special tokens
64
+ generated_sql = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True).strip()
65
+
66
+ print(f"Question: {question}")
67
+ print(f"Generated SQL: {generated_sql}")
68
+
69
+ # Example output might look like:
70
+ # SELECT name FROM employees WHERE join_date > '2020-12-31';
71
+ ```