che111
/

AlphaMed-3B-base-rl

Model card Files Files and versions

che111 commited on Jun 1

Commit

248a640

·

verified ·

1 Parent(s): 3bbb93d

Update README.md

Files changed (1) hide show

README.md +41 -3

README.md CHANGED Viewed

@@ -1,3 +1,41 @@
----
-license: mit
----

+---
+license: mit
+---
+# 🧠 AlphaMed
+This is the official model checkpoint for the paper:
+**[AlphaMed: Incentivizing Medical Reasoning with minimalist Rule-Based RL](https://www.arxiv.org/abs/2505.17952)**
+AlphaMed is a medical large language model trained **without supervised fine-tuning on chain-of-thought (CoT) data**,
+relying solely on reinforcement learning to elicit step-by-step reasoning in complex medical tasks.
+## 🚀 Usage
+To use the model, format your input prompt as:
+> **Question:** [your medical question here]
+> **Please reason step by step, and put the final answer in \boxed{}**
+### 🔬 Example
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
+# Load model and tokenizer
+model_id = "che111/AlphaMed-3B-instruct-rl"  # Replace with actual repo path
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id)
+pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
+# Format question
+prompt = (
+    "Question: A 45-year-old patient presents with chest pain radiating to the left arm and elevated troponin levels. "
+    "What is the most likely diagnosis?\n"
+    "Please reason step by step, and put the final answer in \\boxed{}"
+)
+# Generate output
+max_new_tokens=8196
+output = pipe(prompt, max_new_tokens=max_new_tokens, do_sample=False)[0]["generated_text"]
+print(output)