CySent-SmolLM3-3B
CySent-SmolLM3-3B is a fine-tuned version of HuggingFaceTB/SmolLM3-3B, specifically adapted for cybersecurity instruction-following tasks. It was trained on a 20,000-sample subset of the Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset. This model aims to act as a knowledgeable assistant for a wide range of cybersecurity topics. It achieves the following results on the evaluation set:
- Loss: 0.757
- Mean Token Accuracy: 0.796
Intended uses
This model is designed to assist with a variety of natural language cybersecurity tasks, including:
- Answering technical questions about security concepts.
- Explaining vulnerabilities, attack vectors, and defense mechanisms.
- Generating simple security-related scripts or commands (e.g., for network analysis or pentesting).
- Summarizing security logs, reports, or articles.
- Assisting in educational settings for cybersecurity students and professionals.
It is intended as a co-pilot or assistant and not as a standalone, automated security tool.
Limitations
- Not for Real-Time Threat Detection: This model is not designed for or capable of real-time intrusion detection or automated threat response.
- Potential for Hallucination: Like all language models, it may generate incorrect, outdated, or completely fabricated information. Always verify critical information from authoritative sources.
- Inherited Biases: The model may inherit biases and limitations from its base model (SmolLM3-3B) and the fine-tuning dataset.
- Knowledge Cutoff: The model's knowledge is limited to the data it was trained on and may not be aware of the very latest vulnerabilities or security trends.
- Misuse Potential: The model could potentially be used to generate malicious code or instructions for harmful purposes. Please use it responsibly and ethically.
How to use
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "RamzyBakir/CySent-SmolLM3-3B"
# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Create a prompt
prompt = "### Instruction:\nExplain what a SQL injection attack is and provide a simple example of a vulnerable code snippet.\n\n### Response:\n"
# Generate a response
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=250, do_sample=True, temperature=0.7, top_p=0.9)
# Decode and print the result
response = tokenizer.decode(output[0], skip_special_tokens=True)
print(response)
Training procedure
Training hyperparameters
The model was fine-tuned using Low-Rank Adaptation (LoRA) with the following configuration:
SFTConfig:
max_length: 2048per_device_train_batch_size: 8gradient_accumulation_steps: 2learning_rate: 1e-4num_train_epochs: 3warmup_ratio: 0.1weight_decay: 0.01optim: adamw_torchbf16: Trueeval_strategy: stepseval_steps: 200save_steps: 200metric_for_best_model: eval_loss
LoraConfig:
r: 16lora_alpha: 32lora_dropout: 0.05task_type: CAUSAL_LMtarget_modules: ["q_proj", "k_proj", "v_proj", "o_proj"]
Training results
The model was trained for 3200 steps on a single H200 GPU. The training and validation metrics progressed as follows:
| Step | Training Loss | Validation Loss | Entropy | Num Tokens | Mean Token Accuracy |
|---|---|---|---|---|---|
| 200 | 1.111500 | 1.045437 | 1.002200 | 2,182,437.00 | 0.740981 |
| 400 | 0.975900 | 0.944684 | 0.917857 | 4,368,626.00 | 0.759094 |
| 800 | 0.863500 | 0.860705 | 0.862549 | 8,721,104.00 | 0.775031 |
| 1200 | 0.834900 | 0.816342 | 0.849365 | 13,096,717.00 | 0.784405 |
| 1600 | 0.792200 | 0.794083 | 0.802182 | 17,452,772.00 | 0.788403 |
| 2000 | 0.777900 | 0.779576 | 0.790627 | 21,807,624.00 | 0.791107 |
| 2400 | 0.749800 | 0.771720 | 0.761689 | 26,151,814.00 | 0.792799 |
| 2800 | 0.747800 | 0.762957 | 0.761588 | 30,504,962.00 | 0.794528 |
| 3200 | 0.735800 | 0.757395 | 0.757575 | 34,860,059.00 | 0.795802 |
The model achieved its best performance at the final step, with a validation loss of 0.757 and a mean token accuracy of 0.796.
- Downloads last month
- 10