--- base_model: unsloth/qwen3-4b-base-unsloth-bnb-4bit library_name: peft model_name: Qwen3-4B-AI-Review-Detector tags: - base_model:adapter:unsloth/qwen3-4b-base-unsloth-bnb-4bit - lora - sft - transformers - trl - unsloth - text-classification - korean licence: license pipeline_tag: text-generation license: apache-2.0 language: - ko --- # Qwen3-4B-AI-Review-Detector This model is a fine-tuned version of [unsloth/qwen3-4b-base-unsloth-bnb-4bit](https://huggingface.co/unsloth/qwen3-4b-base-unsloth-bnb-4bit) designed to detect whether a Korean **cosmetics** review is **Human-Written (HWR)** or **LLM-Generated (LGR)**. It has been trained using [TRL](https://github.com/huggingface/trl) and [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning. ## Model Details * **Base Model**: `unsloth/qwen3-4b-base-unsloth-bnb-4bit` * **Task**: Binary Classification (via Text Generation) * **Class 0**: HWR (Human Written Review) * **Class 1**: LGR (LLM Generated Review) * **Language**: Korean * **Domain**: Cosmetics / Beauty * **Training Method**: LoRA (Low-Rank Adaptation) ## Quick start You can use the `pipeline` from the `transformers` library to run inference. ```python from transformers import pipeline # Load the model # Replace 'None' with your Hugging Face model ID (e.g., "username/Qwen3-4B-AI-Review-Detector") model_id = "jedimark/Qwen3-4B-AI-Review-Detector" generator = pipeline("text-generation", model=model_id, device_map="auto") # Example review review_text = "이 제품 정말 좋아요! 배송도 빠르고 품질도 만족합니다." # Construct the prompt prompt = f"""다음 리뷰 텍스트가 사람이 작성한 것인지(Human Written) LLM이 생성한 것인지 판단하여 분류하세요. {review_text} Classify this review into one of the following: class 0: HWR (Human Written Review) class 1: LGR (LLM Generated Review) SOLUTION The correct answer is: class""" # Run inference output = generator(prompt, max_new_tokens=1, return_full_text=False)[0] print(f"Predicted Class: {output['generated_text']}") # Output: 0 (Human Written) or 1 (LLM Generated) ``` ## Training procedure This model was trained using **SFT (Supervised Fine-Tuning)** with the following configuration: * **Dataset**: Custom dataset of Korean **cosmetics** reviews labeled as Human-Written (0) or LLM-Generated (1). * **Quantization**: 4-bit quantization using `bitsandbytes` (BnB) for memory efficiency. * **LoRA Configuration**: * Rank (r): 16 * Alpha: 16 * Target Modules: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` (All linear layers) * **Optimization**: Trained with `Unsloth` for faster training and lower memory usage. ### Framework versions - Unsloth 2024.x - PEFT 0.18.0 - TRL: 0.24.0 - Transformers: 4.57.2 - Pytorch: 2.9.0 - Datasets: 4.3.0 - Tokenizers: 0.22.1