Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# YOLOv12‑x Object Detector
|
| 2 |
|
| 3 |
Ultralytics’s attention‑centric, real‑time object detection model **YOLOv12‑x** is now available on Hugging Face.
|
|
@@ -6,53 +9,98 @@ Ultralytics’s attention‑centric, real‑time object detection model **YOLOv1
|
|
| 6 |
|
| 7 |
## 🧠 Model Description
|
| 8 |
|
| 9 |
-
YOLOv12‑x builds on the YOLO12 family by combining **Area Attention** and **R‑ELAN** modules to deliver state‑of‑the‑art detection accuracy with fewer parameters and FLOPs. Optional **FlashAttention** integration further reduces memory access overhead and boosts inference speed on modern NVIDIA GPUs
|
| 10 |
|
| 11 |
---
|
| 12 |
|
| 13 |
## ⚙️ Requirements
|
| 14 |
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
|
|
|
| 25 |
|
| 26 |
---
|
| 27 |
|
| 28 |
-
## 🚀 Usage
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
## 📊 Performance & Use Cases
|
|
|
|
| 32 |
Benchmarked on COCO val2017 at 640 × 640 resolution on an NVIDIA T4 GPU:
|
| 33 |
|
| 34 |
-
Model
|
| 35 |
-
|
|
|
|
| 36 |
|
| 37 |
YOLOv12‑x excels in scenarios demanding both high accuracy and near‑real‑time throughput:
|
| 38 |
|
| 39 |
-
Autonomous vehicles
|
|
|
|
|
|
|
| 40 |
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
Surveillance & security systems
|
| 44 |
|
|
|
|
| 45 |
|
|
|
|
| 46 |
@article{tian2025yolov12,
|
| 47 |
title={YOLOv12: Attention-Centric Real-Time Object Detectors},
|
| 48 |
author={Tian, Yunjie and Ye, Qixiang and Doermann, David},
|
| 49 |
journal={arXiv preprint arXiv:2502.12524},
|
| 50 |
year={2025}
|
| 51 |
}
|
|
|
|
| 52 |
|
| 53 |
-
|
| 54 |
-
from ultralytics import YOLO
|
| 55 |
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: object-detection
|
| 3 |
+
---
|
| 4 |
# YOLOv12‑x Object Detector
|
| 5 |
|
| 6 |
Ultralytics’s attention‑centric, real‑time object detection model **YOLOv12‑x** is now available on Hugging Face.
|
|
|
|
| 9 |
|
| 10 |
## 🧠 Model Description
|
| 11 |
|
| 12 |
+
YOLOv12‑x builds on the YOLO12 family by combining **Area Attention** and **R‑ELAN** modules to deliver state‑of‑the‑art detection accuracy with fewer parameters and FLOPs. Optional **FlashAttention** integration further reduces memory access overhead and boosts inference speed on modern NVIDIA GPUs citeturn0view0.
|
| 13 |
|
| 14 |
---
|
| 15 |
|
| 16 |
## ⚙️ Requirements
|
| 17 |
|
| 18 |
+
* **Python** ≥ 3.8
|
| 19 |
+
* **PyTorch** ≥ 1.10 (CUDA‑enabled)
|
| 20 |
+
* **CUDA** ≥ 11.2 compatible GPU
|
| 21 |
+
* **Optional**: FlashAttention (install via `pip install flash-attn`)
|
| 22 |
+
* **Recommended GPU architectures** for FlashAttention support:
|
| 23 |
+
|
| 24 |
+
* Turing (e.g. T4, Quadro RTX)
|
| 25 |
+
* Ampere (RTX 30 series, A30/40/100)
|
| 26 |
+
* Ada Lovelace (RTX 40 series)
|
| 27 |
+
* Hopper (H100/H200) citeturn0view0
|
| 28 |
+
* **System specs**: ≥ 8 GB RAM, ≥ 50 GB free disk
|
| 29 |
|
| 30 |
---
|
| 31 |
|
| 32 |
+
## 🚀 Installation & Usage
|
| 33 |
+
|
| 34 |
+
```bash
|
| 35 |
+
pip install ultralytics
|
| 36 |
+
# (Optional for FlashAttention)
|
| 37 |
+
pip install flash-attn
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
**Python example**:
|
| 41 |
+
|
| 42 |
+
```python
|
| 43 |
+
from ultralytics import YOLO
|
| 44 |
+
|
| 45 |
+
# Load a COCO-pretrained YOLO12x model
|
| 46 |
+
model = YOLO("yolo12x.pt")
|
| 47 |
+
|
| 48 |
+
# Train the model on the COCO8 example dataset for 100 epochs
|
| 49 |
+
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
|
| 50 |
|
| 51 |
+
# Run inference with the YOLO12n model on the 'bus.jpg' image
|
| 52 |
+
results = model("path/to/bus.jpg")
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
**CLI example**:
|
| 56 |
+
|
| 57 |
+
```bash
|
| 58 |
+
yolo detect predict model=yolov12x.pt source=test.jpg imgsz=640 conf=0.25
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
|
| 63 |
## 📊 Performance & Use Cases
|
| 64 |
+
|
| 65 |
Benchmarked on COCO val2017 at 640 × 640 resolution on an NVIDIA T4 GPU:
|
| 66 |
|
| 67 |
+
| Model | mAP\@0.5:0.95 | Latency (ms) | Params (M) | FLOPs (B) | |
|
| 68 |
+
| -------- | ------------- | ------------ | ---------- | --------- | ----------------- |
|
| 69 |
+
| YOLO12‑x | 55.2 % | 11.79 | 59.1 | 199.0 | citeturn0view0 |
|
| 70 |
|
| 71 |
YOLOv12‑x excels in scenarios demanding both high accuracy and near‑real‑time throughput:
|
| 72 |
|
| 73 |
+
* Autonomous vehicles
|
| 74 |
+
* Industrial inspection
|
| 75 |
+
* Surveillance & security systems
|
| 76 |
|
| 77 |
+
---
|
|
|
|
|
|
|
| 78 |
|
| 79 |
+
## 📚 References
|
| 80 |
|
| 81 |
+
```bibtex
|
| 82 |
@article{tian2025yolov12,
|
| 83 |
title={YOLOv12: Attention-Centric Real-Time Object Detectors},
|
| 84 |
author={Tian, Yunjie and Ye, Qixiang and Doermann, David},
|
| 85 |
journal={arXiv preprint arXiv:2502.12524},
|
| 86 |
year={2025}
|
| 87 |
}
|
| 88 |
+
```
|
| 89 |
|
| 90 |
+
---
|
|
|
|
| 91 |
|
| 92 |
+
## 📝 Summary
|
| 93 |
+
|
| 94 |
+
| Feature | Details |
|
| 95 |
+
| ------------------ | --------------------------------------------- |
|
| 96 |
+
| **Model** | YOLOv12‑x |
|
| 97 |
+
| **Architecture** | Area Attention + R‑ELAN |
|
| 98 |
+
| **FlashAttention** | Optional (GPU‑accelerated) |
|
| 99 |
+
| **Requirements** | Python ≥ 3.8, PyTorch ≥ 1.10, CUDA ≥ 11.2 |
|
| 100 |
+
| **Use Cases** | Real‑time object detection with high accuracy |
|
| 101 |
+
|
| 102 |
+
```plaintext
|
| 103 |
+
Files:
|
| 104 |
+
├── yolov12x.pt # Trained model weights
|
| 105 |
+
├── README.md # This file
|
| 106 |
+
```
|