🦊 DINOv3 Senko-san Detector (Native Resolution)

🇬🇧 English Description

This is a State-of-the-Art Senko-san (Sewayaki Kitsune) character classifier based on the Meta DINOv3 (ViT-Small) architecture. Unlike standard classifiers that resize images to a fixed square (e.g., 224x224), this model was trained using a True Native Resolution strategy.

🌟 Key Features

No "Squishing" or Padding: The model analyzes images in their original aspect ratio and resolution (up to 2500px). It sees the image exactly as you do.
Hard Negative Mining: The model was specifically trained to distinguish Senko from other "fox-girl" characters (e.g., Texas from Arknights, generic kitsunemimi). It looks at facial features and accessories, not just "fox ears".
High Accuracy: Achieves ~99% confidence on clear images and robustly handles difficult angles.

🏷️ Classes

The model detects 4 specific classes:

Senko (The Helpful Fox Senko-san)
Shiro
Sora
Other (The rest, not Senko)

🇷🇺 Описание на русском

Это SOTA классификатор персонажей из аниме "Заботливая 800-летняя жена" (Sewayaki Kitsune no Senko-san), основанный на архитектуре Meta DINOv3 (ViT-Small). В отличие от обычных моделей, которые сжимают ("жмыхают") картинки до квадрата 224x224, эта модель обучалась с использованием стратегии True Native Resolution.

🌟 Особенности

Без "Жмыха" (No Squish): Модель принимает изображения в их оригинальном разрешении и соотношении сторон. Она видит детали так же четко, как и человек.
Сложные негативы (Hard Negatives): Модель обучена не путать Сенко с другими лисодевочками (например, Техас из Arknights или просто артами с ушками). Она смотрит на черты лица и аксессуары, а не просто на наличие ушей.
Высокая точность: Уверенность модели достигает 99%+ на корректных изображениях.

🏷️ Классы

Модель различает 4 категории:

Senko (Сенко-сан)
Shiro (Широ)
Sora (Сора)
Other (Остальные, не Сенко)

🚀 How to Use / Как использовать

This model is packaged as a standalone .pth file. You need to define the wrapper class to load it. Модель упакована в автономный файл .pth. Вам нужно определить класс-обертку для её загрузки.

import torch
import torch.nn as nn
from transformers import AutoModel, AutoConfig, AutoImageProcessor
from PIL import Image
import numpy as np

# 1. Define the wrapper class / Определяем класс
class StandaloneDino(nn.Module):
    def __init__(self, config_dict, num_classes):
        super().__init__()
        config = AutoConfig.for_model(**config_dict)
        self.backbone = AutoModel.from_config(config)
        self.classifier = nn.Linear(config.hidden_size, num_classes)

    def forward(self, x):
        outputs = self.backbone(pixel_values=x)
        return self.classifier(outputs.pooler_output)

# 2. Load the model / Загружаем модель
# Download 'Senko_Detector_DinoV3_v1.pth' from Files tab
MODEL_FILE = "Senko_Detector_DinoV3_v1.pth"
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

checkpoint = torch.load(MODEL_FILE, map_location=DEVICE)
model = StandaloneDino(checkpoint['architecture_config'], len(checkpoint['classes']))
model.load_state_dict(checkpoint['state_dict'])
model.to(DEVICE).eval()

# 3. Preprocessing & Inference / Инференс
def prepare_native(image, patch_size=16):
    w, h = image.size
    # Resize only if image is too huge / Ресайз только если картинка гигантская
    if max(w, h) > 2500:
        scale = 2500 / max(w, h)
        image = image.resize((int(w * scale), int(h * scale)), Image.Resampling.BICUBIC)
        w, h = image.size
        
    # Crop to patch size / Подгонка под размер патча 16
    new_w = w - (w % patch_size)
    new_h = h - (h % patch_size)
    if new_w != w or new_h != h:
        image = image.crop((0, 0, new_w, new_h))
    return image

# Load Image
image_path = "test.jpg"
image = Image.open(image_path).convert("RGB")
image = prepare_native(image)

# Normalize
p_conf = checkpoint['processor_config']
mean = np.array(p_conf.get('image_mean', [0.485, 0.456, 0.406])).reshape(1, 1, 3)
std = np.array(p_conf.get('image_std', [0.229, 0.224, 0.225])).reshape(1, 1, 3)

img_arr = np.array(image).astype(np.float32) / 255.0
input_tensor = (img_arr - mean) / std
input_tensor = torch.from_numpy(input_tensor.transpose(2, 0, 1)).unsqueeze(0).to(DEVICE)

# Predict
with torch.no_grad():
    logits = model(input_tensor)
    probs = torch.softmax(logits, dim=1)
    conf, pred_idx = torch.max(probs, 1)
    
    print(f"Class: {checkpoint['classes'][pred_idx]} | Confidence: {conf.item():.4f}")

🦊 Want to create your own Senko? / Хотите создать свою Сенко?

Generate high-quality anime art using our bots and AI tools.
Генерируйте качественные аниме-арты с помощью наших ботов.

More AI Models & News / Больше моделей и новостей:

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for AugustLabs/dinov3-senko-detector

Base model

facebook/dinov3-vit7b16-pretrain-lvd1689m

Finetuned

facebook/dinov3-vits16-pretrain-lvd1689m

Finetuned

(2)

this model