MemeIntel / README.md

mbayan

Upload folder using huggingface_hub

e44768c verified about 6 hours ago

preview code

raw

history blame contribute delete

8.78 kB

metadata

license: llama3.2
base_model: meta-llama/Llama-3.2-11B-Vision-Instruct
datasets:
  - QCRI/MemeXplain
language:
  - en
  - ar
pipeline_tag: image-text-to-text
tags:
  - meme-detection
  - propaganda
  - hate-speech
  - multimodal
  - vision-language
  - explainability
library_name: transformers

MemeIntel: Explainable Detection of Propagandistic and Hateful Memes

MemeIntel is a Vision-Language Model fine-tuned from meta-llama/Llama-3.2-11B-Vision-Instruct for detecting propaganda in Arabic memes and hateful content in English memes, with explainable reasoning.

Model Description

MemeIntel addresses the challenge of understanding and moderating complex, context-dependent multimodal content on social media. The model performs:

Label Detection: Classifies memes into categories (propaganda/not-propaganda/not-meme/other for Arabic; hateful/not-hateful for English)
Explanation Generation: Provides human-readable explanations for its predictions

The model was trained using a novel multi-stage optimization approach on the MemeXplain dataset.

Usage

from transformers import MllamaForConditionalGeneration, AutoProcessor
from PIL import Image

# Load model and processor
model = MllamaForConditionalGeneration.from_pretrained(
    "QCRI/MemeIntel",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
processor = AutoProcessor.from_pretrained("QCRI/MemeIntel")

# Load your meme image
image = Image.open("path/to/meme.jpg")

Arabic Propaganda Meme Detection (Arabic Explanation)

messages = [
    {"role": "system", "content": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts."},
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in Arabic. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: لما يقولي انتي مالكيش عزيز\nاعز ما ليا البطاطس المقلية"}
    ]}
]

input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))

Arabic Propaganda Meme Detection (English Explanation)

messages = [
    {"role": "system", "content": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts."},
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in English. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: وأنا أبكي\n٣\nانت تتمنى وانا البي\n{7"}
    ]}
]

input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))

English Hateful Meme Detection

messages = [
    {"role": "system", "content": "You are an expert social media image analyzer specializing in identifying hateful content in memes"},
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "I will provide you with memes and the text extracted from these images. Your task is to classify the image as one of the following: 'hateful' or 'not-hateful' and provide a brief explanation. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: bows here, bows there, bows everywhere"}
    ]}
]

input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))

Prompt Templates

Arabic Meme (Arabic Explanation)

System: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts.

User: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in Arabic. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT}

Arabic Meme (English Explanation)

System: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts.

User: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in English. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT}

English Hateful Meme

System: You are an expert social media image analyzer specializing in identifying hateful content in memes

User: I will provide you with memes and the text extracted from these images. Your task is to classify the image as one of the following: 'hateful' or 'not-hateful' and provide a brief explanation. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT}

Expected Output Format

The model outputs in the following format:

Label: [classification_label]
Explanation: [reasoning for the classification]

Training

Base Model: meta-llama/Llama-3.2-11B-Vision-Instruct
Training Dataset: QCRI/MemeXplain
Training Method: Multi-stage optimization approach

Performance

MemeIntel achieves state-of-the-art results:

ArMeme (Arabic Propaganda): ~3% absolute improvement over previous SOTA
Hateful Memes (English): ~7% absolute improvement over previous SOTA

Citation

If you use this model, please cite:

@inproceedings{kmainasi-etal-2025-memeintel,
    title = "{M}eme{I}ntel: Explainable Detection of Propagandistic and Hateful Memes",
    author = "Kmainasi, Mohamed Bayan  and
      Hasnat, Abul  and
      Hasan, Md Arid  and
      Shahroor, Ali Ezzat  and
      Alam, Firoj",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.emnlp-main.1539/",
    doi = "10.18653/v1/2025.emnlp-main.1539",
    pages = "30263--30279",
    ISBN = "979-8-89176-332-6",
}

License

This model is released under the Llama 3.2 Community License.

Authors

Mohamed Bayan Kmainasi
Abul Hasnat
Md Arid Hasan
Ali Ezzat Shahroor
Firoj Alam

Qatar Computing Research Institute (QCRI), Hamad Bin Khalifa University