--- license: llama3.2 base_model: meta-llama/Llama-3.2-11B-Vision-Instruct datasets: - QCRI/MemeXplain language: - en - ar pipeline_tag: image-text-to-text tags: - meme-detection - propaganda - hate-speech - multimodal - vision-language - explainability library_name: transformers --- # MemeIntel: Explainable Detection of Propagandistic and Hateful Memes MemeIntel is a Vision-Language Model fine-tuned from [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) for detecting propaganda in Arabic memes and hateful content in English memes, with explainable reasoning. ## Model Description MemeIntel addresses the challenge of understanding and moderating complex, context-dependent multimodal content on social media. The model performs: - **Label Detection**: Classifies memes into categories (propaganda/not-propaganda/not-meme/other for Arabic; hateful/not-hateful for English) - **Explanation Generation**: Provides human-readable explanations for its predictions The model was trained using a novel multi-stage optimization approach on the [MemeXplain](https://huggingface.co/datasets/QCRI/MemeXplain) dataset. ## Usage ```python from transformers import MllamaForConditionalGeneration, AutoProcessor from PIL import Image # Load model and processor model = MllamaForConditionalGeneration.from_pretrained( "QCRI/MemeIntel", torch_dtype=torch.bfloat16, device_map="auto" ) processor = AutoProcessor.from_pretrained("QCRI/MemeIntel") # Load your meme image image = Image.open("path/to/meme.jpg") ``` ### Arabic Propaganda Meme Detection (Arabic Explanation) ```python messages = [ {"role": "system", "content": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts."}, {"role": "user", "content": [ {"type": "image"}, {"type": "text", "text": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in Arabic. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: لما يقولي انتي مالكيش عزيز\nاعز ما ليا البطاطس المقلية"} ]} ] input_text = processor.apply_chat_template(messages, add_generation_prompt=True) inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device) output = model.generate(**inputs, max_new_tokens=256) print(processor.decode(output[0], skip_special_tokens=True)) ``` ### Arabic Propaganda Meme Detection (English Explanation) ```python messages = [ {"role": "system", "content": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts."}, {"role": "user", "content": [ {"type": "image"}, {"type": "text", "text": "You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in English. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: وأنا أبكي\n٣\nانت تتمنى وانا البي\n{7"} ]} ] input_text = processor.apply_chat_template(messages, add_generation_prompt=True) inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device) output = model.generate(**inputs, max_new_tokens=256) print(processor.decode(output[0], skip_special_tokens=True)) ``` ### English Hateful Meme Detection ```python messages = [ {"role": "system", "content": "You are an expert social media image analyzer specializing in identifying hateful content in memes"}, {"role": "user", "content": [ {"type": "image"}, {"type": "text", "text": "I will provide you with memes and the text extracted from these images. Your task is to classify the image as one of the following: 'hateful' or 'not-hateful' and provide a brief explanation. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: bows here, bows there, bows everywhere"} ]} ] input_text = processor.apply_chat_template(messages, add_generation_prompt=True) inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to(model.device) output = model.generate(**inputs, max_new_tokens=256) print(processor.decode(output[0], skip_special_tokens=True)) ``` ## Prompt Templates ### Arabic Meme (Arabic Explanation) ``` System: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. User: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in Arabic. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT} ``` ### Arabic Meme (English Explanation) ``` System: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. User: You are an expert social media image analyzer specializing in identifying propaganda in Arabic contexts. I will provide you with Arabic memes and the text extracted from these images. Your task is to classify the image as one of the following: 'propaganda', 'not-propaganda', 'not-meme', or 'other', and provide a brief explanation in English. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT} ``` ### English Hateful Meme ``` System: You are an expert social media image analyzer specializing in identifying hateful content in memes User: I will provide you with memes and the text extracted from these images. Your task is to classify the image as one of the following: 'hateful' or 'not-hateful' and provide a brief explanation. Start your response with 'Label:' followed by the classification label, then on a new line begin with 'Explanation:' and briefly state your reasoning. Text extracted: {OCR_TEXT} ``` ## Expected Output Format The model outputs in the following format: ``` Label: [classification_label] Explanation: [reasoning for the classification] ``` ## Training - **Base Model**: [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) - **Training Dataset**: [QCRI/MemeXplain](https://huggingface.co/datasets/QCRI/MemeXplain) - **Training Method**: Multi-stage optimization approach ## Performance MemeIntel achieves state-of-the-art results: - **ArMeme (Arabic Propaganda)**: ~3% absolute improvement over previous SOTA - **Hateful Memes (English)**: ~7% absolute improvement over previous SOTA ## Citation If you use this model, please cite: ```bibtex @inproceedings{kmainasi-etal-2025-memeintel, title = "{M}eme{I}ntel: Explainable Detection of Propagandistic and Hateful Memes", author = "Kmainasi, Mohamed Bayan and Hasnat, Abul and Hasan, Md Arid and Shahroor, Ali Ezzat and Alam, Firoj", editor = "Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet", booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2025", address = "Suzhou, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.emnlp-main.1539/", doi = "10.18653/v1/2025.emnlp-main.1539", pages = "30263--30279", ISBN = "979-8-89176-332-6", } ``` ## License This model is released under the [Llama 3.2 Community License](https://www.llama.com/llama3_2/license/). ## Authors - Mohamed Bayan Kmainasi - Abul Hasnat - Md Arid Hasan - Ali Ezzat Shahroor - Firoj Alam Qatar Computing Research Institute (QCRI), Hamad Bin Khalifa University