fromWordsToMedia / README.md
farjadmalik's picture
fix readme
12a7635

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: FromWordsToMedia
emoji: πŸ–Ό
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: false
license: mit
short_description: Generates an image and a caption for social media posts

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

From Words to Reels

This project generates social media posts, including an image and a caption, from a user-provided text prompt. It leverages deep learning models for both text-to-image synthesis and text generation to create engaging content.

How it Works

The process is orchestrated by the main.py script and follows these steps:

  1. User Input: The script prompts the user to enter a text prompt.
  2. Image Generation: The VisualSynthesizer takes the prompt, enhances it, and uses a text-to-image diffusion model (e.g., Stable Diffusion) to generate a corresponding image.
  3. Caption Generation: The TextSynthesizer uses the original prompt to generate a suitable caption for the post using a causal language model.
  4. Output: Both the generated image (.png) and the caption (.txt) are saved to the outputs/ directory, prefixed with a timestamp.

Project Structure

.
β”œβ”€β”€ main.py                 # Main script to run the application
β”œβ”€β”€ README.md               # This file
β”œβ”€β”€ outputs/                # Directory for generated images and captions
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ visual_synthesizer.py # Handles image generation
β”‚   β”œβ”€β”€ text_synthesizer.py   # Handles text/caption generation
└── utils/
    β”œβ”€β”€ config.py             # Configuration for models and paths
    └── helpers.py            # Helper functions for saving files etc.

Setup and Installation

  1. Create a virtual environment:

    python -m venv venv
    venv\Scripts\activate
    
  2. Install dependencies: Create a requirements.txt file with the following content:

    torch
    diffusers
    transformers
    sentence-transformers
    Pillow
    accelerate
    

    Then run:

    pip install -r requirements.txt
    

Usage

To generate a post, run the main.py script:

python main.py

You will be prompted to enter your text. After processing, the generated image and caption will be saved in the outputs directory.

Configuration

You can customize the models and other parameters by editing the utils/config.py file. This allows you to easily swap out different text-to-image or language models.