Spaces:

farjadmalik
/

fromWordsToMedia

Sleeping

App Files Files Community

fromWordsToMedia / README.md

farjadmalik

fix readme

12a7635 4 months ago

preview code

raw

history blame contribute delete

2.58 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: FromWordsToMedia
emoji: 🖼
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: false
license: mit
short_description: Generates an image and a caption for social media posts

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

From Words to Reels

This project generates social media posts, including an image and a caption, from a user-provided text prompt. It leverages deep learning models for both text-to-image synthesis and text generation to create engaging content.

How it Works

The process is orchestrated by the main.py script and follows these steps:

User Input: The script prompts the user to enter a text prompt.
Image Generation: The VisualSynthesizer takes the prompt, enhances it, and uses a text-to-image diffusion model (e.g., Stable Diffusion) to generate a corresponding image.
Caption Generation: The TextSynthesizer uses the original prompt to generate a suitable caption for the post using a causal language model.
Output: Both the generated image (.png) and the caption (.txt) are saved to the outputs/ directory, prefixed with a timestamp.

Project Structure

.
├── main.py                 # Main script to run the application
├── README.md               # This file
├── outputs/                # Directory for generated images and captions
├── src/
│   ├── visual_synthesizer.py # Handles image generation
│   ├── text_synthesizer.py   # Handles text/caption generation
└── utils/
    ├── config.py             # Configuration for models and paths
    └── helpers.py            # Helper functions for saving files etc.

Setup and Installation

Create a virtual environment:

python -m venv venv
venv\Scripts\activate

Install dependencies: Create a requirements.txt file with the following content:

torch
diffusers
transformers
sentence-transformers
Pillow
accelerate

Then run:

pip install -r requirements.txt

Usage

To generate a post, run the main.py script:

python main.py

You will be prompted to enter your text. After processing, the generated image and caption will be saved in the outputs directory.

Configuration

You can customize the models and other parameters by editing the utils/config.py file. This allows you to easily swap out different text-to-image or language models.