Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.1.0
title: FromWordsToMedia
emoji: πΌ
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: false
license: mit
short_description: Generates an image and a caption for social media posts
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
From Words to Reels
This project generates social media posts, including an image and a caption, from a user-provided text prompt. It leverages deep learning models for both text-to-image synthesis and text generation to create engaging content.
How it Works
The process is orchestrated by the main.py script and follows these steps:
- User Input: The script prompts the user to enter a text prompt.
- Image Generation: The
VisualSynthesizertakes the prompt, enhances it, and uses a text-to-image diffusion model (e.g., Stable Diffusion) to generate a corresponding image. - Caption Generation: The
TextSynthesizeruses the original prompt to generate a suitable caption for the post using a causal language model. - Output: Both the generated image (
.png) and the caption (.txt) are saved to theoutputs/directory, prefixed with a timestamp.
Project Structure
.
βββ main.py # Main script to run the application
βββ README.md # This file
βββ outputs/ # Directory for generated images and captions
βββ src/
β βββ visual_synthesizer.py # Handles image generation
β βββ text_synthesizer.py # Handles text/caption generation
βββ utils/
βββ config.py # Configuration for models and paths
βββ helpers.py # Helper functions for saving files etc.
Setup and Installation
Create a virtual environment:
python -m venv venv venv\Scripts\activateInstall dependencies: Create a
requirements.txtfile with the following content:torch diffusers transformers sentence-transformers Pillow accelerateThen run:
pip install -r requirements.txt
Usage
To generate a post, run the main.py script:
python main.py
You will be prompted to enter your text. After processing, the generated image and caption will be saved in the outputs directory.
Configuration
You can customize the models and other parameters by editing the utils/config.py file. This allows you to easily swap out different text-to-image or language models.