Spaces:

tomvaillant
/

graphics-llm

Running

App Files Files Community

graphics-llm / README.md

Tom

Update to Jina-CLIP-v2 embeddings and rebrand to Viz LLM

721d500 about 1 month ago

preview code

raw

history blame

8.96 kB

metadata

title: Graphics Guide / Design Assistant
emoji: 📊
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: RAG-powered graphics and design assistant for data visualization
license: mit

📊 Graphics Guide / Design Assistant

A RAG-powered AI assistant that helps users select appropriate visualizations and provides technical implementation guidance for creating effective information graphics. Built with Supabase PGVector and Hugging Face Inference Providers, powered by a knowledge base of graphics research and design principles.

✨ Features

🎯 Design Recommendations: Get tailored visualization suggestions based on your intent and data characteristics
📚 Research-Backed Guidance: Access insights from academic papers and design best practices
🔍 Context-Aware Retrieval: Semantic search finds the most relevant examples and knowledge for your needs
🚀 API Access: Built-in REST API for integration with external applications
💬 Chat Interface: User-friendly conversational interface
⚡ Technical Implementation: Practical guidance on tools, techniques, and code examples

🏗️ Architecture

┌──────────────────────────────────────┐
│      Gradio UI + API Endpoints       │
└──────────────┬───────────────────────┘
               │
┌──────────────▼───────────────────────┐
│          RAG Pipeline                │
│  • Query Understanding               │
│  • Document Retrieval (PGVector)     │
│  • Response Generation (LLM)         │
└──────────────┬───────────────────────┘
               │
    ┌──────────┴──────────┐
    │                     │
┌───▼───────────┐  ┌─────▼────────────┐
│ Supabase      │  │ HF Inference     │
│ PGVector DB   │  │ Providers        │
│ (198 docs)    │  │ (Llama 3.1)      │
└───────────────┘  └──────────────────┘

🚀 Quick Start

Local Development

Clone the repository

git clone <your-repo-url>
cd graphics-llm

Install dependencies
```
pip install -r requirements.txt
```
Set up environment variables
```
cp .env.example .env
# Edit .env with your credentials
```
Required variables:
- SUPABASE_URL: Your Supabase project URL
- SUPABASE_KEY: Your Supabase anon key
- HF_TOKEN: Your Hugging Face API token (for LLM generation)
- JINA_API_KEY: Your Jina AI API token (for embeddings)
Run the application
```
python app.py
```
The app will be available at http://localhost:7860

Hugging Face Spaces Deployment

Create a new Space on Hugging Face
Push this repository to your Space
Set environment variables in Space settings:
- SUPABASE_URL
- SUPABASE_KEY
- HF_TOKEN
- JINA_API_KEY
Deploy - The Space will automatically build and launch

📚 Usage

Chat Interface

Simply ask your design questions:

"What's the best chart type for showing trends over time?"
"How do I create an effective infographic for complex data?"
"What are best practices for data visualization accessibility?"

The assistant will provide:

Design recommendations based on your intent
WHY each visualization type is suitable
HOW to implement it (tools, techniques, code)
Best practices from research and examples
Accessibility and effectiveness considerations

API Access

This app automatically exposes REST API endpoints for external integration.

Python Client:

from gradio_client import Client

client = Client("your-space-url")
result = client.predict(
    "What's the best chart for time series?",
    api_name="/recommend"
)
print(result)

JavaScript Client:

import { Client } from "@gradio/client";

const client = await Client.connect("your-space-url");
const result = await client.predict("/recommend", {
  message: "What's the best chart for time series?"
});
console.log(result.data);

cURL:

curl -X POST "https://your-space.hf.space/call/recommend" \
     -H "Content-Type: application/json" \
     -d '{"data": ["What's the best chart for time series?"]}'

Available Endpoints:

/call/recommend - Main design recommendation assistant
/gradio_api/openapi.json - OpenAPI specification

🗄️ Database

The app uses Supabase with PGVector extension to store and retrieve document chunks from graphics research and examples.

Database Schema:

CREATE TABLE document_embeddings (
  id BIGINT PRIMARY KEY,
  source_type TEXT, -- pdf, url, or image
  source_id TEXT, -- filename or URL
  title TEXT,
  content_type TEXT, -- text or image
  chunk_index INTEGER,
  chunk_text TEXT,
  page_number INTEGER,
  embedding VECTOR(1024), -- 1024-dimensional vectors
  metadata JSONB,
  word_count INTEGER,
  image_metadata JSONB,
  created_at TIMESTAMPTZ
);

Knowledge Base Content:

Research papers on data visualization
Design principles and best practices
Visual narrative techniques
Accessibility guidelines
Chart type selection guidance
Real-world examples and case studies

🛠️ Technology Stack

UI/API: Gradio - Automatic API generation
Vector Database: Supabase with PGVector extension
Embeddings: Jina-CLIP-v2 (1024-dimensional)
LLM: Hugging Face Inference Providers - Llama 3.1
Language: Python 3.9+

📁 Project Structure

graphics-llm/
├── app.py                    # Main Gradio application
├── requirements.txt          # Python dependencies
├── .env.example             # Environment variables template
├── README.md                # This file
└── src/
    ├── __init__.py
    ├── vectorstore.py       # Supabase PGVector connection
    ├── rag_pipeline.py      # RAG pipeline logic
    ├── llm_client.py        # Inference Provider client
    └── prompts.py           # Design recommendation prompt templates

⚙️ Configuration

Environment Variables

See .env.example for all available configuration options.

Required:

SUPABASE_URL - Supabase project URL
SUPABASE_KEY - Supabase anon key
HF_TOKEN - Hugging Face API token (for LLM generation)
JINA_API_KEY - Jina AI API token (for Jina-CLIP-v2 embeddings)

Optional:

LLM_MODEL - Model to use (default: meta-llama/Llama-3.1-8B-Instruct)
LLM_TEMPERATURE - Generation temperature (default: 0.2)
LLM_MAX_TOKENS - Max tokens to generate (default: 2000)
RETRIEVAL_K - Number of documents to retrieve (default: 5)
EMBEDDING_MODEL - Embedding model (default: jina-clip-v2)

Supported LLM Models

meta-llama/Llama-3.1-8B-Instruct (recommended)
meta-llama/Meta-Llama-3-8B-Instruct
Qwen/Qwen2.5-72B-Instruct
mistralai/Mistral-7B-Instruct-v0.3

💰 Cost Considerations

Hugging Face Inference Providers

Free tier: $0.10/month credits
PRO tier: $2.00/month credits + pay-as-you-go
Typical cost: ~$0.001-0.01 per query
Recommended budget: $10-50/month for moderate usage

Supabase

Free tier sufficient for most use cases
PGVector operations are standard database queries

Hugging Face Spaces

Free CPU hosting available
GPU upgrade: ~$0.60/hour (optional, not required)

🔮 Future Enhancements

Multi-turn conversation with memory
Code generation for visualization implementations
Interactive visualization previews
User-uploaded data analysis
Export recommendations as PDF/markdown
Community-contributed examples
Support for more design domains (UI/UX, print graphics)

🤝 Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

📄 License

MIT License - See LICENSE file for details

🙏 Acknowledgments

Knowledge base includes research papers on data visualization and information design
Built to support designers, journalists, and data practitioners

📞 Support

For issues or questions:

Open an issue on GitHub
Check the Hugging Face Spaces documentation
Review the Gradio documentation

Built with ❤️ for the design and visualization community