Spaces:

tomvaillant
/

graphics-llm

Running

Tom Claude commited on Nov 4

Commit

721d500

1 Parent(s): 25028b7

Update to Jina-CLIP-v2 embeddings and rebrand to Viz LLM

Major changes:
- Upgraded embeddings from sentence-transformers to Jina-CLIP-v2 (1024-dim)
- Added JINA_API_KEY support for Jina AI API integration
- Rebranded from "Graphics Guide" to "Viz LLM"
- Removed dark theme styling, restored default Gradio white theme
- Implemented rate limiting: 20 queries per day per user
- Added researcher credits and model attribution
- Updated all documentation (README, QUICKSTART, .env.example)
- Added language disclaimer (English optimized)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (14) hide show

.env.example +61 -0
.gitignore +50 -0
.mcp.json +8 -0
QUICKSTART.md +15 -0
README.md +275 -3
app.py +161 -0
assets/bellingcat.svg +10 -0
requirements.txt +14 -0
src/__init__.py +3 -0
src/llm_client.py +195 -0
src/prompts.py +128 -0
src/rag_pipeline.py +160 -0
src/vectorstore.py +313 -0
test_vectorstore.py +34 -0

.env.example ADDED Viewed

	@@ -0,0 +1,61 @@

+# Graphics Guide / Design Assistant - Environment Variables
+# =============================================================================
+# REQUIRED: Supabase Client Connection
+# =============================================================================
+# Get these from: Supabase Dashboard > Project Settings > API
+SUPABASE_URL=https://[PROJECT-REF].supabase.co
+SUPABASE_KEY=[YOUR-ANON-KEY]
+# =============================================================================
+# REQUIRED: Hugging Face API Token
+# =============================================================================
+# Get your token from: https://huggingface.co/settings/tokens
+# This is used for Inference Providers API access (LLM generation)
+HF_TOKEN=hf_your_token_here
+# =============================================================================
+# REQUIRED: Jina AI API Token
+# =============================================================================
+# Get your token from: https://jina.ai/
+# This is used for Jina-CLIP-v2 embeddings
+JINA_API_KEY=jina_your_token_here
+# =============================================================================
+# OPTIONAL: LLM Configuration
+# =============================================================================
+# Model to use for generation (default: meta-llama/Llama-3.1-8B-Instruct)
+# Other options:
+#   - meta-llama/Meta-Llama-3-8B-Instruct
+#   - Qwen/Qwen2.5-72B-Instruct
+#   - mistralai/Mistral-7B-Instruct-v0.3
+LLM_MODEL=meta-llama/Llama-3.1-8B-Instruct
+# Temperature for LLM generation (0.0 to 1.0, default: 0.7)
+# Lower = more focused/deterministic, Higher = more creative/diverse
+LLM_TEMPERATURE=0.7
+# Maximum tokens to generate (default: 2000)
+LLM_MAX_TOKENS=2000
+# =============================================================================
+# OPTIONAL: Vector Store Configuration
+# =============================================================================
+# Number of document chunks to retrieve for context (default: 5)
+RETRIEVAL_K=5
+# Embedding model for vector search (default: jina-clip-v2)
+# Note: Database has been re-embedded with Jina-CLIP-v2 (1024 dimensions)
+EMBEDDING_MODEL=jina-clip-v2
+# =============================================================================
+# OPTIONAL: Gradio Configuration
+# =============================================================================
+# Port for Gradio app (default: 7860)
+GRADIO_PORT=7860
+# Server name (default: 0.0.0.0 for all interfaces)
+GRADIO_SERVER_NAME=0.0.0.0
+# Enable Gradio sharing link (default: False)
+GRADIO_SHARE=False

.gitignore ADDED Viewed

	@@ -0,0 +1,50 @@

+# Environment variables (contains secrets)
+.env
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Virtual environments
+venv/
+env/
+ENV/
+.venv
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+# Jupyter Notebook
+.ipynb_checkpoints
+# macOS
+.DS_Store
+# Gradio
+gradio_cached_examples/
+flagged/
+# Logs
+*.log

.mcp.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+  "mcpServers": {
+    "supabase": {
+      "type": "http",
+      "url": "https://mcp.supabase.com/mcp?project_ref=qqdjbhrpjjediqmdzxin"
+    }
+  }
+}

QUICKSTART.md ADDED Viewed

	@@ -0,0 +1,15 @@

+# Graphics Guide RAG App Quickstart
+## Stack
+- **Frontend**: Gradio 4.0+ (ChatInterface with auto API endpoints)
+- **Database**: Supabase PGVector (1024-dim embeddings, HNSW index)
+- **LLM**: HuggingFace Inference API (Llama-3.1-8B-Instruct)
+- **Embeddings**: Jina AI API (jina-clip-v2, 1024-dim)
+- **Client**: Supabase Python client + InferenceClient (huggingface_hub)
+## Key Parameters
+- **Temperature**: 0.2 (low hallucination)
+- **Max Tokens**: 800 (moderate responses)
+- **Retrieval K**: 5 documents
+- **Match Threshold**: 0.5 (cosine similarity)
+- **Connection**: Direct via Supabase client

README.md CHANGED Viewed

@@ -1,12 +1,284 @@
 ---
-title: Graphics Llm
 emoji: 📊
 colorFrom: blue
-colorTo: pink
 sdk: gradio
 sdk_version: 5.49.1
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Graphics Guide / Design Assistant
 emoji: 📊
 colorFrom: blue
+colorTo: purple
 sdk: gradio
 sdk_version: 5.49.1
 app_file: app.py
 pinned: false
+short_description: RAG-powered graphics and design assistant for data visualization
+license: mit
 ---
+# 📊 Graphics Guide / Design Assistant
+A RAG-powered AI assistant that helps users select appropriate visualizations and provides technical implementation guidance for creating effective information graphics. Built with Supabase PGVector and Hugging Face Inference Providers, powered by a knowledge base of graphics research and design principles.
+## ✨ Features
+- **🎯 Design Recommendations**: Get tailored visualization suggestions based on your intent and data characteristics
+- **📚 Research-Backed Guidance**: Access insights from academic papers and design best practices
+- **🔍 Context-Aware Retrieval**: Semantic search finds the most relevant examples and knowledge for your needs
+- **🚀 API Access**: Built-in REST API for integration with external applications
+- **💬 Chat Interface**: User-friendly conversational interface
+- **⚡ Technical Implementation**: Practical guidance on tools, techniques, and code examples
+## 🏗️ Architecture
+```
+┌──────────────────────────────────────┐
+│      Gradio UI + API Endpoints       │
+└──────────────┬───────────────────────┘
+               │
+┌──────────────▼───────────────────────┐
+│          RAG Pipeline                │
+│  • Query Understanding               │
+│  • Document Retrieval (PGVector)     │
+│  • Response Generation (LLM)         │
+└──────────────┬───────────────────────┘
+               │
+    ┌──────────┴──────────┐
+    │                     │
+┌───▼───────────┐  ┌─────▼────────────┐
+│ Supabase      │  │ HF Inference     │
+│ PGVector DB   │  │ Providers        │
+│ (198 docs)    │  │ (Llama 3.1)      │
+└───────────────┘  └──────────────────┘
+```
+## 🚀 Quick Start
+### Local Development
+1. **Clone the repository**
+   ```bash
+   git clone <your-repo-url>
+   cd graphics-llm
+   ```
+2. **Install dependencies**
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. **Set up environment variables**
+   ```bash
+   cp .env.example .env
+   # Edit .env with your credentials
+   ```
+   Required variables:
+   - `SUPABASE_URL`: Your Supabase project URL
+   - `SUPABASE_KEY`: Your Supabase anon key
+   - `HF_TOKEN`: Your Hugging Face API token (for LLM generation)
+   - `JINA_API_KEY`: Your Jina AI API token (for embeddings)
+4. **Run the application**
+   ```bash
+   python app.py
+   ```
+   The app will be available at `http://localhost:7860`
+### Hugging Face Spaces Deployment
+1. **Create a new Space** on Hugging Face
+2. **Push this repository** to your Space
+3. **Set environment variables** in Space settings:
+   - `SUPABASE_URL`
+   - `SUPABASE_KEY`
+   - `HF_TOKEN`
+   - `JINA_API_KEY`
+4. **Deploy** - The Space will automatically build and launch
+## 📚 Usage
+### Chat Interface
+Simply ask your design questions:
+```
+"What's the best chart type for showing trends over time?"
+"How do I create an effective infographic for complex data?"
+"What are best practices for data visualization accessibility?"
+```
+The assistant will provide:
+1. Design recommendations based on your intent
+2. WHY each visualization type is suitable
+3. HOW to implement it (tools, techniques, code)
+4. Best practices from research and examples
+5. Accessibility and effectiveness considerations
+### API Access
+This app automatically exposes REST API endpoints for external integration.
+**Python Client:**
+```python
+from gradio_client import Client
+client = Client("your-space-url")
+result = client.predict(
+    "What's the best chart for time series?",
+    api_name="/recommend"
+)
+print(result)
+```
+**JavaScript Client:**
+```javascript
+import { Client } from "@gradio/client";
+const client = await Client.connect("your-space-url");
+const result = await client.predict("/recommend", {
+  message: "What's the best chart for time series?"
+});
+console.log(result.data);
+```
+**cURL:**
+```bash
+curl -X POST "https://your-space.hf.space/call/recommend" \
+     -H "Content-Type: application/json" \
+     -d '{"data": ["What's the best chart for time series?"]}'
+```
+**Available Endpoints:**
+- `/call/recommend` - Main design recommendation assistant
+- `/gradio_api/openapi.json` - OpenAPI specification
+## 🗄️ Database
+The app uses Supabase with PGVector extension to store and retrieve document chunks from graphics research and examples.
+**Database Schema:**
+```sql
+CREATE TABLE document_embeddings (
+  id BIGINT PRIMARY KEY,
+  source_type TEXT, -- pdf, url, or image
+  source_id TEXT, -- filename or URL
+  title TEXT,
+  content_type TEXT, -- text or image
+  chunk_index INTEGER,
+  chunk_text TEXT,
+  page_number INTEGER,
+  embedding VECTOR(1024), -- 1024-dimensional vectors
+  metadata JSONB,
+  word_count INTEGER,
+  image_metadata JSONB,
+  created_at TIMESTAMPTZ
+);
+```
+**Knowledge Base Content:**
+- Research papers on data visualization
+- Design principles and best practices
+- Visual narrative techniques
+- Accessibility guidelines
+- Chart type selection guidance
+- Real-world examples and case studies
+## 🛠️ Technology Stack
+- **UI/API**: [Gradio](https://gradio.app/) - Automatic API generation
+- **Vector Database**: [Supabase](https://supabase.com/) with PGVector extension
+- **Embeddings**: Jina-CLIP-v2 (1024-dimensional)
+- **LLM**: [Hugging Face Inference Providers](https://huggingface.co/docs/inference-providers/) - Llama 3.1
+- **Language**: Python 3.9+
+## 📁 Project Structure
+```
+graphics-llm/
+├── app.py                    # Main Gradio application
+├── requirements.txt          # Python dependencies
+├── .env.example             # Environment variables template
+├── README.md                # This file
+└── src/
+    ├── __init__.py
+    ├── vectorstore.py       # Supabase PGVector connection
+    ├── rag_pipeline.py      # RAG pipeline logic
+    ├── llm_client.py        # Inference Provider client
+    └── prompts.py           # Design recommendation prompt templates
+```
+## ⚙️ Configuration
+### Environment Variables
+See `.env.example` for all available configuration options.
+**Required:**
+- `SUPABASE_URL` - Supabase project URL
+- `SUPABASE_KEY` - Supabase anon key
+- `HF_TOKEN` - Hugging Face API token (for LLM generation)
+- `JINA_API_KEY` - Jina AI API token (for Jina-CLIP-v2 embeddings)
+**Optional:**
+- `LLM_MODEL` - Model to use (default: meta-llama/Llama-3.1-8B-Instruct)
+- `LLM_TEMPERATURE` - Generation temperature (default: 0.2)
+- `LLM_MAX_TOKENS` - Max tokens to generate (default: 2000)
+- `RETRIEVAL_K` - Number of documents to retrieve (default: 5)
+- `EMBEDDING_MODEL` - Embedding model (default: jina-clip-v2)
+### Supported LLM Models
+- `meta-llama/Llama-3.1-8B-Instruct` (recommended)
+- `meta-llama/Meta-Llama-3-8B-Instruct`
+- `Qwen/Qwen2.5-72B-Instruct`
+- `mistralai/Mistral-7B-Instruct-v0.3`
+## 💰 Cost Considerations
+### Hugging Face Inference Providers
+- Free tier: $0.10/month credits
+- PRO tier: $2.00/month credits + pay-as-you-go
+- Typical cost: ~$0.001-0.01 per query
+- Recommended budget: $10-50/month for moderate usage
+### Supabase
+- Free tier sufficient for most use cases
+- PGVector operations are standard database queries
+### Hugging Face Spaces
+- Free CPU hosting available
+- GPU upgrade: ~$0.60/hour (optional, not required)
+## 🔮 Future Enhancements
+- [ ] Multi-turn conversation with memory
+- [ ] Code generation for visualization implementations
+- [ ] Interactive visualization previews
+- [ ] User-uploaded data analysis
+- [ ] Export recommendations as PDF/markdown
+- [ ] Community-contributed examples
+- [ ] Support for more design domains (UI/UX, print graphics)
+## 🤝 Contributing
+Contributions are welcome! Please feel free to submit issues or pull requests.
+## 📄 License
+MIT License - See LICENSE file for details
+## 🙏 Acknowledgments
+- Knowledge base includes research papers on data visualization and information design
+- Built to support designers, journalists, and data practitioners
+## 📞 Support
+For issues or questions:
+- Open an issue on GitHub
+- Check the [Hugging Face Spaces documentation](https://huggingface.co/docs/hub/spaces)
+- Review the [Gradio documentation](https://gradio.app/docs/)
+---
+Built with ❤️ for the design and visualization community

app.py ADDED Viewed

	@@ -0,0 +1,161 @@

+"""
+Viz LLM - Gradio App
+A RAG-powered assistant for data visualization guidance, powered by Jina-CLIP-v2
+embeddings and research from the field of information graphics.
+"""
+import os
+import gradio as gr
+from dotenv import load_dotenv
+from src.rag_pipeline import create_pipeline
+from datetime import datetime, timedelta
+from collections import defaultdict
+# Load environment variables
+load_dotenv()
+# Rate limiting: Track requests per user (IP-based)
+# Format: {ip: [(timestamp1, timestamp2, ...)]}
+rate_limit_tracker = defaultdict(list)
+DAILY_LIMIT = 20
+# Initialize the RAG pipeline
+print("Initializing Graphics Design Pipeline...")
+try:
+    pipeline = create_pipeline(
+        retrieval_k=5,
+        model=os.getenv("LLM_MODEL", "meta-llama/Llama-3.1-8B-Instruct"),
+        temperature=float(os.getenv("LLM_TEMPERATURE", "0.2"))
+    )
+    print("✓ Pipeline initialized successfully")
+except Exception as e:
+    print(f"✗ Error initializing pipeline: {e}")
+    raise
+def check_rate_limit(request: gr.Request) -> tuple[bool, int]:
+    """Check if user has exceeded rate limit"""
+    if request is None:
+        return True, DAILY_LIMIT  # Allow if no request object
+    user_id = request.client.host
+    now = datetime.now()
+    cutoff = now - timedelta(days=1)
+    # Remove old requests (older than 24 hours)
+    rate_limit_tracker[user_id] = [
+        ts for ts in rate_limit_tracker[user_id] if ts > cutoff
+    ]
+    remaining = DAILY_LIMIT - len(rate_limit_tracker[user_id])
+    if remaining <= 0:
+        return False, 0
+    # Add current request
+    rate_limit_tracker[user_id].append(now)
+    return True, remaining - 1
+def recommend_stream(message: str, history: list, request: gr.Request):
+    """
+    Streaming version of design recommendation function
+    Args:
+        message: User's design query
+        history: Chat history
+        request: Gradio request object for rate limiting
+    Yields:
+        Response chunks
+    """
+    # Check rate limit
+    allowed, remaining = check_rate_limit(request)
+    if not allowed:
+        yield "⚠️ **Rate limit exceeded.** You've reached the maximum of 20 queries per day. Please try again in 24 hours."
+        return
+    try:
+        response_stream = pipeline.generate_recommendations(message, stream=True)
+        full_response = ""
+        for chunk in response_stream:
+            full_response += chunk
+            yield full_response
+        # Add rate limit info at the end
+        if remaining <= 5:
+            yield full_response + f"\n\n---\n*You have {remaining} queries remaining today.*"
+    except Exception as e:
+        yield f"Error generating response: {str(e)}\n\nPlease check your environment variables (HF_TOKEN, SUPABASE_URL, SUPABASE_KEY) and try again."
+# Minimal CSS to fix UI artifacts
+custom_css = """
+/* Hide retry/undo buttons that appear as artifacts */
+.chatbot button[aria-label="Retry"],
+.chatbot button[aria-label="Undo"] {
+    display: none !important;
+}
+"""
+# Create Gradio interface
+with gr.Blocks(
+    title="Viz LLM",
+    css=custom_css
+) as demo:
+    gr.Markdown("""
+    # 📊 Viz LLM
+    Get design recommendations for creating effective data visualizations based on research and best practices.
+    """)
+    # Main chat interface
+    chatbot = gr.ChatInterface(
+        fn=recommend_stream,
+        type="messages",
+        examples=[
+            "What's the best chart type for showing trends over time?",
+            "How do I create an effective infographic for complex data?",
+            "What are best practices for data visualization accessibility?",
+            "How should I design a dashboard for storytelling?",
+            "What visualization works best for comparing categories?"
+        ],
+        cache_examples=False,
+        api_name="recommend"
+    )
+    # Knowledge base section (below chat interface)
+    gr.Markdown("""
+    ### Knowledge Base
+    This assistant draws on research papers, design principles, and examples from the field of information graphics and data visualization.
+    **Credits:** Special thanks to the researchers whose work informed this model: Robert Kosara, Edward Segel, Jeffrey Heer, Matthew Conlen, John Maeda, Kennedy Elliott, Scott McCloud, and many others.
+    ---
+    **Usage Limits:** This service is limited to 20 queries per day per user to manage costs. Responses are optimized for English.
+    <div style="text-align: center; margin-top: 20px; opacity: 0.6; font-size: 0.9em;">
+    Embeddings: Jina-CLIP-v2
+    </div>
+    """)
+# Launch configuration
+if __name__ == "__main__":
+    # Check for required environment variables
+    required_vars = ["SUPABASE_URL", "SUPABASE_KEY", "HF_TOKEN"]
+    missing_vars = [var for var in required_vars if not os.getenv(var)]
+    if missing_vars:
+        print(f"⚠️  Warning: Missing environment variables: {', '.join(missing_vars)}")
+        print("Please set these in your .env file or as environment variables")
+    # Launch the app
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False,
+        show_api=True
+    )

assets/bellingcat.svg ADDED Viewed

requirements.txt ADDED Viewed

	@@ -0,0 +1,14 @@

+# Gradio for UI and API
+gradio>=4.0.0
+# Supabase client for vector store
+supabase>=2.0.0
+# Hugging Face Inference (for LLM and embeddings)
+huggingface-hub>=0.20.0
+# Environment variables
+python-dotenv>=1.0.0
+# Utilities
+pydantic>=2.0.0

src/__init__.py ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ """OSINT Investigation Assistant - Core modules"""
2	+
3	+ __version__ = "0.1.0"

src/llm_client.py ADDED Viewed

	@@ -0,0 +1,195 @@

+"""LLM client for Hugging Face Inference API"""
+import os
+from typing import Iterator, Optional
+from huggingface_hub import InferenceClient
+class InferenceProviderClient:
+    """Client for Hugging Face Inference API"""
+    def __init__(
+        self,
+        model: str = "meta-llama/Llama-3.1-8B-Instruct",
+        api_key: Optional[str] = None,
+        temperature: float = 0.3,
+        max_tokens: int = 800
+    ):
+        """
+        Initialize the Inference client
+        Args:
+            model: Model identifier (default: Llama-3.1-8B-Instruct)
+            api_key: HuggingFace API token (defaults to HF_TOKEN env var)
+            temperature: Sampling temperature (0.0 to 1.0)
+            max_tokens: Maximum tokens to generate
+        """
+        self.model = model
+        self.temperature = temperature
+        self.max_tokens = max_tokens
+        # Get API key from parameter or environment
+        api_key = api_key or os.getenv("HF_TOKEN")
+        if not api_key:
+            raise ValueError("HF_TOKEN environment variable must be set or api_key provided")
+        # Initialize Hugging Face Inference Client
+        self.client = InferenceClient(token=api_key)
+    def generate(
+        self,
+        prompt: str,
+        system_prompt: Optional[str] = None,
+        temperature: Optional[float] = None,
+        max_tokens: Optional[int] = None
+    ) -> str:
+        """
+        Generate a response from the LLM
+        Args:
+            prompt: User prompt
+            system_prompt: Optional system prompt
+            temperature: Override default temperature
+            max_tokens: Override default max tokens
+        Returns:
+            Generated text response
+        """
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
+        response = self.client.chat_completion(
+            model=self.model,
+            messages=messages,
+            temperature=temperature or self.temperature,
+            max_tokens=max_tokens or self.max_tokens
+        )
+        return response.choices[0].message.content
+    def generate_stream(
+        self,
+        prompt: str,
+        system_prompt: Optional[str] = None,
+        temperature: Optional[float] = None,
+        max_tokens: Optional[int] = None
+    ) -> Iterator[str]:
+        """
+        Generate a streaming response from the LLM
+        Args:
+            prompt: User prompt
+            system_prompt: Optional system prompt
+            temperature: Override default temperature
+            max_tokens: Override default max tokens
+        Yields:
+            Text chunks as they are generated
+        """
+        messages = []
+        if system_prompt:
+            messages.append({"role": "system", "content": system_prompt})
+        messages.append({"role": "user", "content": prompt})
+        stream = self.client.chat_completion(
+            model=self.model,
+            messages=messages,
+            temperature=temperature or self.temperature,
+            max_tokens=max_tokens or self.max_tokens,
+            stream=True
+        )
+        for chunk in stream:
+            try:
+                if hasattr(chunk, 'choices') and len(chunk.choices) > 0:
+                    if hasattr(chunk.choices[0], 'delta') and hasattr(chunk.choices[0].delta, 'content'):
+                        if chunk.choices[0].delta.content is not None:
+                            yield chunk.choices[0].delta.content
+            except (IndexError, AttributeError) as e:
+                # Gracefully handle malformed chunks
+                continue
+    def chat(
+        self,
+        messages: list[dict],
+        temperature: Optional[float] = None,
+        max_tokens: Optional[int] = None,
+        stream: bool = False
+    ):
+        """
+        Multi-turn chat completion
+        Args:
+            messages: List of message dicts with 'role' and 'content'
+            temperature: Override default temperature
+            max_tokens: Override default max tokens
+            stream: Whether to stream the response
+        Returns:
+            Response text (or iterator if stream=True)
+        """
+        response = self.client.chat_completion(
+            model=self.model,
+            messages=messages,
+            temperature=temperature or self.temperature,
+            max_tokens=max_tokens or self.max_tokens,
+            stream=stream
+        )
+        if stream:
+            def stream_generator():
+                for chunk in response:
+                    try:
+                        if hasattr(chunk, 'choices') and len(chunk.choices) > 0:
+                            if hasattr(chunk.choices[0], 'delta') and hasattr(chunk.choices[0].delta, 'content'):
+                                if chunk.choices[0].delta.content is not None:
+                                    yield chunk.choices[0].delta.content
+                    except (IndexError, AttributeError):
+                        # Gracefully handle malformed chunks
+                        continue
+            return stream_generator()
+        else:
+            return response.choices[0].message.content
+def create_llm_client(
+    model: str = "meta-llama/Llama-3.1-8B-Instruct",
+    temperature: float = 0.7,
+    max_tokens: int = 2000
+) -> InferenceProviderClient:
+    """
+    Factory function to create and return a configured LLM client
+    Args:
+        model: Model identifier
+        temperature: Sampling temperature
+        max_tokens: Maximum tokens to generate
+    Returns:
+        Configured InferenceProviderClient
+    """
+    return InferenceProviderClient(
+        model=model,
+        temperature=temperature,
+        max_tokens=max_tokens
+    )
+# Available models (commonly used for OSINT tasks)
+AVAILABLE_MODELS = {
+    "llama-3.1-8b": "meta-llama/Llama-3.1-8B-Instruct",
+    "llama-3-8b": "meta-llama/Meta-Llama-3-8B-Instruct",
+    "qwen-32b": "Qwen/Qwen2.5-72B-Instruct",
+    "mistral-7b": "mistralai/Mistral-7B-Instruct-v0.3",
+}
+def get_model_identifier(model_name: str) -> str:
+    """Get full model identifier from short name"""
+    return AVAILABLE_MODELS.get(model_name, AVAILABLE_MODELS["llama-3.1-8b"])

src/prompts.py ADDED Viewed

	@@ -0,0 +1,128 @@

+"""Prompt templates for Graphics Guide / Design Assistant"""
+SYSTEM_PROMPT = """You are a graphics and information design advisor. Help users select appropriate visualizations and provide technical implementation guidance.
+RULES:
+1. Recommend graphic types and approaches based on user intent and data characteristics
+2. Explain WHY a particular visualization is suitable and HOW to implement it
+3. Reference best practices and examples from the provided knowledge base
+4. Provide step-by-step guidance in logical order
+5. Keep response under 500 words
+6. For follow-up questions, provide additional details, examples, or technical specifics
+7. Consider accessibility, clarity, and effectiveness in your recommendations
+Format:
+**Design Recommendations:**
+1. [Visualization Type]
+   - When to use: [Context and use cases]
+   - How to implement: [Technical guidance, tools, or techniques]
+   - Best practices: [Key considerations from research/examples]
+2. [Alternative or complementary approach]
+   - When to use: [Context]
+   - How to implement: [Guidance]
+   - Best practices: [Considerations]
+**Key Principles:** [Important design considerations or tips]"""
+DESIGN_PROMPT_TEMPLATE = """USER QUESTION: {query}
+RELEVANT KNOWLEDGE FROM RESEARCH & EXAMPLES:
+{context}
+INSTRUCTIONS:
+- Recommend 2-4 appropriate visualization or design approaches
+- Explain WHY each approach is suitable for the user's intent
+- Provide HOW-TO guidance with specific techniques, tools, or implementation details
+- Reference examples and best practices from the knowledge base above
+- Keep response under 500 words total
+- If user asks for more details, provide specific examples, code snippets, or deeper technical guidance
+Respond with:
+**Design Recommendations:**
+1. [Visualization/Design Approach]
+   - When to use: [Explain why this fits the user's intent and data type]
+   - How to implement: [Specific tools, techniques, or code examples]
+   - Best practices: [Key principles from research, accessibility, effectiveness]
+2. [Alternative Approach]
+   - When to use: [Context and rationale]
+   - How to implement: [Technical guidance]
+   - Best practices: [Considerations]
+**Key Principles:** [Important design considerations, potential pitfalls, or expert tips]"""
+FOLLOWUP_PROMPT_TEMPLATE = """You are an expert graphics and information design advisor continuing a conversation.
+CONVERSATION HISTORY:
+{chat_history}
+USER FOLLOW-UP QUESTION:
+{query}
+RELEVANT KNOWLEDGE FROM RESEARCH & EXAMPLES:
+{context}
+Based on the conversation history and the user's follow-up question, provide a helpful response. If they're asking for clarification or more details about a specific visualization or technique, provide that information with examples. If they're asking a new question, follow the structured design recommendations format."""
+TECHNIQUE_RECOMMENDATION_TEMPLATE = """Based on this design need: {query}
+Available knowledge and examples:
+{context}
+Recommend the top 3-4 most relevant design techniques, visualization types, or approaches and explain why each is suitable. Format as:
+1. **[Technique/Approach Name]**
+   - Type: [chart type, infographic style, etc.]
+   - Why it's suitable: [explanation based on intent and data characteristics]
+   - Implementation: [brief technical guidance or tools to use]
+"""
+class SimplePromptTemplate:
+    """Simple prompt template using string formatting"""
+    def __init__(self, template: str, input_variables: list):
+        self.template = template
+        self.input_variables = input_variables
+    def format(self, **kwargs) -> str:
+        """Format the template with provided variables"""
+        return self.template.format(**kwargs)
+DESIGN_PROMPT = SimplePromptTemplate(
+    template=DESIGN_PROMPT_TEMPLATE,
+    input_variables=["query", "context"]
+)
+FOLLOWUP_PROMPT = SimplePromptTemplate(
+    template=FOLLOWUP_PROMPT_TEMPLATE,
+    input_variables=["chat_history", "query", "context"]
+)
+TECHNIQUE_RECOMMENDATION_PROMPT = SimplePromptTemplate(
+    template=TECHNIQUE_RECOMMENDATION_TEMPLATE,
+    input_variables=["query", "context"]
+)
+def get_design_prompt(include_system: bool = True) -> SimplePromptTemplate:
+    """Get the main design recommendation prompt template"""
+    return DESIGN_PROMPT
+def get_followup_prompt() -> SimplePromptTemplate:
+    """Get the follow-up conversation prompt template"""
+    return FOLLOWUP_PROMPT
+def get_technique_recommendation_prompt() -> SimplePromptTemplate:
+    """Get the technique recommendation prompt template"""
+    return TECHNIQUE_RECOMMENDATION_PROMPT

src/rag_pipeline.py ADDED Viewed

	@@ -0,0 +1,160 @@

+"""RAG pipeline for Graphics Guide / Design Assistant"""
+from typing import Iterator, Optional, List, Tuple
+from .vectorstore import GraphicsVectorStore, create_vectorstore
+from .llm_client import InferenceProviderClient, create_llm_client
+from .prompts import (
+    SYSTEM_PROMPT,
+    DESIGN_PROMPT,
+    get_design_prompt
+)
+class GraphicsDesignPipeline:
+    """RAG pipeline for generating graphics and design recommendations"""
+    def __init__(
+        self,
+        vectorstore: Optional[GraphicsVectorStore] = None,
+        llm_client: Optional[InferenceProviderClient] = None,
+        retrieval_k: int = 5
+    ):
+        """
+        Initialize the RAG pipeline
+        Args:
+            vectorstore: Vector store instance (creates default if None)
+            llm_client: LLM client instance (creates default if None)
+            retrieval_k: Number of document chunks to retrieve for context
+        """
+        self.vectorstore = vectorstore or create_vectorstore()
+        self.llm_client = llm_client or create_llm_client()
+        self.retrieval_k = retrieval_k
+    def retrieve_documents(self, query: str, k: Optional[int] = None) -> List:
+        """
+        Retrieve relevant document chunks for a query
+        Args:
+            query: User's design query
+            k: Number of documents to retrieve (uses default if None)
+        Returns:
+            List of relevant document chunks
+        """
+        k = k or self.retrieval_k
+        return self.vectorstore.similarity_search(query, k=k)
+    def generate_recommendations(
+        self,
+        query: str,
+        stream: bool = False
+    ) -> str | Iterator[str]:
+        """
+        Generate design recommendations for a query
+        Args:
+            query: User's design query
+            stream: Whether to stream the response
+        Returns:
+            Generated recommendations (string or iterator)
+        """
+        # Retrieve relevant documents
+        relevant_docs = self.retrieve_documents(query)
+        # Format documents for context
+        context = self.vectorstore.format_documents_for_context(relevant_docs)
+        # Generate prompt
+        prompt_template = get_design_prompt()
+        full_prompt = prompt_template.format(query=query, context=context)
+        # Generate response
+        if stream:
+            return self.llm_client.generate_stream(
+                prompt=full_prompt,
+                system_prompt=SYSTEM_PROMPT
+            )
+        else:
+            return self.llm_client.generate(
+                prompt=full_prompt,
+                system_prompt=SYSTEM_PROMPT
+            )
+    def chat(
+        self,
+        message: str,
+        history: Optional[List[Tuple[str, str]]] = None,
+        stream: bool = False
+    ) -> str | Iterator[str]:
+        """
+        Handle a chat message with conversation history
+        Args:
+            message: User's message
+            history: Conversation history as list of (user_msg, assistant_msg) tuples
+            stream: Whether to stream the response
+        Returns:
+            Generated response (string or iterator)
+        """
+        # For now, treat each message as a new design query
+        # In the future, could implement follow-up handling
+        return self.generate_recommendations(message, stream=stream)
+    def get_relevant_examples(
+        self,
+        query: str,
+        k: int = 5
+    ) -> List[dict]:
+        """
+        Get relevant examples and knowledge with metadata
+        Args:
+            query: Design query
+            k: Number of examples to recommend
+        Returns:
+            List of document dictionaries with metadata
+        """
+        docs = self.retrieve_documents(query, k=k)
+        examples = []
+        for doc in docs:
+            example = {
+                "source": doc.metadata.get("source_id", "Unknown"),
+                "source_type": doc.metadata.get("source_type", "N/A"),
+                "page": doc.metadata.get("page_number"),
+                "content": doc.page_content,
+                "similarity": doc.metadata.get("similarity")
+            }
+            examples.append(example)
+        return examples
+def create_pipeline(
+    retrieval_k: int = 5,
+    model: str = "meta-llama/Llama-3.1-8B-Instruct",
+    temperature: float = 0.2
+) -> GraphicsDesignPipeline:
+    """
+    Factory function to create a configured RAG pipeline
+    Args:
+        retrieval_k: Number of documents to retrieve
+        model: LLM model identifier
+        temperature: LLM temperature
+    Returns:
+        Configured GraphicsDesignPipeline
+    """
+    vectorstore = create_vectorstore()
+    llm_client = create_llm_client(model=model, temperature=temperature)
+    return GraphicsDesignPipeline(
+        vectorstore=vectorstore,
+        llm_client=llm_client,
+        retrieval_k=retrieval_k
+    )

src/vectorstore.py ADDED Viewed

	@@ -0,0 +1,313 @@

+"""Supabase PGVector connection and retrieval functionality for graphics/design documents"""
+import os
+from typing import List, Dict, Any, Optional
+from supabase import create_client, Client
+from huggingface_hub import InferenceClient
+class Document:
+    """Simple document class to match LangChain interface"""
+    def __init__(self, page_content: str, metadata: dict):
+        self.page_content = page_content
+        self.metadata = metadata
+class GraphicsVectorStore:
+    """Manages connection to Supabase PGVector database with graphics/design document embeddings"""
+    def __init__(
+        self,
+        supabase_url: Optional[str] = None,
+        supabase_key: Optional[str] = None,
+        hf_token: Optional[str] = None,
+        jina_api_key: Optional[str] = None,
+        embedding_model: str = "jina-clip-v2"
+    ):
+        """
+        Initialize the vector store connection
+        Args:
+            supabase_url: Supabase project URL (defaults to SUPABASE_URL env var)
+            supabase_key: Supabase anon key (defaults to SUPABASE_KEY env var)
+            hf_token: HuggingFace API token (defaults to HF_TOKEN env var)
+            jina_api_key: Jina AI API key (defaults to JINA_API_KEY env var, required for Jina models)
+            embedding_model: Embedding model to use (default: jinaai/jina-clip-v2)
+        """
+        # Get credentials from parameters or environment
+        self.supabase_url = supabase_url or os.getenv("SUPABASE_URL")
+        self.supabase_key = supabase_key or os.getenv("SUPABASE_KEY")
+        self.hf_token = hf_token or os.getenv("HF_TOKEN")
+        self.jina_api_key = jina_api_key or os.getenv("JINA_API_KEY")
+        if not self.supabase_url or not self.supabase_key:
+            raise ValueError("SUPABASE_URL and SUPABASE_KEY environment variables must be set")
+        # Check for appropriate API key based on model
+        self.embedding_model = embedding_model
+        if "jina" in self.embedding_model.lower():
+            if not self.jina_api_key:
+                raise ValueError("JINA_API_KEY environment variable must be set for Jina models")
+        else:
+            if not self.hf_token:
+                raise ValueError("HF_TOKEN environment variable must be set for HuggingFace models")
+        # Initialize Supabase client
+        self.supabase: Client = create_client(self.supabase_url, self.supabase_key)
+        # Initialize HuggingFace Inference client for embeddings (if using HF models)
+        if self.hf_token:
+            self.hf_client = InferenceClient(token=self.hf_token)
+    def _generate_embedding(self, text: str) -> List[float]:
+        """
+        Generate embedding for text using HuggingFace Inference API
+        Args:
+            text: Text to embed
+        Returns:
+            List of floats representing the embedding vector (1024 dimensions)
+        """
+        try:
+            # For Jina-CLIP-v2, use the Jina AI Embeddings API
+            import requests
+            import numpy as np
+            # Jina AI uses their own API endpoint
+            api_url = "https://api.jina.ai/v1/embeddings"
+            headers = {
+                "Content-Type": "application/json",
+                "Authorization": f"Bearer {self.jina_api_key}"
+            }
+            payload = {
+                "model": self.embedding_model,
+                "input": [text]
+            }
+            response = requests.post(api_url, headers=headers, json=payload, timeout=30)
+            if response.status_code != 200:
+                raise Exception(f"API returned status {response.status_code}: {response.text}")
+            result = response.json()
+            # Jina API returns embeddings in data array
+            if isinstance(result, dict) and 'data' in result:
+                embedding = result['data'][0]['embedding']
+                return embedding
+            # Fallback to standard response parsing
+            result = result if not isinstance(result, dict) else result.get('embeddings', result)
+            # Convert to list (handles numpy arrays and nested lists)
+            # If it's a numpy array, convert to list
+            if isinstance(result, np.ndarray):
+                if result.ndim > 1:
+                    result = result[0]  # Take first row if 2D
+                return result.tolist()
+            # If it's a nested list, flatten if needed
+            if isinstance(result, list) and len(result) > 0:
+                if isinstance(result[0], list):
+                    return result[0]  # Take first embedding if batched
+                # Handle nested numpy arrays in list
+                if isinstance(result[0], np.ndarray):
+                    return result[0].tolist()
+                return result
+            return result
+        except Exception as e:
+            raise Exception(f"Error generating embedding with {self.embedding_model}: {str(e)}")
+    def similarity_search(
+        self,
+        query: str,
+        k: int = 5,
+        match_threshold: float = 0.3
+    ) -> List[Document]:
+        """
+        Perform similarity search on the graphics/design document database
+        Args:
+            query: Search query
+            k: Number of results to return
+            match_threshold: Minimum similarity threshold (0.0 to 1.0)
+        Returns:
+            List of Document objects with relevant document chunks
+        """
+        # Generate embedding for query
+        query_embedding = self._generate_embedding(query)
+        # Call RPC function
+        try:
+            response = self.supabase.rpc(
+                'match_documents',
+                {
+                    'query_embedding': query_embedding,
+                    'match_threshold': match_threshold,
+                    'match_count': k
+                }
+            ).execute()
+            # Convert results to Document objects
+            documents = []
+            for item in response.data:
+                # Handle None chunk_text
+                chunk_text = item.get('chunk_text') or ''
+                doc = Document(
+                    page_content=chunk_text,
+                    metadata={
+                        'id': item.get('id'),
+                        'source_type': item.get('source_type'),
+                        'source_id': item.get('source_id'),
+                        'title': item.get('title', ''),
+                        'content_type': item.get('content_type'),
+                        'chunk_index': item.get('chunk_index'),
+                        'page_number': item.get('page_number'),
+                        'word_count': item.get('word_count'),
+                        'metadata': item.get('metadata', {}),
+                        'similarity': item.get('similarity')
+                    }
+                )
+                documents.append(doc)
+            return documents
+        except Exception as e:
+            raise Exception(f"Error performing similarity search: {str(e)}")
+    def similarity_search_with_score(
+        self,
+        query: str,
+        k: int = 5
+    ) -> List[tuple]:
+        """
+        Perform similarity search and return documents with relevance scores
+        Args:
+            query: Search query
+            k: Number of results to return
+        Returns:
+            List of tuples (Document, score)
+        """
+        # Generate embedding for query
+        query_embedding = self._generate_embedding(query)
+        # Call RPC function
+        try:
+            response = self.supabase.rpc(
+                'match_documents',
+                {
+                    'query_embedding': query_embedding,
+                    'match_threshold': 0.0,  # Get all matches
+                    'match_count': k
+                }
+            ).execute()
+            # Convert results to Document objects with scores
+            results = []
+            for item in response.data:
+                # Handle None chunk_text
+                chunk_text = item.get('chunk_text') or ''
+                doc = Document(
+                    page_content=chunk_text,
+                    metadata={
+                        'id': item.get('id'),
+                        'source_type': item.get('source_type'),
+                        'source_id': item.get('source_id'),
+                        'title': item.get('title', ''),
+                        'content_type': item.get('content_type'),
+                        'chunk_index': item.get('chunk_index'),
+                        'page_number': item.get('page_number'),
+                        'word_count': item.get('word_count'),
+                        'metadata': item.get('metadata', {})
+                    }
+                )
+                score = item.get('similarity', 0.0)
+                results.append((doc, score))
+            return results
+        except Exception as e:
+            raise Exception(f"Error performing similarity search: {str(e)}")
+    def get_retriever(self, k: int = 5):
+        """
+        Get a retriever-like object for LangChain compatibility
+        Args:
+            k: Number of results to return
+        Returns:
+            Simple retriever object with get_relevant_documents method
+        """
+        class SimpleRetriever:
+            def __init__(self, vectorstore, k):
+                self.vectorstore = vectorstore
+                self.k = k
+            def get_relevant_documents(self, query: str) -> List[Document]:
+                return self.vectorstore.similarity_search(query, k=self.k)
+        return SimpleRetriever(self, k)
+    def format_documents_for_context(self, documents: List[Document]) -> str:
+        """
+        Format retrieved documents for inclusion in LLM context
+        Args:
+            documents: List of retrieved Document objects
+        Returns:
+            Formatted string with document information
+        """
+        formatted_docs = []
+        for i, doc in enumerate(documents, 1):
+            metadata = doc.metadata
+            source_info = f"Source: {metadata.get('source_id', 'Unknown')}"
+            if metadata.get('page_number'):
+                source_info += f" (Page {metadata.get('page_number')})"
+            doc_info = f"""
+Document {i}: {source_info}
+Type: {metadata.get('source_type', 'N/A')} | Content: {metadata.get('content_type', 'text')}
+{doc.page_content}
+"""
+            formatted_docs.append(doc_info.strip())
+        return "\n\n---\n\n".join(formatted_docs)
+    def get_source_types(self) -> List[str]:
+        """Get list of available source types from database"""
+        try:
+            response = self.supabase.table('document_embeddings')\
+                .select('source_type')\
+                .execute()
+            # Extract unique source types
+            source_types = set()
+            for item in response.data:
+                if item.get('source_type'):
+                    source_types.add(item['source_type'])
+            return sorted(list(source_types))
+        except Exception as e:
+            # Return common source types as fallback
+            return [
+                "pdf",
+                "url",
+                "image"
+            ]
+def create_vectorstore() -> GraphicsVectorStore:
+    """Factory function to create and return a configured vector store"""
+    return GraphicsVectorStore()

test_vectorstore.py ADDED Viewed

	@@ -0,0 +1,34 @@

+"""Test script to verify the vectorstore connection and retrieval"""
+from dotenv import load_dotenv
+from src.vectorstore import create_vectorstore
+# Load environment variables
+load_dotenv()
+print("Initializing vectorstore...")
+try:
+    vectorstore = create_vectorstore()
+    print("✓ Vectorstore created successfully")
+    # Test with default threshold (should now get good matches with Jina-CLIP-v2)
+    print("\nTesting similarity search with Jina-CLIP-v2 embeddings...")
+    query = "data visualization storytelling narrative"
+    results = vectorstore.similarity_search(query, k=5)
+    print(f"\n✓ Found {len(results)} documents")
+    print("\nSample results:")
+    for i, doc in enumerate(results[:3], 1):
+        print(f"\n--- Document {i} ---")
+        print(f"Source: {doc.metadata.get('source_id', 'Unknown')}")
+        print(f"Type: {doc.metadata.get('source_type', 'N/A')}")
+        print(f"Page: {doc.metadata.get('page_number', 'N/A')}")
+        print(f"Content preview: {doc.page_content[:150]}...")
+        print(f"Similarity: {doc.metadata.get('similarity', 'N/A')}")
+    print("\n✓ All tests passed!")
+except Exception as e:
+    print(f"✗ Error: {e}")
+    import traceback
+    traceback.print_exc()