Spaces:

MCP-1st-Birthday
/

Jobly

Running

App Files Files Community

Jobly / RAG_ARCHITECTURE.md

Valentina9502

First commit

fdf5af0 verified 15 days ago

preview code

raw

history blame contribute delete

10.4 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

🧠 RAG Architecture & Vector Embeddings

Overview

GigMatch AI uses Retrieval-Augmented Generation (RAG) with vector embeddings to perform intelligent semantic matching between workers and gigs. This goes far beyond simple keyword matching!

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                    DATA INGESTION                            │
├─────────────────────────────────────────────────────────────┤
│  50 Workers + 50 Gigs (JSON)                                │
│         ↓                                                     │
│  Text Enrichment (skills, bio, location, etc.)             │
│         ↓                                                     │
│  HuggingFace Embeddings (all-MiniLM-L6-v2)                 │
│         ↓                                                     │
│  Vector Storage (ChromaDB)                                   │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                    QUERY PIPELINE                            │
├─────────────────────────────────────────────────────────────┤
│  User Query (worker profile or gig post)                    │
│         ↓                                                     │
│  Convert to Search Query                                     │
│         ↓                                                     │
│  Embed Query (HuggingFace)                                  │
│         ↓                                                     │
│  Semantic Search (Vector Similarity)                        │
│         ↓                                                     │
│  Retrieve Top K Results                                      │
│         ↓                                                     │
│  Calculate Match Scores                                      │
│         ↓                                                     │
│  Return Results to Agent                                     │
└─────────────────────────────────────────────────────────────┘

🦙 LlamaIndex Integration

Why LlamaIndex?

Sponsor Recognition - LlamaIndex is a hackathon sponsor 🎉
Production-Ready - Battle-tested RAG framework
Easy Integration - Simple API for vector operations
Flexible - Supports multiple vector stores and embeddings

Implementation

from llama_index.core import VectorStoreIndex, Document
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore

# Initialize embedding model
embed_model = HuggingFaceEmbedding(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

# Create documents with rich text
worker_doc = Document(
    text=f"Name: {name}, Skills: {skills}, Location: {location}...",
    metadata=worker_data
)

# Create vector index
index = VectorStoreIndex.from_documents(
    documents,
    vector_store=vector_store
)

# Query
query_engine = index.as_query_engine(similarity_top_k=5)
response = query_engine.query("Looking for plumber in Rome...")

🤗 HuggingFace Embeddings

Model: all-MiniLM-L6-v2

Why this model?

✅ Fast inference (only 23M parameters)
✅ Good quality embeddings (384 dimensions)
✅ Pre-trained on semantic similarity
✅ HuggingFace sponsor recognition 🤗

Performance:

Embedding time: ~20ms per text
Vector size: 384 dimensions
Cosine similarity for matching

How Embeddings Work

Text → Vector: Each worker/gig is converted to a 384-dimensional vector
Semantic Meaning: Similar meanings = similar vectors
Cosine Similarity: Measure angle between vectors (0-1 score)
Top K: Return K most similar vectors

Example:

text1 = "Experienced plumber, pipe repair, Rome"
text2 = "Looking for plumbing services, leak fix, Rome"

# After embedding:
vec1 = [0.23, -0.45, 0.67, ...]  # 384 dimensions
vec2 = [0.21, -0.43, 0.69, ...]  # 384 dimensions

# Cosine similarity: 0.94 (very similar!)

📊 ChromaDB Vector Store

Why ChromaDB?

✅ Simple local setup (no server needed)
✅ Fast vector search
✅ Native Python API
✅ Persistence support
✅ Perfect for demo/hackathon

Collections

Workers Collection:

50 worker profiles
Indexed by skills, experience, location
Searchable by semantic similarity

Gigs Collection:

50 gig posts
Indexed by requirements, project details
Searchable by semantic similarity

🎯 Semantic Matching Algorithm

Traditional Keyword Matching (OLD)

# Problem: Only finds exact keyword matches
if "plumbing" in worker_skills and "plumbing" in gig_requirements:
    score += 1  # Match!

Semantic Matching with RAG (NEW)

# Solution: Understands meaning and context

Query: "Need someone to fix leaking pipes"
Embedding: [0.23, -0.45, 0.67, ...]

Worker 1: "Plumber, pipe repair specialist"
Embedding: [0.21, -0.43, 0.69, ...]
Similarity: 0.94 ← HIGH MATCH!

Worker 2: "Electrician, wiring expert"
Embedding: [-0.11, 0.52, -0.33, ...]
Similarity: 0.12 ← LOW MATCH

# Semantic search finds Worker 1 even though 
# the word "plumbing" wasn't explicitly mentioned!

Advantages

Synonym Understanding: "plumber" ≈ "pipe specialist"
Context Awareness: "fix pipes" ≈ "repair plumbing"
Related Concepts: "garden" ≈ "landscaping" ≈ "outdoor"
Multi-language: Can handle slight variations
Fuzzy Matching: Typos and variations still work

🔬 Match Score Calculation

Components

Semantic Similarity (70% weight)
- Cosine similarity from vector embeddings
- Range: 0.0 to 1.0
- Higher = better semantic match
Keyword Overlap (20% weight)
- Exact skill matches
- Experience level alignment
- Calculated as: matched_skills / required_skills
Location Match (10% weight)
- Geographic proximity
- Remote work consideration
- Binary: 1.0 (same location/remote) or 0.5 (different)

Final Formula

semantic_score = cosine_similarity(query_vec, doc_vec)
keyword_score = len(matched_skills) / len(required_skills)
location_score = 1.0 if location_match else 0.5

final_score = (
    semantic_score * 0.7 +
    keyword_score * 0.2 +
    location_score * 0.1
) * 100  # Convert to 0-100 scale

📈 Performance & Scalability

Current Setup (Demo)

50 workers + 50 gigs = 100 vectors
Average query time: ~100ms
Embedding model loaded in memory: ~100MB
Total memory usage: ~200MB

Production Scaling

For 10,000 entries:

✅ Still fast (<500ms per query)
✅ ChromaDB handles easily
✅ Consider batch embedding for ingestion

For 100,000+ entries:

Use hosted vector DB (Pinecone, Weaviate)
Batch processing for embeddings
Caching layer for frequent queries
GPU acceleration for embedding

🎨 Benefits for the Hackathon

Why This is WOW

Not Just LLM Calls: Real vector database with semantic search
Sponsor Integration: LlamaIndex 🦙 + HuggingFace 🤗
Production Patterns: Proper RAG architecture
Scalable: Easy to extend to 1000s of entries
Explainable: Can show similarity scores

Demo Impact

Judges will see:

✅ "Powered by LlamaIndex + HuggingFace"
✅ Semantic similarity scores in results
✅ Better matches than keyword search
✅ 100 entries in vector database
✅ Real-time vector search

🔮 Future Enhancements

Easy Wins

Add filters (location, budget, experience)
Implement hybrid search (semantic + keyword)
Add reranking with cross-encoders
Cache popular queries

Advanced

Fine-tune embedding model on gig data
Multi-modal embeddings (add images)
Graph relationships between skills
Temporal embeddings (availability matching)

📚 Code Examples

Creating the Index

# 1. Load data
workers = load_workers_from_json()

# 2. Create documents
documents = []
for worker in workers:
    text = f"""
    Name: {worker['name']}
    Skills: {', '.join(worker['skills'])}
    Experience: {worker['experience']}
    Location: {worker['location']}
    """
    doc = Document(text=text, metadata=worker)
    documents.append(doc)

# 3. Create vector store
chroma_collection = chroma_client.create_collection("workers")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# 4. Build index
index = VectorStoreIndex.from_documents(
    documents,
    vector_store=vector_store
)

Querying the Index

# 1. Create query
query = f"""
Looking for: {', '.join(required_skills)}
Location: {location}
Experience: {experience_level}
"""

# 2. Get query engine
query_engine = index.as_query_engine(similarity_top_k=5)

# 3. Execute query
response = query_engine.query(query)

# 4. Extract results
for node in response.source_nodes:
    worker_data = node.metadata
    similarity_score = node.score
    print(f"Match: {worker_data['name']}, Score: {similarity_score}")

🎯 Key Takeaways

RAG = Better Matches: Semantic understanding > keyword matching
LlamaIndex = Easy: Production RAG in <100 lines of code
HuggingFace = Quality: Great embeddings, sponsor recognition
ChromaDB = Fast: Local vector store, perfect for demo
Scalable = Future-proof: Architecture works at scale

This is what makes GigMatch AI stand out in the hackathon! 🚀