D3MI4N commited on
Commit
b36ff59
Β·
1 Parent(s): 73b0655

clean up project repo

Browse files
.gitignore CHANGED
@@ -51,6 +51,17 @@ venv.bak/
51
  .pytest_cache/
52
  .coverage
53
  htmlcov/
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  # Database files (if downloading local copies)
56
  *.db
 
51
  .pytest_cache/
52
  .coverage
53
  htmlcov/
54
+ .tox/
55
+ nosetests.xml
56
+ coverage.xml
57
+ *.cover
58
+ .hypothesis/
59
+
60
+ # Test artifacts and outputs
61
+ tests/output/
62
+ tests/results/
63
+ test_results/
64
+ *.test
65
 
66
  # Database files (if downloading local copies)
67
  *.db
DATABASE_README.md CHANGED
@@ -1,6 +1,6 @@
1
  # GAIA Agent with Database Search Integration
2
 
3
- This enhanced GAIA agent system includes semantic search against your Supabase database to find similar questions before processing new ones, improving both accuracy and efficiency.
4
 
5
  ## πŸ—οΈ Architecture
6
 
@@ -46,7 +46,7 @@ agents-course-v2/
46
  ```
47
 
48
  ### 2. Example Database Entries
49
- Your database contains 165 GAIA Q&A pairs like:
50
  ```json
51
  {
52
  "question": "A paper about AI regulation submitted to arXiv.org in June 2022...",
@@ -64,11 +64,11 @@ The system uses:
64
  ## πŸ› οΈ Setup
65
 
66
  ### 1. Environment Variables
67
- Add to your `.env` file:
68
  ```env
69
- OPENAI_API_KEY=your_openai_key
70
- SUPABASE_URL=your_supabase_url
71
- SUPABASE_SERVICE_KEY=your_SUPABASE_SERVICE_KEY
72
  ```
73
 
74
  ### 2. Install Dependencies
@@ -126,4 +126,4 @@ answer = answer_gaia_question(
126
  - **Strategy**: Database-enhanced agent coordination
127
  - **Focus**: Exact answer formatting and efficient tool usage
128
 
129
- This system leverages your existing 165 GAIA Q&A pairs to bootstrap better performance on new questions, making your agent more competitive on the leaderboard!
 
1
  # GAIA Agent with Database Search Integration
2
 
3
+ This enhanced GAIA agent system includes semantic search against a Supabase database to find similar questions before processing new ones, improving both accuracy and efficiency.
4
 
5
  ## πŸ—οΈ Architecture
6
 
 
46
  ```
47
 
48
  ### 2. Example Database Entries
49
+ The database contains 165 GAIA Q&A pairs like:
50
  ```json
51
  {
52
  "question": "A paper about AI regulation submitted to arXiv.org in June 2022...",
 
64
  ## πŸ› οΈ Setup
65
 
66
  ### 1. Environment Variables
67
+ Add to the `.env` file:
68
  ```env
69
+ OPENAI_API_KEY=openai_api_key
70
+ SUPABASE_URL=supabase_url
71
+ SUPABASE_SERVICE_KEY=supabase_service_key
72
  ```
73
 
74
  ### 2. Install Dependencies
 
126
  - **Strategy**: Database-enhanced agent coordination
127
  - **Focus**: Exact answer formatting and efficient tool usage
128
 
129
+ This system leverages existing 165 GAIA Q&A pairs to bootstrap better performance on new questions, making the agent more competitive on the leaderboard!
SUPABASE_SETUP.md CHANGED
@@ -9,7 +9,7 @@ This SQL function enables efficient vector similarity search:
9
  ```sql
10
  -- Create the similarity search function for LangChain integration
11
  create or replace function match_documents_langchain (
12
- query_embedding vector(1536), -- Adjust dimension based on your embedding model
13
  match_threshold float default 0.75,
14
  match_count int default 3
15
  )
@@ -74,9 +74,9 @@ end;
74
  $$;
75
  ```
76
 
77
- ### 3. Update Your Database Table Structure
78
 
79
- Ensure your `documents` table has the right structure:
80
 
81
  ```sql
82
  -- Check/create the documents table structure
@@ -96,17 +96,17 @@ WITH (lists = 100);
96
 
97
  ### 4. Environment Variables
98
 
99
- Update your `.env` file:
100
 
101
  ```env
102
  # Required for both approaches
103
- SUPABASE_URL=your_supabase_project_url
104
- SUPABASE_SERVICE_KEY=your_SUPABASE_SERVICE_KEY
105
  # Alternative key name (some setups use this)
106
- SUPABASE_KEY=your_SUPABASE_SERVICE_KEY
107
 
108
  # Optional: For OpenAI fallback
109
- OPENAI_API_KEY=your_openai_api_key
110
  ```
111
 
112
  ## Performance Comparison
@@ -123,7 +123,7 @@ OPENAI_API_KEY=your_openai_api_key
123
  ❌ **Costs money per embedding**
124
  ❌ **API rate limits**
125
 
126
- ## Testing Your Setup
127
 
128
  1. **Test the function exists:**
129
  ```sql
 
9
  ```sql
10
  -- Create the similarity search function for LangChain integration
11
  create or replace function match_documents_langchain (
12
+ query_embedding vector(1536), -- Adjust dimension based on embedding model
13
  match_threshold float default 0.75,
14
  match_count int default 3
15
  )
 
74
  $$;
75
  ```
76
 
77
+ ### 3. Update Database Table Structure
78
 
79
+ Ensure the `documents` table has the right structure:
80
 
81
  ```sql
82
  -- Check/create the documents table structure
 
96
 
97
  ### 4. Environment Variables
98
 
99
+ Update the `.env` file:
100
 
101
  ```env
102
  # Required for both approaches
103
+ SUPABASE_URL=supabase_project_url
104
+ SUPABASE_SERVICE_KEY=supabase_service_key
105
  # Alternative key name (some setups use this)
106
+ SUPABASE_KEY=supabase_service_key
107
 
108
  # Optional: For OpenAI fallback
109
+ OPENAI_API_KEY=openai_api_key
110
  ```
111
 
112
  ## Performance Comparison
 
123
  ❌ **Costs money per embedding**
124
  ❌ **API rate limits**
125
 
126
+ ## Testing the Setup
127
 
128
  1. **Test the function exists:**
129
  ```sql
agent.py CHANGED
@@ -33,8 +33,6 @@ os.environ["TOKENIZERS_PARALLELISM"] = "false"
33
  llm = ChatOpenAI(model="gpt-4o", temperature=0)
34
 
35
 
36
-
37
-
38
  # ─────────────────────────────────────────────────────────────────────────────
39
  # SIMPLE AGENT SETUP (following course pattern)
40
  # ─────────────────────────────────────────────────────────────────────────────
@@ -106,7 +104,7 @@ def should_continue(state: MessagesState):
106
  builder.add_node("agent", gaia_agent)
107
  builder.add_node("tools", ToolNode(ALL_TOOLS))
108
 
109
- # Add edges - much simpler!
110
  builder.add_edge(START, "agent")
111
  builder.add_conditional_edges("agent", should_continue)
112
  builder.add_edge("tools", "agent") # Return to agent after using tools
 
33
  llm = ChatOpenAI(model="gpt-4o", temperature=0)
34
 
35
 
 
 
36
  # ─────────────────────────────────────────────────────────────────────────────
37
  # SIMPLE AGENT SETUP (following course pattern)
38
  # ─────────────────────────────────────────────────────────────────────────────
 
104
  builder.add_node("agent", gaia_agent)
105
  builder.add_node("tools", ToolNode(ALL_TOOLS))
106
 
107
+ # Add edges
108
  builder.add_edge(START, "agent")
109
  builder.add_conditional_edges("agent", should_continue)
110
  builder.add_edge("tools", "agent") # Return to agent after using tools
app.py CHANGED
@@ -6,21 +6,9 @@ import pandas as pd
6
  from agent import graph
7
  from langchain_core.messages import HumanMessage
8
 
9
- # (Keep Constants as is)
10
- # --- Constants ---
11
  DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
12
 
13
- # --- Basic Agent Definition ---
14
- # ----- THIS IS WERE YOU CAN BUILD WHAT YOU WANT ------
15
- # class BasicAgent:
16
- # def __init__(self):
17
- # print("BasicAgent initialized.")
18
- # def __call__(self, question: str) -> str:
19
- # print(f"Agent received question (first 50 chars): {question[:50]}...")
20
- # fixed_answer = "This is a default answer."
21
- # print(f"Agent returning fixed answer: {fixed_answer}")
22
- # return fixed_answer
23
-
24
  class GaiaAgent:
25
  def __init__(self):
26
  print("Graph-based agent initialized.")
 
6
  from agent import graph
7
  from langchain_core.messages import HumanMessage
8
 
9
+ # Constants
 
10
  DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space"
11
 
 
 
 
 
 
 
 
 
 
 
 
12
  class GaiaAgent:
13
  def __init__(self):
14
  print("Graph-based agent initialized.")
tests/README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tests
2
+
3
+ This directory contains test files for the GAIA agent system.
4
+
5
+ ## Test Files
6
+
7
+ - `test_database.py` - Tests database search integration and similarity matching
8
+ - `test_single.py` - Single question test for debugging specific issues
9
+ - `test_routing.py` - Tests intelligent routing and agent decision-making
10
+
11
+ ## Running Tests
12
+
13
+ Make sure to activate the virtual environment first:
14
+
15
+ ```bash
16
+ source .venv/bin/activate
17
+ ```
18
+
19
+ Then run individual tests:
20
+
21
+ ```bash
22
+ python tests/test_database.py
23
+ python tests/test_single.py
24
+ python tests/test_routing.py
25
+ ```
26
+
27
+ ## Test Structure
28
+
29
+ All test files include the necessary path setup to import modules from the parent directory:
30
+
31
+ ```python
32
+ import sys
33
+ import os
34
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
35
+ ```
36
+
37
+ This allows the tests to import from the main project modules while being organized in a separate directory.
test_database.py β†’ tests/test_database.py RENAMED
@@ -1,9 +1,14 @@
1
  """
2
  Example usage of the GAIA agent with database search integration.
3
- This shows how the system works with your Supabase database.
4
  """
5
 
6
  import os
 
 
 
 
 
7
  from agent import answer_gaia_question
8
  from tools.database_tools import get_retriever
9
 
 
1
  """
2
  Example usage of the GAIA agent with database search integration.
3
+ This shows how the system works with the Supabase database.
4
  """
5
 
6
  import os
7
+ import sys
8
+
9
+ # Add parent directory to path for imports
10
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
11
+
12
  from agent import answer_gaia_question
13
  from tools.database_tools import get_retriever
14
 
test_routing.py β†’ tests/test_routing.py RENAMED
@@ -2,6 +2,12 @@
2
  Test the intelligent routing system to show how the orchestrator makes decisions.
3
  """
4
 
 
 
 
 
 
 
5
  from agent import answer_gaia_question
6
 
7
  def test_intelligent_routing():
 
2
  Test the intelligent routing system to show how the orchestrator makes decisions.
3
  """
4
 
5
+ import os
6
+ import sys
7
+
8
+ # Add parent directory to path for imports
9
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
10
+
11
  from agent import answer_gaia_question
12
 
13
  def test_intelligent_routing():
test_single.py β†’ tests/test_single.py RENAMED
@@ -3,6 +3,11 @@ Test a single problematic question to debug the routing logic.
3
  """
4
 
5
  import os
 
 
 
 
 
6
  from agent import answer_gaia_question
7
  from tools.database_tools import get_retriever
8
 
 
3
  """
4
 
5
  import os
6
+ import sys
7
+
8
+ # Add parent directory to path for imports
9
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
10
+
11
  from agent import answer_gaia_question
12
  from tools.database_tools import get_retriever
13
 
tools/__init__.py CHANGED
@@ -4,7 +4,7 @@ Import tools from their respective modules.
4
  """
5
 
6
  from .file_tools import read_excel_file, read_csv_file, calculate_column_sum
7
- from .research_tools import web_search, get_company_info, verify_fact
8
  from .math_tools import calculate_expression, percentage_calculation, currency_format, statistical_summary
9
  from .database_tools import search_similar_gaia_questions, get_exact_answer_if_highly_similar
10
 
@@ -12,7 +12,7 @@ from .database_tools import search_similar_gaia_questions, get_exact_answer_if_h
12
  FILE_TOOLS = [read_excel_file, read_csv_file, calculate_column_sum]
13
 
14
  # Research tools
15
- RESEARCH_TOOLS = [web_search, get_company_info, verify_fact]
16
 
17
  # Mathematical tools
18
  MATH_TOOLS = [calculate_expression, percentage_calculation, currency_format, statistical_summary]
 
4
  """
5
 
6
  from .file_tools import read_excel_file, read_csv_file, calculate_column_sum
7
+ from .research_tools import web_search
8
  from .math_tools import calculate_expression, percentage_calculation, currency_format, statistical_summary
9
  from .database_tools import search_similar_gaia_questions, get_exact_answer_if_highly_similar
10
 
 
12
  FILE_TOOLS = [read_excel_file, read_csv_file, calculate_column_sum]
13
 
14
  # Research tools
15
+ RESEARCH_TOOLS = [web_search]
16
 
17
  # Mathematical tools
18
  MATH_TOOLS = [calculate_expression, percentage_calculation, currency_format, statistical_summary]
tools/research_tools.py CHANGED
@@ -19,36 +19,8 @@ def web_search(query: str, max_results: int = 5) -> str:
19
  Returns:
20
  Search results as formatted text
21
  """
22
- # Implement with your preferred search API (DuckDuckGo, Serper, etc.)
23
- # This is a placeholder - replace with actual search implementation
24
  return f"Search results for: {query}"
25
 
26
- @tool
27
- def get_company_info(company_name: str) -> str:
28
- """
29
- Get basic information about a company.
30
-
31
- Args:
32
- company_name: Name of the company
33
-
34
- Returns:
35
- Company information
36
- """
37
- # Implement company lookup logic
38
- return f"Information about {company_name}"
39
-
40
- @tool
41
- def verify_fact(claim: str) -> str:
42
- """
43
- Verify a factual claim using multiple sources.
44
-
45
- Args:
46
- claim: The claim to verify
47
-
48
- Returns:
49
- Verification result
50
- """
51
- # Implement fact verification logic
52
- return f"Verification result for: {claim}"
53
 
54
- # Add more research tools as needed
 
19
  Returns:
20
  Search results as formatted text
21
  """
22
+ # TODO:Implement search API (Tavily or DuckDuckGo)
 
23
  return f"Search results for: {query}"
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
+ # TODO: Add more research tools as needed (e.g., Wikipedia, Arxiv, etc.)
utils/supbase_fill.py CHANGED
@@ -14,11 +14,11 @@ SUPABASE_SERVICE_KEY = os.getenv("SUPABASE_SERVICE_KEY")
14
  HF_TOKEN = os.getenv("HUGGINGFACE_API_TOKEN")
15
 
16
  if not SUPABASE_URL or not SUPABASE_SERVICE_KEY:
17
- raise RuntimeError("Please set SUPABASE_URL and SUPABASE_SERVICE_KEY in your .env")
18
 
19
  if not HF_TOKEN:
20
  raise RuntimeError(
21
- "Please set HUGGINGFACE_API_TOKEN in your .env and ensure you've been granted access to the GAIA dataset."
22
  )
23
 
24
  # -----------------------------------------------------------------------------
 
14
  HF_TOKEN = os.getenv("HUGGINGFACE_API_TOKEN")
15
 
16
  if not SUPABASE_URL or not SUPABASE_SERVICE_KEY:
17
+ raise RuntimeError("Set SUPABASE_URL and SUPABASE_SERVICE_KEY in your .env")
18
 
19
  if not HF_TOKEN:
20
  raise RuntimeError(
21
+ "Set HUGGINGFACE_API_TOKEN in your .env and ensure you've been granted access to the GAIA dataset."
22
  )
23
 
24
  # -----------------------------------------------------------------------------