Create .github\copilot-instructions.md
Browse files
.github//copilot-instructions.md
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!-- Use this file to provide workspace-specific custom instructions to Copilot. For more details, visit https://code.visualstudio.com/docs/copilot/copilot-customization#_use-a-githubcopilotinstructionsmd-file -->
|
| 2 |
+
|
| 3 |
+
# Web Scraper Project Instructions
|
| 4 |
+
|
| 5 |
+
This is a Python Gradio application for web scraping that:
|
| 6 |
+
|
| 7 |
+
- Scrapes text content from websites
|
| 8 |
+
- Formats content as markdown
|
| 9 |
+
- Generates sitemaps from page links
|
| 10 |
+
- Provides MCP (Model Context Protocol) server functionality
|
| 11 |
+
|
| 12 |
+
## Key Libraries
|
| 13 |
+
|
| 14 |
+
- gradio[mcp]: For the web interface and MCP server capabilities
|
| 15 |
+
- requests: For HTTP requests
|
| 16 |
+
- beautifulsoup4: For HTML parsing
|
| 17 |
+
- markdownify: For converting HTML to markdown
|
| 18 |
+
- urllib.parse: For URL handling
|
| 19 |
+
|
| 20 |
+
## Project Structure
|
| 21 |
+
|
| 22 |
+
- `app.py`: Main web interface application
|
| 23 |
+
- `mcp_server.py`: MCP server that exposes tools for AI integration
|
| 24 |
+
|
| 25 |
+
## MCP Tools
|
| 26 |
+
|
| 27 |
+
The MCP server exposes three main tools:
|
| 28 |
+
|
| 29 |
+
- `scrape_content`: Extract website content as markdown
|
| 30 |
+
- `generate_sitemap`: Create sitemap from page links
|
| 31 |
+
- `analyze_website`: Complete analysis with content and sitemap
|
| 32 |
+
|
| 33 |
+
## Code Style
|
| 34 |
+
|
| 35 |
+
- Use type hints where appropriate
|
| 36 |
+
- Include proper error handling for web requests
|
| 37 |
+
- Follow PEP 8 style guidelines
|
| 38 |
+
- Add docstrings for functions with clear parameter descriptions
|
| 39 |
+
- MCP functions should have descriptive docstrings as they become tool descriptions
|