--- license: gemma language: - en base_model: - google/functiongemma-270m-it pipeline_tag: text-generation tags: - function-calling - infrastructure - devops - litertlm --- # FunctionGemma Infrastructure Tools v8 A fine-tuned [FunctionGemma 270M](https://huggingface.co/google/functiongemma-270m-it) model for infrastructure error diagnosis and remediation. Achieves **100% accuracy** on 7 infrastructure tools when using the correct tool definitions. ## Model Details - **Base Model**: google/functiongemma-270m-it - **Format**: LiteRT-LM (.litertlm) - optimized for on-device inference - **Quantization**: INT8 (Q8) - **Size**: ~271MB - **Training**: 50 epochs on 10,500 examples (1,500 per tool) ## Supported Tools | Tool | Description | Use Case | |------|-------------|----------| | `enableCors` | Enable CORS for a specific origin | CORS policy errors, blocked cross-origin requests | | `updateConnectionUrl` | Update service connection URL | ECONNREFUSED errors, localhost connection issues in containers | | `setEnvVar` | Set environment variable | Missing configuration, undefined env vars | | `addHostMapping` | Add hostname to IP mapping | DNS resolution (ENOTFOUND) errors | | `increaseMemory` | Increase memory limit | OOMKilled errors, out of memory crashes | | `increaseTimeout` | Increase timeout value | 504 Gateway Timeout, connection timeout errors | | `restartService` | Restart a service | Stuck processes, stale data after deployment | ## Usage with LiteRT-LM ### Download the Model ```bash # Using huggingface-cli huggingface-cli download macmacmacmac/functiongemma-nextjs functiongemma-infra-v8_q8_ekv1024.litertlm # Or using Python from huggingface_hub import hf_hub_download model_path = hf_hub_download( repo_id="macmacmacmac/functiongemma-nextjs", filename="functiongemma-infra-v8_q8_ekv1024.litertlm" ) ``` ### Required Tool Definitions **Important**: You must use these exact tool definitions for optimal accuracy. The model was trained with these specific descriptions. ```javascript const tools = [ { type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow (e.g., http://localhost:3000)" }, methods: { type: "string", description: "Allowed HTTP methods (e.g., GET,POST,PUT,DELETE)" } }, required: ["origin"] } } }, { type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors, typically changing localhost to the correct service hostname.", parameters: { type: "object", properties: { service: { type: "string", description: "The service to update (e.g., database, redis, api)" }, hostname: { type: "string", description: "The correct hostname to connect to" }, port: { type: "integer", description: "The port number to connect to" } }, required: ["service", "hostname", "port"] } } }, { type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string", description: "Environment variable name (e.g., DATABASE_URL, API_KEY)" }, value: { type: "string", description: "The value to set" } }, required: ["name", "value"] } } }, { type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution (ENOTFOUND) errors.", parameters: { type: "object", properties: { hostname: { type: "string", description: "The hostname to map" }, ip: { type: "string", description: "The IP address to map to" } }, required: ["hostname", "ip"] } } }, { type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string", description: "The service/container/pod name" }, memoryMb: { type: "integer", description: "Memory limit in megabytes" } }, required: ["service", "memoryMb"] } } }, { type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout or connection timeout errors.", parameters: { type: "object", properties: { service: { type: "string", description: "The service to configure" }, timeoutMs: { type: "integer", description: "Timeout value in milliseconds" } }, required: ["service", "timeoutMs"] } } }, { type: "function", function: { name: "restartService", description: "Restart a service to apply configuration changes or fix a stuck process.", parameters: { type: "object", properties: { service: { type: "string", description: "The service/container/pod name to restart" } }, required: ["service"] } } } ]; ``` ### Example Usage with dad-express ```javascript const { FunctionGemmaEngine } = require('dad-express'); const engine = new FunctionGemmaEngine({ modelPath: './functiongemma-infra-v8_q8_ekv1024.litertlm', tools: JSON.stringify(tools) }); // Diagnose an error const result = await engine.call('Container api was OOMKilled - out of memory'); console.log(result.tool_calls[0].function); // { name: 'increaseMemory', arguments: { service: 'api', memoryMb: 1024 } } ``` ## Training Data The model was trained on 10,500 synthetic examples covering common infrastructure errors: | Error Pattern | Tool | Examples | |--------------|------|----------| | CORS policy errors | enableCors | 1,500 | | ECONNREFUSED errors | updateConnectionUrl | 1,500 | | Missing env vars | setEnvVar | 1,500 | | DNS/ENOTFOUND errors | addHostMapping | 1,500 | | OOMKilled errors | increaseMemory | 1,500 | | Timeout errors | increaseTimeout | 1,500 | | Stuck services | restartService | 1,500 | ### Sample Training Examples ``` "CORS error: No 'Access-Control-Allow-Origin' header from http://localhost:3000" → enableCors "Error: connect ECONNREFUSED 127.0.0.1:5432 - database connection failed" → updateConnectionUrl "Missing required environment variable: DATABASE_URL" → setEnvVar "getaddrinfo ENOTFOUND db" → addHostMapping "Container api was OOMKilled" → increaseMemory "504 Gateway Timeout from backend" → increaseTimeout "nginx container is not responding" → restartService ``` ## Fully Loaded Serving **Fully Loaded Serving** is an end-to-end intelligent error remediation pipeline that runs entirely on-device. It combines: 1. **Low-latency vector embeddings** (EmbeddingGemma) for streaming log classification 2. **Semantic clustering** to group similar errors/issues/outliers 3. **Function calling** (FunctionGemma) to automatically diagnose and fix infrastructure issues 4. **Prompt optimization** via [Ax](https://github.com/ax-llm/ax) with MiPRO for continuous improvement ### Architecture ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Next.js Application │ ├─────────────────────────────────────────────────────────────────────────┤ │ stdout/stderr ──▶ Log Stream ──▶ dad-express middleware │ │ │ │ │ ┌─────────────────────┼──────────────────────┐ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────┐ │ │ │ │ │ EmbeddingGemma (~5ms) │ │ │ │ │ │ 768-dim vector per log line │ │ │ │ │ └──────────────┬───────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────┐ │ │ │ │ │ Semantic Clustering (cosine) │ │ │ │ │ │ • Group similar errors │ │ │ │ │ │ • Detect outliers │ │ │ │ │ │ • Identify recurring patterns │ │ │ │ │ └──────────────┬───────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────┐ │ │ │ │ │ FunctionGemma (~50ms/call) │ │ │ │ │ │ → enableCors, setEnvVar, etc. │ │ │ │ │ └──────────────┬───────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────┐ │ │ │ │ │ Auto-Remediation Layer │ │ │ │ │ │ Execute fix or notify developer │ │ │ │ │ └──────────────────────────────────┘ │ │ │ │ │ │ │ │ LiteRT-LM (on-device, ~300MB RAM) │ │ │ └────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────────┘ ``` ### Ax Integration with MiPRO [Ax](https://github.com/ax-llm/ax) is a TypeScript DSPy-style framework for declarative AI programming. dad-express provides `AxLiteRTProvider` to run Ax signatures entirely on-device: ```typescript import { AxGen } from "@ax-llm/ax"; import { AxLiteRTProvider, EmbeddingEngine, FunctionGemmaEngine } from "dad-express"; // Create on-device provider with both embedding and chat models const provider = new AxLiteRTProvider({ chat: { modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm", tools: infrastructureTools, // The 7 tools from this repo }, embed: { modelPath: "./models/embedding_gemma.tflite", tokenizerPath: "./models/tokenizer.model", }, }); // Define Ax signature for error diagnosis (MiPRO-optimizable) const diagnoseError = new AxGen(` errorMessage:string "The error log line", errorCluster:string? "Similar errors seen recently" -> diagnosis:string "Root cause analysis", toolName:string "Which infrastructure tool to call", confidence:class "high, medium, low" `); // Run inference on-device const result = await diagnoseError.forward(provider, { errorMessage: "CORS error from http://localhost:3000", errorCluster: "3 similar CORS errors in last 5 minutes", }); console.log(result); // { diagnosis: "Frontend origin not in allowed list", // toolName: "enableCors", // confidence: "high" } ``` ### Example: Hosting Next.js with Fully Loaded Serving ```typescript // server.ts - Next.js with intelligent error remediation import { createApp, FunctionGemmaEngine, EmbeddingEngine } from "dad-express"; import { spawn } from "child_process"; // Infrastructure tools (exact definitions for 100% accuracy) const tools = [ { type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow" } }, required: ["origin"] } } }, { type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors.", parameters: { type: "object", properties: { service: { type: "string" }, hostname: { type: "string" }, port: { type: "integer" } }, required: ["service", "hostname", "port"] } } }, { type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string" }, value: { type: "string" } }, required: ["name", "value"] } } }, { type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution errors.", parameters: { type: "object", properties: { hostname: { type: "string" }, ip: { type: "string" } }, required: ["hostname", "ip"] } } }, { type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string" }, memoryMb: { type: "integer" } }, required: ["service", "memoryMb"] } } }, { type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout errors.", parameters: { type: "object", properties: { service: { type: "string" }, timeoutMs: { type: "integer" } }, required: ["service", "timeoutMs"] } } }, { type: "function", function: { name: "restartService", description: "Restart a service to apply changes or fix stuck processes.", parameters: { type: "object", properties: { service: { type: "string" } }, required: ["service"] } } }, ]; // Initialize on-device models const embedEngine = new EmbeddingEngine({ modelPath: "./models/embedding_gemma.tflite", tokenizerPath: "./models/tokenizer.model", }); const functionGemma = new FunctionGemmaEngine({ modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm", tools: JSON.stringify(tools), }); // Error clustering state const errorClusters = new Map(); async function classifyAndCluster(logLine: string): Promise { // Skip non-error lines if (!logLine.match(/error|fail|exception|timeout|refused|denied/i)) { return null; } // Generate embedding (~5ms on CPU) const embedding = await embedEngine.encodeAsync(logLine); // Find similar errors via cosine similarity let bestMatch: string | null = null; let bestSimilarity = 0.85; // Threshold for clustering for (const [clusterId, cluster] of errorClusters) { const similarity = EmbeddingEngine.cosineSimilarity(embedding, cluster.embedding); if (similarity > bestSimilarity) { bestSimilarity = similarity; bestMatch = clusterId; } } if (bestMatch) { // Update existing cluster const cluster = errorClusters.get(bestMatch)!; cluster.count++; cluster.lastSeen = new Date(); return bestMatch; } // Create new cluster const clusterId = `cluster_${Date.now()}`; errorClusters.set(clusterId, { embedding, count: 1, lastSeen: new Date() }); return clusterId; } async function diagnoseAndFix(errorLog: string, clusterId: string): Promise { const cluster = errorClusters.get(clusterId); // Call FunctionGemma for diagnosis (~50ms) const result = await functionGemma.sendMessage(errorLog); if (result.functionCalls && result.functionCalls.length > 0) { const call = result.functionCalls[0]; console.log(`[AutoFix] Detected ${cluster?.count || 1}x: ${call.name}`); console.log(`[AutoFix] Args: ${JSON.stringify(call.arguments)}`); // Execute remediation (in production, this would call actual infrastructure APIs) switch (call.name) { case "enableCors": console.log(`[AutoFix] Would enable CORS for: ${call.arguments.origin}`); break; case "restartService": console.log(`[AutoFix] Would restart: ${call.arguments.service}`); break; case "increaseMemory": console.log(`[AutoFix] Would increase memory for ${call.arguments.service} to ${call.arguments.memoryMb}MB`); break; // ... handle other tools } } } // Create dad-express app const app = createApp(); // API routes app.get("/health", () => ({ status: "ok", models: { embed: true, functionGemma: true } })); app.get("/clusters", () => { const clusters = []; for (const [id, cluster] of errorClusters) { clusters.push({ id, count: cluster.count, lastSeen: cluster.lastSeen }); } return clusters; }); // Start Next.js as child process with log monitoring const nextProcess = spawn("npx", ["next", "start"], { stdio: ["inherit", "pipe", "pipe"], env: { ...process.env, PORT: "3001" }, }); // Stream stdout nextProcess.stdout.on("data", (data) => { const line = data.toString().trim(); console.log(`[next] ${line}`); }); // Stream stderr with intelligent processing nextProcess.stderr.on("data", async (data) => { const line = data.toString().trim(); console.log(`[next:err] ${line}`); // Classify and cluster error const clusterId = await classifyAndCluster(line); if (clusterId) { // Diagnose and auto-fix await diagnoseAndFix(line, clusterId); } }); // Start dad-express on separate port for monitoring app.listen(4000, () => { console.log("dad-express monitoring on http://localhost:4000"); console.log("Next.js app on http://localhost:3001"); }); ``` ### Key Benefits | Feature | Latency | Memory | Cloud Calls | |---------|---------|--------|-------------| | EmbeddingGemma | ~5ms/embed | ~50MB | 0 | | FunctionGemma | ~50ms/call | ~271MB | 0 | | Semantic clustering | <1ms | Varies | 0 | | **Total pipeline** | **~60ms** | **~350MB** | **0** | - **Zero cloud dependency**: All inference runs locally via LiteRT-LM - **Sub-100ms latency**: Fast enough for real-time log processing - **Privacy-preserving**: Error logs never leave the device - **Continuous improvement**: Use Ax MiPRO to optimize prompts over time ## Limitations - Optimized for the 7 specific infrastructure tools listed above - Requires exact tool definitions for best accuracy - May not generalize well to error patterns not seen in training ## License This model inherits the [Gemma license](https://ai.google.dev/gemma/terms) from the base model.