FunctionGemma Infrastructure Tools v8

A fine-tuned FunctionGemma 270M model for infrastructure error diagnosis and remediation. Achieves 100% accuracy on 7 infrastructure tools when using the correct tool definitions.

Model Details

  • Base Model: google/functiongemma-270m-it
  • Format: LiteRT-LM (.litertlm) - optimized for on-device inference
  • Quantization: INT8 (Q8)
  • Size: ~271MB
  • Training: 50 epochs on 10,500 examples (1,500 per tool)

Supported Tools

Tool Description Use Case
enableCors Enable CORS for a specific origin CORS policy errors, blocked cross-origin requests
updateConnectionUrl Update service connection URL ECONNREFUSED errors, localhost connection issues in containers
setEnvVar Set environment variable Missing configuration, undefined env vars
addHostMapping Add hostname to IP mapping DNS resolution (ENOTFOUND) errors
increaseMemory Increase memory limit OOMKilled errors, out of memory crashes
increaseTimeout Increase timeout value 504 Gateway Timeout, connection timeout errors
restartService Restart a service Stuck processes, stale data after deployment

Usage with LiteRT-LM

Download the Model

# Using huggingface-cli
huggingface-cli download macmacmacmac/functiongemma-nextjs functiongemma-infra-v8_q8_ekv1024.litertlm

# Or using Python
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
    repo_id="macmacmacmac/functiongemma-nextjs",
    filename="functiongemma-infra-v8_q8_ekv1024.litertlm"
)

Required Tool Definitions

Important: You must use these exact tool definitions for optimal accuracy. The model was trained with these specific descriptions.

const tools = [
  {
    type: "function",
    function: {
      name: "enableCors",
      description: "Enable CORS for a specific origin to fix blocked cross-origin requests.",
      parameters: {
        type: "object",
        properties: {
          origin: { type: "string", description: "The origin to allow (e.g., http://localhost:3000)" },
          methods: { type: "string", description: "Allowed HTTP methods (e.g., GET,POST,PUT,DELETE)" }
        },
        required: ["origin"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "updateConnectionUrl",
      description: "Update a service connection URL to fix ECONNREFUSED errors, typically changing localhost to the correct service hostname.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to update (e.g., database, redis, api)" },
          hostname: { type: "string", description: "The correct hostname to connect to" },
          port: { type: "integer", description: "The port number to connect to" }
        },
        required: ["service", "hostname", "port"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "setEnvVar",
      description: "Set an environment variable to fix missing configuration errors.",
      parameters: {
        type: "object",
        properties: {
          name: { type: "string", description: "Environment variable name (e.g., DATABASE_URL, API_KEY)" },
          value: { type: "string", description: "The value to set" }
        },
        required: ["name", "value"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "addHostMapping",
      description: "Add a hostname to IP mapping to fix DNS resolution (ENOTFOUND) errors.",
      parameters: {
        type: "object",
        properties: {
          hostname: { type: "string", description: "The hostname to map" },
          ip: { type: "string", description: "The IP address to map to" }
        },
        required: ["hostname", "ip"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseMemory",
      description: "Increase memory limit for a service to fix OOMKilled errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name" },
          memoryMb: { type: "integer", description: "Memory limit in megabytes" }
        },
        required: ["service", "memoryMb"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseTimeout",
      description: "Increase timeout value to fix 504 Gateway Timeout or connection timeout errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to configure" },
          timeoutMs: { type: "integer", description: "Timeout value in milliseconds" }
        },
        required: ["service", "timeoutMs"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "restartService",
      description: "Restart a service to apply configuration changes or fix a stuck process.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name to restart" }
        },
        required: ["service"]
      }
    }
  }
];

Example Usage with dad-express

const { FunctionGemmaEngine } = require('dad-express');

const engine = new FunctionGemmaEngine({
  modelPath: './functiongemma-infra-v8_q8_ekv1024.litertlm',
  tools: JSON.stringify(tools)
});

// Diagnose an error
const result = await engine.call('Container api was OOMKilled - out of memory');
console.log(result.tool_calls[0].function);
// { name: 'increaseMemory', arguments: { service: 'api', memoryMb: 1024 } }

Training Data

The model was trained on 10,500 synthetic examples covering common infrastructure errors:

Error Pattern Tool Examples
CORS policy errors enableCors 1,500
ECONNREFUSED errors updateConnectionUrl 1,500
Missing env vars setEnvVar 1,500
DNS/ENOTFOUND errors addHostMapping 1,500
OOMKilled errors increaseMemory 1,500
Timeout errors increaseTimeout 1,500
Stuck services restartService 1,500

Sample Training Examples

"CORS error: No 'Access-Control-Allow-Origin' header from http://localhost:3000" β†’ enableCors
"Error: connect ECONNREFUSED 127.0.0.1:5432 - database connection failed" β†’ updateConnectionUrl
"Missing required environment variable: DATABASE_URL" β†’ setEnvVar
"getaddrinfo ENOTFOUND db" β†’ addHostMapping
"Container api was OOMKilled" β†’ increaseMemory
"504 Gateway Timeout from backend" β†’ increaseTimeout
"nginx container is not responding" β†’ restartService

Fully Loaded Serving

Fully Loaded Serving is an end-to-end intelligent error remediation pipeline that runs entirely on-device. It combines:

  1. Low-latency vector embeddings (EmbeddingGemma) for streaming log classification
  2. Semantic clustering to group similar errors/issues/outliers
  3. Function calling (FunctionGemma) to automatically diagnose and fix infrastructure issues
  4. Prompt optimization via Ax with MiPRO for continuous improvement

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Next.js Application                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  stdout/stderr ──▢ Log Stream ──▢ dad-express middleware                β”‚
β”‚                                          β”‚                              β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚                    β”‚                     β–Ό                      β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚      EmbeddingGemma (~5ms)       β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚   768-dim vector per log line    β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                 β”‚                          β”‚       β”‚
β”‚                    β”‚                 β–Ό                          β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚   Semantic Clustering (cosine)   β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β€’ Group similar errors          β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β€’ Detect outliers               β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β€’ Identify recurring patterns   β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                 β”‚                          β”‚       β”‚
β”‚                    β”‚                 β–Ό                          β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚   FunctionGemma (~50ms/call)     β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β†’ enableCors, setEnvVar, etc.   β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                 β”‚                          β”‚       β”‚
β”‚                    β”‚                 β–Ό                          β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚      Auto-Remediation Layer      β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  Execute fix or notify developer β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                                            β”‚       β”‚
β”‚                    β”‚     LiteRT-LM (on-device, ~300MB RAM)      β”‚       β”‚
β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Ax Integration with MiPRO

Ax is a TypeScript DSPy-style framework for declarative AI programming. dad-express provides AxLiteRTProvider to run Ax signatures entirely on-device:

import { AxGen } from "@ax-llm/ax";
import { AxLiteRTProvider, EmbeddingEngine, FunctionGemmaEngine } from "dad-express";

// Create on-device provider with both embedding and chat models
const provider = new AxLiteRTProvider({
  chat: {
    modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
    tools: infrastructureTools,  // The 7 tools from this repo
  },
  embed: {
    modelPath: "./models/embedding_gemma.tflite",
    tokenizerPath: "./models/tokenizer.model",
  },
});

// Define Ax signature for error diagnosis (MiPRO-optimizable)
const diagnoseError = new AxGen(`
  errorMessage:string "The error log line",
  errorCluster:string? "Similar errors seen recently"
  ->
  diagnosis:string "Root cause analysis",
  toolName:string "Which infrastructure tool to call",
  confidence:class "high, medium, low"
`);

// Run inference on-device
const result = await diagnoseError.forward(provider, {
  errorMessage: "CORS error from http://localhost:3000",
  errorCluster: "3 similar CORS errors in last 5 minutes",
});

console.log(result);
// { diagnosis: "Frontend origin not in allowed list", 
//   toolName: "enableCors", 
//   confidence: "high" }

Example: Hosting Next.js with Fully Loaded Serving

// server.ts - Next.js with intelligent error remediation
import { createApp, FunctionGemmaEngine, EmbeddingEngine } from "dad-express";
import { spawn } from "child_process";

// Infrastructure tools (exact definitions for 100% accuracy)
const tools = [
  { type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow" } }, required: ["origin"] } } },
  { type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors.", parameters: { type: "object", properties: { service: { type: "string" }, hostname: { type: "string" }, port: { type: "integer" } }, required: ["service", "hostname", "port"] } } },
  { type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string" }, value: { type: "string" } }, required: ["name", "value"] } } },
  { type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution errors.", parameters: { type: "object", properties: { hostname: { type: "string" }, ip: { type: "string" } }, required: ["hostname", "ip"] } } },
  { type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string" }, memoryMb: { type: "integer" } }, required: ["service", "memoryMb"] } } },
  { type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout errors.", parameters: { type: "object", properties: { service: { type: "string" }, timeoutMs: { type: "integer" } }, required: ["service", "timeoutMs"] } } },
  { type: "function", function: { name: "restartService", description: "Restart a service to apply changes or fix stuck processes.", parameters: { type: "object", properties: { service: { type: "string" } }, required: ["service"] } } },
];

// Initialize on-device models
const embedEngine = new EmbeddingEngine({
  modelPath: "./models/embedding_gemma.tflite",
  tokenizerPath: "./models/tokenizer.model",
});

const functionGemma = new FunctionGemmaEngine({
  modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
  tools: JSON.stringify(tools),
});

// Error clustering state
const errorClusters = new Map<string, { embedding: Float32Array; count: number; lastSeen: Date }>();

async function classifyAndCluster(logLine: string): Promise<string | null> {
  // Skip non-error lines
  if (!logLine.match(/error|fail|exception|timeout|refused|denied/i)) {
    return null;
  }

  // Generate embedding (~5ms on CPU)
  const embedding = await embedEngine.encodeAsync(logLine);

  // Find similar errors via cosine similarity
  let bestMatch: string | null = null;
  let bestSimilarity = 0.85; // Threshold for clustering

  for (const [clusterId, cluster] of errorClusters) {
    const similarity = EmbeddingEngine.cosineSimilarity(embedding, cluster.embedding);
    if (similarity > bestSimilarity) {
      bestSimilarity = similarity;
      bestMatch = clusterId;
    }
  }

  if (bestMatch) {
    // Update existing cluster
    const cluster = errorClusters.get(bestMatch)!;
    cluster.count++;
    cluster.lastSeen = new Date();
    return bestMatch;
  }

  // Create new cluster
  const clusterId = `cluster_${Date.now()}`;
  errorClusters.set(clusterId, { embedding, count: 1, lastSeen: new Date() });
  return clusterId;
}

async function diagnoseAndFix(errorLog: string, clusterId: string): Promise<void> {
  const cluster = errorClusters.get(clusterId);
  
  // Call FunctionGemma for diagnosis (~50ms)
  const result = await functionGemma.sendMessage(errorLog);
  
  if (result.functionCalls && result.functionCalls.length > 0) {
    const call = result.functionCalls[0];
    console.log(`[AutoFix] Detected ${cluster?.count || 1}x: ${call.name}`);
    console.log(`[AutoFix] Args: ${JSON.stringify(call.arguments)}`);
    
    // Execute remediation (in production, this would call actual infrastructure APIs)
    switch (call.name) {
      case "enableCors":
        console.log(`[AutoFix] Would enable CORS for: ${call.arguments.origin}`);
        break;
      case "restartService":
        console.log(`[AutoFix] Would restart: ${call.arguments.service}`);
        break;
      case "increaseMemory":
        console.log(`[AutoFix] Would increase memory for ${call.arguments.service} to ${call.arguments.memoryMb}MB`);
        break;
      // ... handle other tools
    }
  }
}

// Create dad-express app
const app = createApp();

// API routes
app.get("/health", () => ({ status: "ok", models: { embed: true, functionGemma: true } }));

app.get("/clusters", () => {
  const clusters = [];
  for (const [id, cluster] of errorClusters) {
    clusters.push({ id, count: cluster.count, lastSeen: cluster.lastSeen });
  }
  return clusters;
});

// Start Next.js as child process with log monitoring
const nextProcess = spawn("npx", ["next", "start"], {
  stdio: ["inherit", "pipe", "pipe"],
  env: { ...process.env, PORT: "3001" },
});

// Stream stdout
nextProcess.stdout.on("data", (data) => {
  const line = data.toString().trim();
  console.log(`[next] ${line}`);
});

// Stream stderr with intelligent processing
nextProcess.stderr.on("data", async (data) => {
  const line = data.toString().trim();
  console.log(`[next:err] ${line}`);
  
  // Classify and cluster error
  const clusterId = await classifyAndCluster(line);
  
  if (clusterId) {
    // Diagnose and auto-fix
    await diagnoseAndFix(line, clusterId);
  }
});

// Start dad-express on separate port for monitoring
app.listen(4000, () => {
  console.log("dad-express monitoring on http://localhost:4000");
  console.log("Next.js app on http://localhost:3001");
});

Key Benefits

Feature Latency Memory Cloud Calls
EmbeddingGemma ~5ms/embed ~50MB 0
FunctionGemma ~50ms/call ~271MB 0
Semantic clustering <1ms Varies 0
Total pipeline ~60ms ~350MB 0
  • Zero cloud dependency: All inference runs locally via LiteRT-LM
  • Sub-100ms latency: Fast enough for real-time log processing
  • Privacy-preserving: Error logs never leave the device
  • Continuous improvement: Use Ax MiPRO to optimize prompts over time

Limitations

  • Optimized for the 7 specific infrastructure tools listed above
  • Requires exact tool definitions for best accuracy
  • May not generalize well to error patterns not seen in training

License

This model inherits the Gemma license from the base model.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for macmacmacmac/functiongemma-nextjs

Finetuned
(140)
this model