File size: 20,014 Bytes

---
license: gemma
language:
- en
base_model:
- google/functiongemma-270m-it
pipeline_tag: text-generation
tags:
- function-calling
- infrastructure
- devops
- litertlm
---

# FunctionGemma Infrastructure Tools v8

A fine-tuned [FunctionGemma 270M](https://huggingface.co/google/functiongemma-270m-it) model for infrastructure error diagnosis and remediation. Achieves **100% accuracy** on 7 infrastructure tools when using the correct tool definitions.

## Model Details

- **Base Model**: google/functiongemma-270m-it
- **Format**: LiteRT-LM (.litertlm) - optimized for on-device inference
- **Quantization**: INT8 (Q8)
- **Size**: ~271MB
- **Training**: 50 epochs on 10,500 examples (1,500 per tool)

## Supported Tools

| Tool | Description | Use Case |
|------|-------------|----------|
| `enableCors` | Enable CORS for a specific origin | CORS policy errors, blocked cross-origin requests |
| `updateConnectionUrl` | Update service connection URL | ECONNREFUSED errors, localhost connection issues in containers |
| `setEnvVar` | Set environment variable | Missing configuration, undefined env vars |
| `addHostMapping` | Add hostname to IP mapping | DNS resolution (ENOTFOUND) errors |
| `increaseMemory` | Increase memory limit | OOMKilled errors, out of memory crashes |
| `increaseTimeout` | Increase timeout value | 504 Gateway Timeout, connection timeout errors |
| `restartService` | Restart a service | Stuck processes, stale data after deployment |

## Usage with LiteRT-LM

### Download the Model

```bash
# Using huggingface-cli
huggingface-cli download macmacmacmac/functiongemma-nextjs functiongemma-infra-v8_q8_ekv1024.litertlm

# Or using Python
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
    repo_id="macmacmacmac/functiongemma-nextjs",
    filename="functiongemma-infra-v8_q8_ekv1024.litertlm"
)
```

### Required Tool Definitions

**Important**: You must use these exact tool definitions for optimal accuracy. The model was trained with these specific descriptions.

```javascript
const tools = [
  {
    type: "function",
    function: {
      name: "enableCors",
      description: "Enable CORS for a specific origin to fix blocked cross-origin requests.",
      parameters: {
        type: "object",
        properties: {
          origin: { type: "string", description: "The origin to allow (e.g., http://localhost:3000)" },
          methods: { type: "string", description: "Allowed HTTP methods (e.g., GET,POST,PUT,DELETE)" }
        },
        required: ["origin"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "updateConnectionUrl",
      description: "Update a service connection URL to fix ECONNREFUSED errors, typically changing localhost to the correct service hostname.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to update (e.g., database, redis, api)" },
          hostname: { type: "string", description: "The correct hostname to connect to" },
          port: { type: "integer", description: "The port number to connect to" }
        },
        required: ["service", "hostname", "port"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "setEnvVar",
      description: "Set an environment variable to fix missing configuration errors.",
      parameters: {
        type: "object",
        properties: {
          name: { type: "string", description: "Environment variable name (e.g., DATABASE_URL, API_KEY)" },
          value: { type: "string", description: "The value to set" }
        },
        required: ["name", "value"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "addHostMapping",
      description: "Add a hostname to IP mapping to fix DNS resolution (ENOTFOUND) errors.",
      parameters: {
        type: "object",
        properties: {
          hostname: { type: "string", description: "The hostname to map" },
          ip: { type: "string", description: "The IP address to map to" }
        },
        required: ["hostname", "ip"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseMemory",
      description: "Increase memory limit for a service to fix OOMKilled errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name" },
          memoryMb: { type: "integer", description: "Memory limit in megabytes" }
        },
        required: ["service", "memoryMb"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseTimeout",
      description: "Increase timeout value to fix 504 Gateway Timeout or connection timeout errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to configure" },
          timeoutMs: { type: "integer", description: "Timeout value in milliseconds" }
        },
        required: ["service", "timeoutMs"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "restartService",
      description: "Restart a service to apply configuration changes or fix a stuck process.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name to restart" }
        },
        required: ["service"]
      }
    }
  }
];
```

### Example Usage with dad-express

```javascript
const { FunctionGemmaEngine } = require('dad-express');

const engine = new FunctionGemmaEngine({
  modelPath: './functiongemma-infra-v8_q8_ekv1024.litertlm',
  tools: JSON.stringify(tools)
});

// Diagnose an error
const result = await engine.call('Container api was OOMKilled - out of memory');
console.log(result.tool_calls[0].function);
// { name: 'increaseMemory', arguments: { service: 'api', memoryMb: 1024 } }
```

## Training Data

The model was trained on 10,500 synthetic examples covering common infrastructure errors:

| Error Pattern | Tool | Examples |
|--------------|------|----------|
| CORS policy errors | enableCors | 1,500 |
| ECONNREFUSED errors | updateConnectionUrl | 1,500 |
| Missing env vars | setEnvVar | 1,500 |
| DNS/ENOTFOUND errors | addHostMapping | 1,500 |
| OOMKilled errors | increaseMemory | 1,500 |
| Timeout errors | increaseTimeout | 1,500 |
| Stuck services | restartService | 1,500 |

### Sample Training Examples

```
"CORS error: No 'Access-Control-Allow-Origin' header from http://localhost:3000" → enableCors
"Error: connect ECONNREFUSED 127.0.0.1:5432 - database connection failed" → updateConnectionUrl
"Missing required environment variable: DATABASE_URL" → setEnvVar
"getaddrinfo ENOTFOUND db" → addHostMapping
"Container api was OOMKilled" → increaseMemory
"504 Gateway Timeout from backend" → increaseTimeout
"nginx container is not responding" → restartService
```



## Fully Loaded Serving

**Fully Loaded Serving** is an end-to-end intelligent error remediation pipeline that runs entirely on-device. It combines:

1. **Low-latency vector embeddings** (EmbeddingGemma) for streaming log classification
2. **Semantic clustering** to group similar errors/issues/outliers  
3. **Function calling** (FunctionGemma) to automatically diagnose and fix infrastructure issues
4. **Prompt optimization** via [Ax](https://github.com/ax-llm/ax) with MiPRO for continuous improvement

### Architecture

```
┌─────────────────────────────────────────────────────────────────────────┐
│                         Next.js Application                             │
├─────────────────────────────────────────────────────────────────────────┤
│  stdout/stderr ──▶ Log Stream ──▶ dad-express middleware                │
│                                          │                              │
│                    ┌─────────────────────┼──────────────────────┐       │
│                    │                     ▼                      │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │      EmbeddingGemma (~5ms)       │      │       │
│                    │  │   768-dim vector per log line    │      │       │
│                    │  └──────────────┬───────────────────┘      │       │
│                    │                 │                          │       │
│                    │                 ▼                          │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │   Semantic Clustering (cosine)   │      │       │
│                    │  │  • Group similar errors          │      │       │
│                    │  │  • Detect outliers               │      │       │
│                    │  │  • Identify recurring patterns   │      │       │
│                    │  └──────────────┬───────────────────┘      │       │
│                    │                 │                          │       │
│                    │                 ▼                          │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │   FunctionGemma (~50ms/call)     │      │       │
│                    │  │  → enableCors, setEnvVar, etc.   │      │       │
│                    │  └──────────────┬───────────────────┘      │       │
│                    │                 │                          │       │
│                    │                 ▼                          │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │      Auto-Remediation Layer      │      │       │
│                    │  │  Execute fix or notify developer │      │       │
│                    │  └──────────────────────────────────┘      │       │
│                    │                                            │       │
│                    │     LiteRT-LM (on-device, ~300MB RAM)      │       │
│                    └────────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────────────────────────┘
```

### Ax Integration with MiPRO

[Ax](https://github.com/ax-llm/ax) is a TypeScript DSPy-style framework for declarative AI programming. dad-express provides `AxLiteRTProvider` to run Ax signatures entirely on-device:

```typescript
import { AxGen } from "@ax-llm/ax";
import { AxLiteRTProvider, EmbeddingEngine, FunctionGemmaEngine } from "dad-express";

// Create on-device provider with both embedding and chat models
const provider = new AxLiteRTProvider({
  chat: {
    modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
    tools: infrastructureTools,  // The 7 tools from this repo
  },
  embed: {
    modelPath: "./models/embedding_gemma.tflite",
    tokenizerPath: "./models/tokenizer.model",
  },
});

// Define Ax signature for error diagnosis (MiPRO-optimizable)
const diagnoseError = new AxGen(`
  errorMessage:string "The error log line",
  errorCluster:string? "Similar errors seen recently"
  ->
  diagnosis:string "Root cause analysis",
  toolName:string "Which infrastructure tool to call",
  confidence:class "high, medium, low"
`);

// Run inference on-device
const result = await diagnoseError.forward(provider, {
  errorMessage: "CORS error from http://localhost:3000",
  errorCluster: "3 similar CORS errors in last 5 minutes",
});

console.log(result);
// { diagnosis: "Frontend origin not in allowed list", 
//   toolName: "enableCors", 
//   confidence: "high" }
```

### Example: Hosting Next.js with Fully Loaded Serving

```typescript
// server.ts - Next.js with intelligent error remediation
import { createApp, FunctionGemmaEngine, EmbeddingEngine } from "dad-express";
import { spawn } from "child_process";

// Infrastructure tools (exact definitions for 100% accuracy)
const tools = [
  { type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow" } }, required: ["origin"] } } },
  { type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors.", parameters: { type: "object", properties: { service: { type: "string" }, hostname: { type: "string" }, port: { type: "integer" } }, required: ["service", "hostname", "port"] } } },
  { type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string" }, value: { type: "string" } }, required: ["name", "value"] } } },
  { type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution errors.", parameters: { type: "object", properties: { hostname: { type: "string" }, ip: { type: "string" } }, required: ["hostname", "ip"] } } },
  { type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string" }, memoryMb: { type: "integer" } }, required: ["service", "memoryMb"] } } },
  { type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout errors.", parameters: { type: "object", properties: { service: { type: "string" }, timeoutMs: { type: "integer" } }, required: ["service", "timeoutMs"] } } },
  { type: "function", function: { name: "restartService", description: "Restart a service to apply changes or fix stuck processes.", parameters: { type: "object", properties: { service: { type: "string" } }, required: ["service"] } } },
];

// Initialize on-device models
const embedEngine = new EmbeddingEngine({
  modelPath: "./models/embedding_gemma.tflite",
  tokenizerPath: "./models/tokenizer.model",
});

const functionGemma = new FunctionGemmaEngine({
  modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
  tools: JSON.stringify(tools),
});

// Error clustering state
const errorClusters = new Map<string, { embedding: Float32Array; count: number; lastSeen: Date }>();

async function classifyAndCluster(logLine: string): Promise<string | null> {
  // Skip non-error lines
  if (!logLine.match(/error|fail|exception|timeout|refused|denied/i)) {
    return null;
  }

  // Generate embedding (~5ms on CPU)
  const embedding = await embedEngine.encodeAsync(logLine);

  // Find similar errors via cosine similarity
  let bestMatch: string | null = null;
  let bestSimilarity = 0.85; // Threshold for clustering

  for (const [clusterId, cluster] of errorClusters) {
    const similarity = EmbeddingEngine.cosineSimilarity(embedding, cluster.embedding);
    if (similarity > bestSimilarity) {
      bestSimilarity = similarity;
      bestMatch = clusterId;
    }
  }

  if (bestMatch) {
    // Update existing cluster
    const cluster = errorClusters.get(bestMatch)!;
    cluster.count++;
    cluster.lastSeen = new Date();
    return bestMatch;
  }

  // Create new cluster
  const clusterId = `cluster_${Date.now()}`;
  errorClusters.set(clusterId, { embedding, count: 1, lastSeen: new Date() });
  return clusterId;
}

async function diagnoseAndFix(errorLog: string, clusterId: string): Promise<void> {
  const cluster = errorClusters.get(clusterId);
  
  // Call FunctionGemma for diagnosis (~50ms)
  const result = await functionGemma.sendMessage(errorLog);
  
  if (result.functionCalls && result.functionCalls.length > 0) {
    const call = result.functionCalls[0];
    console.log(`[AutoFix] Detected ${cluster?.count || 1}x: ${call.name}`);
    console.log(`[AutoFix] Args: ${JSON.stringify(call.arguments)}`);
    
    // Execute remediation (in production, this would call actual infrastructure APIs)
    switch (call.name) {
      case "enableCors":
        console.log(`[AutoFix] Would enable CORS for: ${call.arguments.origin}`);
        break;
      case "restartService":
        console.log(`[AutoFix] Would restart: ${call.arguments.service}`);
        break;
      case "increaseMemory":
        console.log(`[AutoFix] Would increase memory for ${call.arguments.service} to ${call.arguments.memoryMb}MB`);
        break;
      // ... handle other tools
    }
  }
}

// Create dad-express app
const app = createApp();

// API routes
app.get("/health", () => ({ status: "ok", models: { embed: true, functionGemma: true } }));

app.get("/clusters", () => {
  const clusters = [];
  for (const [id, cluster] of errorClusters) {
    clusters.push({ id, count: cluster.count, lastSeen: cluster.lastSeen });
  }
  return clusters;
});

// Start Next.js as child process with log monitoring
const nextProcess = spawn("npx", ["next", "start"], {
  stdio: ["inherit", "pipe", "pipe"],
  env: { ...process.env, PORT: "3001" },
});

// Stream stdout
nextProcess.stdout.on("data", (data) => {
  const line = data.toString().trim();
  console.log(`[next] ${line}`);
});

// Stream stderr with intelligent processing
nextProcess.stderr.on("data", async (data) => {
  const line = data.toString().trim();
  console.log(`[next:err] ${line}`);
  
  // Classify and cluster error
  const clusterId = await classifyAndCluster(line);
  
  if (clusterId) {
    // Diagnose and auto-fix
    await diagnoseAndFix(line, clusterId);
  }
});

// Start dad-express on separate port for monitoring
app.listen(4000, () => {
  console.log("dad-express monitoring on http://localhost:4000");
  console.log("Next.js app on http://localhost:3001");
});
```

### Key Benefits

| Feature | Latency | Memory | Cloud Calls |
|---------|---------|--------|-------------|
| EmbeddingGemma | ~5ms/embed | ~50MB | 0 |
| FunctionGemma | ~50ms/call | ~271MB | 0 |
| Semantic clustering | <1ms | Varies | 0 |
| **Total pipeline** | **~60ms** | **~350MB** | **0** |

- **Zero cloud dependency**: All inference runs locally via LiteRT-LM
- **Sub-100ms latency**: Fast enough for real-time log processing
- **Privacy-preserving**: Error logs never leave the device
- **Continuous improvement**: Use Ax MiPRO to optimize prompts over time

## Limitations

- Optimized for the 7 specific infrastructure tools listed above
- Requires exact tool definitions for best accuracy
- May not generalize well to error patterns not seen in training

## License

This model inherits the [Gemma license](https://ai.google.dev/gemma/terms) from the base model.