File size: 20,014 Bytes
761169f
 
 
848b69b
2079dd4
 
761169f
2079dd4
 
 
 
 
761169f
 
2079dd4
e16caa5
2079dd4
e16caa5
 
 
2079dd4
 
 
 
 
e16caa5
 
 
2079dd4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e16caa5
 
 
2079dd4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e16caa5
2079dd4
e16caa5
2079dd4
 
e16caa5
2079dd4
 
 
 
e16caa5
2079dd4
 
 
 
e16caa5
 
2079dd4
e16caa5
2079dd4
e16caa5
2079dd4
 
 
 
 
 
 
 
 
e16caa5
2079dd4
e16caa5
 
2079dd4
 
 
 
 
 
 
e16caa5
 
e1be71a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2079dd4
e16caa5
2079dd4
 
 
e16caa5
 
 
2079dd4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
---
license: gemma
language:
- en
base_model:
- google/functiongemma-270m-it
pipeline_tag: text-generation
tags:
- function-calling
- infrastructure
- devops
- litertlm
---

# FunctionGemma Infrastructure Tools v8

A fine-tuned [FunctionGemma 270M](https://huggingface.co/google/functiongemma-270m-it) model for infrastructure error diagnosis and remediation. Achieves **100% accuracy** on 7 infrastructure tools when using the correct tool definitions.

## Model Details

- **Base Model**: google/functiongemma-270m-it
- **Format**: LiteRT-LM (.litertlm) - optimized for on-device inference
- **Quantization**: INT8 (Q8)
- **Size**: ~271MB
- **Training**: 50 epochs on 10,500 examples (1,500 per tool)

## Supported Tools

| Tool | Description | Use Case |
|------|-------------|----------|
| `enableCors` | Enable CORS for a specific origin | CORS policy errors, blocked cross-origin requests |
| `updateConnectionUrl` | Update service connection URL | ECONNREFUSED errors, localhost connection issues in containers |
| `setEnvVar` | Set environment variable | Missing configuration, undefined env vars |
| `addHostMapping` | Add hostname to IP mapping | DNS resolution (ENOTFOUND) errors |
| `increaseMemory` | Increase memory limit | OOMKilled errors, out of memory crashes |
| `increaseTimeout` | Increase timeout value | 504 Gateway Timeout, connection timeout errors |
| `restartService` | Restart a service | Stuck processes, stale data after deployment |

## Usage with LiteRT-LM

### Download the Model

```bash
# Using huggingface-cli
huggingface-cli download macmacmacmac/functiongemma-nextjs functiongemma-infra-v8_q8_ekv1024.litertlm

# Or using Python
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
    repo_id="macmacmacmac/functiongemma-nextjs",
    filename="functiongemma-infra-v8_q8_ekv1024.litertlm"
)
```

### Required Tool Definitions

**Important**: You must use these exact tool definitions for optimal accuracy. The model was trained with these specific descriptions.

```javascript
const tools = [
  {
    type: "function",
    function: {
      name: "enableCors",
      description: "Enable CORS for a specific origin to fix blocked cross-origin requests.",
      parameters: {
        type: "object",
        properties: {
          origin: { type: "string", description: "The origin to allow (e.g., http://localhost:3000)" },
          methods: { type: "string", description: "Allowed HTTP methods (e.g., GET,POST,PUT,DELETE)" }
        },
        required: ["origin"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "updateConnectionUrl",
      description: "Update a service connection URL to fix ECONNREFUSED errors, typically changing localhost to the correct service hostname.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to update (e.g., database, redis, api)" },
          hostname: { type: "string", description: "The correct hostname to connect to" },
          port: { type: "integer", description: "The port number to connect to" }
        },
        required: ["service", "hostname", "port"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "setEnvVar",
      description: "Set an environment variable to fix missing configuration errors.",
      parameters: {
        type: "object",
        properties: {
          name: { type: "string", description: "Environment variable name (e.g., DATABASE_URL, API_KEY)" },
          value: { type: "string", description: "The value to set" }
        },
        required: ["name", "value"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "addHostMapping",
      description: "Add a hostname to IP mapping to fix DNS resolution (ENOTFOUND) errors.",
      parameters: {
        type: "object",
        properties: {
          hostname: { type: "string", description: "The hostname to map" },
          ip: { type: "string", description: "The IP address to map to" }
        },
        required: ["hostname", "ip"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseMemory",
      description: "Increase memory limit for a service to fix OOMKilled errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name" },
          memoryMb: { type: "integer", description: "Memory limit in megabytes" }
        },
        required: ["service", "memoryMb"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseTimeout",
      description: "Increase timeout value to fix 504 Gateway Timeout or connection timeout errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to configure" },
          timeoutMs: { type: "integer", description: "Timeout value in milliseconds" }
        },
        required: ["service", "timeoutMs"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "restartService",
      description: "Restart a service to apply configuration changes or fix a stuck process.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name to restart" }
        },
        required: ["service"]
      }
    }
  }
];
```

### Example Usage with dad-express

```javascript
const { FunctionGemmaEngine } = require('dad-express');

const engine = new FunctionGemmaEngine({
  modelPath: './functiongemma-infra-v8_q8_ekv1024.litertlm',
  tools: JSON.stringify(tools)
});

// Diagnose an error
const result = await engine.call('Container api was OOMKilled - out of memory');
console.log(result.tool_calls[0].function);
// { name: 'increaseMemory', arguments: { service: 'api', memoryMb: 1024 } }
```

## Training Data

The model was trained on 10,500 synthetic examples covering common infrastructure errors:

| Error Pattern | Tool | Examples |
|--------------|------|----------|
| CORS policy errors | enableCors | 1,500 |
| ECONNREFUSED errors | updateConnectionUrl | 1,500 |
| Missing env vars | setEnvVar | 1,500 |
| DNS/ENOTFOUND errors | addHostMapping | 1,500 |
| OOMKilled errors | increaseMemory | 1,500 |
| Timeout errors | increaseTimeout | 1,500 |
| Stuck services | restartService | 1,500 |

### Sample Training Examples

```
"CORS error: No 'Access-Control-Allow-Origin' header from http://localhost:3000" β†’ enableCors
"Error: connect ECONNREFUSED 127.0.0.1:5432 - database connection failed" β†’ updateConnectionUrl
"Missing required environment variable: DATABASE_URL" β†’ setEnvVar
"getaddrinfo ENOTFOUND db" β†’ addHostMapping
"Container api was OOMKilled" β†’ increaseMemory
"504 Gateway Timeout from backend" β†’ increaseTimeout
"nginx container is not responding" β†’ restartService
```



## Fully Loaded Serving

**Fully Loaded Serving** is an end-to-end intelligent error remediation pipeline that runs entirely on-device. It combines:

1. **Low-latency vector embeddings** (EmbeddingGemma) for streaming log classification
2. **Semantic clustering** to group similar errors/issues/outliers  
3. **Function calling** (FunctionGemma) to automatically diagnose and fix infrastructure issues
4. **Prompt optimization** via [Ax](https://github.com/ax-llm/ax) with MiPRO for continuous improvement

### Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Next.js Application                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  stdout/stderr ──▢ Log Stream ──▢ dad-express middleware                β”‚
β”‚                                          β”‚                              β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚                    β”‚                     β–Ό                      β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚      EmbeddingGemma (~5ms)       β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚   768-dim vector per log line    β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                 β”‚                          β”‚       β”‚
β”‚                    β”‚                 β–Ό                          β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚   Semantic Clustering (cosine)   β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β€’ Group similar errors          β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β€’ Detect outliers               β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β€’ Identify recurring patterns   β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                 β”‚                          β”‚       β”‚
β”‚                    β”‚                 β–Ό                          β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚   FunctionGemma (~50ms/call)     β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  β†’ enableCors, setEnvVar, etc.   β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                 β”‚                          β”‚       β”‚
β”‚                    β”‚                 β–Ό                          β”‚       β”‚
β”‚                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚       β”‚
β”‚                    β”‚  β”‚      Auto-Remediation Layer      β”‚      β”‚       β”‚
β”‚                    β”‚  β”‚  Execute fix or notify developer β”‚      β”‚       β”‚
β”‚                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚       β”‚
β”‚                    β”‚                                            β”‚       β”‚
β”‚                    β”‚     LiteRT-LM (on-device, ~300MB RAM)      β”‚       β”‚
β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Ax Integration with MiPRO

[Ax](https://github.com/ax-llm/ax) is a TypeScript DSPy-style framework for declarative AI programming. dad-express provides `AxLiteRTProvider` to run Ax signatures entirely on-device:

```typescript
import { AxGen } from "@ax-llm/ax";
import { AxLiteRTProvider, EmbeddingEngine, FunctionGemmaEngine } from "dad-express";

// Create on-device provider with both embedding and chat models
const provider = new AxLiteRTProvider({
  chat: {
    modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
    tools: infrastructureTools,  // The 7 tools from this repo
  },
  embed: {
    modelPath: "./models/embedding_gemma.tflite",
    tokenizerPath: "./models/tokenizer.model",
  },
});

// Define Ax signature for error diagnosis (MiPRO-optimizable)
const diagnoseError = new AxGen(`
  errorMessage:string "The error log line",
  errorCluster:string? "Similar errors seen recently"
  ->
  diagnosis:string "Root cause analysis",
  toolName:string "Which infrastructure tool to call",
  confidence:class "high, medium, low"
`);

// Run inference on-device
const result = await diagnoseError.forward(provider, {
  errorMessage: "CORS error from http://localhost:3000",
  errorCluster: "3 similar CORS errors in last 5 minutes",
});

console.log(result);
// { diagnosis: "Frontend origin not in allowed list", 
//   toolName: "enableCors", 
//   confidence: "high" }
```

### Example: Hosting Next.js with Fully Loaded Serving

```typescript
// server.ts - Next.js with intelligent error remediation
import { createApp, FunctionGemmaEngine, EmbeddingEngine } from "dad-express";
import { spawn } from "child_process";

// Infrastructure tools (exact definitions for 100% accuracy)
const tools = [
  { type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow" } }, required: ["origin"] } } },
  { type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors.", parameters: { type: "object", properties: { service: { type: "string" }, hostname: { type: "string" }, port: { type: "integer" } }, required: ["service", "hostname", "port"] } } },
  { type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string" }, value: { type: "string" } }, required: ["name", "value"] } } },
  { type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution errors.", parameters: { type: "object", properties: { hostname: { type: "string" }, ip: { type: "string" } }, required: ["hostname", "ip"] } } },
  { type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string" }, memoryMb: { type: "integer" } }, required: ["service", "memoryMb"] } } },
  { type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout errors.", parameters: { type: "object", properties: { service: { type: "string" }, timeoutMs: { type: "integer" } }, required: ["service", "timeoutMs"] } } },
  { type: "function", function: { name: "restartService", description: "Restart a service to apply changes or fix stuck processes.", parameters: { type: "object", properties: { service: { type: "string" } }, required: ["service"] } } },
];

// Initialize on-device models
const embedEngine = new EmbeddingEngine({
  modelPath: "./models/embedding_gemma.tflite",
  tokenizerPath: "./models/tokenizer.model",
});

const functionGemma = new FunctionGemmaEngine({
  modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
  tools: JSON.stringify(tools),
});

// Error clustering state
const errorClusters = new Map<string, { embedding: Float32Array; count: number; lastSeen: Date }>();

async function classifyAndCluster(logLine: string): Promise<string | null> {
  // Skip non-error lines
  if (!logLine.match(/error|fail|exception|timeout|refused|denied/i)) {
    return null;
  }

  // Generate embedding (~5ms on CPU)
  const embedding = await embedEngine.encodeAsync(logLine);

  // Find similar errors via cosine similarity
  let bestMatch: string | null = null;
  let bestSimilarity = 0.85; // Threshold for clustering

  for (const [clusterId, cluster] of errorClusters) {
    const similarity = EmbeddingEngine.cosineSimilarity(embedding, cluster.embedding);
    if (similarity > bestSimilarity) {
      bestSimilarity = similarity;
      bestMatch = clusterId;
    }
  }

  if (bestMatch) {
    // Update existing cluster
    const cluster = errorClusters.get(bestMatch)!;
    cluster.count++;
    cluster.lastSeen = new Date();
    return bestMatch;
  }

  // Create new cluster
  const clusterId = `cluster_${Date.now()}`;
  errorClusters.set(clusterId, { embedding, count: 1, lastSeen: new Date() });
  return clusterId;
}

async function diagnoseAndFix(errorLog: string, clusterId: string): Promise<void> {
  const cluster = errorClusters.get(clusterId);
  
  // Call FunctionGemma for diagnosis (~50ms)
  const result = await functionGemma.sendMessage(errorLog);
  
  if (result.functionCalls && result.functionCalls.length > 0) {
    const call = result.functionCalls[0];
    console.log(`[AutoFix] Detected ${cluster?.count || 1}x: ${call.name}`);
    console.log(`[AutoFix] Args: ${JSON.stringify(call.arguments)}`);
    
    // Execute remediation (in production, this would call actual infrastructure APIs)
    switch (call.name) {
      case "enableCors":
        console.log(`[AutoFix] Would enable CORS for: ${call.arguments.origin}`);
        break;
      case "restartService":
        console.log(`[AutoFix] Would restart: ${call.arguments.service}`);
        break;
      case "increaseMemory":
        console.log(`[AutoFix] Would increase memory for ${call.arguments.service} to ${call.arguments.memoryMb}MB`);
        break;
      // ... handle other tools
    }
  }
}

// Create dad-express app
const app = createApp();

// API routes
app.get("/health", () => ({ status: "ok", models: { embed: true, functionGemma: true } }));

app.get("/clusters", () => {
  const clusters = [];
  for (const [id, cluster] of errorClusters) {
    clusters.push({ id, count: cluster.count, lastSeen: cluster.lastSeen });
  }
  return clusters;
});

// Start Next.js as child process with log monitoring
const nextProcess = spawn("npx", ["next", "start"], {
  stdio: ["inherit", "pipe", "pipe"],
  env: { ...process.env, PORT: "3001" },
});

// Stream stdout
nextProcess.stdout.on("data", (data) => {
  const line = data.toString().trim();
  console.log(`[next] ${line}`);
});

// Stream stderr with intelligent processing
nextProcess.stderr.on("data", async (data) => {
  const line = data.toString().trim();
  console.log(`[next:err] ${line}`);
  
  // Classify and cluster error
  const clusterId = await classifyAndCluster(line);
  
  if (clusterId) {
    // Diagnose and auto-fix
    await diagnoseAndFix(line, clusterId);
  }
});

// Start dad-express on separate port for monitoring
app.listen(4000, () => {
  console.log("dad-express monitoring on http://localhost:4000");
  console.log("Next.js app on http://localhost:3001");
});
```

### Key Benefits

| Feature | Latency | Memory | Cloud Calls |
|---------|---------|--------|-------------|
| EmbeddingGemma | ~5ms/embed | ~50MB | 0 |
| FunctionGemma | ~50ms/call | ~271MB | 0 |
| Semantic clustering | <1ms | Varies | 0 |
| **Total pipeline** | **~60ms** | **~350MB** | **0** |

- **Zero cloud dependency**: All inference runs locally via LiteRT-LM
- **Sub-100ms latency**: Fast enough for real-time log processing
- **Privacy-preserving**: Error logs never leave the device
- **Continuous improvement**: Use Ax MiPRO to optimize prompts over time

## Limitations

- Optimized for the 7 specific infrastructure tools listed above
- Requires exact tool definitions for best accuracy
- May not generalize well to error patterns not seen in training

## License

This model inherits the [Gemma license](https://ai.google.dev/gemma/terms) from the base model.