File size: 4,344 Bytes
761169f
 
 
 
848b69b
 
 
 
 
3df44ed
761169f
848b69b
761169f
848b69b
761169f
 
 
e16caa5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
848b69b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
license: gemma
base_model: google/functiongemma-270m-it
tags:
- function-calling
- litert
- on-device
- gemma
- infrastructure
- LiteRT-LM
datasets:
- custom
language:
- en
pipeline_tag: text-generation
---

# FunctionGemma Infrastructure LiteRT-LM

A fine-tuned [FunctionGemma-270M](https://huggingface.co/google/functiongemma-270m-it) model converted to LiteRT-LM format for on-device inference. Designed for self-healing infrastructure and automatic error remediation.

## Model Details

| Property | Value |
|----------|-------|
| Base Model | google/functiongemma-270m-it |
| Format | LiteRT-LM (.litertlm) |
| Quantization | Dynamic INT8 |
| File Size | 272 MB |
| Parameters | 270M |

## Intended Use

This model is designed for [dad-express](https://github.com/anthropics/dad-express), a self-healing gateway that monitors HTTP traffic and automatically fixes infrastructure configuration issues by calling the appropriate tools.

## Supported Tools

The model was fine-tuned on 9 infrastructure tools:

| Tool | Description | Parameters |
|------|-------------|------------|
| `addProxyRoute` | Add reverse proxy route | path, upstream, port |
| `addCorsHeaders` | Configure CORS headers | origin, credentials |
| `configureSsl` | Configure SSL certificate | hostname, selfSigned |
| `setEnvVariable` | Set environment variable | name, value |
| `exposePort` | Expose port in Docker/firewall | service, port |
| `addHostEntry` | Add hostname to /etc/hosts | hostname, ip |
| `restartService` | Restart a service | service |
| `clearCache` | Clear cache | cacheType |
| `modifyConfig` | Modify config file | file, key, value |

## Training Details

### Dataset
- **Total Examples**: 10,216
- **Train/Eval Split**: 90/10 (9,194 train, 1,022 eval)
- **Format**: Prompt-completion pairs using FunctionGemma chat template
- **Distribution**: ~1,000-1,200 examples per tool (balanced)

### Training Configuration

Trained following [Google's official FunctionGemma fine-tuning notebook](https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/FunctionGemma/%5BFunctionGemma%5DFinetune_FunctionGemma_270M_for_Mobile_Actions_with_Hugging_Face.ipynb):

```python
SFTConfig(
    num_train_epochs=2,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=8,
    learning_rate=1e-5,
    lr_scheduler_type="cosine",
    gradient_checkpointing=True,
    packing=False,
    optim="adamw_torch_fused",
    bf16=True,
    completion_only_loss=True,  # Critical: only train on completion tokens
)
```

### Hardware
- **GPU**: NVIDIA L4 (24GB)
- **Training Time**: ~23 minutes
- **Conversion Time**: ~10 minutes

### Performance Metrics

| Metric | Value |
|--------|-------|
| Final Eval Loss | 0.034 |
| Token Accuracy | 98.6% |
| Training Steps | 576 |

## LiteRT-LM Conversion

Converted using [ai-edge-torch](https://github.com/google-ai-edge/ai-edge-torch):

```python
converter.convert_to_litert(
    pytorch_model,
    prefill_seq_len=256,
    kv_cache_max_len=1024,
    quantize="dynamic_int8",
    output_format="litertlm",
)
```

### LLM Metadata
```protobuf
start_token: { token_ids: { ids: [ 2 ] } }
stop_tokens: { token_str: "<end_of_turn>" }
stop_tokens: { token_str: "<start_function_response>" }
llm_model_type: { function_gemma: {} }
```

## Usage

### With LiteRT-LM Runtime
```typescript
import { LiteRTLM } from 'litert-lm';

const model = await LiteRTLM.load('functiongemma-infrastructure_q8_ekv1024.litertlm');
const response = await model.generate(prompt);
```

### Example Input/Output

**Input:**
```
Error: CORS - No 'Access-Control-Allow-Origin' header from http://localhost:3000
```

**Output:**
```
<start_function_call>call:addCorsHeaders{origin:<escape>http://localhost:3000<escape>,credentials:<escape>true<escape>}<end_function_call><start_function_response>
```

## Dependencies

- Python 3.11+
- transformers==4.57.1
- trl==0.25.1
- datasets==4.4.1
- ai-edge-torch-nightly
- ai-edge-litert-nightly

## License

This model inherits the [Gemma Terms of Use](https://ai.google.dev/gemma/terms).

## Citation

```bibtex
@misc{functiongemma-infrastructure-litertlm,
  title={FunctionGemma Infrastructure LiteRT-LM},
  author={dad-express contributors},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/macmacmacmac/functiongemma-infrastructure-litertlm}
}
```