File size: 2,113 Bytes
dbb04e4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# MnemoCore Performance Documentation

## Performance Targets (SLOs)

| Metric | Target | Description |
|--------|--------|-------------|
| `store()` P99 latency | < 100ms | Store a single memory |
| `query()` P99 latency | < 50ms | Query for similar memories |
| Throughput | > 1000 req/s | Sustained request rate |
| Memory overhead | < 100MB per 100k memories | RAM usage for storage |

## Baseline Measurements

### BinaryHDV Operations (1024 dimensions)

| Operation | Time (us) | Notes |
|-----------|-----------|-------|
| `xor_bind()` | ~5 | XOR binding of two vectors |
| `permute()` | ~5 | Cyclic permutation |
| `hamming_distance()` | ~3 | Distance calculation |
| `similarity()` | ~4 | Normalized similarity |

### permute() Benchmark Results

`BinaryHDV.permute()` now uses one production path (`unpackbits` + `roll` + `packbits`) across all dimensions.

| Dimension | permute() (us) | Notes |
|-----------|----------------|-------|
| 512 | ~5.2 | Production path |
| 4096 | ~5.5 | Production path |
| 16384 | ~6.8 | Production path |
| 32768 | ~8.2 | Production path |
| 65536 | ~11.3 | Production path |
| 131072 | ~17.7 | Production path |

Run `python benchmarks/bench_permute.py` for machine-specific current numbers.

## Load Testing

### Using Locust

```bash

# Install locust

pip install locust



# Run load test

cd tests/load

locust -f locustfile.py --host http://localhost:8100

```

### Using the Benchmark Script

```bash

# Run 100k memory benchmark

python benchmarks/bench_100k_memories.py

```

## Performance Optimization Tips

1. Use BinaryHDV instead of float HDV.
2. Use batch operations for bulk work.
3. Keep Redis connection pools right-sized.
4. Enable Qdrant binary quantization for faster search.

## Monitoring

Prometheus metrics are exposed at `/metrics` endpoint:
- `mnemocore_store_duration_seconds` - Store operation latency
- `mnemocore_query_duration_seconds` - Query operation latency
- `mnemocore_memory_count_total` - Total memories per tier
- `mnemocore_queue_length` - Subconscious queue length