MnemoCore Performance Documentation
Performance Targets (SLOs)
| Metric | Target | Description |
|---|---|---|
store() P99 latency |
< 100ms | Store a single memory |
query() P99 latency |
< 50ms | Query for similar memories |
| Throughput | > 1000 req/s | Sustained request rate |
| Memory overhead | < 100MB per 100k memories | RAM usage for storage |
Baseline Measurements
BinaryHDV Operations (1024 dimensions)
| Operation | Time (us) | Notes |
|---|---|---|
xor_bind() |
~5 | XOR binding of two vectors |
permute() |
~5 | Cyclic permutation |
hamming_distance() |
~3 | Distance calculation |
similarity() |
~4 | Normalized similarity |
permute() Benchmark Results
BinaryHDV.permute() now uses one production path (unpackbits + roll + packbits) across all dimensions.
| Dimension | permute() (us) | Notes |
|---|---|---|
| 512 | ~5.2 | Production path |
| 4096 | ~5.5 | Production path |
| 16384 | ~6.8 | Production path |
| 32768 | ~8.2 | Production path |
| 65536 | ~11.3 | Production path |
| 131072 | ~17.7 | Production path |
Run python benchmarks/bench_permute.py for machine-specific current numbers.
Load Testing
Using Locust
# Install locust
pip install locust
# Run load test
cd tests/load
locust -f locustfile.py --host http://localhost:8100
Using the Benchmark Script
# Run 100k memory benchmark
python benchmarks/bench_100k_memories.py
Performance Optimization Tips
- Use BinaryHDV instead of float HDV.
- Use batch operations for bulk work.
- Keep Redis connection pools right-sized.
- Enable Qdrant binary quantization for faster search.
Monitoring
Prometheus metrics are exposed at /metrics endpoint:
mnemocore_store_duration_seconds- Store operation latencymnemocore_query_duration_seconds- Query operation latencymnemocore_memory_count_total- Total memories per tiermnemocore_queue_length- Subconscious queue length