sparseflow-chat / README.md
amewebstudio's picture
Upload README.md with huggingface_hub
1587be1 verified
---
license: mit
tags:
- sparseflow
- sparse-attention
- conversational
- efficient
---
# SparseFlow-Chat v5
An efficient conversational AI with **sparse attention** - achieving significant compute savings.
## 🚀 Performance
| Metric | Value |
|--------|-------|
| Parameters | 39,840,002 |
| Perplexity | 1.00 |
| Token Sparsity | 87.5% |
| Attention Saved | 87.5% |
## 🏗️ Architecture
- **Sparse Token Router**: O(n×k) instead of O(n²) attention
- **Persistent Memory Banks**: Store and retrieve knowledge
- **Channel Sparsity**: Activates only top-k channels
### Complexity Comparison
| Operation | Transformer | SparseFlow | Speedup |
|-----------|-------------|------------|--------|
| Attention | O(n²) | O(n×k) | 8x |
| FFN | O(n×d²) | O(n×k×d) | ~4x |
## 💬 Usage
```python
# Load model
import torch
checkpoint = torch.load("model.pt")
# ... initialize model with config.json
model.load_state_dict(checkpoint['model'])
# Chat
response = chat("What is the capital of France?")
# -> "The capital of France is Paris."
```
## 📝 Created By
**Logo (Mike Amega)** — [Ame Web Studio](https://github.com/AmeWebStudio)
February 2025