sparseflow-chat / README.md

amewebstudio

Upload README.md with huggingface_hub

1587be1 verified 4 days ago

preview code

raw

history blame contribute delete

1.17 kB

metadata

license: mit
tags:
  - sparseflow
  - sparse-attention
  - conversational
  - efficient

SparseFlow-Chat v5

An efficient conversational AI with sparse attention - achieving significant compute savings.

🚀 Performance

Metric	Value
Parameters	39,840,002
Perplexity	1.00
Token Sparsity	87.5%
Attention Saved	87.5%

🏗️ Architecture

Sparse Token Router: O(n×k) instead of O(n²) attention
Persistent Memory Banks: Store and retrieve knowledge
Channel Sparsity: Activates only top-k channels

Complexity Comparison

Operation	Transformer	SparseFlow	Speedup
Attention	O(n²)	O(n×k)	8x
FFN	O(n×d²)	O(n×k×d)	~4x

💬 Usage

# Load model
import torch
checkpoint = torch.load("model.pt")
# ... initialize model with config.json
model.load_state_dict(checkpoint['model'])

# Chat
response = chat("What is the capital of France?")
# -> "The capital of France is Paris."

📝 Created By

Logo (Mike Amega) — Ame Web Studio

February 2025