File size: 3,371 Bytes
9e00ba8 2033fd5 9e00ba8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
---
language: en
license: mit
tags:
- text-generation
- pytorch
- causal-lm
- continuous-learning
- gqa
- swiglu
- rmsnorm
- rope
---
# π€ Fin.AI v2.0
**β οΈ EXPERIMENTAL - Continuously Learning Language Model**
Fin.AI v2 is an optimized transformer language model that trains itself every ~85 minutes on diverse datasets via GitHub Actions. THIS MODEL IS STILL IN TRAINING IT WILL GIVE U GIBBERISH!!!
## π What's New in v2
### Architecture Improvements
- **Grouped Query Attention (GQA)**: 40% faster inference with fewer KV heads
- **SwiGLU Activation**: Better learning dynamics (used in LLaMA, PaLM)
- **RMSNorm**: 20% faster than LayerNorm
- **Rotary Position Embeddings (RoPE)**: Better position encoding
- **Pre-norm Architecture**: More stable training
### Performance Gains
- **40% faster training** on CPU
- **24% less memory** usage
- **Better model quality** with improved architecture
- **More efficient** parameter usage
## π Model Details
- **Architecture**: Custom GPT-style transformer with modern improvements
- **Parameters**: ~40M (small preset)
- **Layers**: 8
- **Attention Heads**: 8 (4 KV heads for GQA)
- **Embedding Dimension**: 512
- **FFN Dimension**: 1792 (with SwiGLU)
- **Max Sequence Length**: 512 tokens
- **Vocabulary Size**: 50,257 (GPT-2 tokenizer)
## π― Training
- **Schedule**: Trains every ~85 minutes (24/7)
- **Datasets**: Rotates through 24+ diverse datasets
- **Platform**: GitHub Actions (free tier, CPU)
- **Framework**: PyTorch
- **Tracking**: Weights & Biases
## π₯ Usage
### Download and Load
```python
from huggingface_hub import hf_hub_download
import torch
# Download model files
hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./model")
hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./model")
# Load model
from fin_ai.model import FinAIModel
model = FinAIModel.from_pretrained("./model")
model.eval()
```
### Generate Text
```python
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")
prompt = "The future of AI is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=100,
temperature=0.8,
top_k=50,
top_p=0.9,
repetition_penalty=1.1
)
print(tokenizer.decode(outputs[0]))
```
## β οΈ Limitations
- **Experimental**: This is a research project, not production-ready
- **Quality**: Model is continuously learning and may produce errors
- **Biases**: May reflect biases from training data
- **Size**: Small model (40M params) has limited capabilities
- **Context**: 512 token context window
## π Links
- **GitHub**: [MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)
- **Training Logs**: [GitHub Actions](https://github.com/MeridianAlgo/FinAI/actions)
- **Metrics**: [Wandb Dashboard](https://wandb.ai/meridianalgo-meridianalgo/fin-ai)
- **Architecture**: [Technical Documentation](https://github.com/MeridianAlgo/FinAI/blob/main/docs/ARCHITECTURE_V2.md)
## π License
MIT License - See [LICENSE](https://github.com/MeridianAlgo/FinAI/blob/main/LICENSE)
## π Acknowledgments
Architecture inspired by:
- **LLaMA** (Meta AI) - GQA, SwiGLU, RMSNorm, RoPE
- **PaLM** (Google) - SwiGLU
- **GPT-NeoX** (EleutherAI) - RoPE
---
**Last Updated**: 2026-01-03 01:16 UTC
*Built with β€οΈ for continuous learning*
|