File size: 3,371 Bytes
9e00ba8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2033fd5
9e00ba8
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
---
language: en
license: mit
tags:
- text-generation
- pytorch
- causal-lm
- continuous-learning
- gqa
- swiglu
- rmsnorm
- rope
---

# πŸ€– Fin.AI v2.0

**⚠️ EXPERIMENTAL - Continuously Learning Language Model**

Fin.AI v2 is an optimized transformer language model that trains itself every ~85 minutes on diverse datasets via GitHub Actions. THIS MODEL IS STILL IN TRAINING IT WILL GIVE U GIBBERISH!!!

## πŸš€ What's New in v2

### Architecture Improvements

- **Grouped Query Attention (GQA)**: 40% faster inference with fewer KV heads
- **SwiGLU Activation**: Better learning dynamics (used in LLaMA, PaLM)
- **RMSNorm**: 20% faster than LayerNorm
- **Rotary Position Embeddings (RoPE)**: Better position encoding
- **Pre-norm Architecture**: More stable training

### Performance Gains

- **40% faster training** on CPU
- **24% less memory** usage
- **Better model quality** with improved architecture
- **More efficient** parameter usage

## πŸ“Š Model Details

- **Architecture**: Custom GPT-style transformer with modern improvements
- **Parameters**: ~40M (small preset)
- **Layers**: 8
- **Attention Heads**: 8 (4 KV heads for GQA)
- **Embedding Dimension**: 512
- **FFN Dimension**: 1792 (with SwiGLU)
- **Max Sequence Length**: 512 tokens
- **Vocabulary Size**: 50,257 (GPT-2 tokenizer)

## 🎯 Training

- **Schedule**: Trains every ~85 minutes (24/7)
- **Datasets**: Rotates through 24+ diverse datasets
- **Platform**: GitHub Actions (free tier, CPU)
- **Framework**: PyTorch
- **Tracking**: Weights & Biases

## πŸ“₯ Usage

### Download and Load

```python
from huggingface_hub import hf_hub_download
import torch

# Download model files
hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./model")
hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./model")

# Load model
from fin_ai.model import FinAIModel

model = FinAIModel.from_pretrained("./model")
model.eval()
```

### Generate Text

```python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("gpt2")
prompt = "The future of AI is"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    inputs["input_ids"],
    max_new_tokens=100,
    temperature=0.8,
    top_k=50,
    top_p=0.9,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0]))
```

## ⚠️ Limitations

- **Experimental**: This is a research project, not production-ready
- **Quality**: Model is continuously learning and may produce errors
- **Biases**: May reflect biases from training data
- **Size**: Small model (40M params) has limited capabilities
- **Context**: 512 token context window

## πŸ”— Links

- **GitHub**: [MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)
- **Training Logs**: [GitHub Actions](https://github.com/MeridianAlgo/FinAI/actions)
- **Metrics**: [Wandb Dashboard](https://wandb.ai/meridianalgo-meridianalgo/fin-ai)
- **Architecture**: [Technical Documentation](https://github.com/MeridianAlgo/FinAI/blob/main/docs/ARCHITECTURE_V2.md)

## πŸ“œ License

MIT License - See [LICENSE](https://github.com/MeridianAlgo/FinAI/blob/main/LICENSE)

## πŸ™ Acknowledgments

Architecture inspired by:
- **LLaMA** (Meta AI) - GQA, SwiGLU, RMSNorm, RoPE
- **PaLM** (Google) - SwiGLU
- **GPT-NeoX** (EleutherAI) - RoPE

---

**Last Updated**: 2026-01-03 01:16 UTC

*Built with ❀️ for continuous learning*