IshaanMan123 commited on
Commit
444b56b
Β·
verified Β·
1 Parent(s): 7303fe2

πŸ”₯ v2.0.0 - Fresh model initialization - 2025-12-28 17:54

Browse files
Files changed (3) hide show
  1. README.md +343 -0
  2. config.json +9 -0
  3. model.pt +3 -0
README.md ADDED
@@ -0,0 +1,343 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - pytorch
5
+ - gpt2
6
+ - text-generation
7
+ - fin-ai
8
+ - experimental
9
+ - in-training
10
+ - from-scratch
11
+ - automated-training
12
+ language:
13
+ - en
14
+ datasets:
15
+ - wikitext
16
+ - roneneldan/TinyStories
17
+ - openai/gsm8k
18
+ - squad
19
+ - imdb
20
+ - ag_news
21
+ - yelp_review_full
22
+ - cnn_dailymail
23
+ - billsum
24
+ - commonsense_qa
25
+ - hellaswag
26
+ - winogrande
27
+ - boolq
28
+ - race
29
+ - stanfordnlp/coqa
30
+ - allenai/c4
31
+ - Skylion007/openwebtext
32
+ - trivia_qa
33
+ - hotpot_qa
34
+ - microsoft/ms_marco
35
+ - duorc
36
+ - amazon_polarity
37
+ - zeroshot/twitter-financial-news-sentiment
38
+ - sciq
39
+ - quail
40
+ - wiki_qa
41
+ - paws
42
+ - medical_questions_pairs
43
+ - app_reviews
44
+ - rotten_tomatoes
45
+ metrics:
46
+ - perplexity
47
+ library_name: pytorch
48
+ pipeline_tag: text-generation
49
+ ---
50
+
51
+ # πŸ€– Fin.AI v2.0 - Continuously Trained Language Model
52
+
53
+ <div align="center">
54
+
55
+ ![Status](https://img.shields.io/badge/status-training-yellow)
56
+ ![Version](https://img.shields.io/badge/version-2.0.0-blue)
57
+ ![Parameters](https://img.shields.io/badge/parameters-30M-green)
58
+ ![License](https://img.shields.io/badge/license-MIT-blue)
59
+
60
+ **⚠️ EXPERIMENTAL MODEL - Training from scratch**
61
+
62
+ [GitHub](https://github.com/MeridianAlgo/FinAI) β€’ [Training Logs](https://wandb.ai/meridianalgo-meridianalgo/fin-ai) β€’ [Report Issue](https://github.com/MeridianAlgo/FinAI/issues)
63
+
64
+ </div>
65
+
66
+ ---
67
+
68
+ ## 🚨 Important Notice
69
+
70
+ **This model is training from scratch and outputs will be gibberish initially.**
71
+
72
+ - πŸ”΄ **Brand new model** - Starting from random weights
73
+ - ⏳ **Training time needed**: 2-4 weeks for basic coherence
74
+ - πŸ€– **Automated training**: Every 1 hour 10 minutes via GitHub Actions
75
+ - πŸ“Š **Current quality**: Expect complete nonsense initially
76
+ - 🎯 **Purpose**: Research/experimental continuous learning
77
+
78
+ ---
79
+
80
+ ## πŸ“Š Model Overview
81
+
82
+ | Specification | Value |
83
+ |--------------|-------|
84
+ | **Architecture** | GPT-2 style Transformer |
85
+ | **Parameters** | 30,142,848 (~30M) |
86
+ | **Layers** | 6 |
87
+ | **Attention Heads** | 6 |
88
+ | **Embedding Dimension** | 384 |
89
+ | **Feed-Forward Dimension** | 1,536 |
90
+ | **Max Sequence Length** | 512 tokens |
91
+ | **Vocabulary Size** | 50,257 (GPT-2 tokenizer) |
92
+ | **Position Encoding** | Rotary (RoPE) |
93
+ | **Activation** | GELU |
94
+
95
+ ---
96
+
97
+ ## 🎯 Training Details
98
+
99
+ ### Training Schedule
100
+ - **Frequency**: Every 1 hour 10 minutes (6 cycles/hour)
101
+ - **Steps per cycle**: 800 steps
102
+ - **Daily steps**: ~115,200 steps
103
+ - **Weekly steps**: ~806,400 steps
104
+ - **Batch size**: 8 (effective: 32 with gradient accumulation)
105
+ - **Learning rate**: 3e-4 with cosine decay
106
+ - **Warmup steps**: 100
107
+
108
+ ### Training Infrastructure
109
+ - **Platform**: GitHub Actions (free tier)
110
+ - **Hardware**: CPU only
111
+ - **Training time**: ~15-20 minutes per cycle
112
+ - **Automatic upload**: To Hugging Face after each cycle
113
+
114
+ ### Datasets (30 total, rotating hourly)
115
+
116
+ The model trains on a diverse set of 30 datasets, cycling through one per hour:
117
+
118
+ **πŸ“š Knowledge & Reference**
119
+ - WikiText-2, OpenWebText, C4
120
+
121
+ **✍️ Creative Writing**
122
+ - TinyStories
123
+
124
+ **πŸ“° News & Articles**
125
+ - CNN/DailyMail, AG News, Billsum
126
+
127
+ **❓ Question Answering**
128
+ - SQuAD, CoQA, TriviaQA, HotpotQA, MS MARCO, WikiQA, Quail
129
+
130
+ **🧠 Reasoning & Logic**
131
+ - GSM8K (Math), Common Sense QA, HellaSwag, WinoGrande, BoolQ
132
+
133
+ **πŸ“– Reading Comprehension**
134
+ - RACE, DuoRC
135
+
136
+ **πŸ’¬ Reviews & Sentiment**
137
+ - IMDB, Yelp, Amazon Polarity, Rotten Tomatoes, App Reviews
138
+
139
+ **πŸ”¬ Scientific & Medical**
140
+ - SciQ, Medical Questions
141
+
142
+ **πŸ’° Financial**
143
+ - Twitter Financial News
144
+
145
+ **πŸ”„ Paraphrase & Similarity**
146
+ - PAWS
147
+
148
+ ---
149
+
150
+ ## πŸ“ˆ Training Progress
151
+
152
+ ### Current Status
153
+ - **Version**: v2.0.0
154
+ - **Training started**: December 28, 2024
155
+ - **Model type**: fresh_init
156
+ - **Total parameters**: 30,142,848
157
+
158
+ ### Expected Timeline
159
+
160
+ | Week | Expected Quality | Description |
161
+ |------|-----------------|-------------|
162
+ | 1 | πŸ”΄ Gibberish | Random weights, no coherence |
163
+ | 2 | 🟠 Patterns | Some token patterns emerging |
164
+ | 3-4 | 🟑 Basic | Simple word sequences |
165
+ | 5-8 | 🟒 Improving | Short coherent phrases |
166
+ | 9-12 | πŸ”΅ Decent | Usable for simple tasks |
167
+
168
+ ### Monitoring
169
+ - **GitHub Actions**: [View Training Runs](https://github.com/MeridianAlgo/FinAI/actions)
170
+ - **Wandb Dashboard**: [View Metrics](https://wandb.ai/meridianalgo-meridianalgo/fin-ai)
171
+ - **Model Updates**: This page updates automatically
172
+
173
+ ---
174
+
175
+ ## πŸ’» Usage
176
+
177
+ ### Installation
178
+
179
+ ```bash
180
+ pip install torch transformers huggingface-hub
181
+ ```
182
+
183
+ ### Download Model
184
+
185
+ ```python
186
+ from huggingface_hub import hf_hub_download
187
+ import os
188
+
189
+ # Create directory
190
+ os.makedirs("./fin_ai_model", exist_ok=True)
191
+
192
+ # Download model files
193
+ hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./fin_ai_model")
194
+ hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./fin_ai_model")
195
+ ```
196
+
197
+ ### Generate Text (Experimental)
198
+
199
+ ```python
200
+ from fin_ai.model import FinAIModel
201
+ import torch
202
+ from transformers import AutoTokenizer
203
+
204
+ # Load model
205
+ model = FinAIModel.from_pretrained("./fin_ai_model")
206
+ model.eval()
207
+
208
+ # Load tokenizer
209
+ tokenizer = AutoTokenizer.from_pretrained("gpt2")
210
+
211
+ # Generate text (expect poor quality initially)
212
+ input_text = "Once upon a time"
213
+ input_ids = tokenizer.encode(input_text, return_tensors="pt")
214
+
215
+ with torch.no_grad():
216
+ output = model.generate(
217
+ input_ids,
218
+ max_length=50,
219
+ temperature=0.8,
220
+ top_p=0.9,
221
+ do_sample=True,
222
+ )
223
+
224
+ generated_text = tokenizer.decode(output[0])
225
+ print(generated_text)
226
+
227
+ # Note: Output quality is poor initially and improves over weeks
228
+ ```
229
+
230
+ ---
231
+
232
+ ## πŸ”¬ Technical Details
233
+
234
+ ### Architecture Improvements (v2.0)
235
+
236
+ Compared to v1.x:
237
+ - βœ… **3x more parameters** (10M β†’ 30M)
238
+ - βœ… **Better architecture** (4 layers β†’ 6 layers)
239
+ - βœ… **Larger embeddings** (256 β†’ 384 dimensions)
240
+ - βœ… **More attention heads** (4 β†’ 6 heads)
241
+ - βœ… **Improved training** (600 β†’ 800 steps/cycle)
242
+
243
+ ### Training Configuration
244
+
245
+ ```yaml
246
+ model:
247
+ size_preset: "small"
248
+ n_layers: 6
249
+ n_heads: 6
250
+ embed_dim: 384
251
+ ff_dim: 1536
252
+ max_seq_len: 512
253
+
254
+ training:
255
+ batch_size: 8
256
+ gradient_accumulation_steps: 4
257
+ learning_rate: 3.0e-4
258
+ weight_decay: 0.01
259
+ warmup_steps: 100
260
+ max_steps: 800
261
+ ```
262
+
263
+ ---
264
+
265
+ ## πŸ“Š Evaluation
266
+
267
+ ### Metrics Tracked
268
+ - **Training Loss**: Cross-entropy loss
269
+ - **Perplexity**: exp(loss)
270
+ - **Tokens/Second**: Training throughput
271
+ - **Learning Rate**: Cosine schedule with warmup
272
+ - **Gradient Norm**: For stability monitoring
273
+
274
+ ### Benchmarks (Coming Soon)
275
+ Once the model reaches basic coherence, we'll evaluate on:
276
+ - HellaSwag (common sense)
277
+ - LAMBADA (reading comprehension)
278
+ - WikiText perplexity
279
+ - Custom generation quality tests
280
+
281
+ ---
282
+
283
+ ## ⚠️ Limitations
284
+
285
+ 1. **Early Training**: Model is in very early training stages
286
+ 2. **Output Quality**: Expect gibberish for several weeks
287
+ 3. **CPU Training**: Slower than GPU training
288
+ 4. **Small Model**: 30M parameters is relatively small
289
+ 5. **Limited Context**: 512 token context window
290
+ 6. **No Fine-tuning**: Base model only, not instruction-tuned
291
+ 7. **English Only**: Trained primarily on English text
292
+
293
+ ---
294
+
295
+ ## 🀝 Contributing
296
+
297
+ This is an open research project! Contributions welcome:
298
+
299
+ - **Code**: [GitHub Repository](https://github.com/MeridianAlgo/FinAI)
300
+ - **Issues**: [Report Problems](https://github.com/MeridianAlgo/FinAI/issues)
301
+ - **Discussions**: [Join Discussion](https://github.com/MeridianAlgo/FinAI/discussions)
302
+
303
+ ---
304
+
305
+ ## πŸ“œ License
306
+
307
+ MIT License - See [LICENSE](https://github.com/MeridianAlgo/FinAI/blob/main/LICENSE)
308
+
309
+ ---
310
+
311
+ ## πŸ“š Citation
312
+
313
+ ```bibtex
314
+ @misc{finai2024,
315
+ title={Fin.AI: Continuously Trained Language Model},
316
+ author={MeridianAlgo},
317
+ year={2024},
318
+ publisher={Hugging Face},
319
+ howpublished={\url{https://huggingface.co/MeridianAlgo/Fin.AI}},
320
+ note={Experimental model in active training}
321
+ }
322
+ ```
323
+
324
+ ---
325
+
326
+ ## πŸ”— Links
327
+
328
+ - **Repository**: https://github.com/MeridianAlgo/FinAI
329
+ - **Training Logs**: https://wandb.ai/meridianalgo-meridianalgo/fin-ai
330
+ - **GitHub Actions**: https://github.com/MeridianAlgo/FinAI/actions
331
+ - **Issues**: https://github.com/MeridianAlgo/FinAI/issues
332
+
333
+ ---
334
+
335
+ <div align="center">
336
+
337
+ **Last Updated**: 2025-12-28 17:54 UTC
338
+
339
+ **Status**: πŸ”΄ Training from Scratch
340
+
341
+ **Quality**: ⚠️ Expect Gibberish (2-4 weeks needed)
342
+
343
+ </div>
config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vocab_size": 50257,
3
+ "n_layers": 6,
4
+ "n_heads": 6,
5
+ "embed_dim": 384,
6
+ "ff_dim": 1536,
7
+ "max_seq_len": 512,
8
+ "dropout": 0.1
9
+ }
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66bd59e6a00205816e9dad61f4d367367c6fdbea05c87936e56e47cd7f04ea2b
3
+ size 120596507