KiteFishAI
/

KiteFish-A1-1.5B-Math

Text Generation

scientific-language-model

text-generation-inference

Model card Files Files and versions

anuj0456 commited on 4 days ago

Commit

c0b74e4

·

verified ·

1 Parent(s): 7c070a4

Update README.md

Files changed (1) hide show

README.md +0 -12

README.md CHANGED Viewed

@@ -19,8 +19,6 @@ library_name: transformers
 This is a **base scientific language model** (not instruction-tuned).
----
 ## Overview
 KiteFish-A1-1.5B explores what it takes to train a domain-specialized scientific language model directly from structured LaTeX archives.
@@ -35,8 +33,6 @@ KiteFish-A1-1.5B explores what it takes to train a domain-specialized scientific
 The focus of this project is *scientific language modeling robustness*, not benchmark optimization.
----
 ## Model Architecture
 - 24 Transformer layers
@@ -57,8 +53,6 @@ The focus of this project is *scientific language modeling robustness*, not benc
 **Validation Perplexity:** ~4.2 (held-out scientific corpus)
----
 ## Intended Use
 KiteFish-A1-1.5B is suitable for:
@@ -76,8 +70,6 @@ It is **not optimized for:**
 - General conversational AI
 - Benchmark leaderboard performance
----
 ## Performance Notes
 This model was trained under moderate compute constraints and without instruction tuning or alignment stages.
@@ -92,8 +84,6 @@ Observed characteristics:
 Performance improves significantly with supervised fine-tuning (SFT), LoRA adaptation, or domain-specific instruction tuning.
----
 ## Limitations
 - Not instruction-tuned
@@ -105,8 +95,6 @@ Performance improves significantly with supervised fine-tuning (SFT), LoRA adapt
 This release is intended primarily for research and experimentation.
----
 ## Example Usage
 ```python

 This is a **base scientific language model** (not instruction-tuned).
 ## Overview
 KiteFish-A1-1.5B explores what it takes to train a domain-specialized scientific language model directly from structured LaTeX archives.
 The focus of this project is *scientific language modeling robustness*, not benchmark optimization.
 ## Model Architecture
 - 24 Transformer layers
 **Validation Perplexity:** ~4.2 (held-out scientific corpus)
 ## Intended Use
 KiteFish-A1-1.5B is suitable for:
 - General conversational AI
 - Benchmark leaderboard performance
 ## Performance Notes
 This model was trained under moderate compute constraints and without instruction tuning or alignment stages.
 Performance improves significantly with supervised fine-tuning (SFT), LoRA adaptation, or domain-specific instruction tuning.
 ## Limitations
 - Not instruction-tuned
 This release is intended primarily for research and experimentation.
 ## Example Usage
 ```python