Update README.md
Browse files
README.md
CHANGED
|
@@ -51,8 +51,7 @@ This checkpoint was trained on data focused on character knowledge and lore:
|
|
| 51 |
| Parameter | Value |
|
| 52 |
|-----------|-------|
|
| 53 |
| Base Model | `Qwen/Qwen3-14B-Base` |
|
| 54 |
-
|
|
| 55 |
-
| Steps | 1000 / 3000 |
|
| 56 |
| Batch Size | 1 |
|
| 57 |
| Gradient Accumulation | 16 |
|
| 58 |
| **Effective Batch Size** | **16** |
|
|
@@ -63,7 +62,6 @@ This checkpoint was trained on data focused on character knowledge and lore:
|
|
| 63 |
| Precision | BF16 |
|
| 64 |
| Optimizer | 8-bit Paged AdamW |
|
| 65 |
| Gradient Checkpointing | ✓ |
|
| 66 |
-
| Priority Repeat | 50× (character cards) |
|
| 67 |
|
| 68 |
### Hardware
|
| 69 |
|
|
|
|
| 51 |
| Parameter | Value |
|
| 52 |
|-----------|-------|
|
| 53 |
| Base Model | `Qwen/Qwen3-14B-Base` |
|
| 54 |
+
| Steps | 3000 |
|
|
|
|
| 55 |
| Batch Size | 1 |
|
| 56 |
| Gradient Accumulation | 16 |
|
| 57 |
| **Effective Batch Size** | **16** |
|
|
|
|
| 62 |
| Precision | BF16 |
|
| 63 |
| Optimizer | 8-bit Paged AdamW |
|
| 64 |
| Gradient Checkpointing | ✓ |
|
|
|
|
| 65 |
|
| 66 |
### Hardware
|
| 67 |
|