Resa-Yi
/

STILL-Trained-from-Scratch-SAE-OctoThinker-3B-Long-Base-196k

Model card Files Files and versions

STILL-Trained-from-Scratch-SAE-OctoThinker-3B-Long-Base-196k

9.67 GB

2 contributors

History: 2 commits

farukakgul

add sae trained from scratch at layer 12

b8db39b 3 months ago

model.layers.12
add sae trained from scratch at layer 12 3 months ago
.gitattributes

1.52 kB

initial commit 3 months ago
README.md

31 Bytes

initial commit 3 months ago
config.json

851 Bytes

add sae trained from scratch at layer 12 3 months ago
optimizer_0.pt
Detected Pickle imports (3)
- "torch.FloatStorage",
- "collections.OrderedDict",
- "torch._utils._rebuild_tensor_v2"
What is a pickle import?
4.83 GB
xet

add sae trained from scratch at layer 12 3 months ago
rank_0_state.pt
Detected Pickle imports (3)
- "collections.OrderedDict",
- "torch.LongStorage",
- "torch._utils._rebuild_tensor_v2"
What is a pickle import?
1.57 MB
xet

add sae trained from scratch at layer 12 3 months ago
state.pt
Pickle imports
- No problematic imports detected
What is a pickle import?
856 Bytes
xet

add sae trained from scratch at layer 12 3 months ago