Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
tangledgroup
/
tangled-alpha-0.1-core
like
0
Follow
TangledGroup
5
Text Generation
Transformers
20 datasets
107 languages
chat
core
base
instruct
reason
conversational
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
c2fc44c
tangled-alpha-0.1-core
Commit History
out/pretrain-core/final
c2fc44c
Marko Tasic
commited on
Feb 28
reqs
9b182f7
Marko Tasic
commited on
Feb 24
grokadamw.GrokAdamW
da80ae1
mtasic85
commited on
Feb 24
grokadamw.GrokAdamW
1386dd6
mtasic85
commited on
Feb 24
global_batch_size: 256; micro_batch_size: 2
afa8f6e
mtasic85
commited on
Feb 24
global_batch_size: 256; micro_batch_size: 4
bee417a
mtasic85
commited on
Feb 24
micro_batch_size: 3
0f5ef2e
mtasic85
commited on
Feb 24
micro_batch_size: 4
1c9b116
mtasic85
commited on
Feb 24
micro_batch_size: 1
056e2c6
mtasic85
commited on
Feb 24
class_path: torch.optim.AdamW
fcc1668
mtasic85
commited on
Feb 24
micro_batch_size: 2
578a7be
mtasic85
commited on
Feb 24
class_path: bitsandbytes.optim.AdamW8bit
ed2b433
mtasic85
commited on
Feb 24
class_path: bitsandbytes.optim.PagedAdamW8bit
14f2503
mtasic85
commited on
Feb 24
micro_batch_size: 4
50de401
mtasic85
commited on
Feb 24
class_path: torchao.prototype.low_bit_optim.AdamW8bit
dfc4418
mtasic85
commited on
Feb 22
class_path: torchao.prototype.low_bit_optim.AdamW4bit
4afec17
mtasic85
commited on
Feb 22
class_path: torchao.prototype.low_bit_optim.AdamW8bit
779fd25
mtasic85
commited on
Feb 22
torchao
fa76479
mtasic85
commited on
Feb 22
max_seq_length: 8192
b68070c
mtasic85
commited on
Feb 22
pretrain model
193a28c
mtasic85
commited on
Feb 22
git config
8432ba4
mtasic85
commited on
Feb 22
initial commit
3d391d5
verified
mtasic85
commited on
Feb 22