TRI-ML
/

DCLM-1B

Model card Files Files and versions

Resources

View closed (2)

Is this model supported for finetuning with flash attention ?

#4 opened 5 months ago by

MMLU Performance After Token Training

#3 opened about 1 year ago by