Checkpoints

#15

by borgr - opened Feb 29, 2024

Feb 29, 2024

There are multiple checkpoints mentioned all inside OLMo-7B repo, how could one part be with LR going to 0 and a later one in the same repo not? What does it mean about the rest of the checkpoints found in the repo?

akshitab

Ai2 org Mar 4, 2024

Hi @borgr , for the revisions from step0 to step556, we follow a linear LR schedule, and then in the last 1000 steps, we anneal the LR to 0. We found this to be better for the performance of the final model.

borgr

Mar 4, 2024

I think I didn't put the question well

I find the differences between those checkpoints unclear, specifically the ones that are part of allenai/OLMo-7B, how can the not annealed one be the one with more tokens,batches and steps?

akshitab

Ai2 org Mar 7, 2024

@borgr This might make it clearer:

borgr

Mar 8, 2024

Maybe write in the NAME and Note something comparable between the second and third row then?

baileyk

Ai2 org Jul 17

Hi, thanks again for the inquiry! We’re currently working on closing out old tickets, so we’re closing this out for now, but if you require a follow-up response, please re-open this ticket or a new one and we will get back to you!

baileyk changed discussion status to closed Jul 17

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment