Is ckpt000 the initialized model?

by jasonrqh - opened Mar 24, 2025

Mar 24, 2025

Really appreciate this great work on opensourcing all intermediate checkpoints! Just a quick question, is ckpt000 the initialized model or the model trained for one chunk of data? The model output from ckpt000 seems to have some patterns (e.g., 'the first one is the first one is...') rather than pure nonsense.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment