How an LLM becomes more coherent as we train it
A researcher trained a GPT-2-style language model on 3.2 billion tokens and tracked its progress through 57 checkpoints. The model evolved from generating incoherent text to producing coherent, motivational content by processing about one-third of the training data.