Translation

The Fourth Scaling Law

The post discusses a concept called the "Fourth Scaling Law," suggesting a new pattern or principle related to scaling in AI or technology beyond the known three scaling laws (model size, data, and compute).

Background

- "Amazed Saint" is a pseudonymous commentator in AI/tech. The tweet refers to a hot topic: the "fourth scaling law." The first three scaling laws (model size, data size, training compute) drove LLM progress for years — bigger models + more data + more compute = reliably better performance. That trend is showing diminishing returns. - The "fourth scaling law" means scaling compute at *inference time* (when the model generates an answer) instead of just during training. Let the model "think" longer — explore multiple reasoning paths, self-critique, backtrack. Pioneered by models like OpenAI's o1 and DeepSeek's R1. - Why it matters: It changes the AI race. Winning isn't just about giant training clusters anymore. Architectures that use inference-time compute efficiently become the differentiator. Also shifts costs to deployment — you pay per "thought" rather than per token — and raises questions about when more thinking stops helping.