First Token Cutoff LLM sampling
The blog post proposes a new sampling algorithm called First Token Cutoff (FTC) as an alternative to nucleus sampling. FTC selects tokens based on their probability ratio to the highest-scoring token, rejecting those below a cutoff threshold. This approach aims to maintain creativity while avoiding selection of suboptimal tokens that could lead to hallucination.