Can modern LLMs actually count the number of b's in "blueberry"?
The article examines whether modern large language models can accurately count the number of 'b's in the word "blueberry," testing their ability to handle this specific adversarial question.