Skip to content
TopicTracker
From HackerNewsView original
TranslationTranslation

Can gzip be a language model?

A blog post explores whether gzip compression can function as a language model by using compression ratios to estimate text similarity and perform classification tasks. The author finds that while not a true LM, gzip-based methods can surprisingly achieve competitive results on some text classification benchmarks, though with practical limitations.