Direct I/O for Cassandra Compaction: Cutting p99 Read Latency by 5x
The article discusses how using Direct I/O for Cassandra compaction instead of buffered I/O can significantly reduce p99 read latency—by up to 5x—by avoiding page cache pollution and reducing memory pressure during compaction operations.
Background
- Apache Cassandra is a widely used open-source NoSQL database, designed for massive scale across many servers with no single point of failure. Companies like Netflix, Apple, and Instagram rely on it.
- "Compaction" is a background maintenance process that merges temporary data files (SSTables). It's necessary to keep reads fast, but it's also expensive — it does heavy disk I/O that can slow down normal user-facing read requests.
- "p99 read latency" means the worst-case latency experienced by the slowest 1% of read requests. Cutting it by 5× is a big deal for applications that need consistently fast responses.
- This article shows how switching compaction from "Buffered I/O" (which goes through the OS page cache) to "Direct I/O" (bypassing the cache) dramatically improves read performance. The key insight: compaction's access pattern doesn't benefit from caching, but the default approach pollutes the cache and evicts useful user data.