SlateDB: An Object-Native LSM for Online Systems
SlateDB is a new open-source embedded database built as an "object-native" LSM-tree that stores its base SST files directly in cloud object storage (like S3 or GCS) rather than on local disk, aiming to simplify stateful workloads in serverless and cloud environments.
Background
- SlateDB is a new embedded database designed to store data directly on cloud object storage (like AWS S3) instead of a local disk, using a Log-Structured Merge (LSM) tree architecture.
- It targets "online systems" — meaning applications that need low read/write latency, not just batch analytics — but wants to avoid managing local disks. Think serverless functions, edge workers, or multi-region apps that can't rely on a fast local SSD.
- LSM trees (popularized by LevelDB, RocksDB) are a storage engine pattern that batches writes into immutable files and periodically merges them. SlateDB adapts this pattern to work over S3's high-latency, cheap object storage rather than local NVMe drives.
- The project was started by former engineers from Redis and RocksDB, and is open-source under the Apache 2.0 license. It's notable because existing LSM engines assume a local disk, while SlateDB rethinks compaction and caching for cloud blob stores.