-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Implement Queue Service Using RocksDB
Many users implement a queue service using RocksDB. In these services, from each queue, new items are added with a higher sequence ID and removed from the smallest sequence ID. Users usually read from a queue in sequence ID increasing order.
You can simply encode it as <queue_id, sequence_id>
, where queue_id is fixed length encoded and sequence_id is encoded as big endian.
While iterating keys, a user can create an iterator and seek to <queue_id, target_sequence_id>
and iterate from there.
Since the oldest items are deleted, there can be a large amount of "tombstones" in the beginning of each queue_id
. As a result, the two queries might be exceptionally slow:
- Seek(
<queue_id, 0>
) - While you are in the last sequence ID of a
queue_id
and try to call Next()
To mitigate a problem, you can remember the first and last sequence ID of each queue_id, and never iterate over the range.
As another way to solve the second problem, you can set an end key of your iterate when you iterate inside a queue_id, by letting ReadOptions.iterate_upper_bound
point to <queue_id, max_int>
<queue_id + 1>
. We encourage you always set it no matter whether you see the slowness problem caused by deletions.
If a user finishes processing the last sequence_id
of a queue_id
, and keep polling new item to be created, just Seek(<queue_id, last_processed_id>
) and call Next() and see whether the next key is still for the same <queue_id>
. Make sure ReadOptions.iterate_upper_bound
points to <queue_id + 1>
to avoid slowness for the item deletion problem.
If you want to further optimize this use case, to avoid binary search of the whole LSM tree each time, consider using TailingIterator
(or ForwardIterator
called in some parts of the codes) (https://github.com/facebook/rocksdb/blob/main/include/rocksdb/options.h#L1235-L1241).
The queue service is a good use case of CompactOnDeletionCollector
, which prioritizes ranges with more deletes when scheduling compactions. Set ImmutableCFOptions::table_properties_collector_factories to the factory defined here: https://github.com/facebook/rocksdb/blob/main/include/rocksdb/utilities/table_properties_collectors.h#L23-L27
Contents
- RocksDB Wiki
- Overview
- RocksDB FAQ
- Terminology
- Requirements
- Contributors' Guide
- Release Methodology
- RocksDB Users and Use Cases
- RocksDB Public Communication and Information Channels
-
Basic Operations
- Iterator
- Prefix seek
- SeekForPrev
- Tailing Iterator
- Compaction Filter
- Multi Column Family Iterator
- Read-Modify-Write (Merge) Operator
- Column Families
- Creating and Ingesting SST files
- Single Delete
- Low Priority Write
- Time to Live (TTL) Support
- Transactions
- Snapshot
- DeleteRange
- Atomic flush
- Read-only and Secondary instances
- Approximate Size
- User-defined Timestamp
- Wide Columns
- BlobDB
- Online Verification
- Options
- MemTable
- Journal
- Cache
- Write Buffer Manager
- Compaction
- SST File Formats
- IO
- Compression
- Full File Checksum and Checksum Handoff
- Background Error Handling
- Huge Page TLB Support
- Tiered Storage (Experimental)
- Logging and Monitoring
- Known Issues
- Troubleshooting Guide
- Tests
- Tools / Utilities
-
Implementation Details
- Delete Stale Files
- Partitioned Index/Filters
- WritePrepared-Transactions
- WriteUnprepared-Transactions
- How we keep track of live SST files
- How we index SST
- Merge Operator Implementation
- RocksDB Repairer
- Write Batch With Index
- Two Phase Commit
- Iterator's Implementation
- Simulation Cache
- [To Be Deprecated] Persistent Read Cache
- DeleteRange Implementation
- unordered_write
- Extending RocksDB
- RocksJava
- Lua
- Performance
- Projects Being Developed
- Misc