forked from continuwuation/rocksdb
Summary: To be compatible with some upcoming compression change/refactoring where we supply a fixed size buffer to CompressBlock, we need to support CompressedSecondaryCache storing uncompressed values when the compression ratio is not suitable. It seems crazy that CompressedSecondaryCache currently stores compressed values that are *larger* than the uncompressed value, and even explicitly exercises that case (almost exclusively) in the existing unit tests. But it's true. This change fixes that with some other nearby refactoring/improvement: * Update the in-memory representation of these cache entries to support uncompressed entries even when compression is enabled. AFAIK this also allows us to safely get rid of "don't support custom split/merge for the tiered case". * Use more efficient in-memory representation for non-split entries * For CompressionType and CacheTier, which are defined as single-byte data types, use a single byte instead of varint32. (I don't know if varint32 was an attempt at future-proofing for a memory-only schema or what.) Now using lossless_cast will raise a compiler error if either of these types is made too large for a single byte. * Don't wrap entries in a CacheAllocationPtr object; it's not necessary. We can rely on the same allocator being provided at delete time. * Restructure serialization/deserialization logic, hopefully simpler or easier to read/understand. * Use a RelaxedAtomic for disable_cache_ to avoid race. Suggested follow-up on CompressedSecondaryCache: * Refine the exact strategy for rejecting compressions * Still have a lot of buffer copies; try to reduce * Revisit the split-merge logic and try to make it more efficient overall, more unified with non-split case Pull Request resolved: https://github.com/facebook/rocksdb/pull/13797 Test Plan: Unit tests updated to use actually compressible strings in many places and more testing around non-compressible string. ## Performance Test There was some pre-existing issue causing decompression failures in compressed secondary cache with cache_bench that is somehow fixed in this change. This decompression failures were present before the new compression API, but since then cause assertion failures rather than being quietly ignored. For the "before" test here, they are back to quietly ignored. And the cache_bench changes here were back-ported to the "before" configuration. ### No compressed secondary (setting expectations) ``` ./cache_bench --cache_type=auto_hyper_clock_cache -cache_size=8000000000 -populate_cache ``` Max key : 3906250 Before: Complete in 12.784 s; Rough parallel ops/sec = 2503123 Thread ops/sec = 160329; Lookup hit ratio: 0.686771 After: Complete in 12.745 s; Rough parallel ops/sec = 2510717 (in the noise) Thread ops/sec = 159498; Lookup hit ratio: 0.68686 ### Compressed secondary, no split/merge Same max key and approximate total memory size ``` /usr/bin/time ./cache_bench --cache_type=auto_hyper_clock_cache -cache_size=4000000000 -populate_cache -resident_ratio=0.125 -compressible_to_ratio=0.4 --secondary_cache_uri=compressed_secondary_cache://capacity=4000000000 ``` Before: Complete in 18.690 s; Rough parallel ops/sec = 1712144 Thread ops/sec = 108683; Lookup hit ratio: 0.776683 Latency: P50: 4205.19 P75: 15281.76 P99: 43810.98 P99.9: 71487.41 P99.99: 165453.32 max RSS (according to /usr/bin/time): 9341856 After: Complete in 17.878 s; Rough parallel ops/sec = 1789951 (+4.5%) Thread ops/sec = 114957; Lookup hit ratio: 0.792998 (+0.016) Latency: P50: 4012.70 P75: 14477.63 P99: 40039.70 P99.9: 62521.04 P99.99: 167049.18 max RSS (according to /usr/bin/time): 9235688 The improved hit ratio is probably from fixing the failed decompressions (somehow). And my modifications could have improved CPU efficiency, or it could be the small penalty the benchmark naturally imposes on most misses (generate another value and insert it). ### Compressed secondary, with split/merge ``` /usr/bin/time ./cache_bench --cache_type=auto_hyper_clock_cache -cache_size=4000000000 -populate_cache -resident_ratio=0.125 -compressible_to_ratio=0.4 --secondary_cache_uri='compressed_secondary_cache://capacity=4000000000;enable_custom_split_merge=true' ``` Before: Complete in 20.062 s; Rough parallel ops/sec = 1595075 Thread ops/sec = 101759; Lookup hit ratio: 0.787129 Latency: P50: 5338.53 P75: 16073.46 P99: 46752.65 P99.9: 73459.11 P99.99: 201318.75 max RSS (according to /usr/bin/time): 9049852 After: Complete in 18.564 s; Rough parallel ops/sec = 1723771 (+8.1%) Thread ops/sec = 110724; Lookup hit ratio: 0.813414 (+0.026) Latency: P50: 5234.75 P75: 14590.43 P99: 41401.03 P99.9: 65606.50 P99.99: 157248.04 max RSS (according to /usr/bin/time): 8917592 Looks like an improvement Reviewed By: anand1976 Differential Revision: D78842120 Pulled By: pdillinger fbshipit-source-id: 5f754b160c37ebee789279178ebb5e862071bdb2 |
||
|---|---|---|
| .. | ||
| allocator.h | ||
| arena.cc | ||
| arena.h | ||
| arena_test.cc | ||
| concurrent_arena.cc | ||
| concurrent_arena.h | ||
| jemalloc_nodump_allocator.cc | ||
| jemalloc_nodump_allocator.h | ||
| memkind_kmem_allocator.cc | ||
| memkind_kmem_allocator.h | ||
| memory_allocator.cc | ||
| memory_allocator_impl.h | ||
| memory_allocator_test.cc | ||
| memory_usage.h | ||