rocksdb/table/block_based
Andrew Chang 31408c0d8d Set filesystem constructor parameter for FilePrefetchBuffer inside PrefetchTail (#13157)
Summary:
After https://github.com/facebook/rocksdb/pull/13118 was merged, I did some investigation to see whether the file system buffer reuse code was actually being used.

The good news is that I was able to see from the CPU profiling results that my code is getting invoked through the warm storage stress tests.

The bad news is that most of the time, the optimization is not being used, so we end up going through the regular old `RandomAccessFileReader::Read` path.

Here is the entire function call chain up to `FilePrefetchBuffer::Read`

1. rocksdb::DB::MultiGet
2. rocksdb::DBImpl::MultiGet
3. rocksdb::DBImpl::MultiGetCommon
4. rocksdb::DBImpl::MultiGetImpl
5. rocksdb::Version::MultiGet
6. rocksdb::Version::MultiGetFromSST
7. rocksdb::TableCache::MultiGet
8. rocksdb::TableCache::FindTable
9. rocksdb::TableCache::GetTableReader
10. rocksdb::BlockBasedTableFactory::NewTableReader
11. rocksdb::BlockBasedTable::Open
12. rocksdb::BlockBasedTable::PrefetchTail
13. rocksdb::FilePrefetchBuffer::Prefetch
14. rocksdb::FilePrefetchBuffer::Read

At this point, we split into `rocksdb::RandomAccessFileReader::Read` and
`rocksdb::FilePrefetchBuffer::FSBufferDirectRead`. `FSBufferDirectRead` gets called <3% of the time.

I think the root cause is that the `FileSystem* fs` parameter is not getting passed into the `FilePrefetchBuffer` constructor. When `fs` is `nullptr`, `UseFSBuffer()` will always return `false` and we do not end up calling `FSBufferDirectRead`.

Luckily, it does not seem like there are too many places I need to change. `BlockBasedTable` resets its `prefetch_buffer` in 3 separate places. When it disables the prefetch buffer (2/3 of the instances), we don't care about whether the `fs` parameter is there. This PR is addressing the third instance, where it is not trying to disable the buffer.

Note that there is another method, `PrefetchBufferCollection::GetOrCreatePrefetchBuffer` that creates new `FilePrefetchBuffer`s without the `fs` parameter. This method gets called by `compaction_iterator` and `merge_helper`. I think we can address this in a subsequent PR:
1. Each of these changes effectively "unlocks" the buffer reuse feature. Separating the changes would be helpful when I look at the profiling results again, since I can isolate what impact this PR had on the percentage of time that `rocksdb::FilePrefetchBuffer::FSBufferDirectRead` was invoked.
2. I still need to look into what exactly I would need to changes I need to make to `PrefetchBufferCollection`
3. This code seems to be for blob prefetching in particular, and I don't think it has the biggest ROI anyways.

```cpp
      const Status s = blob_fetcher_->FetchBlob(
          user_key(), blob_index, prefetch_buffer, &blob_value_, &bytes_read);
```
4. I am not sure if the current benchmark I am using for warm storage exercises this blob prefetching code, so I may need to find another way to assess the performance impact.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/13157

Test Plan: The existing unit test coverage guards against obvious bugs. I ran another set of performance tests to confirm there were no regressions in CPU utilization.

Reviewed By: anand1976

Differential Revision: D66464704

Pulled By: archang19

fbshipit-source-id: 260145cfcc05ac46cf2dd77a53a85e8808031dea
2024-12-06 15:37:44 -08:00
..
binary_search_index_reader.cc Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00
binary_search_index_reader.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
block.cc Add initial support for TimedPut API (#12419) 2024-03-14 15:44:55 -07:00
block.h Add some per key optimization for UDT in memtable only feature (#13031) 2024-10-03 17:57:50 -07:00
block_based_table_builder.cc Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
block_based_table_builder.h Fix/cleanup SeqnoToTimeMapping (#12253) 2024-01-19 21:50:38 -08:00
block_based_table_factory.cc Fix race to make BlockBasedTableOptions effectively mutable (#13082) 2024-10-25 10:24:54 -07:00
block_based_table_factory.h Fix race to make BlockBasedTableOptions effectively mutable (#13082) 2024-10-25 10:24:54 -07:00
block_based_table_iterator.cc Trim readahead size based on prefix during prefix scan (#13040) 2024-10-17 15:52:55 -07:00
block_based_table_iterator.h Trim readahead size based on prefix during prefix scan (#13040) 2024-10-17 15:52:55 -07:00
block_based_table_reader.cc Set filesystem constructor parameter for FilePrefetchBuffer inside PrefetchTail (#13157) 2024-12-06 15:37:44 -08:00
block_based_table_reader.h Set filesystem constructor parameter for FilePrefetchBuffer inside PrefetchTail (#13157) 2024-12-06 15:37:44 -08:00
block_based_table_reader_impl.h Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00
block_based_table_reader_sync_and_async.h Update MultiGet to respect the strict_capacity_limit block cache option (#13104) 2024-11-01 13:22:27 -07:00
block_based_table_reader_test.cc Update MultiGet to respect the strict_capacity_limit block cache option (#13104) 2024-11-01 13:22:27 -07:00
block_builder.cc internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
block_builder.h Logically strip timestamp during flush (#11557) 2023-06-29 15:50:50 -07:00
block_cache.cc Support compressed and local flash secondary cache stacking (#11812) 2023-09-21 20:30:53 -07:00
block_cache.h Support pro-actively erasing obsolete block cache entries (#12694) 2024-06-07 08:57:11 -07:00
block_prefetcher.cc Respect ReadOptions::read_tier in prefetching (#12782) 2024-06-19 09:53:59 -07:00
block_prefetcher.h Refactor FilePrefetchBuffer code (#12097) 2024-01-05 09:29:01 -08:00
block_prefix_index.cc Fix bug with kHashSearch and changing prefix_extractor with SetOptions (#10128) 2022-06-10 08:51:45 -07:00
block_prefix_index.h Fix bug with kHashSearch and changing prefix_extractor with SetOptions (#10128) 2022-06-10 08:51:45 -07:00
block_test.cc internal_repo_rocksdb (-8794174668376270091) (#12114) 2023-12-01 11:10:30 -08:00
block_type.h Remove deprecated block-based filter (#10184) 2022-06-16 15:51:33 -07:00
cachable_entry.h Support pro-actively erasing obsolete block cache entries (#12694) 2024-06-07 08:57:11 -07:00
data_block_footer.cc Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_footer.h Replace namespace name "rocksdb" with ROCKSDB_NAMESPACE (#6433) 2020-02-20 12:09:57 -08:00
data_block_hash_index.cc Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
data_block_hash_index.h Fix build with gcc 13 by including <cstdint> (#11118) 2023-01-25 14:30:32 -08:00
data_block_hash_index_test.cc Record newest_key_time as a table property (#13083) 2024-11-01 10:08:35 -07:00
filter_block.h Option to decouple index and filter partitions (#12939) 2024-08-16 15:34:31 -07:00
filter_block_reader_common.cc Remove redundant no_io parameters to filter functions (#12762) 2024-06-12 18:47:11 -07:00
filter_block_reader_common.h Remove redundant no_io parameters to filter functions (#12762) 2024-06-12 18:47:11 -07:00
filter_policy.cc More valgrind fixes (#12990) 2024-09-06 10:11:34 -07:00
filter_policy_internal.h Optimize, simplify filter block building (fix regression) (#12931) 2024-08-14 15:13:16 -07:00
flush_block_policy.cc Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
flush_block_policy_impl.h Change internal headers with duplicate names (#11408) 2023-05-17 11:27:09 -07:00
full_filter_block.cc Option to decouple index and filter partitions (#12939) 2024-08-16 15:34:31 -07:00
full_filter_block.h Option to decouple index and filter partitions (#12939) 2024-08-16 15:34:31 -07:00
full_filter_block_test.cc Optimize, simplify filter block building (fix regression) (#12931) 2024-08-14 15:13:16 -07:00
hash_index_reader.cc Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00
hash_index_reader.h Extend Get/MultiGet deadline support to table open (#6982) 2020-06-29 14:53:17 -07:00
index_builder.cc Refactor IndexBuilder::AddIndexEntry (#12867) 2024-07-22 14:27:31 -07:00
index_builder.h Refactor IndexBuilder::AddIndexEntry (#12867) 2024-07-22 14:27:31 -07:00
index_reader_common.cc Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00
index_reader_common.h Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00
mock_block_based_table.h Remove deprecated block-based filter (#10184) 2022-06-16 15:51:33 -07:00
parsed_full_filter_block.cc Hide FilterBits{Builder,Reader} from public API (#9592) 2022-02-17 16:34:46 -08:00
parsed_full_filter_block.h Major Cache refactoring, CPU efficiency improvement (#10975) 2023-01-11 14:20:40 -08:00
partitioned_filter_block.cc Option to decouple index and filter partitions (#12939) 2024-08-16 15:34:31 -07:00
partitioned_filter_block.h Option to decouple index and filter partitions (#12939) 2024-08-16 15:34:31 -07:00
partitioned_filter_block_test.cc Option to decouple index and filter partitions (#12939) 2024-08-16 15:34:31 -07:00
partitioned_index_iterator.cc Refactor FilePrefetchBuffer code (#12097) 2024-01-05 09:29:01 -08:00
partitioned_index_iterator.h Format files under table/ by clang-format (#10852) 2022-10-25 11:50:38 -07:00
partitioned_index_reader.cc Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00
partitioned_index_reader.h Support pro-actively erasing obsolete block cache entries (#12694) 2024-06-07 08:57:11 -07:00
reader_common.cc Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
reader_common.h Remove unnecessary, confusing 'extern' (#12300) 2024-01-29 10:38:08 -08:00
uncompression_dict_reader.cc Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00
uncompression_dict_reader.h Eliminate some parameters redundant with ReadOptions (#12761) 2024-06-12 15:44:37 -07:00