rocksdb/utilities
Levi Tamasi b339d089ed Write-side support for FAISS IVF indices (#13197)
Summary:
Pull Request resolved: https://github.com/facebook/rocksdb/pull/13197

The patch adds initial support for backing FAISS's inverted file based indices with data stored in RocksDB. It introduces a `SecondaryIndex` implementation called `FaissIVFIndex` which takes ownership of a `faiss::IndexIVF` object. During indexing, `FaissIVFIndex` treats the original value of the specified primary column as an embedding vector, and passes it to the provided FAISS index object to perform quantization. It replaces the original embedding vector with the result of the coarse quantizer (i.e. the inverted list id), and puts the result of the fine quantizer (if any) into the secondary index value. Note that this patch is only one half of the equation; it provides a way of storing FAISS inverted lists in RocksDB but there is currently no retrieval/search support (this will be a follow-up change). Also, the integration currently works only with our internal Buck build. I plan to add support for `cmake` / `make` based builds similarly to how we handle Folly.

Reviewed By: jowlyzhang

Differential Revision: D66907065

fbshipit-source-id: 63fdf29895d5feeffc230254a7ddfb0aac050967
2024-12-09 18:56:27 -08:00
..
agg_merge Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
backup Avoid unnecessary work in internal calls to GetSortedWalFiles (#12831) 2024-07-01 23:29:02 -07:00
blob_db Fix file deletions in DestroyDB not rate limited (#12891) 2024-08-02 19:31:55 -07:00
cassandra Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
checkpoint Fix failure to clean the temporary directory due to NotFound in crash test checkpoint creation (#12919) 2024-08-08 15:37:19 -07:00
compaction_filters Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
convenience Run clang-format on utilities/ (except utilities/transactions/) (#10853) 2022-10-24 16:38:09 -07:00
leveldb_options Put Cache and CacheWrapper in new public header (#11192) 2023-02-09 12:12:02 -08:00
memory Refactor table_factory into MutableCFOptions (#13077) 2024-10-17 14:13:20 -07:00
merge_operators Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
option_change_migration Disallow refitting more than 1 file from non-L0 to L0 (#12481) 2024-03-29 10:52:36 -07:00
options Fix race to make BlockBasedTableOptions effectively mutable (#13082) 2024-10-25 10:24:54 -07:00
persistent_cache Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
secondary_index Write-side support for FAISS IVF indices (#13197) 2024-12-09 18:56:27 -08:00
simulator_cache Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
table_properties_collectors Add public API definitions for surfacing data age (#13138) 2024-12-02 16:32:02 -08:00
trace Prefer static_cast in place of most reinterpret_cast (#12308) 2024-02-07 10:44:11 -08:00
transactions Support using secondary indices with write-committed transactions (#13180) 2024-12-05 19:05:56 -08:00
ttl Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
write_batch_with_index Introduce a transaction option to skip memtable write during commit (#13144) 2024-12-05 15:00:17 -08:00
cache_dump_load.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
cache_dump_load_impl.cc Add dump all keys for cache dumper impl (#12500) 2024-04-12 10:47:13 -07:00
cache_dump_load_impl.h Add dump all keys for cache dumper impl (#12500) 2024-04-12 10:47:13 -07:00
compaction_filters.cc Remove FactoryFunc from LoadXXXObject (#11203) 2023-02-17 12:54:07 -08:00
counted_fs.cc Fix serious FSDirectory use-after-Close bug (missing fsync) (#10460) 2022-08-02 10:54:32 -07:00
counted_fs.h Explicitly closing all directory file descriptors (#10049) 2022-06-01 18:03:34 -07:00
debug.cc Add timestamp support in dump_wal/dump/idump (#12690) 2024-05-23 20:26:57 -07:00
env_mirror.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
env_mirror_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
env_timed.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
env_timed.h Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
env_timed_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
fault_injection_env.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
fault_injection_env.h Remove extra semi colon from internal_repo_rocksdb/repo/util/filelock_test.cc 2024-03-19 16:17:57 -07:00
fault_injection_fs.cc Fix non-ASCII character (#12972) 2024-09-03 14:41:55 -07:00
fault_injection_fs.h Handle injected write error after successful WAL write in crash test + misc (#12838) 2024-07-29 13:51:49 -07:00
fault_injection_secondary_cache.cc Add some compressed and tiered secondary cache stats (#12150) 2023-12-15 11:34:08 -08:00
fault_injection_secondary_cache.h Remove 'virtual' when implied by 'override' (#12319) 2024-01-31 13:14:42 -08:00
memory_allocators.h Major Cache refactoring, CPU efficiency improvement (#10975) 2023-01-11 14:20:40 -08:00
merge_operators.cc Remove FactoryFunc from LoadXXXObject (#11203) 2023-02-17 12:54:07 -08:00
merge_operators.h Run clang-format on utilities/ (except utilities/transactions/) (#10853) 2022-10-24 16:38:09 -07:00
object_registry.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
object_registry_test.cc Remove RocksDB LITE (#11147) 2023-01-27 13:14:19 -08:00
types_util.cc Add support in SstFileReader to get a raw table iterator (#12385) 2024-04-02 21:23:06 -07:00
types_util_test.cc Add support in SstFileReader to get a raw table iterator (#12385) 2024-04-02 21:23:06 -07:00
util_merge_operators_test.cc Run internal cpp modernizer on RocksDB repo (#12398) 2024-03-04 10:08:32 -08:00
wal_filter.cc Remove FactoryFunc from LoadXXXObject (#11203) 2023-02-17 12:54:07 -08:00