rocksdb/utilities/trie_index/bitvector.cc
zaidoon 16c81a27e9 Extend UDI trie index with seqno side-table for same-user-key block boundaries (#14412)
Summary:
When the same user key spans adjacent data blocks with different sequence
numbers, the trie index cannot distinguish them because it only stores
user keys. This adds a side-table that stores per-leaf sequence numbers
and overflow block metadata, enabling correct post-seek correction using
seqno comparison.

Design:
- Trie always stores user-key-only separators (unchanged)
- Duplicate separators from same-key boundaries are de-duplicated
- Side-table stores leaf_seqnos[], leaf_block_counts[], and overflow
  arrays (offsets, sizes, seqnos) serialized after the trie data
- Post-seek correction: after trie Seek lands on a leaf, compare
  target_seq vs leaf_seqno to decide whether to advance through
  overflow blocks or to the next trie leaf
- Zero overhead when no same-user-key boundaries exist (no side-table
  serialized, no seqno checks at seek time)

Public API changes (user_defined_index.h):
- AddIndexEntry: single pure virtual with IndexEntryContext parameter
  (replaces old 4-arg version). Context carries last_key_seq and
  first_key_seq for block boundaries.
- SeekAndGetResult: single pure virtual with SeekContext parameter
  (replaces old 2-arg version). Context carries target_seq.
- Fix trailing semicolons on NewBuilder/NewReader default overrides.

Wrapper layer (user_defined_index_wrapper.h):
- Extract sequence numbers from parsed internal keys and pass via
  context structs to UDI builder/iterator.
- Fix CurrentIndexSizeEstimate() to delegate to internal builder
  (was returning 0).
- Fix ApproximateMemoryUsage() to include UDI reader memory.
- Fix typos: lof_err_key -> log_err_key, COuld -> Could,
  "Bad index name" -> "Bad index name: ".

Trie builder (trie_index_factory.cc):
- Buffer separator entries during building; at Finish(), detect whether
  any same-user-key boundary was seen (sticky flag, same strategy as
  ShortenedIndexBuilder::must_use_separator_with_seq_).
- When seqno encoding is needed, re-encode all separators with seqno
  side-table metadata in the trie.
- Default null comparator to BytewiseComparator() to prevent crash.

Trie iterator (trie_index_factory.cc):
- Post-seek correction: compare target_seq vs leaf_seqno to advance
  through overflow runs.
- Overflow run state tracking: overflow_run_index_, overflow_run_size_,
  overflow_base_idx_ for O(1) access to overflow block handles.
- NextAndGetResult advances within overflow runs before moving to next
  trie leaf.
- Return kOutOfBound instead of kUnknown when Seek/Next finds no blocks.

LOUDS trie (louds_trie.h, louds_trie.cc):
- Seqno side-table: builder AddKeyWithSeqno/AddOverflowBlock methods,
  BFS reordering of seqno/block_count arrays, serialization/deserialization
  of per-leaf seqnos, block counts, overflow handles/seqnos, and
  overflow_base_ prefix sum for O(1) access.
- AppendKeySlot() helper with debug bounds assert on all key-append sites.

Dead code removal:
- LoudsTrieBuilder::NumKeys() (never called externally)
- sparse_leaf_count_ (serialized but never read by the reader)
- DenseChildNodeNum(pos), DenseLeafIndex(pos), DenseNodeNum(pos)
  (superseded by FromRank variants that avoid redundant Rank1 calls)

Deserialization hardening (all in InitFromData, cold path only):
- Move num_keys_ validation before any dependent arithmetic.
- Validate dense bitvector sizes match dense_node_count_ (d_labels
  must be node_count*256, d_has_child must equal d_labels.NumOnes(),
  d_is_prefix_key must equal node_count).
- Validate sparse bitvector sizes match s_labels_size_ (s_has_child
  and s_louds must have the same number of bits as the label array).
- Validate child position table values within s_labels_size_ bounds.
- Validate chain suffix offset+length within suffix data blob.
- Validate chain end child indices < num_internal or == UINT32_MAX.
- EliasFano: add count_ upper-bound check (<=2^30) to prevent
  count_*low_bits_ integer overflow.
- Bitvector: validate select hint values < num_rank_samples_ to
  prevent OOB in FindNthOneBit/FindNthZeroBit.
- EliasFano: custom move ctor/assignment to re-seat low_words_
  after owned_low_data_ move (fixes dangling pointer for SSO strings).

https://github.com/facebook/rocksdb/issues/14406

Pull Request resolved: https://github.com/facebook/rocksdb/pull/14412

Reviewed By: anand1976

Differential Revision: D95160933

Pulled By: xingbowang

fbshipit-source-id: f2c3681c1059c03d540ce1cb9cc14cc79cd9730c
2026-03-04 15:31:17 -08:00

562 lines
20 KiB
C++

// Copyright (c) Meta Platforms, Inc. and affiliates.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
#include "utilities/trie_index/bitvector.h"
#include <algorithm>
#include <cassert>
#include <cstring>
#include "port/lang.h"
#include "util/coding.h"
namespace ROCKSDB_NAMESPACE {
namespace trie_index {
// ============================================================================
// Bitvector serialization format:
//
// [num_bits : uint64_t (fixed 8 bytes)]
// [num_ones : uint64_t (fixed 8 bytes)]
// [words : num_words * 8 bytes, where num_words = ceil(num_bits/64)]
// [rank_lut : num_rank_samples * 4 bytes (uint32_t), padded to 8]
// [select1_hints : num_select1_hints * 4 bytes (uint32_t), padded to 8]
// [select0_hints : num_select0_hints * 4 bytes (uint32_t), padded to 8]
//
// num_rank_samples = num_bits / kBitsPerRankSample + 1
// (The +1 is for the sentinel entry at the end that stores total popcount.)
//
// num_select1_hints = ceil(num_ones / kOnesPerSelectHint) if num_ones > 0,
// else 0. Analogously for num_select0_hints with (num_bits - num_ones).
//
// Using uint32_t for rank LUT entries halves the LUT memory overhead compared
// to uint64_t. See bitvector.h for why this is safe for trie index bitvectors.
// ============================================================================
size_t Bitvector::SerializedSize() const {
// Header: 2 uint64_t (num_bits, num_ones).
// Words: num_words * 8 bytes.
// Rank LUT: num_rank_samples * 4 bytes, padded to 8.
// FindNthOneBit hints: num_select1_hints * 4 bytes, padded to 8.
// FindNthZeroBit hints: num_select0_hints * 4 bytes, padded to 8.
size_t rank_bytes = num_rank_samples_ * sizeof(uint32_t);
size_t rank_padded = (rank_bytes + 7) & ~size_t(7);
size_t s1_bytes = num_select1_hints_ * sizeof(uint32_t);
size_t s1_padded = (s1_bytes + 7) & ~size_t(7);
size_t s0_bytes = num_select0_hints_ * sizeof(uint32_t);
size_t s0_padded = (s0_bytes + 7) & ~size_t(7);
return 2 * sizeof(uint64_t) + num_words_ * sizeof(uint64_t) + rank_padded +
s1_padded + s0_padded;
}
void Bitvector::EncodeTo(std::string* output) const {
size_t old_size = output->size();
size_t needed = SerializedSize();
output->resize(old_size + needed);
char* dst = &(*output)[old_size];
// Write header: num_bits, num_ones
memcpy(dst, &num_bits_, sizeof(uint64_t));
dst += sizeof(uint64_t);
memcpy(dst, &num_ones_, sizeof(uint64_t));
dst += sizeof(uint64_t);
// Write words
if (num_words_ > 0) {
memcpy(dst, words_, num_words_ * sizeof(uint64_t));
dst += num_words_ * sizeof(uint64_t);
}
// Helper lambda to write a uint32_t array with 8-byte aligned padding.
auto write_u32_array = [&](const uint32_t* arr, uint64_t count) {
if (count > 0) {
size_t bytes = count * sizeof(uint32_t);
memcpy(dst, arr, bytes);
size_t padded = (bytes + 7) & ~size_t(7);
if (padded > bytes) {
memset(dst + bytes, 0, padded - bytes);
}
dst += padded;
}
};
// Write rank LUT, select1 hints, select0 hints.
write_u32_array(rank_lut_, num_rank_samples_);
write_u32_array(select1_hints_, num_select1_hints_);
write_u32_array(select0_hints_, num_select0_hints_);
}
Status Bitvector::InitFromData(const char* data, size_t data_size,
size_t* bytes_consumed) {
const char* start = data;
// Read header
if (data_size < 2 * sizeof(uint64_t)) {
return Status::Corruption("Bitvector: insufficient data for header");
}
memcpy(&num_bits_, data, sizeof(uint64_t));
data += sizeof(uint64_t);
memcpy(&num_ones_, data, sizeof(uint64_t));
data += sizeof(uint64_t);
data_size -= 2 * sizeof(uint64_t);
// Validate header fields from untrusted data.
if (num_ones_ > num_bits_) {
return Status::Corruption("Bitvector: num_ones exceeds num_bits");
}
// The rank LUT uses uint32_t entries, so num_bits must fit in uint32_t.
// This also prevents integer overflow in derived size computations below.
if (num_bits_ > UINT32_MAX) {
return Status::Corruption(
"Bitvector: num_bits exceeds uint32_t range for rank LUT");
}
// Compute derived sizes
num_words_ = (num_bits_ + 63) / 64;
num_rank_samples_ = num_bits_ / kBitsPerRankSample + 1;
// Select hints: ceil(count / kOnesPerSelectHint) entries for ones/zeros.
// If there are 0 ones (or 0 zeros), there are 0 hints.
num_select1_hints_ =
(num_ones_ > 0) ? (num_ones_ - 1) / kOnesPerSelectHint + 1 : 0;
uint64_t num_zeros = num_bits_ - num_ones_;
num_select0_hints_ =
(num_zeros > 0) ? (num_zeros - 1) / kOnesPerSelectHint + 1 : 0;
// Read words
size_t words_bytes = num_words_ * sizeof(uint64_t);
if (data_size < words_bytes) {
return Status::Corruption("Bitvector: insufficient data for words");
}
// The data pointer must be 8-byte aligned for safe uint64_t access.
// This is guaranteed by the serialization format (header is 16 bytes,
// and the overall buffer comes from aligned block cache allocations).
if (reinterpret_cast<uintptr_t>(data) % alignof(uint64_t) != 0) {
return Status::Corruption("Bitvector: words data not 8-byte aligned");
}
words_ = reinterpret_cast<const uint64_t*>(data);
data += words_bytes;
data_size -= words_bytes;
// Helper lambda: read a padded uint32_t array from the stream.
auto read_u32_array = [&](const uint32_t** out, uint64_t count,
const char* name) -> Status {
if (count == 0) {
*out = nullptr;
return Status::OK();
}
size_t bytes = count * sizeof(uint32_t);
size_t padded = (bytes + 7) & ~size_t(7);
if (data_size < padded) {
return Status::Corruption(
std::string("Bitvector: insufficient data for ") + name);
}
if (reinterpret_cast<uintptr_t>(data) % alignof(uint32_t) != 0) {
return Status::Corruption(std::string("Bitvector: ") + name +
" not 4-byte aligned");
}
*out = reinterpret_cast<const uint32_t*>(data);
data += padded;
data_size -= padded;
return Status::OK();
};
Status s;
s = read_u32_array(&rank_lut_, num_rank_samples_, "rank LUT");
if (!s.ok()) {
return s;
}
s = read_u32_array(&select1_hints_, num_select1_hints_, "select1 hints");
if (!s.ok()) {
return s;
}
s = read_u32_array(&select0_hints_, num_select0_hints_, "select0 hints");
if (!s.ok()) {
return s;
}
// Validate select hint values. Each hint is a rank LUT index used to
// narrow the FindNthOneBit/FindNthZeroBit search range. An out-of-bounds
// hint would cause rank_lut_[hint] to read past the LUT array.
for (uint64_t i = 0; i < num_select1_hints_; i++) {
if (select1_hints_[i] >= num_rank_samples_) {
return Status::Corruption(
"Bitvector: select1 hint value exceeds rank LUT size");
}
}
for (uint64_t i = 0; i < num_select0_hints_; i++) {
if (select0_hints_[i] >= num_rank_samples_) {
return Status::Corruption(
"Bitvector: select0 hint value exceeds rank LUT size");
}
}
*bytes_consumed = static_cast<size_t>(data - start);
return Status::OK();
}
void Bitvector::BuildFrom(const BitvectorBuilder& builder) {
num_bits_ = builder.NumBits();
num_words_ = (num_bits_ + 63) / 64;
num_rank_samples_ = num_bits_ / kBitsPerRankSample + 1;
// We need to build rank LUT first to know num_ones_, then compute hint
// counts. Two-pass: first allocate for words + rank LUT, build rank LUT
// to get num_ones_, then resize to include select hints.
size_t words_bytes = num_words_ * sizeof(uint64_t);
size_t rank_bytes = num_rank_samples_ * sizeof(uint32_t);
// Phase 1: allocate words + rank LUT.
owned_data_.resize(words_bytes + rank_bytes, '\0');
if (num_words_ > 0) {
memcpy(&owned_data_[0], builder.Words().data(), words_bytes);
}
words_ = reinterpret_cast<const uint64_t*>(owned_data_.data());
rank_lut_ =
reinterpret_cast<const uint32_t*>(owned_data_.data() + words_bytes);
assert(reinterpret_cast<uintptr_t>(rank_lut_) % alignof(uint32_t) == 0);
BuildRankLUT();
// Phase 2: compute hint counts and reallocate to include them.
num_select1_hints_ =
(num_ones_ > 0) ? (num_ones_ - 1) / kOnesPerSelectHint + 1 : 0;
uint64_t num_zeros = num_bits_ - num_ones_;
num_select0_hints_ =
(num_zeros > 0) ? (num_zeros - 1) / kOnesPerSelectHint + 1 : 0;
size_t s1_bytes = num_select1_hints_ * sizeof(uint32_t);
size_t s0_bytes = num_select0_hints_ * sizeof(uint32_t);
owned_data_.resize(words_bytes + rank_bytes + s1_bytes + s0_bytes, '\0');
// Recompute all pointers after resize (string may have reallocated).
RecomputePointers();
assert(reinterpret_cast<uintptr_t>(rank_lut_) % alignof(uint32_t) == 0);
BuildSelectHints();
}
void Bitvector::BuildRankLUT() {
// The rank LUT is mutable during construction, so we const_cast here.
// After BuildRankLUT() completes, rank_lut_ is treated as read-only.
uint32_t* lut = const_cast<uint32_t*>(rank_lut_);
// Guard: uint32_t can hold values up to ~4 billion. The maximum cumulative
// rank is num_ones_ (at most num_bits_), so asserting num_bits_ fits in
// uint32_t guarantees all rank LUT entries fit.
assert(num_bits_ <= UINT32_MAX);
uint64_t cumulative = 0;
for (uint64_t s = 0; s < num_rank_samples_; s++) {
lut[s] = static_cast<uint32_t>(cumulative);
// Sum popcount for the kWordsPerRankSample words in this sample interval.
uint64_t word_start = s * kWordsPerRankSample;
uint64_t word_end = std::min(word_start + kWordsPerRankSample, num_words_);
for (uint64_t w = word_start; w < word_end; w++) {
cumulative += Popcount(words_[w]);
}
}
// Total number of 1-bits is the cumulative count after all words.
num_ones_ = cumulative;
}
void Bitvector::BuildSelectHints() {
// Build select1 hints: for each k, select1_hints_[k] is the rank LUT
// index of the sample interval containing the (k * kOnesPerSelectHint)-th
// 1-bit.
if (num_select1_hints_ > 0) {
uint32_t* hints = const_cast<uint32_t*>(select1_hints_);
uint64_t sample_idx = 0;
for (uint64_t k = 0; k < num_select1_hints_; k++) {
uint64_t target = k * kOnesPerSelectHint;
// Advance sample_idx until rank_lut_[sample_idx+1] > target.
while (sample_idx + 1 < num_rank_samples_ &&
rank_lut_[sample_idx + 1] <= target) {
sample_idx++;
}
hints[k] = static_cast<uint32_t>(sample_idx);
}
}
// Build select0 hints analogously using zero counts.
if (num_select0_hints_ > 0) {
uint32_t* hints = const_cast<uint32_t*>(select0_hints_);
uint64_t sample_idx = 0;
for (uint64_t k = 0; k < num_select0_hints_; k++) {
uint64_t target = k * kOnesPerSelectHint;
while (sample_idx + 1 < num_rank_samples_ &&
((sample_idx + 1) * kBitsPerRankSample -
rank_lut_[sample_idx + 1]) <= target) {
sample_idx++;
}
hints[k] = static_cast<uint32_t>(sample_idx);
}
}
}
// FindNthOneBit is now inline in bitvector.h for hot-path performance.
uint64_t Bitvector::FindNthZeroBit(uint64_t i) const {
uint64_t num_zeros = num_bits_ - num_ones_;
if (i >= num_zeros) {
return num_bits_;
}
// Use select0 hints to narrow the search range.
uint64_t lo;
uint64_t hi;
if (num_select0_hints_ > 0) {
uint64_t hint_idx = i / kOnesPerSelectHint;
lo = select0_hints_[hint_idx];
hi = (hint_idx + 1 < num_select0_hints_) ? select0_hints_[hint_idx + 1]
: num_rank_samples_ - 1;
} else {
lo = 0;
hi = num_rank_samples_ - 1;
}
// Linear scan within the narrowed range.
while (lo < hi) {
uint64_t zeros_next = (lo + 1) * kBitsPerRankSample - rank_lut_[lo + 1];
if (zeros_next <= i) {
lo++;
} else {
break;
}
}
uint64_t zeros_at_lo = lo * kBitsPerRankSample - rank_lut_[lo];
uint64_t remaining = i - zeros_at_lo;
uint64_t word_start = lo * kWordsPerRankSample;
uint64_t word_end = std::min(word_start + kWordsPerRankSample, num_words_);
for (uint64_t w = word_start; w < word_end; w++) {
// For the last word, we may have fewer than 64 valid bits. We need to
// mask off the padding bits so they don't count as zeros.
uint64_t valid_bits = 64;
if (w == num_words_ - 1 && num_bits_ % 64 != 0) {
valid_bits = num_bits_ % 64;
}
uint64_t zeros_in_word = valid_bits - Popcount(words_[w]);
if (remaining < zeros_in_word) {
// The target 0-bit is in this word. Invert to get 1-bits at zero
// positions, mask off padding, then find the n-th set bit in the word.
uint64_t word = ~words_[w];
if (valid_bits < 64) {
word &= (uint64_t(1) << valid_bits) - 1;
}
return w * 64 + FindNthSetBitInWord(word, remaining);
}
remaining -= zeros_in_word;
}
assert(false);
return num_bits_;
}
uint64_t Bitvector::NextSetBit(uint64_t pos) const {
if (pos >= num_bits_) {
return num_bits_;
}
uint64_t word_idx = pos / 64;
uint64_t bit_idx = pos % 64;
// Check remaining bits in the current word.
uint64_t word = words_[word_idx] >> bit_idx;
if (word != 0) {
uint64_t result = pos + Ctz(word);
return (result < num_bits_) ? result : num_bits_;
}
// Scan subsequent words.
for (uint64_t w = word_idx + 1; w < num_words_; w++) {
if (words_[w] != 0) {
uint64_t result = w * 64 + Ctz(words_[w]);
return (result < num_bits_) ? result : num_bits_;
}
}
return num_bits_;
}
uint64_t Bitvector::DistanceToNextSetBit(uint64_t pos) const {
assert(pos < num_bits_);
assert(GetBit(pos)); // pos must be a set bit.
// Find the next set bit after pos (exclusive).
uint64_t next = NextSetBit(pos + 1);
return next - pos;
}
// ============================================================================
// EliasFano implementation
// ============================================================================
void EliasFano::BuildFrom(const uint64_t* values, uint64_t count,
uint64_t universe) {
count_ = count;
universe_ = universe;
if (count == 0) {
low_bits_ = 0;
low_mask_ = 0;
low_words_ = nullptr;
num_low_words_ = 0;
// Build an empty high bitvector so it serializes/deserializes correctly.
// Without this, high_bv_ stays default-constructed (num_rank_samples_=0),
// but InitFromData computes num_rank_samples_=1 for 0-bit bitvectors,
// causing a size mismatch on deserialization.
BitvectorBuilder empty_builder;
high_bv_.BuildFrom(empty_builder);
return;
}
// Compute low_bits = floor(log2(universe / count)).
// When universe <= count, low_bits = 0 (all bits are high).
// FloorLog2 returns floor(log2(x)) for x >= 1.
if (universe > count) {
low_bits_ = static_cast<uint64_t>(FloorLog2(universe / count));
} else {
low_bits_ = 0;
}
low_mask_ =
(low_bits_ < 64) ? ((uint64_t(1) << low_bits_) - 1) : ~uint64_t(0);
// Build high-bits bitvector.
// The high part of each value is (value >> low_bits_). The bitvector
// has a 1-bit at position high[i] + i for each element i, with 0-bits
// filling the gaps. Total length = max_high + count.
uint64_t max_high =
(count > 0 && low_bits_ < 64) ? (values[count - 1] >> low_bits_) : 0;
uint64_t high_len = max_high + count;
BitvectorBuilder high_builder;
high_builder.Reserve(high_len + 1);
uint64_t prev_high = 0;
for (uint64_t i = 0; i < count; i++) {
assert(i == 0 || values[i] >= values[i - 1]); // Must be monotone.
uint64_t high = (low_bits_ < 64) ? (values[i] >> low_bits_) : 0;
// Append (high - prev_high) zeros followed by a 1.
assert(high >= prev_high);
high_builder.AppendMultiple(false, high - prev_high);
high_builder.Append(true);
prev_high = high;
}
high_bv_.BuildFrom(high_builder);
// Build packed low-bits array.
uint64_t total_low_bits = count * low_bits_;
num_low_words_ = (total_low_bits + 63) / 64;
owned_low_data_.resize(num_low_words_ * sizeof(uint64_t), '\0');
uint64_t* low_buf = reinterpret_cast<uint64_t*>(&owned_low_data_[0]);
low_words_ = low_buf;
// Pack low bits: value i's low bits go at bit position i * low_bits_.
for (uint64_t i = 0; i < count; i++) {
uint64_t low = values[i] & low_mask_;
uint64_t bit_pos = i * low_bits_;
uint64_t word_idx = bit_pos / 64;
uint64_t bit_idx = bit_pos % 64;
low_buf[word_idx] |= low << bit_idx;
// Handle word boundary crossing.
if (bit_idx > 0 && bit_idx + low_bits_ > 64) {
low_buf[word_idx + 1] |= low >> (64 - bit_idx);
}
}
}
size_t EliasFano::SerializedSize() const {
// 3 uint64_t header + high bitvector + low words (8-byte aligned).
size_t low_bytes = num_low_words_ * sizeof(uint64_t);
return 3 * sizeof(uint64_t) + high_bv_.SerializedSize() + low_bytes;
}
void EliasFano::EncodeTo(std::string* output) const {
size_t old_size = output->size();
// Header: count, universe, low_bits
PutFixed64(output, count_);
PutFixed64(output, universe_);
PutFixed64(output, low_bits_);
// High bitvector
high_bv_.EncodeTo(output);
// Low words
size_t low_bytes = num_low_words_ * sizeof(uint64_t);
if (low_bytes > 0) {
size_t cur = output->size();
output->resize(cur + low_bytes);
memcpy(&(*output)[cur], low_words_, low_bytes);
}
(void)old_size;
}
Status EliasFano::InitFromData(const char* data, size_t data_size,
size_t* bytes_consumed) {
const char* start = data;
// Read header: count, universe, low_bits.
if (data_size < 3 * sizeof(uint64_t)) {
return Status::Corruption("EliasFano: insufficient data for header");
}
memcpy(&count_, data, sizeof(uint64_t));
data += sizeof(uint64_t);
memcpy(&universe_, data, sizeof(uint64_t));
data += sizeof(uint64_t);
memcpy(&low_bits_, data, sizeof(uint64_t));
data += sizeof(uint64_t);
data_size -= 3 * sizeof(uint64_t);
// Validate header fields from untrusted data.
if (low_bits_ > 63) {
return Status::Corruption("EliasFano: low_bits exceeds 63");
}
// Upper-bound on count_: a sequence with > 2^30 elements would require
// tens of GB of data. This prevents integer overflow in subsequent
// count_ * low_bits_ and count_ * sizeof(...) arithmetic. Block checksums
// are the first line of defense; this is defense-in-depth for the
// deserialization path.
static constexpr uint64_t kMaxReasonableCount = uint64_t(1) << 30;
if (count_ > kMaxReasonableCount) {
return Status::Corruption("EliasFano: count exceeds reasonable limit");
}
low_mask_ =
(low_bits_ < 64) ? ((uint64_t(1) << low_bits_) - 1) : ~uint64_t(0);
// Read high bitvector.
size_t consumed = 0;
Status s = high_bv_.InitFromData(data, data_size, &consumed);
if (!s.ok()) {
return s;
}
data += consumed;
data_size -= consumed;
// Read low words. The multiplication is safe from overflow because
// count_ <= 2^30 and low_bits_ <= 63, so count_ * low_bits_ <= 2^36.
uint64_t total_low_bits = count_ * low_bits_;
num_low_words_ = (total_low_bits + 63) / 64;
size_t low_bytes = num_low_words_ * sizeof(uint64_t);
if (data_size < low_bytes) {
return Status::Corruption("EliasFano: insufficient data for low words");
}
if (low_bytes > 0) {
if (reinterpret_cast<uintptr_t>(data) % alignof(uint64_t) != 0) {
return Status::Corruption("EliasFano: low words not 8-byte aligned");
}
low_words_ = reinterpret_cast<const uint64_t*>(data);
} else {
low_words_ = nullptr;
}
data += low_bytes;
*bytes_consumed = static_cast<size_t>(data - start);
return Status::OK();
}
} // namespace trie_index
} // namespace ROCKSDB_NAMESPACE