Bloom Filter Read
The
db_bench
starts from the main function in the db_bench.cc
file, reads parameters using scanf
, and then executes the benchmark.Run()
class function.
This function sequentially executes the Open()
function, which handles the write process, and the RunBenchmark()
function, which handles the read process.
The RunBenchmark()
function requires three parameters: num_threads
, name
, and method
. Among them, num_threads
indicates the number of threads and can be specified as a parameter value when starting db_bench
.
Next, Name
is a string variable that represents the type of benchmark. It stores the values entered in the parameters, separated by commas. Method
is a pointer variable that stores the address of the benchmark function corresponding to Name
.
After that, the address value of the method
in the RunBenchmark()
function is stored in the arg[]
array. The StartThread
function is executed with this value and the ThreadBody()
function as arguments. The ThreadBody()
function performs the given benchmark with the allocated thread.
The code flow of the benchmark is as follows: Starting with benchmark functions like ReadRandom()
or ReadHot()
, it gradually narrows the search range through multiple Get()
functions, from the database to the sstable, and then to each level, table, filter block, and bloom filter.
In the benchmark function, after performing the functions assigned to each benchmark, the db->Get()
function is used. Here, db
is the address value of the database used in the DB::open()
function when opening the database during the write process.
The DBImpl::Get()
function sequentially searches the memtable, immemtable, and sstable. The function used to search the sstable is Version::Get()
. (Note that the bloom filter is only used in the sstable.)
The Version::Get()
function executes the ForEachOverlapping()
function with the Match()
function as an argument. The ForEachOverlapping()
function first searches level 0 of the sstable and then searches other levels. The Match()
function used in this search process checks if a specific key exists in the table and returns a boolean value based on its existence.
The TableCache::Get()
function called by the Match()
function searches the table. Then, FilterBlockReader::KeyMayMatch()
searches the filter block, and BloomFilterPolicy::KeyMayMatch()
searches the bloom filter to finally check the existence of a specific key. The InternalGet()
function in the middle proceeds with the read if the key being searched for exists.