Bloom Filter Read

img The db_bench starts from the main function in the db_bench.cc file, reads parameters using scanf, and then executes the benchmark.Run() class function.

This function sequentially executes the Open() function, which handles the write process, and the RunBenchmark() function, which handles the read process.

image

The RunBenchmark() function requires three parameters: num_threads, name, and method. Among them, num_threads indicates the number of threads and can be specified as a parameter value when starting db_bench.

image

Next, Name is a string variable that represents the type of benchmark. It stores the values entered in the parameters, separated by commas. Method is a pointer variable that stores the address of the benchmark function corresponding to Name.

image

After that, the address value of the method in the RunBenchmark() function is stored in the arg[] array. The StartThread function is executed with this value and the ThreadBody() function as arguments. The ThreadBody() function performs the given benchmark with the allocated thread.

img

The code flow of the benchmark is as follows: Starting with benchmark functions like ReadRandom() or ReadHot(), it gradually narrows the search range through multiple Get() functions, from the database to the sstable, and then to each level, table, filter block, and bloom filter.

image

In the benchmark function, after performing the functions assigned to each benchmark, the db->Get() function is used. Here, db is the address value of the database used in the DB::open() function when opening the database during the write process.

image

The DBImpl::Get() function sequentially searches the memtable, immemtable, and sstable. The function used to search the sstable is Version::Get(). (Note that the bloom filter is only used in the sstable.)

image

The Version::Get() function executes the ForEachOverlapping() function with the Match() function as an argument. The ForEachOverlapping() function first searches level 0 of the sstable and then searches other levels. The Match() function used in this search process checks if a specific key exists in the table and returns a boolean value based on its existence.

image

The TableCache::Get() function called by the Match() function searches the table. Then, FilterBlockReader::KeyMayMatch() searches the filter block, and BloomFilterPolicy::KeyMayMatch() searches the bloom filter to finally check the existence of a specific key. The InternalGet() function in the middle proceeds with the read if the key being searched for exists.