LevelDB Explained - Understanding Advanced C++ Techniques

The overall code of LevelDB is quite understandable, without using many esoteric C++ techniques. However, there are some implementations that are relatively uncommon, such as flexible arrays, symbol exporting for linking, and Pimpl class design. This article will review these advanced C++ techniques to help better understand the implementation of LevelDB.

Flexible Arrays

In the LRUHandle structure definition in util/cache.cc, there’s a flexible array member char key_data[1], used to implement variable-length data structures in C/C++.

1
2
3
4
5
6
7
8
9
struct LRUHandle {
// ...
char key_data[1]; // Beginning of key

Slice key() const {
assert(next != this);
return Slice(key_data, key_length);
}
};

Read More

LevelDB Explained - Bloom Filter Implementation and Visualization

In LevelDB, data is stored in SSTable files. When using Get() to query a key, it may be necessary to read multiple blocks from the SST file. To reduce disk reads, LevelDB provides a FilterPolicy strategy. If it can determine that a key is not in the current SSTable file, it can skip reading that file, thus improving query efficiency.

LevelDB supports user-defined filter policies but provides a default Bloom filter implementation. A Bloom filter is a space-efficient data structure used to determine whether an element is a member of a set. It has a certain false positive rate but no false negatives. In simple terms, if a Bloom filter determines that an element does not exist, then the element definitely does not exist; if a Bloom filter determines that an element exists, then the element may not exist.

Read More

LevelDB Explained - Preparing the Development Environment

LevelDB is an excellent LSM Tree storage component developed in C++. Although its overall codebase is not large, its design is ingenious and worth studying. During the process of reading the source code, I have compiled a series of articles to gradually break down the implementation details of LevelDB. However, before reading the code, it’s best to prepare the entire development environment.

This article will start from the most basic step of pulling the code, recording my process of preparing the entire environment, including configuring the VSCode IDE and using the clangd plugin, as well as how to configure compilation options. Then, through a simple read and write code demo, we’ll briefly use LevelDB to gain a perceptual understanding of this library. Additionally, we’ll introduce how to run test cases. LevelDB’s test cases are well-written, and during the code reading process, we can use these cases to better understand the code.

Read More