LevelDB Explained - How To Read and Write WAL Logs

LevelDB uses Write-Ahead Logging (WAL) to ensure data durability. When a write operation occurs, LevelDB first writes the data to the log file, and then applies it to the in-memory data structure (such as MemTable). When the system or database restarts after a crash, LevelDB checks the records in the WAL log file. By reading and replaying these log records, LevelDB can rebuild the data state that had not been fully written to disk when the crash occurred.

LevelDB WAL Log Writing Process

The overall WAL log-related operation process is as follows:

  1. LevelDB first writes the data to the WAL log. This ensures that the data won’t be lost even in the event of a system crash.
  2. The data is written to the MemTable in memory, which is a fast memory operation.
  3. LevelDB confirms the write completion to the client.
  4. Over time, when the MemTable is full, it is flushed to SSTable files on disk.
  5. Once the MemTable has been successfully flushed to SSTable, the corresponding WAL log can be cleared.

Let’s take a detailed look at the implementation.

Read More

LevelDB Explained - Understanding Advanced C++ Techniques

The overall code of LevelDB is quite understandable, without using many esoteric C++ techniques. However, there are some implementations that are relatively uncommon, such as flexible arrays, symbol exporting for linking, and Pimpl class design. This article will review these advanced C++ techniques to help better understand the implementation of LevelDB.

Flexible Arrays

In the LRUHandle structure definition in util/cache.cc, there’s a flexible array member char key_data[1], used to implement variable-length data structures in C/C++.

1
2
3
4
5
6
7
8
9
struct LRUHandle {
// ...
char key_data[1]; // Beginning of key

Slice key() const {
assert(next != this);
return Slice(key_data, key_length);
}
};

Read More

LevelDB Explained - Bloom Filter Implementation and Visualization

In LevelDB, data is stored in SSTable files. When using Get() to query a key, it may be necessary to read multiple blocks from the SST file. To reduce disk reads, LevelDB provides a FilterPolicy strategy. If it can determine that a key is not in the current SSTable file, it can skip reading that file, thus improving query efficiency.

LevelDB supports user-defined filter policies but provides a default Bloom filter implementation. A Bloom filter is a space-efficient data structure used to determine whether an element is a member of a set. It has a certain false positive rate but no false negatives. In simple terms, if a Bloom filter determines that an element does not exist, then the element definitely does not exist; if a Bloom filter determines that an element exists, then the element may not exist.

Read More