LevelDB Explained - Static Thread Safety Analysis with Clang

LevelDB has some interesting macros that I rarely use in my daily coding. These macros are defined in thread_annotations.h and can be used to detect potential thread safety issues at compile time using Clang’s thread safety analysis tool.

Clang's Thread Safety Analysis Tool

Read More

LevelDB Explained - How to Design a High-Performance HashTable

Hash tables are a classic data structure that anyone who has written code should be familiar with. Every programming language has its own hash table implementation that’s ready to use out of the box. As a result, while many people have used hash tables, probably not many have implemented one themselves.

Designing a high-performance hash table actually requires considering quite a few details, such as how to handle hash collisions and how to handle hash table expansion. Some mature hash table implementations, like the one in the C++ standard library, have a large codebase and can be difficult to understand.

Fortunately, when implementing LRU Cache, LevelDB also implemented a simple and efficient hash table. The overall code is very concise - small but complete, making it very worth studying. Using LevelDB’s hash table implementation as an example, this article analyzes how to design a high-performance hash table.

Read More

LevelDB Explained - How to Analyze the Time Complexity of SkipLists?

In the previous article LevelDB Explained - How to implement SkipList, we analyzed in detail the implementation of skip lists in LevelDB. Then in LevelDB Explained - How to Test Parallel Read and Write of SkipLists?, we analyzed the test code for LevelDB skip lists. One question remains: how do we analyze the time complexity of skip lists?

After analyzing the time complexity of skip lists, we can understand the choice of probability value and maximum height in LevelDB, as well as why Redis chooses a different maximum height. Finally, this article will also provide a simple benchmark code to examine the performance of skip lists.

This article will not involve very advanced mathematical knowledge, only simple probability theory, so you can read on without worry. The performance analysis of skip lists has several approaches worth learning from. I hope this article can serve as a starting point and bring some inspiration to everyone.

Breaking Down Skip List Performance Analysis

Knowing the principles and implementation of LevelDB, we can deduce that in extreme cases, where the height of each node is 1, the time complexity of search, insertion, and deletion operations of the skip list will degrade to O(n). In this case, the performance is considerably worse than balanced trees. Of course, due to the randomness involved, no input sequence can consistently lead to the worst performance.

So, how about the average performance of skip lists? We’ve previously stated the conclusion that it’s similar to the average performance of balanced trees. Introducing a simple random height can ensure that the average performance of skip lists is comparable to balanced trees. Is there any analysis method behind this that can analyze the performance of skip lists?

Read More