Ending

Wow, that was much longer than I expected. Congratulations if you’ve made it this far. Hopefully, you left this post with a bit more knowledge about B-Tree. Despite being half a century old, this data structure still has some intriguing concepts and new developments.

If you have any queries or modification suggestions, please open an issue or PR on GitHub.

References

If you are interested in learning more about B-Tree, here are some resources that helped me tremendously:

  1. Organization and maintenance of large ordered indices (Rudolf Bayer and Edward M. McCreight, 1970): This is the original paper that introduced B-Tree and outlined the variation that we later know as B+ Tree.
  2. Prefix B-trees (Rudolf Bayer and Karl Unterauer, 1977): This paper presented the suffix truncation optimization.
  3. Efficient locking for concurrent operations on B-trees (Philip L. Lehman and Shibing Yao, 1981): This paper introduced the B Link Tree variation.
  4. B-Tree Indexes and CPU Caches (Goetz Graefe and Per-Åke Larson, 2001): This paper discussed techniques to leverage CPU caches to improve B-Tree performance.
  5. Modern B-Tree Techniques (Goetz Graefe, 2010): This book compiled various optimizations and variations of B-Tree. It touched multiple areas, like B-Tree in database transansations context, index creation/delete, or query procressing.
  6. Database Internals: A Deep-Dive Into How Distributed Data Systems Work (Alex Petrov, 2019): This book has few excellent chapters talking about specific implementation details of B-Tree. Highly recommend this.
  7. Postgres mailing list: I found a ton of hidden gems regarding B-Tree.

[This Space Intentionally Left Blank]

The bottom of every page is padded so readers can maintain a consistent eyeline.