Ben (@benjdd.com)

using RAM to make databases FAST (the buffer pool) YouTube video by Ben Dicken

www.youtube.com/watch?v=syPE...

21.01.2026 15:52 👍 2 🔁 0 💬 0 📌 0

You're probably sick of me saying "B-tree" but these impact SO MUCH of database performance. They're used all over the place in Postgres, MySQL, and SQLite.

This week I broke down B-tree lookups and how the page cache makes lookups faster.

21.01.2026 15:52 👍 1 🔁 0 💬 1 📌 0

Choose your storage layer carefully.

20.01.2026 16:45 👍 1 🔁 1 💬 0 📌 0

Good catch. Fixed.

19.01.2026 13:40 👍 1 🔁 0 💬 0 📌 0

Blood, sweat, and tears 😅

But seriously, these were Js + GSAP, and in this case Cursor helped quite a bit too.

16.01.2026 14:06 👍 2 🔁 0 💬 1 📌 0

Protect your database. Use the pg_strict Postgres extension. YouTube video by PlanetScale

Introducing pg_strict for Postgres.

Our new extension adds a safety net to Postgres, catching dangerous queries before they run.

www.youtube.com/watch?v=noPn...

15.01.2026 17:35 👍 9 🔁 3 💬 1 📌 1

Database Transactions — PlanetScale What are database transactions and how do SQL databases isolate one transaction from another?

pscale.link/transactions

14.01.2026 19:41 👍 0 🔁 0 💬 2 📌 0

If databases fascinate you like they do me, this article's for you!

Every time you interact with a website, database transactions are keeping your data consistent, safe, and isolated.

I wrote an interactive guide to how they work ⬇️

14.01.2026 19:41 👍 0 🔁 0 💬 1 📌 0

Tuning your database just right can be counter-intuitive, unless you understand all levels of the system.

Intuitively, most would say "more work_mem = better" for building indexes, but this hurts performance due to L3 cache behavior.

Great article by Tomas Vondra.

vondra.me/posts/dont-g...

13.01.2026 14:10 👍 5 🔁 2 💬 0 📌 0

A key difference from B-trees: searching for a single rectangle may require searching multiple tree paths! In the ideal case, B-trees offer O(log n) search performance, but due to possible overlaps the worst-case performance is actually O(n).

11.01.2026 13:33 👍 0 🔁 0 💬 0 📌 0

Moving up the tree, the bounding rectangles get larger and larger, up to the root node storing a small number of large bounding boxes.

11.01.2026 13:33 👍 0 🔁 0 💬 1 📌 0

Entries in an R-tree are bounding rectangles. At the leaves we store the minimum bounding rectangle (MBR) for each region being stored, with a reference to the full geometry stored on-disk elsewhere. The parent entry of a leaf MBR stores a bounding rectangle that fully bounds all children.

11.01.2026 13:33 👍 0 🔁 0 💬 1 📌 0

They function similarly to B-trees: It’s a tree structure with multiple entries at each page-aligned node. This generally keeps the trees nice and shallow, and allows for efficient lookups for millions of elements stored on-disk. It also generally only stores data values at the leaves, like B+trees.

11.01.2026 13:33 👍 0 🔁 0 💬 1 📌 0

R-trees are a powerful structure for indexing geometric data.

They’re used by MySQL, and Postgres uses an R-tree-like structure via GiST in PostGIS.

🧵

11.01.2026 13:33 👍 1 🔁 0 💬 1 📌 0

TLDR: io_uring won't help much if treated as a drop-in replacement for existing database I/O architectures. Better performance will often require architectural changes. When applied, there's tons of performance gains to be had.

Here's the paper: arxiv.org/pdf/2512.04859

08.01.2026 14:15 👍 3 🔁 1 💬 0 📌 0

I'm excited about the database performance io_uring will unlock.

Last year I benchmarked Postgres 17 vs 18 to test the initial io_uring upgrades. I was surprised to see they weren't always a clear win for TPC-C.

This paper studies the potential, and the future looks good.

08.01.2026 14:15 👍 6 🔁 1 💬 1 📌 0

At PlanetScale we observe this first-hand for both MySQL and Postgres. We then get to go tackle these hard-but-fun engineering problems, so our customers don't have to.

06.01.2026 13:37 👍 1 🔁 0 💬 0 📌 0

Software at scale reveals the cracks.

Managing a system for a single use-case (databases or otherwise) can make it seem like a perfect solution. It just might be for that narrow environment!

At scale you see all the edge cases because you're operating on so many workloads.

06.01.2026 13:37 👍 2 🔁 0 💬 1 📌 0

This makes for fast change detection (O(log n)), and saves bandwidth since we only need to re-sync files that we know have been modified.

05.01.2026 14:37 👍 0 🔁 0 💬 0 📌 0

When syncing local file changes with a remote server (like Git) we can quickly tell if changes were made by comparing the client's root hash to the server's. If they differ, the tree is navigated to find the leaf node(s) with changes, and only those files need to be re-synced.

05.01.2026 14:37 👍 0 🔁 0 💬 1 📌 0

We then build a tree based on the directory structure. A parent node's hash is a hash of the concatenation of all its children's hashes. Inner nodes' hashes are based on the data / hashes of its descendants. This culminates in the root node, whose hash is based on ALL the tracked source files.

05.01.2026 14:37 👍 0 🔁 0 💬 1 📌 0

What do Git, Cursor, and Dynamo have in common?

Merkle trees!

A great data structure for tracking file changes, facilitating incremental sync with remote servers.

Say we want to track changes to a codebase at a per-file level. We compute a hash for each source file, and these become leaf nodes.

05.01.2026 14:37 👍 5 🔁 0 💬 1 📌 0

Need a break from AI in the timeline?

Listen to me talk about data organization instead :)

Friday's stream was a fun one. Sequential writes, binary search trees, block I/O devices, and B-trees. The latest slice dropped this afternoon.

www.youtube.com/watch?v=84b_...

04.01.2026 22:41 👍 5 🔁 0 💬 0 📌 0

Equip yourself with the fundamental building-blocks of software systems. This combined with the right LLM tooling can take you very far.

04.01.2026 13:42 👍 0 🔁 0 💬 0 📌 0

Merkle trees, consistent hashing, vector clocks, gossip protocols, and quorum algorithms were all previously known and had been used to build other software. But their unique combination to build Dynamo worked super well for Amazon's use-case, and helped them to scale to millions of DAU.

04.01.2026 13:42 👍 0 🔁 0 💬 1 📌 0

Though it was published in 2007 (nearly 20 years ago!) it was revolutionary in its day, and an example of how already-known technologies can be combined to make something new and extremely successful.

04.01.2026 13:42 👍 0 🔁 0 💬 1 📌 0

If 2026 is the year of AI, it's also the year to read more papers.

LLMs make writing code cheaper. This places greater emphasis on architectural choices, understanding design tradeoffs, ensuring security, and building things people actually need.

Great example: yesterday I read the Dynamo paper.

04.01.2026 13:42 👍 6 🔁 0 💬 1 📌 0

Data storage engines: the 5 must-know components YouTube video by Benjamin Dicken

www.youtube.com/watch?v=K4YM...

01.01.2026 22:04 👍 2 🔁 0 💬 0 📌 0

2026 is the year to end TikTok brain.

Instead, learn database internals on YouTube.

Speaking of which, another dropped today (link below).

01.01.2026 22:04 👍 4 🔁 0 💬 1 📌 0

Cameras, lenses, framing, and everything in-between have fascinated me for many year.

This morning I read Bartosz Ciechanowski's article on the subject. It's the best explainer I've seen. The interactivity really sells it.

Great article to kick off your year with:

ciechanow.ski/cameras-and-...

01.01.2026 15:04 👍 14 🔁 1 💬 0 📌 1

Ben

Latest posts by Ben @benjdd.com