SaaS & Software·Jun 5, 2026

The perils of UUID primary keys in SQLite

Hacker News16h ago3 min readSingle source

The gist

5-point summary · 1 min

The table's data rows are stored in the index's leaf pages, sorted by the indexed key.
Because of this:There can be only one clustered index per table (rows can only be physically sorted one way).A clustered index is the table.
The physical storage order of rows follows rowid sequence.Without rowidSQLite also supports WITHOUT ROWID tables.
What's happened here 10-12x slower?!ProfileThat's a big difference.
This matters because our profiles will run for different amounts of time.

05 Jun 2026It's common to use random UUIDs as a primary key in databases. One of the known downsides of random UUIDs is that their unordered nature (UUID4) can cause a lot of extra paging for the clustered index because you are inserting rows randomly into the Btree and having to re-balance it. This post tries to help us develop a more visceral understanding of the performance cost of all that extra paging.While this post is about SQLite specifically, the problem of random UUIDs also extends to other databases that use clustered indexes.What is a clustered index?A clustered index determines the physical storage order of the rows in a table. The table's data rows are stored in the index's leaf pages, sorted by the indexed key. Because of this:There can be only one clustered index per table (rows can only be physically sorted one way).A clustered index is the table. The leaf nodes contain the full row dataA non-clustered index, by contrast, stores only the indexed columns plus a pointer to the actual row data, which lives elsewhere.RowidEvery ordinary SQLite table has an implicit 64-bit integer primary key called rowid. The table's data is stored in a B-tree ordered by rowid. This is effectively SQLite's clustered index. The physical storage order of rows follows rowid sequence.Without rowidSQLite also supports WITHOUT ROWID tables. These tables have no implicit rowid. Instead, the primary key you declare becomes the clustered index.BaselineLet's establish a performance baseline with regular rowid int primary key. We'll insert 10 million rows in batches of 1 million. (d/q writer ["CREATE TABLE IF NOT EXISTS event(id INT PRIMARY KEY, data BLOB)"]) (dotimes [_ 100] (time (d/with-write-tx [db writer] (dotimes [_ 1000000] (d/q db ["INSERT INTO event (data) values (?)" data]))))) Results:total rowstime in ms1000000012082000000011023000000011774000000011385000000010866000000011017000000010708000000010699000000010791000000001081Roughly a million inserts per second.UUID4Now lets try UUID4.(d/q writer ["CREATE TABLE IF NOT EXISTS event(id BLOB PRIMARY KEY, data BLOB) WITHOUT ROWID"]) (dotimes [_ 10] (time (d/with-write-tx [db writer] (dotimes [_ 1000000] (d/q db ["INSERT INTO event (id, data) values (?,?)" (random-uuid4-bytes) data]))))) Results:total rowstime in ms10000000264920000000564430000000713740000000835250000000935960000000981770000000104908000000011130900000001166810000000012586Oh no! What's happened here 10-12x slower?!ProfileThat's a big difference. But, lets not guess when we can profile.Below is a normalised diffgraph. A diffgraph compares two profiling snapshots (in this case INT vs UUID4) and displays the differences in a flamegraph structure. Unlike a regular diffgraph that shows absolute changes, a normalised view adjusts the total number of samples between the two compared profiles to be the same. This means we can see the relative differences as a percentage. This matters because our profiles will run for different amounts of time. The colour signifies the direction of the change: a blue frame means less time was spent in this function in the second profile (UUID4) compared to the first (INT); a red frame means more time was spent in the second profile. The colour intensity indicates the relative change in the number of samples for the frame itself (self time delta).We can see from the diffgraph that we are spending a lot more time balancing the tree, reading and writing. This is because the unordered nature of UUID4 means they are ordered randomly which is forcing SQLite to constantly re-balance the Btree.UUID7We can theoretically fix this with UUID7 which is time ordered eliminating the ordering problem of UUID4. Let's see if this improves things.(d/q writer ["CREATE TABLE IF NOT EXISTS event(id BLOB PRIMARY KEY, data BLOB) WITHOUT ROWID"]) (dotimes [_ 10] (time (d/with-write-tx [db writer] (dotimes [_ 1000000] (d/q db ["INSERT INTO event (id, data) values (?,?)" (random-uuid7-bytes) data]))))) Results:total rowstime in ms1000000013722000000012803000000013654000000012505000000012566000000012707000000012468000000012579000000012451000000001258Back to a more reasonable number. Slightly slower than our baseline. UUID blob primary keys are 16 bytes vs int primary keys which are 8 bytes.ConclusionHopefully, this post helps illustrate some of the pitfalls with UUID primary keys in SQLite and how to navigate them.The full benchmark code can be found here.If you enjoyed this post you might like this one 100000 TPS with SQLiteFurther readingClustered IndexesClustered Indexes and the WITHOUT ROWID Optimizationclj-async-profilerExploring flamegraphsDiffgraphsThanks to Everyone on the Datastar discord who read drafts of this and gave me feedback.

Integrity note · Xela does not rewrite or paraphrase article content. The excerpt above is the source publication's own words, sanitized for display. For the full piece — including any quotes, charts, or images — read it at Hacker News. Xela's rewritten version is off for this story, so there's no editorial angle attached — you're getting the source's reporting unfiltered. When the rewrite is on, we add a What this means block underneath with the operator/trader takeaway.