UUIDs are way over used. There is almost always a better key to use, usually a bigint for databases. If you're making some kind of leaderless distributed data store, then maybe, but even then there are other ID sharding strategies I'd go for first depending on the constraints.
For a single database, bigints are smaller and faster, with less footguns.
UUIDs can be nice for an opaque public ID, however I'd still prefer something like a Sqid for space and usability.
show comments
adityaathalye
Thanks for the benching, Anders! So grateful for the stuff you've shared over the years. Invariably, every single post has been useful and/or educational to me.
I read this post more as an illustration of the *value* of UUIDv7 as primary key, over integer primary keys, in lieu of minimal loss of read/write performance, and marginally more data on disk bloat.
SQLite's automatic integer rowID primary key is a no-brainer, when the SQLite application is local-only, such as application storage format (mobile and desktop). Or is never intended to grow beyond a single server instance. Basically, where each SQLite file is private to a singular instance of the application.
However, if there is even an outside chance of needing to cooperate across application instances, e.g. the minimal limit case of a personal knowledge base that should seamlessly sync across a person's devices, as well as a hosted service, then a high-quality sequential random ID starts to make a lot more sense. (No-brainer arbitrary table merges / splits / remerges, de-duplication, etc.)
Random ID primary key is a bad idea period, whether it be the UU kind or the SQ kind, or any other kind. As far as my DB knowledge goes, this class of ID destroys all tree-algorithms, and we are stuck with the fact that there is no practically better way, than an appropriate tree-structure, to group and organise a meaningful amount of data, efficiently and effectively.
show comments
jdthedisciple
So UUID isn't the problem but UUID v4 is, just like any random ID-scheme, correct?
UUID v7 so far seems like the best solution if you want UUID benefits and ordering.
show comments
ItsBob
My rule for primary keys and id's is simple: Sequential integer (or bigint) as the PK and if I need to make it public, I have a GUID (or UUID) in the row too, e.g. tbl_person would have Id (int|bigint) and person_guid as (UUID).
The Integer id is used for joins and looks ups and such but that's it. If I need to send anything to the frontend or outside of the app/DB then that's the UUID.
andersmurphy
This is actually a draft. I Wanted to add more details about how this changes with row size etc. I might get time to update it later today.
show comments
bambax
Why would you use UUIDs a primary keys? Let SQLite use rowids internally (which is automatic and invisible), and have a different (indexed) column with UUID if you need that for publishing the ID somewhere.
show comments
pyuser583
Oh gosh the ints v uuids debate for pks. This is worse than vim v eMacs or brackets v braces.
ac50hz
I enjoy these carefully worded posts from Anders Murphy, illustrative and informative, not opinionated and preachy. Very useful, it’s great to see the process, and ofc bookmarkeable material for sharing with others.
w10-1
Isn't the solution just to use the rowid (after doing the read-id-after-insert dance)?
How much trouble does SQLite reysing rowid's actually cause?
show comments
kjgkjhfkjf
The script to create the benchmark numbers appears to be inserting 100 batches, not 10. (The benchmark numbers in the table appear to be consistent with the text, so I guess the actual script used to create them was correct.)
show comments
yepyoukno
Perils of “UUIDv4”. Everyone knows that’s what UUIDv7 was really for, and you should always convert that to binary to optimize everything.
show comments
cropcirclbureau
Is this relevant for other databases? For postgres for example, which supports concurrent writers, wouldn't sequential keys lead to contention on the page at the frontier?
show comments
dumbledorf
Wait how is sqlite doing a million inserts a second?
show comments
michaelcampbell
How much time is `(random-uuid7-bytes)` taking?
wood_spirit
If you need (or want the convenience of) a uuid and the time of creation is not secret then use ulids eg uuid v7.
UUIDs are way over used. There is almost always a better key to use, usually a bigint for databases. If you're making some kind of leaderless distributed data store, then maybe, but even then there are other ID sharding strategies I'd go for first depending on the constraints.
For a single database, bigints are smaller and faster, with less footguns.
UUIDs can be nice for an opaque public ID, however I'd still prefer something like a Sqid for space and usability.
Thanks for the benching, Anders! So grateful for the stuff you've shared over the years. Invariably, every single post has been useful and/or educational to me.
I read this post more as an illustration of the *value* of UUIDv7 as primary key, over integer primary keys, in lieu of minimal loss of read/write performance, and marginally more data on disk bloat.
SQLite's automatic integer rowID primary key is a no-brainer, when the SQLite application is local-only, such as application storage format (mobile and desktop). Or is never intended to grow beyond a single server instance. Basically, where each SQLite file is private to a singular instance of the application.
However, if there is even an outside chance of needing to cooperate across application instances, e.g. the minimal limit case of a personal knowledge base that should seamlessly sync across a person's devices, as well as a hosted service, then a high-quality sequential random ID starts to make a lot more sense. (No-brainer arbitrary table merges / splits / remerges, de-duplication, etc.)
Random ID primary key is a bad idea period, whether it be the UU kind or the SQ kind, or any other kind. As far as my DB knowledge goes, this class of ID destroys all tree-algorithms, and we are stuck with the fact that there is no practically better way, than an appropriate tree-structure, to group and organise a meaningful amount of data, efficiently and effectively.
So UUID isn't the problem but UUID v4 is, just like any random ID-scheme, correct?
UUID v7 so far seems like the best solution if you want UUID benefits and ordering.
My rule for primary keys and id's is simple: Sequential integer (or bigint) as the PK and if I need to make it public, I have a GUID (or UUID) in the row too, e.g. tbl_person would have Id (int|bigint) and person_guid as (UUID).
The Integer id is used for joins and looks ups and such but that's it. If I need to send anything to the frontend or outside of the app/DB then that's the UUID.
This is actually a draft. I Wanted to add more details about how this changes with row size etc. I might get time to update it later today.
Why would you use UUIDs a primary keys? Let SQLite use rowids internally (which is automatic and invisible), and have a different (indexed) column with UUID if you need that for publishing the ID somewhere.
Oh gosh the ints v uuids debate for pks. This is worse than vim v eMacs or brackets v braces.
I enjoy these carefully worded posts from Anders Murphy, illustrative and informative, not opinionated and preachy. Very useful, it’s great to see the process, and ofc bookmarkeable material for sharing with others.
Isn't the solution just to use the rowid (after doing the read-id-after-insert dance)?
How much trouble does SQLite reysing rowid's actually cause?
The script to create the benchmark numbers appears to be inserting 100 batches, not 10. (The benchmark numbers in the table appear to be consistent with the text, so I guess the actual script used to create them was correct.)
Perils of “UUIDv4”. Everyone knows that’s what UUIDv7 was really for, and you should always convert that to binary to optimize everything.
Is this relevant for other databases? For postgres for example, which supports concurrent writers, wouldn't sequential keys lead to contention on the page at the frontier?
Wait how is sqlite doing a million inserts a second?
How much time is `(random-uuid7-bytes)` taking?
If you need (or want the convenience of) a uuid and the time of creation is not secret then use ulids eg uuid v7.