Problem/Motivation
For now, Drupal stores the UUID values in the entity tables just as plain strings in the database: varchar(128).
This type of storage leads to significant performance issues, compared to the binary UUID format:
Storing UUIDs as strings instead of binary in MySQL results in a 55% storage penalty and significantly degrades performance due to index fragmentation. A string UUID occupies 36 bytes compared to 16 bytes for binary, which reduces buffer pool efficiency, increases disk I/O during reads, and forces expensive, non-sequential "index thrashing" during writes. Furthermore, binary comparisons are computationally cheaper than character-based collation checks, making joins and lookups consistently faster when using the compact BINARY(16) format.
So, switching to the binary format should significantly improve the performance, here are some estimations:
Storage: A 55–70% reduction in storage for the UUID column itself. Because MySQL InnoDB secondary indexes also include the primary key, this space-saving is compounded across every index on your table.
Write Speed (Insert): Up to a 3x–5x faster insertion rate once your dataset exceeds the size of your RAM (Buffer Pool).
Read Speed (Select/Join): Joins and lookups typically see a 20–60% speed improvement because the database compares raw 128-bit numbers instead of performing character-by-character lexical collation checks.
Proposed resolution
To improve the overall performance of Drupal, we should find a way to store UUIDs as binary.
Comments
Comment #2
murzAlso, switching the UUID generator from UUIDv4 to UUIDv7 should increase insertion throughput by 30% to 50% and reduce index fragmentation by up to 30% by enabling sequential, append-only database writes. But this sounds like a separate issue that is much easier to implement without breaking changes, so created a separate issue #3573736: Switch the UUID generator to use UUIDv7 by default instead of UUIDv4 to speed up the performance.
Comment #3
murzSeems we are blocked by #1805576: Add a 'uuid' database schema type to implement this.
Comment #4
murzAh, found a previous issue about the performance issues with UUID - #2491989: [PP-1] Use the 'uuid' database schema type (with native PostgreSQL implementation) for UUID fields - let's proceed with the discussion there.