Problem/Motivation

Currently with Drupal 8 a UUID is useless. It maybe universally unique, but it doesn't really identify anything on it's own.

In Multiversion module there is multiple entity indexes for UUID, revision hash, sequence, and revision tree. All but UUID are irrelevant for core at the moment.

In Multiversion all of the indexes are grouped by workspace, which is important for how Multiversion uses these indexes. If we put indexes into core before workspaces go into core we will need to find a way for contrib modules like Multiversion to alter these indexes.

Proposed resolution

- Create an interface for all indexes
- Create a base class for all indexes
- Create an index of UUIDs with their entity type, entity id and revision id.

Remaining tasks

This is already implemented in Multiversion module, so mostly just needs porting over. However Multiversion introduces workspaces which would need to be removed from the uuid index code in a way that still allows Multiversion to add them.

Technical summary

- entity.index.uuid service
- getters and setters in service
- hook_install to add all existing entities to index
- hook_entity_insert to add new entities to index

How the indexes in Multiversion are used

The main use case is the RELAXed Web Services module which implements the CouchDB API (http://docs.couchdb.org/en/stable/http-api.html) this API focuses 100% on UUIDs, so we need a way to know when GETting or POSTing a URI which entity that relates to without looping through every entity type.

Other use cases

#2353611: Make it possible to link to an entity by UUID - Wouldn't it be cool if we didn't need the entity type in the URI, and out UUIDs were universally identifiable?
#2577923: MenuLinkContent entities pointing to nodes are not deployable: LinkItem should have a "target_uuid" computed property - Wouldn't it be cool if menu links could link to a UUID and we can look up what entity type that relates to?

Comments

timmillwood created an issue. See original summary.

amateescu’s picture

I was thinking about this in the past few days and I couldn't quite figure out what is the use case for the UUID index as it is currently provided by the Multiversion module. Is it really useful to find an entity by UUID without knowing the entity type?

For core, I think we could do something more like the taxonomy index but for each entity type. It would be something like this:

- each content entity type that is referenced by an entity reference field will get its own index table (this is needed in order to be able to provide extra columns per entity type, e.g. 'status', 'sticky', 'created' from the current taxonomy index)
- we do not exclude "inaccessible" entities like the taxonomy index does
- we include UUIDs alongside numeric IDs for both referencing and referenced entity

What do you think?

timmillwood’s picture

Issue summary: View changes
catch’s picture

Yes I'm also not clear on the use case here, tagging for issue summary update.

timmillwood’s picture

Title: Create an index of UUIDs » [PLAN] Create an index of UUIDs
Issue summary: View changes
Issue tags: -Needs issue summary update

The biggest use case for this in Multiversion is the Relaxed module.

Relaxed is an implementation of the CouchDB API (http://docs.couchdb.org/en/stable/http-api.html) a CouchDB database relates to a Workspace, and a CouchDB docid relates to a Drupal entity UUID. Therefore we need to be able to GET URLs like /{db}/{docid} without knowing the entity type. We also need to POST to these types of URL for this we have a bunch of normalizers in the Replication module which handle the normalization and denormalization of entities.

There are many many other places in these modules, especially in the normalizers, where we use the indexes to get entity information from just the UUID or revision hash or sequence id etc

larowlan’s picture

Is there any technical reason those document IDs can't contain the entity type as well.

I.e. do they need to be UUIDs in the strict form of the word, or can they be in format {entity_type}:{uuid}?

timmillwood’s picture

I guess in theory they could be in the format {entity_type}:{uuid}

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

sime’s picture

sime’s picture

This is where Drupal can better provide internal/external interfaces to its library of things. And provides a means to hide internal implementation details like entity types from external systems - general DX at the edge (or even "machine to machine experience"?).

Assumptions:

1. You would not want every entity's uuids in this index because that would include very low level entities. You would potentially compromise the value of the index on a site-by-site basis.
2. Following from that, the index is effectively a cache (could be rebuilt) and should not ever be some sort of primary key table thing.
3. While the index is by default in the database it should be readily put in memory or mongo or whatever.
4. Modules or site config may be able to define which entities are represented in the index, and may be able to extend the information in the index.

If this is written in an abstract way (think "uuid index api") then a module or subsystem could simply request its own index. The "uuid route" module (core or contrib, who cares) says "hey i want a uuid index of all these entity types and here is the storage class for it" and then define a /uuid/... route with the data storage looked after. Other modules could use it too, in the same way that modules share use of the default `cache` table, but a module could define its own for a specific purpose.

sime’s picture

I'm interested if anyone has the view that Drupal should be centralising all entity uuids into a primary table: `uuid`, `entity_type`, `eid`, removing uuid completely from the entity tables, and whether that would be a good thing ™.

colan’s picture

There's some related discussion on this at #1637370-56: Add UUID support to core entity types. I feel like folks are afraid of using a single primary table because of the size, but I wouldn't mind hearing other opinions on that myself. Maybe this is less of an issue with the DBs now than it was 8 years ago.

What you've written above makes sense to me so far, but I'd feel better about it if we had an explicit reason to rule out the single-table approach. Maybe it's not such a terrible idea? I'm not sure.

berdir’s picture

I'm fine with a centralized single index table, but not as the only storage for it. We definitely still need the UUID's to be stored as a field for content entities too.

Note that we already did a custom optimization for block content entities in \Drupal\block_content\BlockContentUuidLookup. That *might* not be required anymore then, but it's a cache collector, so stores all block content uuids in a single cache entry and looks them up if necessary.

sime’s picture

My instinct is that an all in approach might not be practical, or might simply impair the performance wins we get from a central index. More and more we will see developers using entities for things - specifically things that get created without human intervention. I had an issue where broken_link module filled up the db with 50k entities in a day. Or the site where a developer decided to use entities as log entries for machine-to-machine chatty communications. All these uuids would go into an "all in" index I assume.

aaronmchale’s picture

Reading over the issue summary and comments I started to think that this could be a good idea, but (and #21 illustrates perfectly) my gut feeling is that the code which acts on an entity create/save should live in one or more methods in the EntityBase class (or maybe ContentEntityBase? Do we really need this for config entities?), and called at some point in the Entity save process. This would be instead of relying on hook_entity_insert, which is currently proposed in the issue summary. As said, comment #21 provides some rational for this, and so this approach means devs can effectively opt-out the entity types where it would not be appropriate to include in the index. We could even go a step further and provide an optional entity annotation key to allow configuring this.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

ghost of drupal past’s picture

To further on #6 we could create UUIDs that contain the entity type, for example the first 64 bits of our UUIDs could be the first 64 bits of sha1("drupal:$entity_type") and then to find out the entity type one would need to run #(entity types) SHA1 operations -- but that's very well cacheable as that table almost never changes.

bircher’s picture

RE #26: This is the second issue I am aware of which would benefit from predictable or semi predictable UUIDs
The other one is #3208766: Add UUID to sections. If there is more then maybe we can create a new issue to do this in a reusable way.

Though as with the other issue, the upgrade path would be a bit difficult. Here I guess it is a new kind of uuid we would save?

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

geek-merlin’s picture

#21 (Performance penalty from huge entity tables):
Yes it should be possible to exclude entity types from indexing.

Also there may be an alternative implementation that replaces the separate index with sql union magick:

SELECT 'node' AS type, nid AS id FROM node
UNION ALL
SELECT 'user' AS type, uid AS id FROM node
...
WHERE uuid = :uuid

I don't see a good use case for that though, but maybe others do.

c-logemann’s picture

Issue tags: +UUID

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

claudiu.cristea’s picture

Also EntityRepository::loadEntityByUuid() can be deprecated in favour of a method that only needs the UUID, without the entity type argument

c-logemann’s picture

I completely disagree with the starting argument of 2016:
> Currently with Drupal 8 a UUID is useless.

There are so much things we can solve with UUIDs like config entity conflicts and hiding serial IDs (see my module U3ID. And since 2016 we have much more entity types including logs etc. and still growing. On larger systems this index would be a very huge table and should probably handled like a cache table which can be excluded in backups etc.
Personally I only see this as an interesting idea to find an entity just via UUID. But I hope this is something we won't be dependent on this feature in future because I see a lot of performance issues coming with it. So this should be optional at all or at least optional on any (!) entity type as an OPT IN setting in my opinion.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.