Problem/Motivation

On an "API-first" site, most/all of our client side interactions with entities is over JSON:API. Entities exposed over JSON:API require (I think?!) a UUID field. We also take steps to hide serialized entity IDs, where Drupal is apt to send them, to prevent enumeration and make the API less "Drupally."

AFAIK there is no strict requirement that content entities must implement a serial integer ID; they simply need an ID because, well, that's what is used around Drupal's core entity API to load and, uh, identity entities.

I posed the question in Drupal Slack, what if we just used the UUID for both? (This is for a custom content entity, obviously.)

This seems to work in theory, however DefaultTableMapping.php assumes these entity keys are unique field definitions, so if they refer to the same field, you get a deeply merged combination of duplicate values, which screws up the initialization of the entity tables.

Steps to reproduce

N/A

Proposed resolution

in DefaultTableMapping make sure $key_fields contains unique values.

Remaining tasks

Review

User interface changes

N/A

API changes

N/A

Data model changes

N/A

Release notes snippet

Content entities may now declare UUIDs as a content entity's ID (though entity key mappings), removing the prior implicit requirement by core that all entities have separate UUIDs and IDs.

CommentFileSizeAuthor
#32 3310170-nr-bot.txt2.18 KBneeds-review-queue-bot

Issue fork drupal-3310170

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

bradjones1 created an issue. See original summary.

bradjones1’s picture

Let's see if array_unique() breaks anything.

bbrala’s picture

Tbh, it feels like that might work.

bbrala’s picture

So, if we have a failing test only using a uuid as id and then this + test green, I'd be happy to review and probably this.

bradjones1’s picture

Here is the post from Percona regarding performance; honestly I'm not quite sure what this means, if anything, for this issue because it's not like we're saying you must or even should do this; in cases of relatively few entries or developer preference or... whatever, the performance question might not matter. I think this is only really about Drupal being able to support UUID-only identified entities. Whether to do so is up to you.

https://www.percona.com/blog/2019/11/22/uuids-are-popular-but-bad-for-pe...

wim leers’s picture

bbrala’s picture

Your comment is a bit ambiguous, what didn't you realize? This issue or the performance 🤔

tstoeckler’s picture

Thanks for opening this!

Hit this as well before, but never got around to opening an issue. The fix is not quite complete, however: For translatable entities we explicitly remove the UUID field from the data table (because the UUID field comes with a unique key but the data table would have multiple identical entries (namely one for each language for a given entity)). If the UUID and ID keys are the same, however this removal does not work as then we also don't have an ID in the data table.

In case this is useful to anyone: We use the following storage class with our custom entities to handle this issue:

class EntityStorage extends SqlContentEntityStorage {

  /**
   * {@inheritdoc}
   */
  public function getCustomTableMapping(ContentEntityTypeInterface $entityType, array $storageDefinitions, $prefix = '') {
    $tableMapping = parent::getCustomTableMapping($entityType, $storageDefinitions, $prefix);
    assert($tableMapping instanceof DefaultTableMapping);

    // Fix the table mapping for the case that the ID and UUID keys point to the
    // same field.
    $idKey = $entityType->getKey('id');
    if ($idKey === $entityType->getKey('uuid')) {
      // The parent implementation will add the ID and UUIDs keys to the base
      // table independently, resulting in the field being listed twice.
      $baseTable = $tableMapping->getBaseTable();
      $baseFieldNames = array_unique($tableMapping->getFieldNames($baseTable));
      $tableMapping->setFieldNames($baseTable, $baseFieldNames);

      if ($entityType->isTranslatable()) {
        // When removing the UUID key from the data table, the parent
        // implementation also removes the ID key.
        $dataTable = $tableMapping->getDataTable();
        $dataFieldNames = array_merge([$idKey], $tableMapping->getFieldNames($dataTable));
        $tableMapping->setFieldNames($dataTable, $dataFieldNames);
      }
    }

    return $tableMapping;
  }

}

Also note that due to the unique key issue mentioned above, you cannot actually use a field of type uuid for the ID/UUID field, but you need to use a string field. We use the following snippet in our entity base class's baseFieldDefinitions(), which comes out effectively the same as a uuid field but for the unique key (and that we don't set it as read only because we want to use a widget for it, but that could/should be done depending on the use-case):

  public static function baseFieldDefinitions(EntityTypeInterface $entity_type) {
    assert($entity_type instanceof ContentEntityTypeInterface);

    $fields = parent::baseFieldDefinitions($entity_type);

    if ($entity_type->hasKey('id') && $entity_type->hasKey('uuid') && $entity_type->getKey('id') === $entity_type->getKey('uuid')) {
      // Configure a string field to match the UUID field configuration and use it
      // for both the ID and the UUID key.
      $fields[$entity_type->getKey('uuid')] = BaseFieldDefinition::create('string')
        ->setLabel(new TranslatableMarkup('UUID'))
        /* @see \Drupal\Core\Field\Plugin\Field\FieldType\UuidItem::defaultStorageSettings() */
        ->setSetting('max_length', 128)
        ->setSetting('is_ascii', TRUE)
        /* @see \Drupal\Core\Field\Plugin\Field\FieldType\UuidItem::applyDefaultValue() */
        ->setDefaultValueCallback(static::class . '::generateUuid');
    }

    return $fields;
  }

  /**
   * Statically generates a UUID.
   *
   * @return string
   *   A newly generated UUID.
   */
  public static function generateUuid() {
    $uuid = \Drupal::service('uuid');
    assert($uuid instanceof UuidInterface);
    return $uuid->generate();
  }

And lastly linking some related issues where there has been an effort to avoid (some of) the performance hit of using UUIDs at least on PostgreSQL by using native UUID database columns.

thursday_bw’s picture

I just recently discovered this same issue when writing a unit test that relied on a contrib entity that was also patched to add the UUID key to work with JSON:API.

I dug in and applied the same solution as in this patch.

@tstockler Thanks for all those details about translatable entitie's and gotchas.

We are using postgres too, so that perfmormance information is relevant.

thursday_bw’s picture

The following issue on on the apigee_edge contrib module requires the fix for UUIDs as entity ids too. Apigee edge issue: https://github.com/apigee/apigee-edge-drupal/issues/727

bradjones1’s picture

Thanks for the notes regarding PgSQL as well as translations... I'm not sure I entirely follow the issue but I'll need to dig into the code more to understand.

tstoeckler’s picture

Sorry for being a bit brief @bradjones1. I'll take another stab at explaining this:

I'll leave both revisions and fields with dedicated tables (i.e. configurable fields) completely out of this for now and only consider a non-revisionable entity type with a bunch of base fields.

In the non-translatable case we just put all of the fields in the base table and call it a day. As you found there is a bug if the ID key and the UUID key are the same, but let's assume that had been fixed, i.e. the MR above had been merged.

If we make the entity type translatable the table layout changes a fair bit. We still have the base table to generate the serial ID for us and we have a few (other) key fields in the base table as well, but all of the fields are also in the data table which has ID and langcode as the primary key as it has a row for each entity translation. So given that a very simple entity table (where the ID and UUID fields are separate, however) would look like this (spoiler alert: it doesn't):

id langcode uuid label uid changed
1 en 6901e486-40a2-4894-b3e8-b372f21e4439 Example 1 1663666559
1 de 6901e486-40a2-4894-b3e8-b372f21e4439 Beispiel 1 1663666609

The UUID field, however, comes with a unique key in its schema definition, which makes sense for a UUID field ;-). Adding the field to the data table, however, would then also add that unique key to the data table. The example above is already proof that that would not be a great idea. That* is why the UUID field is in fact removed from the list of fields in the data table and it remains solely in the base table.

And the reason I explained this in such a convoluted way is because that is in fact how the code in DefaultTableMapping::create() works:

      $table_mapping
        ->setFieldNames($table_mapping->baseTable, $key_fields)
        ->setFieldNames($table_mapping->dataTable, array_values(array_diff($all_fields, [$uuid_key])));

(This is the non-revisionable case, the revisionable case is different but similar enough.)

So with that in mind when you take a look at what then happens if the ID and UUID field are one and the same, you can see that there's a problem. Let's say the ID/UUID field is called uuid, then:

  • $all_fields will be ['uuid', 'langcode', 'label', 'uid', 'changed']
  • $uuid_key will be 'uuid'

Therefore the end result of fields that will be added to the data table will be: ['langcode', 'label', 'uid', 'changed'] notably without the ID/UUID field which is a problem.

* At least I think that is why. I would be a bit surprised, but not really shocked to learn that that is not actually the reason. I have been wrong before with things about the entity system that I was more sure about than this...

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

tstoeckler’s picture

Issue tags: +DevDaysBurgas2024
tstoeckler’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests

After talking to @rnsrk abou this in Burgas, spent some more time on this.

Fixed the issue mentioned in #9. Also added some inline comments about the changes and finally added some test coverage. I pushed those to individual commits in case there are any objections to any of those.

Note that I thought a bit about how to avoid all this array madness (i.e. array_values(array_unique(array_filter(...))) and friends) and I think the solution would be to make DefaultTableMapping provide an addFields() method, which would internally check for duplicate fields. Will open a follow-up about that, so we can avoid that discussion here (even though it does make the situation (inherently) slightly worse).

Not exactly why this was tagged "Needs change record" or what the change record should be. I mean we could literally write a change notice with "Content entities may now use the same field for the ID and UUID keys" if people think that's useful. (I don't, in particular, but it also doesn't hurt, so I wouldn't mind it.)

tstoeckler’s picture

Not sure why the testbot doesn't run for the updated commits, but I opened #3458145: Replace DefaultTableMapping::setFieldNames() with ::add*() and ::remove*() in the meantime.

tstoeckler’s picture

...ahh maybe because the MR is targeting 10.1.x. @bradjones1 do you have access to change the target branch? Or should we create a new MR? Not sure...

bradjones1’s picture

MR target changed.

bradjones1’s picture

Needed a rebase and now GitLab CI is running.

bradjones1’s picture

Issue tags: -Needs change record

Added a basic CR. Tests green now.

tstoeckler’s picture

Awesome, thanks. You dropped my commits from the MR, though, I'm guessing that was accidental? 🤔

bradjones1’s picture

That is so strange, I think something weird happened with changing the merge target. I thought it was strange I didn't need to do a rebase --onto... and clearly that would be necessary. I've fixed this. Thanks for catching it.

bradjones1’s picture

Title: Cannot use UUID as entity ID » Use UUID as entity ID
tstoeckler’s picture

Great, thanks! 👍

joachim’s picture

Status: Needs review » Needs work

Docs will need changing.

For example, on EntityTypeInterface::getKeys()

   *   - id: The name of the property that contains the primary ID of the
   *     entity. Every entity object passed to the Field API must have this
   *     property and its value must be numeric.
tstoeckler’s picture

Status: Needs work » Needs review

That documentation is actually already incorrect prior to this issue, so opened #3467540: Update EntityTypeInterface::getKeys() docs for string IDs about that. We are not changing anything about being able to use strings in the first place, this is just about using UUIDs, in particular, and to be more precise to be able to use the same field as UUID key and ID key.

wim leers’s picture

smustgrave’s picture

There a recommended way for testing this one?

bradjones1’s picture

Beyond the test that's in the MR? AFAIK this isn't test-able in the UI since you can't create a custom entity type in core via the UI.

smustgrave’s picture

Issue summary: View changes
Status: Needs review » Reviewed & tested by the community
Issue tags: +Needs Review Queue Initiative
1 test triggered 2 PHP warnings:
1) /builds/issue/drupal-3310170/core/modules/mysql/src/Driver/Database/mysql/Schema.php:218
Array to string conversion
Triggered by:
* Drupal\FunctionalTests\Entity\EntityUuidIdTest::testUi
  /builds/issue/drupal-3310170/core/tests/Drupal/FunctionalTests/Entity/EntityUuidIdTest.php:45
2) /builds/issue/drupal-3310170/core/modules/mysql/src/Driver/Database/mysql/Schema.php:218
Undefined array key "Array:normal"
Triggered by:
* Drupal\FunctionalTests\Entity\EntityUuidIdTest::testUi
  /builds/issue/drupal-3310170/core/tests/Drupal/FunctionalTests/Entity/EntityUuidIdTest.php:45
ERRORS!

In that case I ran the test-only feature to show the test coverage.

Cleaned up the issue summary some, copied the CR to the release notes as seems like something that probably should be well announced.

Code review wise everything seems to have proper typehints and return hints. The comments added to DefaultTableMapping about why to get unique is a nice touch.

Going to go out on a limb and say this one is good.

needs-review-queue-bot’s picture

Status: Reviewed & tested by the community » Needs work
StatusFileSize
new2.18 KB

The Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

tstoeckler’s picture

Status: Needs work » Reviewed & tested by the community

Fixed that, thanks!

  • longwave committed 3085a932 on 10.4.x
    Issue #3310170 by bradjones1, tstoeckler, smustgrave, thursday_bw: Use...

  • longwave committed 0bff6d7e on 10.5.x
    Issue #3310170 by bradjones1, tstoeckler, smustgrave, thursday_bw: Use...

  • longwave committed 9b1a9796 on 11.1.x
    Issue #3310170 by bradjones1, tstoeckler, smustgrave, thursday_bw: Use...
longwave’s picture

Version: 11.x-dev » 10.4.x-dev
Status: Reviewed & tested by the community » Fixed

A simple solution to the problem, I don't foresee any side effects but also a little bit wary of touching DefaultTableMapping; as this also has a change record I only committed this to 11.1.x and 10.4.x and higher so it will be available in the next minor releases.

  • longwave committed da2c85be on 11.x
    Issue #3310170 by bradjones1, tstoeckler, smustgrave, thursday_bw: Use...

godotislate’s picture

Looks like after these changes on 11.x, PHPStan is failing the build: https://git.drupalcode.org/project/drupal/-/jobs/3330912.

------ ---------------------------------------------------------------------------- 
  Line   core/modules/language/tests/src/Traits/LanguageTestTrait.php (in            
         context of class Drupal\FunctionalTests\Entity\EntityUuidIdTest)            
 ------ ---------------------------------------------------------------------------- 
  65     Method                                                                      
         Drupal\FunctionalTests\Entity\EntityUuidIdTest::disableBundleTranslation()  
         has no return type specified.                                               
 ------ ---------------------------------------------------------------------------- 

nicxvan made their first commit to this issue’s fork.

nicxvan’s picture

I regenerated baseline.

  • longwave committed 07eea76d on 11.1.x
    Issue #3310170 follow-up by nicxvan, godotislate: Use UUID as entity ID...

  • longwave committed b5c5a2a3 on 11.x
    Issue #3310170 follow-up by nicxvan, godotislate: Use UUID as entity ID
    
longwave’s picture

Thanks for the followup, not sure what happened there as PHPStan didn't complain locally on commit.

Committed and pushed b5c5a2a3ef8 to 11.x and 07eea76d204 to 11.1.x. Thanks!

wim leers’s picture

berdir’s picture

@wim: The point of this issue is that the UUID can be the ID, at the same time. That means these entities do not need to link by UUID because it's literally the same thing as linking by ID.

This got it in just before #3396166: Convert entity type discovery to PHP attributes, so the new EntityTestUuidId entity type is still using annotation instead of attributes, should we open a follow-up issue or do a quick fix here?

longwave’s picture

@berdir at some point we will deprecate annotations so can probably mop it up then?

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

c-logemann’s picture

I just successfully created an "uuid only entity" with just using "entity_keys = { "id" = "uuid", ...". That's awesome and many thanks to everybody who made this possible. I will use this from now on many upcoming contrib and custom entities.

But when there is already content and other code handling with the serial ID? On some other uuid discussions I read many comments of people loving the old serial user and node IDs. For users I created the module "u3id" to handle a mixed usage. So I see a need of mixed usage and transformation of existing content etc. How to start in contrib/custom and how at core entities?

ressa’s picture

This is great! But how do you use it?

As I understand this change, content entities ("nodes") may now use UUID's instead of integers, so I was hoping to see a new option, like "Use UUID's in stead of classic node ID's" under /admin/structure/types/add, when creating a new content type ...

Do you need to create a custom module to use this new feature, and if yes, is there a documentation page?

berdir’s picture

This is an option for new entity types, it is not something that can be enabled for existing entity types such as nodes. It's a storage/base field decision, not something that can be configured. Existing core entity types such as nodes do not and will never* use this.

It's an interesting option for some use cases, but has its drawbacks, database indexes for v4 UUIDs perform much worse than integers for example.

* never say never, but I don't see how that's possible to migrate, all kinds of things reference nodes by ID and there are still things that do not support string entity IDs in core, such as comments.

ressa’s picture

Thanks for clarifying @berdir. I wonder if a regular old school nid-based node could reference content with a UUID-based entity type, or perhaps as @c-logemann mentions, he created "... the module "u3id" to handle a mixed usage.", and this is needed?

If the basics of setting up a basic entity type using UUID as identifier hasn't already been created, it would be fantastic if a basic documentation page was made, maybe also covering the subject of mixing nid's and UUID's?

berdir’s picture

I think you still mix up entity types, content types and bundles a bit. Which is understandable, because the terminology is confusing.

This allows to set both the id and the uuid to the same field for an entity type. That's all. In practice, all the change really does is fix a problem that the storage got confused when the same field ended up multiple times in automatically created indexes in the table.

See the committed example test entity type: https://git.drupalcode.org/project/drupal/-/commit/9b1a9796b42d6bd581b5c..., I also mentioned it in the change record a bit.

There is no mixing of nid and uuid, because a custom entity type doesn't have a nid (because only nodes do) or any other integer id, both id() and uuid() return the UUID. They can be referenced by and reference other entity types because entity reference fields understand when the target entity type has a string ID.

ressa’s picture

Ah yes, I do have a hard time differentiating between them, thank you for your patience :)

Maybe it would have been easier if I had shared my vision: I was hoping to be able to create content, like we can with nodes now, but on top of NID's use UUID's as well (or solely UUID's) specifically to be able to manage most content as configuration, allowing some (or all) content to use UUID's as identifier. The point would be to not use the standard auto generated incrementing integer as unique identifier, as we do now with NID's. But maybe that's not possible ...

berdir’s picture

All nodes also have a UUID, always had since Drupal 8.0. You can rely on that.

The default_content project (including recipe default content in core which is the same format and implementation) uses the UUID as primary identifier for dependency resolution. Both REST and JSON:API modules use the UUID of entities the primary external identifier.

In both cases because the UUID is reliably unique and can be explicitly set, while it is not possible to rely on using a specific serial ID as that may already be in use in a given system.

ressa’s picture

That's awesome! I didn't realize that, thanks for making me aware, and of modules using UUID as identifier, I'll take a closer look at them. I am very grateful for your detailed answers, it helped me a lot. ... and now, I won't be adding more noise to this issue :)