Updated: Comment #N

Problem/Motivation

While working on #2208617: Add key value entity storage, I was amazed at the amount of code every entity storage needed in order to maintain the assumptions of core.
There are hooks to invoke, exceptions to throw, and very little of the resulting code was actually specific to the storage itself.

It is very easy to miss some of the assumptions, and/or duplicate large chunks of code from the various storage-specific core entity storage controllers.

Proposed resolution

Enhance EntityStorageBase by adding abstract protected methods to wrap just the storage-specific needs, moving as much of the boilerplate upstream as possible.

Remaining tasks

N/A

User interface changes

N/A

API changes

Added the following methods to EntityStorageBase:

protected function doCreate($entity_class, array $values);
protected function doLoad(array $ids = NULL);
protected function doSave($id, EntityInterface $entity);
protected function has($id, EntityInterface $entity);
protected function doDelete($entities);
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tim.plunkett’s picture

This currently incorporates code from #2224833: Remove redundant ID manipulation in ConfigEntityStorage::save(). Attaching a do-not-test patch without that for reviewing.

The diffstat of relevant changes is
+493, -527

But that's really
+243, -2
for the base class, and
+250, -525
for the actual implementations

tim.plunkett’s picture

When I wrote this, I used the prefix "do", but for some reason (lack of sleep?) I changed it to "perform". @Berdir suggested switching it back, I will next time through.

tim.plunkett’s picture

Issue summary: View changes
FileSize
53.08 KB

This still has that other issue rolled in, but I had to reroll around the entity storage class renames anyway. I also changed perform to do.

Berdir’s picture

As mentioned in IRC, have a look at how we're refactoring loadMultiple() in #597236: Add entity caching to core. The ensureOrder() helper might not be necessary anymore as it only exists once, but splitting into different method calls might make a lot of sense for all storage controller, with an empty implementation for the persistent cache or maybe that could still be custom to the content entity implementation.

the uuid stuff in create() shouldn't be in the base class as content entities get that through the field default value.

Overall, this looks interesting, I'm a bit worried about making too many assumptions, based on the diff it seems like we're saving more code in config entity storage than for content entity. But you can always override the methods...

I would prefer to get the entity cache issue in first, though, because the current doLoad() implementation of ContentEntityStorage is a lie and only works with the postLoad() hack, I'm cleaning up a lot there.

tim.plunkett’s picture

Status: Needs review » Postponed
Related issues: +#597236: Add entity caching to core

the uuid stuff in create() shouldn't be in the base class as content entities get that through the field default value.

It's not really a problem since those storage classes don't define the key, so this code never runs...

I'm a bit worried about making too many assumptions, based on the diff it seems like we're saving more code in config entity storage than for content entity. But you can always override the methods...

Well ConfigEntityStorage is much much simpler, without fields and bundles and translations on this level, we have less to worry about. But I did not write this with ConfigEntityStorage in mind, but actually from the perspective of writing a new entity storage class. Not needing all that boilerplate is really really helpful.
And yes, you can just override the methods. It doesn't make anything harder for the content entity classes.

I would prefer to get the entity cache issue in first

Fine by me!

tim.plunkett’s picture

FileSize
44.04 KB

Rerolling now that the other blocker went in. Still postponed.

tim.plunkett’s picture

Status: Postponed » Needs review
FileSize
49.26 KB
5.42 KB

I agreed to postpone this because the entity cache issue seemed likely to go in at any minute. But it's been a couple weeks, and doesn't seem as imminent. I checked with @Berdir, and he's okay with me resuming work on this, and whichever one gets in first wins.

Note that the amount of code savings will go up even more when taking #2208617: Add key value entity storage into account, which was RTBC but is now blocked on a bit of unrelated fragile code (#2240709: ConfigImportUITest::testImport fails when the module list changes).

Updated now that there are unit tests for config entity storage.

Status: Needs review » Needs work

The last submitted patch, 7: entitystorage-2225955-7.patch, failed testing.

tim.plunkett’s picture

tim.plunkett’s picture

7: entitystorage-2225955-7.patch queued for re-testing.

The last submitted patch, 7: entitystorage-2225955-7.patch, failed testing.

tim.plunkett’s picture

Status: Needs work » Needs review
FileSize
49.29 KB

Now I conflicted with my own fixes :)
Rerolled.

Berdir’s picture

While I would love to unify things and make it easier for actual implementations, I'm worried a lot that this is going into a rather different direction than #597236: Add entity caching to core right now. It is going to be a crazy re-roll anyway, that's OK, but I'm hoping that we can find a way that will make my patch easier, not harder :)

Did you have a look at the flow there already? See also below for some snippets, but you probably want to have a look at the applied patch, I will do the same with this, it's hard to grasp it from looking at the patch.

  1. +++ b/core/lib/Drupal/Core/Entity/EntityStorageBase.php
    @@ -155,12 +155,128 @@ protected function invokeHook($hook, EntityInterface $entity) {
    +      $queried_entities = $this->doLoad($ids);
    

    Why not doLoadMultiple()? That's what I'm doing in #597236: Add entity caching to core.

    This is how loadMultiple() looks over there for ContentEntityDatabaseStorage:

      /**
       * {@inheritdoc}
       */
      public function loadMultiple(array $ids = NULL) {
        // Create a new variable which is either a prepared version of the $ids
        // array for later comparison with the entity cache, or FALSE if no $ids
        // were passed. The $ids array is reduced as items are loaded from cache,
        // and we need to know if it's empty for this reason to avoid querying the
        // database when all requested entities are loaded from cache.
        $passed_ids = !empty($ids) ? array_flip($ids) : FALSE;
    
        // Try to load entities from the static cache, if the entity type supports
        // static caching. This will remove IDs that were loaded from $ids.
        $entities_from_static_cache = $this->getFromStaticCache($ids);
    
        // Load remaining entities either from the persistent cache or storage.
        $entities = $this->doLoadMultiple($ids);
        $this->invokeLoadUncachedHook($entities);
        $this->setStaticCache($entities);
    
        $entities += $entities_from_static_cache;
        return $this->ensureOrder($passed_ids, $entities);
      }
    

    It no longer makes sense to use that 1:1, because for example the ensureOrder() method added there was only useful as long as we still had different implementations of that. However, renaming the cache methods to StaticCache() would help my issue a lot, then I won't have to do that anymore and right now, this change only happens within my own storage controller but now, I would have to rename it globally, to introduce a new caching layer locally. The return by reference pattern is I think also useful but I don't care much about that.

  2. +++ b/core/lib/Drupal/Core/Entity/EntityStorageBase.php
    @@ -155,12 +155,128 @@ protected function invokeHook($hook, EntityInterface $entity) {
        */
       protected function postLoad(array &$queried_entities) {
    +    $queried_entities = $this->mapFromStorageRecords($queried_entities);
    +
    

    I think this is wrong, this should be part of load, not postLoad(). This will not work for me in the entity cache issue, because doLoad(Multiple)() does two things, load from persistent cache and load from storage, and I need to treat the responses differently and even invoke different hooks on them. Hooks that obviously already need the entity objects.

    The fact that the content entity database storage does that right now is a forgotten left-over that I have to clean up in the entity cache issue.

    This is my doLoadMultiple() implementation there, including getFromStorage() and getFromPersistentCache():

      protected function doLoadMultiple(array $ids = NULL) {
        // There is nothing to do if $ids is an empty array.
        if ($ids === array()) {
          return array();
        }
    
        // Attempt to load entities from the persistent cache. This will remove IDs
        // that were loaded from $ids.
        $entities_from_cache = $this->getFromPersistentCache($ids);
    
        // Load any remaining entities from the database.
        $entities_from_storage = $this->getFromStorage($ids);
        $this->setPersistentCache($entities_from_storage);
    
        return $entities_from_cache + $entities_from_storage;
      }
    
      protected function getFromStorage(array $ids = NULL) {
        $entities = array();
    
        if ($ids === NULL || $ids) {
          // Build and execute the query.
          $query_result = $this->buildQuery($ids)->execute();
          $records = $query_result->fetchAllAssoc($this->idKey);
    
          // Map the loaded records into entity objects and according fields.
          if ($records) {
            $entities = $this->mapFromStorageRecords($records);
    
            // Attach field values.
            if ($this->entityType->isFieldable()) {
              $this->loadFieldItems($entities);
            }
          }
        }
    
        // Pass all entities loaded from the database through $this->postLoad(),
        // to invoke the load hooks and postLoad() method on the entity class.
        $this->postLoad($entities);
    
        return $entities;
      }
    
      protected function getFromStaticCache(&$ids) {
        $entities = array();
    
        if ($this->entityType->isStaticallyCacheable() && $ids) {
          $entities = array();
          // Load any available entities from the internal cache.
          if (!empty($this->entities)) {
            foreach ($ids as $index => $id) {
              if (isset($this->entities[$id])) {
                $entities[$id] = $this->entities[$id];
                // Remove the ID from the list.
                unset($ids[$index]);
              }
            }
          }
        }
        return $entities;
      }
    

    Again, some things don't match, like calling postLoad() in getStorage(), we will need to find a different solution for that (@fago is proposing there to introduce a new hook_entity_storage_load() hook instead of the normal one) but still, I need to find a way to do what I need to do, preferably without overriding everything from the base class :)

Berdir’s picture

Ok, how do you feel about this?

This does a few more changes that aren't strictly required to introduce the do*() method pattern, but it brings this back in line with the refactorings in the other issue.

- Added an additional check that we have an uuidKey and an uuidService (also moved the property definition up), because it only didn't trigger for ContentEntityDatabaseStorage as isset() on the uuid key was TRUE as it is an object. Also, it assumed a public property, so I put it int the $values instead. Still not completely happy about this, but that works for now I think.
- renamed the static methods and removed $this->cache (was very confusingly named)
- Moved up $this->entityCache from ContentEntityStorageBase, it was still there but not used anymore, I think this is helpful to have everywhere, then doCreate() can use $this->entityClass instead of having to pass it around.
- renamed doLoad() to doLoadMultiple()
- Moved the mapFromStorageRecords() into doLoadMultiple() and documented that as returning entities, because IMHO, what happens in between those two methods is an implementation detail of the storage, which is important for the persistent cache refactoring. This is one more line in that method, so I think it's OK, the other alternative would be to have a default doLoadMultiple() implementation that would call fetchRecords() or something like that which would return the raw results. But given that for example the content entity storage returns objects while the others return arrays is a nice example why this should be internal.
- Some changes to the User and Comment storage that are necessary due to postLoad() now longer receiving content entities, moving those hacks to mapFromStorageRecords() instead.

Another thing is that this now does what we came to agree on in #1885830: Enable static caching for config entities (more or less), support static caching for config entities but don't enable it by default. It however is missing the static cache changes that are important to make the override stuff (and runtime changes of overrides) work.

This should now simplify the entity cache issue *a lot* and allow us to focus on doing persistent caching instead of refactoring the content entity storage :)

Status: Needs review » Needs work

The last submitted patch, 14: entitystorage-2225955-14.patch, failed testing.

tim.plunkett’s picture

Status: Needs work » Needs review
FileSize
4.52 KB
62.2 KB

I'm okay with those changes. Using ConfigEntityStorage as a barometer for the DX implications, the only actual difference was this:

+++ b/core/lib/Drupal/Core/Config/Entity/ConfigEntityStorage.php
@@ -179,20 +171,20 @@ protected function doLoad(array $ids = NULL) {
-    return $result;
+    return $this->mapFromStorageRecords($records);

Updated the unit tests (now getClass is called in the constructor).

Thanks @Berdir!

Berdir’s picture

Sorry about not updating the unit tests, thanks for the fixes :)

Looks like there's a bunch of failing tests, probably messed something up, I'll look into it tomorrow.

Status: Needs review » Needs work

The last submitted patch, 16: entitystorage-2225955-16.patch, failed testing.

Berdir’s picture

Ok, most test fails are because UserStorage::mapFromStorageRecords()/addRoles() doesn't check if the array is empty, I earlier had a check that only called that method when there where $records but removed it as it should work for an empty array, but either within addRoles() or in the call to it, we need to check if we have results.

tim.plunkett’s picture

Assigned: tim.plunkett » Berdir

I saw the same thing. That should be the job of addRoles.

But that's not the only fail, for example FieldAttachStorageTest is unrelated.

I have no idea what you broke, but it definitely is around the timing of the call to mapFromStorageRecords.

Berdir’s picture

Status: Needs work » Needs review
FileSize
63.75 KB
2.1 KB

Yeah, I forgot to update loadRevision().

Status: Needs review » Needs work

The last submitted patch, 21: entitystorage-2225955-21.patch, failed testing.

Berdir’s picture

Status: Needs work » Needs review
FileSize
64.09 KB
672 bytes

This should fix the last test fails.

tim.plunkett’s picture

I'm +1 one on the changes since my patch. Now who do we get to RTBC? :)

dawehner’s picture

Wow, this really simplifies stuff and makes it way better for people writing stuff.

  1. +++ b/core/lib/Drupal/Core/Entity/EntityStorageBase.php
    @@ -155,27 +162,265 @@ protected function invokeHook($hook, EntityInterface $entity) {
    +   * Performs storage-specific loading of entities.
    +   *
    +   * @param array|null $ids
    +   *   (optional) An array of entity IDs, or NULL to load all entities.
    +   *
    +   * @return \Drupal\Core\Entity\EntityInterface[]
    +   *   Associative array of entities, keyed on the entity ID.
    +   */
    +  abstract protected function doLoadMultiple(array $ids = NULL);
    +
    

    Given that most people will not extend from that class directly I would suggest to tell people more actively that this is THE method you would to work with. Maybe add some sentence about that

  2. +++ b/core/lib/Drupal/Core/Entity/EntityStorageBase.php
    @@ -155,27 +162,265 @@ protected function invokeHook($hook, EntityInterface $entity) {
    +  abstract protected function doSave($id, EntityInterface $entity);
    ...
    +  abstract protected function has($id, EntityInterface $entity);
    ...
    +  abstract protected function doDelete($entities);
    

    do load multiple is directly under loadMultiple but these ones are grouped at the bottom. Is there a specific reason we did that?

  3. +++ b/core/tests/Drupal/Tests/Core/Config/Entity/ConfigEntityStorageTest.php
    @@ -110,6 +110,8 @@ public static function getInfo() {
    +   * @covers ::__construct()
    

    <3

tim.plunkett’s picture

1) Added a line explaining that
2) Rearranged the new methods, thanks
3) :)

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Thank you for these changes

tim.plunkett’s picture

#2208617: Add key value entity storage will need to be reworked slightly after this goes in.

xjm’s picture

Will this issue require any change record updates? I think it might also merit an 8.x to 8.x change record, since it will get rid of a lot of redundant code for most entity storage handlers, right?

tim.plunkett’s picture

andypost’s picture

+++ b/core/lib/Drupal/Core/Entity/EntityStorageBase.php
@@ -155,27 +162,269 @@ protected function invokeHook($hook, EntityInterface $entity) {
+   * Determines if this entity already exists in storage.
+   *
+   * @param int|string $id
+   *   The original entity ID.
+   * @param \Drupal\Core\Entity\EntityInterface $entity
+   *   The entity being saved.
...
+  abstract protected function has($id, EntityInterface $entity);

Why the order is not a ($entity, $id) most of time entity goes first

tim.plunkett’s picture

I put ID first because we explicitly cannot rely on $entity->id(), the $id was specially calculated for this purpose.
Since we have no other places where this is done, putting $id first is fine (and preferred).

tim.plunkett’s picture

Priority: Normal » Major

This is blocking a major (#597236: Add entity caching to core)

alexpott’s picture

Title: Improve the DX of writing entity storage controllers » Improve the DX of writing entity storage classes
Status: Reviewed & tested by the community » Fixed

Reviewed this asking a few questions of @tim.plunkett and @Berdir in IRC. I raised questions about:

  1. the implementation of ContentEntityDatabaseStorage::has() since it does not actually check the db
  2. the lack of injecting uuidService into EntityStorageBase
  3. the doSomething pattern

The answers were:

  1. This is how content entity storages currently work
  2. Content entities do not need it and we check before using the uuidService
  3. the pattern is used elsewhere and all are abstracted protected apart from doCreate so implementation know to implement these methods

Committed 1a5d1fc and pushed to 8.x. Thanks!

diff --git a/core/lib/Drupal/Core/Entity/EntityDatabaseStorage.php b/core/lib/Drupal/Core/Entity/EntityDatabaseStorage.php
index 7fb61a7..33eb34c 100644
--- a/core/lib/Drupal/Core/Entity/EntityDatabaseStorage.php
+++ b/core/lib/Drupal/Core/Entity/EntityDatabaseStorage.php
@@ -9,7 +9,6 @@
 
 use Drupal\Component\Uuid\UuidInterface;
 use Drupal\Core\Database\Connection;
-use Drupal\Core\Entity\Query\QueryInterface;
 use Symfony\Component\DependencyInjection\ContainerInterface;
 
 /**

Fixed in commit.

  • Commit 1a5d1fc on 8.x by alexpott:
    Issue #2225955 by tim.plunkett, Berdir: Improve the DX of writing entity...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.