Dividing responsibilities

For maximum flexibility we want to split up the entity controller into multiple specialized entity controllers (storage, cache, display, forms, etc.).

  • We would be able to simply override the cache controller to include a persistent (database?) cache on top of the static caching or the like.
  • It is also a matter of purity: Each entity controller should really just do one thing: Caching // Storage // Display // Form // Access or module integrations like Views integration … and so on.
  • By providing specialized controllers we provide clean interfaces and allow the “end-developer” (heh) to easily access and manipulate how entities are integrated in the system. Developers can choose among the provided controllers or customize them to their specific needs.
  • Separating entity controllers is going to help us with disentangling the mess in core.

The logic around the entity controllers

The entity_get_controller() function is going to be the universal entity controller fetcher with the centralized static caching of the instantiated entity controllers. It will accept two arguments: a) the entity type and b) the name of the controller. In addition, we are going to define wrappers for this function for every specific controller type.

// Examples:
entity_cache_controller('node')->flush();
entity_storage_controller('node')->loadMultiple(array(1, 2, 3));
entity_storage_controller('node')->save($entity);

// Or consider other possible controllers:
entity_display_controller('node')->view($entity);
entity_form_controller('node')->build($entity);
entity_form_controller('node')->validate($entity);
entity_access_controller('node')->check('view', $entity);

Using wrapper functions like that will give us auto-complete as well as static code validation, and so on.

Separating and combining entity controllers

We agreed that it is important to separate the single controllers from each other. Therefore collaboration should really happen on a higher level and outside of the controllers. A good example for this is entity_load(). This function would first try to retrieve as many entities as possible from the cache via the cache controller, and then continue to load all remaining entities via the storage controller.

Deletion and saving of entities would work similar: Every time we delete an entity via something like an entity_delete() function we first delete it via the storage controller and then remove it from the cache controller.

Conclusion: Never use a remote entity controller and its methods inside of an entity controller.

E.g. the entity_load() function could look like this:

function entity_load($entity_type, $ids, $conditions) {
  // Initialize an array to be populated with entities by the cache and storage controllers.
  $entities = array();
  if (!entity_cache_controller($entity_type)->getMultiple($ids, $conditions, $entities)) {
    // The cache controller was not able to retrieve all requested entities. Retrieve the remaining
    // entities via the storage controller.
    entity_storage_controller($entity_type)->loadMultiple($ids, $conditions, $entities);
  }
  return $entities;  
}

Another example of how much simpler and easier our code would become is entity_load_unchanged():

entity_load_unchanged() first resets the cache for the entity that is requested. This is bad, because we don’t really want to lose the cache here... We just want to load the original (unchanged) entity from the database. With the concept outlined here we could just skip the cache controller and load the requested entity directly from the database:

// Before
function entity_load_unchanged($entity_type, $id) {
  entity_get_controller($entity_type)->resetCache(array($id));
  $result = entity_get_controller($entity_type)->load(array($id));
  return reset($result);
}

// After
function entity_load_unchanged($entity_type, $id) {
  return entity_storage_controller($entity_type)->load($id);
}

entity_load_unchanged() is a wonky workaround currently as there is no (clean) way to skip the cache while loading an entity, since that is a part of the current load method in the controller.

There surely are more examples that prove this point but as you can see we would be able to control entities much easier by invoking completely specialized operations instead of functions with 2, 3 or 4 different purposes.

Declaring responsibility

We would like to change the structure of hook_entity_info() to reflect the changes in the entity controller concept. Here is an example (simplified):

function node_entity_info() {
  $info['node'] = array(
    'label' => t('Node'),
    'controllers' => array(
      'storage' => array(
        'class' => 'NodeStorageController',  // Custom storage controller
        'base table' => 'node',
        'revision table' => 'node_revision',
      ),
      'cache' => array(
        // It’s debatable whether we need an intermediate controller or
        // whether we can directly rely on the already defined cache system interface.
        // For advanced caching functionality (like LRU caching) we would still need a custom
        // implementation of course.
        'class' => 'EntityDefaultCacheController', // Permanent cache() + static cache controller
        'class' => 'DrupalDefaultDatabaseCache', // As known as cache($bin)->get()/->set()/etc
        'bin' => 'node', // Custom cache bin
        'static cache' => TRUE,
      ),
    ),
  );
  return $info;
}

The settings of each controller should be defined together with the controller class in a sub-array to keep the parent array level clean. Each controller has its own, unique array key to work with the entity() function. Each controller is initialized with a) the entity type and b) the settings that are defined for it (additionally the controller could still pull the whole entity info by invoking entity_get_info($entity_type).)

A way forward

As this is a massive change in the architecture and would affect all entities that are currently defined in core I would suggest this:

  1. Start by creating the controller interfaces and their default implementations.
  2. Change the hook_entity_info() structure to work with the entity() function (as suggested).
  3. Split the current entity controllers to use the new pattern.
  4. Change all functions like entity_load() etc. to work with the new controllers and assign the new controllers to all core entities.

The working branch is this: http://drupalcode.org/sandbox/heyrocker/1145636.git/shortlog/refs/heads/...

Comments

moshe weitzman’s picture

Looks like a good idea, but I don't really like the example below. It it non obvious that getMultiple hits cache while loadMultiple does not.

function entity_load($entity_type, $ids, $conditions) {
  // Initialize an array to be populated with entities by the cache and storage controllers.
  $entities = array();
  if (!entity_cache_controller($entity_type)->getMultiple($ids, $conditions, $entities)) {
    // The cache controller was not able to retrieve all requested entities. Retrieve the remaining
    // entities via the storage controller.
    entity_storage_controller($entity_type)->loadMultiple($ids, $conditions, $entities);
  }
  return $entities;  
}
fago’s picture

ad #4: You forget to mention that getMultiple() is a method of the cache controller, and loadMultiple of the storage controller. So that the first one hits the cache should be clear, that the second one doesn't is indeed not immediately clear.
Still I don't think it's a big issue. You have to grasp one time that the storage controller is about storage and nothing else (why should it?), then you got it. I'd even say this is what a non-Drupal person, who does not know the d7 design, expect from the controller naming. That + clearly documenting what which controller is for should do the job imho.

catch’s picture

That example doesn't have any setting back to cache in it. We'd need to know which ids were returned from the cache controller, which from storage, then set back only the ones loaded from storage back to the cache.

catch’s picture

plach’s picture

Entity form controller work in #1499596: Introduce a basic entity form controller. Early reviews welcome :)

fago’s picture

That example doesn't have any setting back to cache in it. We'd need to know which ids were returned from the cache controller, which from storage, then set back only the ones loaded from storage back to the cache.

That's true. I think the storage controller we'll have to take care of updating the cache controller anyways, i.e. always when the storage controller loads, update the cache controller. Respectively, where we currently call resetCache() we'll have to call reset the cache controller's cache.

That way we ensure the cache controller remains up2date, but we can still talk to each of the controller on its own.

mitchell’s picture

> Remote Entity Controllers
See also: #1125024: Enabling Schema API with remote databases

fubhy’s picture

We also got #1696660: Add an entity access API for single entity access... We should start working on that again.

mitchell’s picture

There's also #1763974: Convert entity type info into plugins.

How else could entity controllers use plugins?

klausi’s picture

Status: Active » Fixed

yes we do use specialized entity controllers now, calling this one fixed?

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Add link to branch