Updated: Comment #N

Problem/Motivation

The order in which data is saved, caches are cleared and hooks invoked is inconsistent and can result in bugs, e.g. calls in field update methods resulting in loading a partially saved entity because the configurable fields are saved after the cache clear.

Proposed resolution

Unify the call order to this:
1. Save/delete base stuff
2. save/delete configurable fields
3. reset caches
4. invoke entity method and hooks.

Remaining tasks

Check if this results in failing tests, review.

User interface changes

None.

API changes

Not really, other than the slightly different behavior of when caches are cleared.

Original report by @username

I was modifying a module which required me to modify a node and save it. However, the changes to that node kept reverting back to what it was before I did the save. Upon investigation I determined that it was the node_load cache and when I did a load just after the save with the flag to reset the cache, all worked as it should.

So first, is there a deliberate reason why node_save can save without keeping the node_load cache in sync?

Second, if it is not deliberate then perhaps a change to the node system along the lines of the following would be useful:

In node_load:

function node_load($param = array(), $revision = NULL, $reset = NULL) {
  global $user;
  static $nodes = array();

  if ($reset) {
  	if (is_numeric($reset)) {
  		if (array_key_exists($reset, $nodes)) {
  			unset($nodes[$reset]);
  			return true;
  		} else {
  			return false;
  		}
	 	} else {
	    $nodes = array();
		}
  }

In node_save (about 5 lines from the bottom):

  // Call the node specific callback (if any):
  if ($node->is_new) {
    node_invoke($node, 'insert');
    node_invoke_nodeapi($node, 'insert');
  }
  else {
    node_invoke($node, 'update');
    node_invoke_nodeapi($node, 'update');
    node_load(null, null, $node->nid);
  }

  // Update the node access table for this node.
  node_access_acquire_grants($node);

  // Clear the cache so an anonymous poster can see the node being added or updated.
  cache_clear_all();
}
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Reg’s picture

Title: Node_load Cache » node_load cache out of sync with node_save

This is a little cleaner for the node_load part:

  if ($reset) {
    if (is_numeric($reset)) {
      $result = array_key_exists($reset, $nodes);
      if ($result) unset($nodes[$reset]);
      return $result;
     } else {
      $nodes = array();
    }
  }
Reg’s picture

Category: task » bug

I'm changing this to a bug as there is no way that I can see where what you save should be allowed to be out of sync with what you load at such a fundamental level.

moshe weitzman’s picture

Yes, it would be smart to invalidate this cache from within node_save()

moshe weitzman’s picture

This is fixed the patch proposed at http://drupal.org/node/111127#comment-768053. That patch is much bigger and intended for D7.

David_Rothstein’s picture

Version: 5.2 » 6.x-dev
Status: Active » Needs review
FileSize
614 bytes

Here is a patch for Drupal 6 that should fix the bug.

Note, I wrote this for Drupal 6 rather than Drupal 7 because:
(a) the bug appears to already be fixed in Drupal 7, although I think it may have been fixed inadvertently (see #424602: Unsaved node changes are reflected on subsequent calls to node_load()), and
(b) static caching in Drupal 7 is currently in a state of flux, so I wasn't even positive what the correct fix would be right now

schwern’s picture

I just hit this bug in 6.x.

@#5 node_load() flushes the entire node cache. That seems a bit extreme. node_save() could instead insert the node it just saved into the node cache. That would require exposing the node_load() cache to node_save() as well. Sounds worth it.

schwern’s picture

FileSize
1.37 KB

FWIW I didn't see a test for this in 7, so here's a test patch.

schwern’s picture

FileSize
995 bytes

Digging into this, there's no special logic in 7 to clear the cache on save. The difference between 6 and 7 is that in 7 node_load() returns the node from the cache and 6 stores and returns a COPY of the node from the cache. In 7 when you call node_load() and fiddle with the node you've already updated the cache.

There's two solutions for 6: stop copying the node before returning it from the cache; update the node cache in node_save(). The former has the advantage of compatibility.

In 7, the consequences of node_load() returning the real cached object means that this now becomes unstable:

    $node = node_load($nid);
    $node->body = "Something";
    node_save($node);

    $node->body = "Something else";

    // code passes

    $node = node_load($nid);
    print $node->body;

What is $node->body? If the node cache happened to have been cleared between the two node_load() calls its going to be its "Something". If not, its "Something else". This exposes the existence of the cache to the user and allows action at a distance. ++ungood

The attached test class demonstrates the bug. It passes in 6 and fails in 7 (remove setUp to run it in 6).

David_Rothstein’s picture

Yup, exactly :) As I already noted above, please see #424602: Unsaved node changes are reflected on subsequent calls to node_load()

schwern’s picture

FileSize
7 KB
4.94 KB

Here's full patches with tests to solve this bug and #424602 in both 6.11 and 7.x. I don't think #424602 is really a duplicate of #154859 as nodes have their own in memory cache unrelated to the database caches.

I'm new to PHP and Drupal, so please excuse any non-idiomatic stylings.

I turned on all error_reporting() in the tests because it seems like a good idea to run the tests with all the warnings flipped on. It exposes some issues in simpletest and the node code. Feel free to turn them off if you don't want to deal with them right now.

yhager’s picture

Due to the nature of patches for 6.x, I assume this won't go in as is. However, this is rather nasty bug to fall on... Maybe a clean workaround could be laid out for 6.x developers, something along the lines of:

if you node_save() something you might want to node_load() later, make sure you clean the nodes cache by calling node_load(array(), NULL, TRUE).

Subscribing.

hanoii’s picture

I just got to this issue and I think the scope of it is even wider, probably covered in other issues but worth mentioning, maybe the same happens on D7?

If you are in any node/xxx/yyy page (node/123/edit for instance), the menu system will call node_load(123) at the very beginning of the page request, and the node will eventually be statically cached in the node_load() function.

This means that after clicking SAVE in the node/123/edit page, the menu will first call node_load(123) function but the node has not been updated at all yet, so it's really caching the old node. So every node_load(123) function called along the update process will always get the old node.

Maybe the patches already solved this, but thought it worth mentioning.

David_Rothstein’s picture

Version: 6.x-dev » 7.x-dev
FileSize
3.64 KB

I think this issue is becoming too confused with #424602: Unsaved node changes are reflected on subsequent calls to node_load(). Please move any discussion related to that problem there (or really to the meta-issue which it was marked duplicate of). Probably I made a mistake by even referring to it in the first place -- although it (coincidentally) helps fix the bug here, it doesn't actually completely fix it, and plus, we shouldn't rely on it but rather should have an actual direct fix.

The attached patch for D7 combines the fix from #5 with a reworked version of the test from #7 (thanks @schwern, and don't be discouraged that your patch sat for so long! - back in April, many people weren't yet focused on the "bug-fixing" part of the D7 development cycle...) The test works with cloned nodes so that it will fail without the actual fix, regardless of what happens with #424602: Unsaved node changes are reflected on subsequent calls to node_load().

Regarding the comment above in #6, I think we actually do need to clear the entire node cache, rather than just the node being saved. Instead of explaining it here, I figured I'd just explain it in a code comment in the patch :)

moshe weitzman’s picture

Not so sure about the clear all nodes upon saving one. Couldn't that be implemented by the text filter module or the node ref module in your examples? I don't think node.module itself needs to be so aggressive, since core does not support these things. My main concern is that we blow away the whole persistent entity cache - http://drupal.org/project/entitycache. Perhaps catch can chime in here.

catch’s picture

Status: Needs review » Needs work

This should use entity_get_controller('node')->resetCache(); to clear the static.

entitycache.module doesn't clear the persistent cache when you do ->resetCache(); so it wouldn't break that particularly. We could add an optional $nid argument to resetCache() too I guess.

David_Rothstein’s picture

Status: Needs work » Needs review
FileSize
3.71 KB

OK, using the resetCache() function now instead.

Seems to me that's probably far enough to go for this patch, since it makes the behavior consistent with http://api.drupal.org/api/function/node_delete_multiple/7... Trying to deal with only clearing a single $nid or group of $nids really sounds more complicated, at least off the top of my head (can a text field easily figure out the cache clearing function for the type of object it is attached to and know how to call it?). Plus it seems like a lot of work for minimal gain, no? The code right above this clears the entire page cache, so clearing the static cache for the rest of the current request seems like nothing compared to that...

moshe weitzman’s picture

Makes sense.

Could save a few cycles by having NodeSaveTestCase extend DrupalUnitTestCase instead of DrupalWebTestCase? Or will the node_save() fail in that state?

catch’s picture

Status: Needs review » Reviewed & tested by the community

Yeah this is fine, we'll just need to be very careful in entitycache to keep the persistent and static cache clearing separate, but that's already the case.

Cross posted with Moshe - UnitTestCase has no database available, at all, so anything which could possibly trigger a database hit is a no go.

webchick’s picture

Status: Reviewed & tested by the community » Needs work

No longer applies. :(

David_Rothstein’s picture

Status: Needs work » Needs review
FileSize
3.74 KB

Rerolled.

The reroll was completely trivial, so should be RTBC again assuming the tests pass.

David_Rothstein’s picture

Status: Needs review » Reviewed & tested by the community

The tests passed :)

fago’s picture

Status: Reviewed & tested by the community » Needs work

The node_load() cache being still there after doing a node_save() is currently the only possibility to retrieve the unchanged node after an update. To be able to get the unchanged node is really important when you want to react on changes, thus most rules rely on reacting on changes! This patch makes it impossible to react on this changes on API level!

Ideally the API would make the unchanged node object available in hook_node_presave and hook_node_update. Thus when we remove the only previously possibility to get the unchanged node, we really need to make it available properly instead.

Also FIELD_LANGUAGE_NONE is LANGUAGE_NONE now.

fago’s picture

With the current behaviour of objects being shared #154859: Document that cached entity objects are not necessarily cloned any more we are already close to the desired behaviour no? If the changed was previously loaded with node_load() the node is already updated in the cache. However in some situations, the updated node is created fresh e.g. from form values or comes from somewhere else, e.g. the form cache - then the node_load() cache hasn't been updated yet. So what about just updating the cached node with the new one? That way we would save further lookups of a node we already had in memory.

+  // Clear the static cache of nodes. Since the content of one node can depend
+  // arbitrarily on the content of another (for example, via a text filter or
+  // field that renders part of one node inside of another), it is not
+  // sufficient to refresh only the node that is being saved; instead, all
+  // nodes must be refreshed.

If a node depends on another node it should really only save a pointer to it - thus the nid or vid. In that case everything is fine as any further request to get the the node will retrieve the updated node anyway. So fixing the cache of the changed node only should be fine.

fago’s picture

For the issue of being able to react on changes I created #651240: Allow modules to react to changes to an entity

fago’s picture

Status: Needs work » Needs review
FileSize
3.35 KB

I fixed the FIELD_LANGUAGE_NONE define and re-rolled the patch.

While I'm still not sure about the complete (static) cache clear I trust the opinion of performance specialists like catch. :)

fago’s picture

#25: node-save-cache-221081-20.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, node-save-cache-221081-20.patch, failed testing.

David_Rothstein’s picture

Title: node_load cache out of sync with node_save » entity_load cache out of sync with entity save
Version: 7.x-dev » 8.x-dev
Status: Needs work » Needs review
FileSize
868 bytes

This bug was mostly fixed as a result of #651240: Allow modules to react to changes to an entity, but I'm running into one scenario where it still occurs.

The issue is that if you call node_load() from within an implementation of (let's say) hook_node_update(), you get the old node rather than the new one that was just saved to the database. Someone else already pointed this out in the comments on api.drupal.org, and a realistic case where I just ran into it is #1541142: Immediately deploying an edited entity sends the old version of the entity rather than the new one.

The fix would be something like in the attached patch, which I'm sending to the testbot to see what happens. (However, it's not just nodes that this issue occurs for; other entities have the same problem.) Also, it appears that $entity->save() in Drupal 8 already does this correctly, so for Drupal 8 it may fix itself on its own once everything is converted to the new entity system.

I think this is a clear bug, but I'm not sure if it's backportable to Drupal 7 or not? It makes the hooks behave the way you'd expect them to, but I guess I can imagine it breaking some existing code...

David_Rothstein’s picture

Another issue still remaining is the inconsistency between save and delete.

In node_save(), we do this:

    // Clear the static loading cache.
    entity_get_controller('node')->resetCache(array($node->nid));

But in node_delete(), we do this:

    // Clear the page and block and node_load_multiple caches.
    entity_get_controller('node')->resetCache();

(And a similar thing for other entities.)

It's not clear to me that this discrepancy exists for a reason.

(Not to mention that the code comment about clearing the page and block caches looks to be out of date, and if we don't clear the page cache when a node is deleted anymore I bet that's a bug too.)

Berdir’s picture

Status: Needs review » Needs work

The last submitted patch, entity-reset-cache-ordering-221081-28.patch, failed testing.

Fabianx’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, entity-reset-cache-ordering-221081-28.patch, failed testing.

Berdir’s picture

#597236: Add entity caching to core shows another problem that we have right now, but the other way round.

We first clear the static cache after we clear the cache, so in case someone calls entity load in e.g. a field update method gets a partially saved entity because the new field values aren't there yet.

Berdir’s picture

Title: entity_load cache out of sync with entity save » Entity cache out of sync and inconsistent when saving/deleting entities
Issue summary: View changes
Status: Needs work » Needs review
Related issues: +#597236: Add entity caching to core
FileSize
2.22 KB

Ok, what about this?

This unifies the calls to this order:
1. Save base stuff
2. save configurable fields
3. reset caches
4. invoke entity method and hooks.

This will also help with moving the hooks into the entity pre/post methods as the order won't change then.

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

That unified order looks fine.

fago’s picture

I agree the ordering makes sense - first do the storage, then reset caches and invoke methods/hooks.

catch’s picture

Not sure about clearing caches before the insert etc. hooks fire - there's some modules that change data in those hooks (iirc ERS does). That's usually a hack to work around core entity storage limitations, some of which we've fixed so maybe it won't be necessary, but would be good to at least add a comment to those hooks to note they shouldn't change the stored data.

fago’s picture

What's ERS?

..but would be good to at least add a comment to those hooks to note they shouldn't change the stored data

Totally, we should finally become clear on that. Please see #1729812: Separate storage operations from reactions

Xano’s picture

catch’s picture

Berdir’s picture

Not sure what's the state here now, should we move cache clear further down or not? I'm fine either way, as long as we we don't keep it like it is now in save, which is broken, see #34.

fago’s picture

Component: node system » entity system
Status: Reviewed & tested by the community » Needs review
FileSize
2.95 KB
5.17 KB

I think catch was requesting to improve comments:

..but would be good to at least add a comment to those hooks to note they shouldn't change the stored data

So, let's do. Attached patch improved entity-crud hooks docs based upon the existing node-crud hook docs which slightly differ. The interdif
unifies things such that pre-operation hook us the "act on .." wording while the post-operation hook use the "respond to" wording

  • states that implementations may not alter entity storage
  • adds consistent notes to post-operation hooks to clarify they are post-operation.
  • removes the note for post-deletion hooks to run after per-entity-type hooks and replaces it with the usual post-storage statement. All entity hooks run after entity-type specific hooks, what's not mentioned anywhere else either.
  • does not change all per-entity type documentation to be consistent with that - although the new docs are more inline with what it's there. Overhauling all per entity-type documentation would deserve its own issue imho.

Given that changes we already be clear on entity-insert/update hooks are for reactions, i.e. it mostly solves #1729812: Separate storage operations from reactions already. Imo a logical consequence would be that we change the scope of the database transaction, but I'd suggest deferring this to #1729812: Separate storage operations from reactions.

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me!

webchick’s picture

Assigned: Unassigned » catch

Sounds like catch already was involved in this one...

catch’s picture

Status: Reviewed & tested by the community » Fixed

Yep extra comments are plenty. Committed/pushed to 8.x, thanks!

yched’s picture

+++ b/core/modules/system/entity.api.php
@@ -286,7 +286,10 @@ function hook_entity_presave(Drupal\Core\Entity\EntityInterface $entity) {
+ * This hook runs once the entity has been stored. Note that hook
+ * implementations may not alter the stored entity data.

This means hook_entity_presave() is the only place where you can alter the entity values then, right ?
Can implementations of this hook tell between an insert and a delete at this point in the saving flow ?

catch’s picture

Should have isNew() (or whatever the 8.x equivalent is now) available in presave().

yched’s picture

Well, currently the various StorageControllers only do ->enforceIsNew() *after* calling invokeHook('presave').

Berdir’s picture

That call is only a trick to make sure the entity is still seen as new while saving data. Above that, they use isNew() themself to figure out if this is an insert or update. So that's safe to use.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

ndobromirov’s picture

This is a bug reported as fixed in for 8. Should there be a back-port to 7?