A node access rebuild immediately removes access to all nodes on the site, then goes through each one individually and basically node_load()s and then node_save()s them. This can take an incredibly long period of time on a site with tons and tons of content. And because the query goes from oldest to newest, the site is stuck with old crappy content showing up in views and stuff until the node access rebuild is complete.
It seems pretty obvious that it should go newest to oldest instead so at least while most of the content was invisible it would be old crappy stuff no one cares about.
I was surprised not to find a pre-existing issue about this. I might've missed it.
Comment | File | Size | Author |
---|---|---|---|
#42 | D8-993186-42.patch | 4.94 KB | vlad.n |
#5 | order.patch | 735 bytes | moshe weitzman |
#16 | D7-order-node-access.patch | 715 bytes | joshi.rohit100 |
#39 | interdiff-993186-36-39.txt | 846 bytes | hctom |
#39 | D8-993186-39.patch | 4.89 KB | hctom |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedWhat about forcing a maintenance mode in that case? Changing from
oldest to newest
tonewest to oldest
feels like it's only solving the symptom without solving the underlying issue.Comment #2
webchickForcing maintenance mode for 1.5 hours, vs. having a usable front page + sidebars in 30-60 seconds? That doesn't sound like a good idea. :)
Comment #3
nielsvm CreditAttribution: nielsvm commentedTotally agreeing with Webchick here, sometimes sites simply can not afford this. Having a database with 40K+ nodes this process will run for over 4 hours on the command line alone and thus having the first 500 nodes processed in the first 30 minutes means a great deal to the perception of stability of a website.
Comment #4
mdupontProcessing newest nodes first would definitively be an improvement for most installations, however as Damien states it the real issue is that node access rebuild doesn't scale.
Just thinking out loud here. I see 2 obvious ways to help it scale better:
Comment #5
moshe weitzman CreditAttribution: moshe weitzman commentedFor me, this is a great improvement. I can't think of any downside, even for D7. Patch attached.
Comment #6
agentrickardThe desire for perfect should not block incremental improvement.
Comment #7
Dries CreditAttribution: Dries commentedCommitted to 8.x. Moving to 7.x.
Comment #8
Lars Toomre CreditAttribution: Lars Toomre commentedJust saw this in the issue queue so sorry for the late response...
For large sites with lots of content, as a short term fix, wouldn't it make sense to sort on a) promote column and b) status column before the c) nid column? I do not foresee this being called that often, but several sites I am aware of do not necessarily have most recent nodes on the front page, which I would think should be first thing available.
Comment #9
agentrickardThat also makes sense.
Comment #10
Dries CreditAttribution: Dries commentedMoving to 'needs work' as I committed the patch in #5.
Comment #11
miiimoooLooking at the implemntation in D7 I think the main issue is the delete at the start of node_access_rebuild:
I haven't tested this thoroughly but wouldn't it make more sense to remove entries for nodes from the node_access table incrementally?
Something like this
I guess it's not complete because you still have the issue of modules storing data in other tables.
Comment #12
texas-bronius CreditAttribution: texas-bronius commentedI see @Dries' comment in #7 to backport to 7.x, but the version on this issue queue was not updated.
Also, weighing in to say: I like the thinking behind deletes as we go (@miiimooo in #11) and great, harmless but likely effective considerations of more creative sorting (@Lars #8), but the patch as applied to 8 was already the simple sort. But along the lines of "let's not block progress by getting caught in the weeds" (@agentrickard in #6), let's try different stories for each:
#2292113: Improve node access rebuild immediate effect by further improving the sort critera
#2292117: Improve node access rebuild by deleting record-by-record instead of the delete-all at the beginning each time it's run
and move forward with commit of this current improvement to 7.x-dev.
For the core issue and originally committed patch to review, see @moshe's comment in #5.
EDIT:
Oh wow. And after all that, I still appear to see that merely changing the sort of the select did not affect the order of creation of records: Why is that? Here's what I see on 30+k nodes site:
drush sqlq "select max(nid) from node_access"
shows numbers in the 100s and then low 1000s after a little while. It should be just the opposite, and in fact, it should always return the max nid in the node table. :( Any ideas what's going on?
EDIT 2:
Ah. It's because when running node_access_rebuild() from the browser, you're in batch mode. When executed directly from code, like via drush, only then do you get the specific select that's been ordered DESC. Cool.
Comment #14
texas-bronius CreditAttribution: texas-bronius commentedAnd so, to be clear, Moshe's fix in #5 works great for directly executed node_access_rebuild() but is wholly ineffective against batch (browser-clicked Rebuild). The batch bit is a bit more spidery: not just a matter of sorting in one place bc of the way it monitors progress and progress completion with a ++ and a max(nid) I think.
Comment #15
dcam CreditAttribution: dcam commentedConsidering that this went into D8 almost two years ago, I'd agree that #5 should be backported into D7 sooner rather than later despite the stagnant discussion surrounding additional performance optimization in D8.
I'm setting the status as Patch (to be ported) so that it will hopefully have higher visibility for any novices looking for easy things to port.
The issue should probably be set back to active for D8 after the patch is ported.
Comment #16
joshi.rohit100Updated patch for D7 to rebuild access for new content first.
Please review.
Comment #18
joshi.rohit10016: D7-order-node-access.patch queued for re-testing.
Comment #19
dcam CreditAttribution: dcam commentedThis is the exact same change that went into D8. After applying, rebuilding node access permissions still worked on my D7 sites. RTBC.
Comment #22
dcam CreditAttribution: dcam commentedComment #25
dcam CreditAttribution: dcam commentedComment #30
dcam CreditAttribution: dcam commentedComment #31
David_Rothstein CreditAttribution: David_Rothstein commentedCommitted to 7.x - thanks!
And back to Drupal 8 to fix it for the batch situation also.
Comment #33
David_Rothstein CreditAttribution: David_Rothstein commentedAs far as I can tell, actually, the code in Drupal 8 was already changed to try to make this work at some point, but wasn't done correctly... see #2331113: Node access rebuilds are completely broken when being rebuilt through a batch process?. That may need to be resolved before proceeding here.
Comment #34
Fabianx CreditAttribution: Fabianx commentedComment #35
hctomI just gave this a try... I liked the incremental approach from #11 and i figured out that
\Drupal::entityManager()->getAccessControlHandler('node')->writeGrants();
already provides adelete
parameter, so that old records of a node are deleted before saving new ones.In addition to that I extended the following items to allow easy deletion of the default node grant records:
... and of course, I preserved the node order change (nid DESC) for the batch processing.
Attached you can find a patch with all these changes. Please have a look and tell me what you think.
Comment #36
hctomDamn... I forgot to execute the deletion of the default grant! Forget about the patch in #35 - here is a new patch. Sorry for the failure ;)
Comment #38
Fabianx CreditAttribution: Fabianx commentedThis won't fly and desperately needs a test to not regress again:
You test for current_node >, but sort DESC ...
See #2331113: Node access rebuilds are completely broken when being rebuilt through a batch process? for more information.
Thanks for the patch!
Comment #39
hctomWhoops... of course you are right. Attached is a new patch and an interdiff. This fixes the sorting/selection problem, but I was not able to create a full test for
node_access_rebuild()
- perhaps someone else is more into this and may help out on that. As far as I see, only the process of clicking "Rebuild permissions" is covered with a test right now, or did I miss something?Comment #42
vlad.n CreditAttribution: vlad.n commentedRe-roll. Patch applies cleanly at commit f07bcff .
Comment #49
mgiffordInstalls fine in SimplyTest.me - there's a big error there, but thats been there for maybe a week now.
Comment #51
Patrick Storey CreditAttribution: Patrick Storey at Hook 42 commentedI am removing the Novice tag from this issue. It looks like it was added in comment #15 for backporting to D7. It looks like that was completed on comment #31.
I’m using this documentation as a source: https://www.drupal.org/core-mentoring/novice-tasks#avoid
I'm also removing the 'needs backport to D7' tag as it looks like that is completed.
I'm not sure what's remaining on this issue. @mgifford, looks like you said there's a big error? Could you give more information on that. I think we should update the issue summary.
Comment #52
David_Rothstein CreditAttribution: David_Rothstein commentedThink we might backport the batch fix to Drupal 7 too (depending on exactly what it looks like in the end).
Comment #68
quietone CreditAttribution: quietone at PreviousNext commentedThis is a challenging issue to triage.
This was committed to 8.x, then moved to D7 and committed with a different commit message, then moved back to D8 for 'the batch situation'. So, what to do?
I am going to open a new issue for the batch work. I'll upload the latest patch to that issue and link to this issue so anyone can read the comments. And add credit too. For this issue, I'm restoring the meta data to the time of the D8 commit and setting the status to fixed.
New issue created to look into the batch part of this, #3270646: Node access rebuilds should go newest to oldest (fix for rebuilds that are done via a batch).