Updated: Comment #153
Problem/Motivation
Drupal's current method of determining whether a specified user has access to a node is reported to have significant performance issues on a site using node access with a large number of grants.
Proposed resolution
A patch has been prepared, however it appears to be unclear as to what functionality/performance testing might be required or how it would be undertaken.
Remaining tasks
- (todo) commit patch #133
------------------------------------------------------------
Not sure if its faster, but I thought it might:
function node_access($op, $node = NULL) {
// ...
// If the module did not override the access rights, use those set in the
// node_access table.
if ($op != 'create' && $node->nid && $node->status) {
$grants = array();
foreach (node_access_grants($op) as $realm => $gids) {
foreach ($gids as $gid) {
$grants[] = "(gid = $gid AND realm = '$realm')";
}
}
// ...
}
function node_access($op, $node = NULL) {
// ...
// If the module did not override the access rights, use those set in the
// node_access table.
if ($op != 'create' && $node->nid && $node->status) {
$grants = array();
foreach (node_access_grants($op) as $realm => $gids) {
$grants[] = "((realm = '$realm') AND (gid = ".implode(' OR gid = ', $gids)."))";
}
// ...
}
Difference is that realm is just checked once; not per gid. But it is possible that mysql/postgre server already do this, I don't know.
Comments
Comment #1
casey CreditAttribution: casey commentedOww, another optimization might be adding a LIMIT. Since it's only quering whether there is at least 1 match it should be possible to add "LIMIT 1": query is finished when a match is found.
Comment #2
catchrolled into patch, upped version, setting to needs review, didn't add limit in to this patch.
Comment #3
moshe weitzman CreditAttribution: moshe weitzman commentedinteresting. your probably need a very large node access table for this to matter. groups.drupal.org has pretty fast node access queries so there isn't much room to improve. just one example though.
Comment #4
catchNo longer applies.
Comment #5
bdragon CreditAttribution: bdragon commentedFree reroll courtesy of patch bingo.
Comment #6
moshe weitzman CreditAttribution: moshe weitzman commentedgood idea, but a bit late now ... benchmarks might help people get more interested, though I'm not sure that is a prerequisite.
Comment #7
moshe weitzman CreditAttribution: moshe weitzman commentedgood idea, but a bit late now.
Comment #8
catchRe-roll to remove some offset. I'm not set up for benchmarks, yet, but agree that'd be nice here.
Comment #9
lilou CreditAttribution: lilou commentedPatch still apply.
Comment #10
cburschkaIf you tell me what real-world conditions need to apply for this to matter, I can try creating them with the devel generator. I'd stress test this with a large database and see if it makes an appreciable difference.
I very much doubt this will be committed without some benchmarks to prove it is worthwhile (or even that it doesn't slow it down).
Comment #12
lilou CreditAttribution: lilou commentedSee: #335122: Test clean HEAD after every commit and http://pastebin.ca/1258476
Comment #13
David StraussSubscribing
Comment #14
dawehnerjust a reroll to the latest version and a fix of codestyle
does anyone know how to test this intelligent
Comment #15
catchWe should probably have a test_node_grants module, set some grants and ensure that users have the appropriate viewing rights. Maybe two modules to handle the interactions. I don't think this patch should go in without a test like that. Additionally, this query needs converting to dbtng.
Comment #16
ajayg CreditAttribution: ajayg commented1. just want to point out the above patch is not sufficient. You will need to make similar changes to "_node_access_where_sql" function below as well as it overrides for modules using hook_dbrewrite_sql.
2. Another thought
instead of
This may be better perhaps? (but I can not prove with data.)
I tried to benchmark all 3 conditions (A. WIthout the patch B. With the patch C. With "In" clause instead of OR clause as above.) Unfortunately my data is not conclusive. For all first invocations I would get 3 digits (190 millisecond) query time and with subsequent invocation it would be in in single digits(3 milliseconds). SO I am thinking my data/environment is not pristine. Also the difference between all 3 tests (ABC) was not significant.
Comment #17
ajayg CreditAttribution: ajayg commentedOk here is my second try with a clean environment with mysql query cache turned off.
I tried patch but as used "IN" clause from my comments above, For tracker there is slight improvement
before patch
After patch
However for a normal node page the difference is less obvious. The page has a related content view block.
before patch
After patch
Comment #18
casey CreditAttribution: casey commentedDid you also try using LIMIT 1? I think this would also improves performance since the query is finished on the first match.
Comment #19
ajayg CreditAttribution: ajayg commentedwhere exactly (how) you would add the LIMIT clause? I could not find it in any of the code above other than your comment? can you write some code? Also if we add LIMIT here, whats happens to the limit added by query , say pager query?
Comment #20
casey CreditAttribution: casey commentedWell, instead of using COUNT(*) use LIMIT 1. See http://www.dbawill.org/2008/08/08/in-mysql-count-or-limit-1-to-determine... for more info.
Comment #21
ajayg CreditAttribution: ajayg commentedBut there is no Count(*) in this particular patch. So if you need to change the query, which query need to change? I just tested the patch which dynamically adds the criteria for an existing query.
Do you mean something like
But then how you will determine where the preceding query uses select fieldname1, fieldname 2 or select count(*).
Your idea may be correct but I am not understanding implementation details for all use cases. Could you please submit exact patch?
Comment #22
casey CreditAttribution: casey commentedAh I now see what you mean :) I originally posted this issue about the function node_access() which does use COUNT(*) in its query. But in the comments you also mentioned that _node_access_where_sql() needs similar changes. And you are right of course. I haven't looked very well to the patches and was under the impression that they applied to node_access().
Comment #23
erikwebb CreditAttribution: erikwebb commentedIt looks like this issue hasn't been touched in quite a while. My patch is essentially a re-roll with DBTNG in the mix. I ran into this issue with a client running tac_lite with a large number of terms. I want to submit this patch for 7.x for d.o testing (along with the client testing this), then push into 8.x afterwards.
Comment #24
erikwebb CreditAttribution: erikwebb commentedBumping the priority because this change has a massive improvement on any site using node access with a large number of grants.
Comment #25
tim.plunkettThis makes sense, but it should go into D8 first.
EDIT: I see you said that in #23 :)
Comment #26
erikwebb CreditAttribution: erikwebb commented#23: drupal-optimize_node_access_queries-106721-23.patch queued for re-testing.
Comment #28
lotyrin CreditAttribution: lotyrin commentedRolled for 8.x and made the same change to a fourth query.
Comment #30
lotyrin CreditAttribution: lotyrin commentedDerp. Guess I should run more of the test suite and/or wait til I'm sober and rested to write patches.
Comment #31
lotyrin CreditAttribution: lotyrin commentedOr maybe that's not all my problems? :P
Comment #32
lotyrin CreditAttribution: lotyrin commentedComment #33
lotyrin CreditAttribution: lotyrin commentedComment #34
lotyrin CreditAttribution: lotyrin commentedSo, I had planned to test this on other database drivers (just in case) but it looks like that's not going to happen because of issues like #1799310: WebTestBase->setUp() broken for sqlite.
Comment #35
YesCT CreditAttribution: YesCT commented#32: 106721-optimize-node-access-query-building-30.patch queued for re-testing.
Comment #37
lotyrin CreditAttribution: lotyrin commented#32: 106721-optimize-node-access-query-building-30.patch queued for re-testing.
Comment #38
kerasai CreditAttribution: kerasai commented#32: 106721-optimize-node-access-query-building-30.patch queued for re-testing.
Comment #40
jrglasgow CreditAttribution: jrglasgow commentedComment #41
jrglasgow CreditAttribution: jrglasgow commentedrerolled the patch in #32
Comment #42
jrglasgow CreditAttribution: jrglasgow commentedComment #44
bdragon CreditAttribution: bdragon commented#41: 106721-optimize-node-access-query-building-41.patch queued for re-testing.
Comment #45
zzrwood CreditAttribution: zzrwood commented#41: 106721-optimize-node-access-query-building-41.patch queued for re-testing.
Comment #46.0
(not verified) CreditAttribution: commentedUpdated issue summary.
Comment #47
joelpittetThis one needs a re-roll.
@see http://drupal.org/patch/reroll
Comment #48
jeanfei CreditAttribution: jeanfei commentedI think, that the patch doesn't need to be re-roll anymore.
The node access queries were replaced by checkAllGrants() method in NodeAccessController.
I think that this issue should be closed, but I suggest somebody else to check again.
Comment #49
kristiaanvandeneyndeSo let's put it in D7 then?
Comment #50
erikwebb CreditAttribution: erikwebb commentedI believe my patch was the last for D7. We can re-start over from here.
Comment #51
kristiaanvandeneyndeYou should check whether the $gids array contains any items before using it in an IN query.
Otherwise, PDO will throw exceptions whenever $gids is empty.
Comment #51.0
kristiaanvandeneyndeIssue summary updated
Comment #52
oleg.medvedev CreditAttribution: oleg.medvedev commentedComment #53
kristiaanvandeneyndeLooks good now
Comment #54
andyceo CreditAttribution: andyceo commentedComment #55
andyceo CreditAttribution: andyceo commented52: drupal-optimize_node_access_queries-106721-52.patch queued for re-testing.
Comment #56
David_Rothstein CreditAttribution: David_Rothstein commentedStill needs to go in Drupal 8 first (code is in core/modules/node/lib/Drupal/node/NodeGrantDatabaseStorage.php).
Comment #57
joelpittetHere's the d8 version.
Comment #58
msonnabaum CreditAttribution: msonnabaum commentedHere's a new version for D8.
I ran into this on a D7 site with a many node grants, which resulted in severael seconds worth of DatabaseCondition::compile(). This approach doesn't really fix the query much (I think mysql's parser will figure that out just fine), but it helps a ton in assembling the query.
I also extracted this logic into it's own static helper method, since we repeat it three times. I made it static because it's a pure function and it's possible that contrib will need it (views would in D7).
Comment #59
msonnabaum CreditAttribution: msonnabaum commentedAnd here's a D7 version.
Comment #61
kristiaanvandeneyndeD7 looks good, dunno why 8 fails
Comment #62
msonnabaum CreditAttribution: msonnabaum commentedThink I see what happened.
Comment #63
BerdirI know you don't like you write documentation ;), but this will need a proper docblock and $node_access_grants should be type hinted as an array.
Comment #66
msonnabaum CreditAttribution: msonnabaum commentedFair enough.
Comment #67
msonnabaum CreditAttribution: msonnabaum commentedAlso reuploading the D7 version, the one I posted earlier was bad.
Comment #68
Wim LeersNitpick review.
This doesn't add anything, it *builds* an object. So what about
buildGrantsCondition()
?Furthermore, all function/method docblocks should begin with a verb AFAIK.
Missing the typehint.
s/node_access_grants/node_access_grants()/
s/$query->condition/$query->condition()/
Without the parentheses, it looks like you're talking about a DB table/variable/concept and an object property, respectively, rather than functions.
Also, maybe:
s/matching the return value of/as returned by/
?
Comment #69
msonnabaum CreditAttribution: msonnabaum commentedGood call on the method name. I originally had it take the query as an arg and add the condition, but didnt update the name when I changed it.
Everything else in #68 should be addressed.
Comment #70
Crell CreditAttribution: Crell commentedAnother reason I dislike static methods: You can't inject the connection here properly.
At minimum, we should use \Drupal or Database:: here rather than the function.
Although I'd prefer to just make it a normal method and inject the connection. I don't know if AND/OR handling varies by database, but in practice I think we have it setup such that it could.
Comment #71
msonnabaum CreditAttribution: msonnabaum commenteddb_and is an inconsistency considering that i used the class version above it, so I'm fixing that.
The rest I disagree with because no connection should be involved here.
Comment #72
ezra-g CreditAttribution: ezra-g commentedHere's a re-roll of #59 (D7) that fixes some issues applying (contains a /docroot path and some changes to the 7.x contrib Views module), so that we can include this improvement in Commons 7.x.
Comment #73
kristiaanvandeneyndeLooks good to me.
(Same code for a while now, so I can keep repeating this :) )
Comment #74
Crell CreditAttribution: Crell commentedI will compromise on new Condition().
Comment #75
andypostRelated #1349080: node_access filters out accessible nodes when node is left joined
Comment #76
andypostComment #77
andypostsorry
Comment #78
catchkilles did some benchmarks of this approach for a 6.x site but I can't find them on this issue. iirc the results were it made some queries worse, some better, and was neutral for others. That sounds somewhat similar to Mark's findings as well.
I think this is worth it for less time spent in PHP and we can call the query changes neutral, so going ahead and committing, but retitling so we don't make any exciting claims.
Comment #79
Wim LeersIt wasn't 100% clear to me whether this was indeed committed & pushed, but it is: http://drupalcode.org/project/drupal.git/commit/d1d797046ffe79faf45d0aa0.... Yay :)
Comment #80
Crell CreditAttribution: Crell commentedThe patch in #72 needs to be rerolled so testbot can find it. If it passes, that's also RTBC IMO.
Comment #81
erikwebb CreditAttribution: erikwebb commentedRerolled
Comment #82
Crell CreditAttribution: Crell commentedThanks, Erik!
Comment #83
Wim LeersMissing docs!
Comment #85
longwave81: drupal-106721-optimize_node_access_queries-81.patch queued for re-testing.
Comment #86
mgiffordBack to rtbc. Bad bot!
Comment #88
joelpittetWill it blend?
Comment #90
dcam CreditAttribution: dcam commentedClosed #2326095: Optimize queries with node grants. as a duplicate.
Comment #91
donquixote CreditAttribution: donquixote commentedSorry, I still had no chance to test this - but it looks good.
A test I ran with a smaller version of this patch reduced the memory load from ~828MB to ~225MB on a particular site, but this was an extreme situation with I think ~500 Organic groups on one user.
@catch (#78)
I could imagine that smaller queries with only one or a few gids are better off with OR instead of IN.
If we care about this, we could add a simple if() to distinguish these cases.
Also, I wonder if the "gid IN (..)" should not rather happen AFTER the "realm = .." check, since I imagine the former to be more expensive.
On the other hand, the "gid IN (..)" is more likely to evaluate to FALSE, making it more likely that MySQL can skip the "realm = ...".
We would have to find the ideal number for THRESHOLD. Maybe it is just 1.
This being said: Since this is already committed to D8, we might go ahead as planned and do this fine-tuning in a follow-up.
Comment #94
mgiffordBad bot.
Comment #96
joelpittetComment #98
weri CreditAttribution: weri commentedComment #101
dcam CreditAttribution: dcam commentedComment #104
dcam CreditAttribution: dcam commentedComment #105
sheldonkreger CreditAttribution: sheldonkreger commentedI rolled 81 against 7.32. Here is the patch. I suspect many people will like to use this. Please double check my work!
Comment #107
sheldonkreger CreditAttribution: sheldonkreger commentedSorry :-/
Comment #108
sheldonkreger CreditAttribution: sheldonkreger commentedDo not use patch in 105, it breaks node access checks.
Comment #109
David_Rothstein CreditAttribution: David_Rothstein commentedLooking at #81:
(#1 looks like a D7-only issue, but I think #2 might apply to D8 as well.)
Also, #83 points out the node_add_node_grants_to_query() function needs PHPDoc.... (If that were the only issue I would have fixed it myself on commit or what not, but given the other issues we should take care of this one too.)
Comment #110
hefox CreditAttribution: hefox commentedReroll addressing above points
Comment #112
hefox CreditAttribution: hefox commentedo.O
Comment #115
hefox CreditAttribution: hefox commentedoops, forgot to default new arg for table alias
Comment #116
hefox CreditAttribution: hefox commentedNeed a version that will apply with #1349080: node_access filters out accessible nodes when node is left joined
The changes that need to be done should work fine without #1349080: node_access filters out accessible nodes when node is left joined, so sending this in for testing also (e.g. did not add do-not-test)
Comment #118
hefox CreditAttribution: hefox commentedOr maybe it does conflict
Comment #120
hefox CreditAttribution: hefox commentedSorry for the noise, trying to get this back to needs review
Comment #121
hefox CreditAttribution: hefox commentedAgain, sorry the noise but had an oops above of course
Comment #122
hefox CreditAttribution: hefox commentedAfter looking into further, the part that conflicted with this patch was added after a working patch #1349080: node_access filters out accessible nodes when node is left joined to fix an conflict between two modules. It sounds like reason it was added was to fix a bug that that conflict produced, so instead for openatrium went with patch 195 for #1349080: node_access filters out accessible nodes when node is left joined and 115 for this issue.
Comment #125
ohthehugemanatee CreditAttribution: ohthehugemanatee at Forum One commentedPatch 115 on this issue applies cleanly and appears to fix the issue. Can we get more reviewers?
Comment #126
kristiaanvandeneyndeLooks fine except:
Trailing whitespace.
Comment #128
gifad CreditAttribution: gifad commentedI'm not php specialist, but the code for :
should read :
or :
or am I missing something ?
Comment #129
pounardNope, '' is false if you if around.
Comment #130
kristiaanvandeneyndeFor further info, see: http://php.net/manual/en/language.types.boolean.php#language.types.boole...
Comment #131
gifad CreditAttribution: gifad commentedOk, thanks, and sorry for the noise...
Comment #132
pounardNo problem.
Comment #133
q0rban CreditAttribution: q0rban at Lullabot for Cisco Systems commentedThis is identical to #115 with the trailing white space mentioned in #126 fixed.
Comment #134
Chris CharltonAnything holding this back?
Comment #135
Vergil.K CreditAttribution: Vergil.K commentedThis fixes performance issue for large sites which live on grants. Can we get more reviewers to review and close this ASAP?
Comment #136
pounardSo far so good for me, I just need to apply it on a live project and run its tests to really say it's RTBC but I will need some days maybe weeks to confirm.
Comment #137
dcam CreditAttribution: dcam commentedI hesitate to discourage additional reviews with performance testing, but I would like to point out that the basic patch was already RTBC and was rejected based on a code review, not because it didn't fix the problem. If someone can verify that:
1. This is the functionally the same fix that went into D8.
2. The issues in #109 have been addressed.
...then I'd say it could go back to RTBC.
From my brief perspective, I'd have to say that the issues have somewhat been addressed. Yes, a docblock was written, but it looks like it was completely written from scratch. This function had a docblock in D8. I think we should re-use it instead, but I don't care so much that I'll set this back to NW for that. @David_Rothstein could change it on commit, if he wanted.
Comment #140
ajayg CreditAttribution: ajayg commentedAnything holding this back to be committed in 7.x? Looking at the issue I see some of my earlier comments going as back as 8 years ago. :)
Comment #141
kristiaanvandeneyndeSomeone needs to set it to RTBC to get it on a committer's radar, I guess.
Comment #142
ajayg CreditAttribution: ajayg as a volunteer commented@Pounard Is there any chance you tried the patch, as you mentioned? I am in the middle of a migration to 7x for a large install, so can't try till it is completely migrated.
Comment #145
joseph.olstadsubscribing***EDIT*** Followup soon.Comment #146
colan@joseph.olstad: It's no longer necessary to add a new comment to subscribe to an issue. Simply click on the Follow link at the top-right.
Comment #147
joseph.olstadRTBC #133
FYI: patch #81 is in use by over 600 acquia commons distribution installs.
Remaining tasks: commit patch #133
Comment #151
joseph.olstadComment #152
joseph.olstadRTBC patch 133
FYI: patch #81 is in use by over 600 acquia commons distribution installs.
Remaining tasks: commit patch #133
Comment #153
joseph.olstadComment #154
joseph.olstadComment #155
joseph.olstadBumping to 7.70. This didn't make it into 7.60
Comment #156
joseph.olstadNeed for speed, this issue needs to go in, is blocking #2204257: Update Views Content access filter per core performance improvements
Comment #157
joseph.olstadComment #158
joseph.olstadComment #159
joseph.olstadComment #160
MustangGB CreditAttribution: MustangGB commentedComment #161
MustangGB CreditAttribution: MustangGB commentedComment #162
joseph.olstadComment #163
mcdruidPatch #133 seems to have problems with the PHP 5.3 and MySQL 8 tests.
We'll need to address those before we can consider committing this.
Comment #164
joseph.olstad@mcdruid , all those tests came out green, please refresh this page.
Please have another look at the test results.
Comment #165
joseph.olstadback to RTBC
patch #133 is testing all green on all tests after tests were re-triggered. Testbot issue on previous runs is flushed out.
Comment #166
mcdruidYup, you're right @joseph.olstad - the test fails were unrelated and they both passed when re-tested.
The patch looks very similar to what's now in 9.2.x here:
https://git.drupalcode.org/project/drupal/-/blob/9.2.0-alpha1/core/modul...
I might tweak the docblock on commit to match the above in order to avoid the "North American past participle of get" :) but then Drupal does use American English...
Anyway, I think the patch looks good, and I'll ask Fabianx for final review in the next few days.
Comment #167
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedRTBC + 1 -- patch is the same as D9
IMHO it fixes a bug when an empty array is passed to IN() -- it might be that this somehow leads to that condition not being added, because I would have expected a SQL error in that case.
Anyway, this is good to go and I see no harm to match D8/D9/D10 behavior here.
Let's get this in after 14 years ;).
Comment #168
MustangGB CreditAttribution: MustangGB commentedIf you're feeling in the node_access mood, also checkout #3176634: [D7] node_access filters out accessible nodes when node is left joined.
Comment #169
renatogPlease could you use early return on node_add_node_grants_to_query?
From
To:
Comment #170
renatogFollow the new patch using early return
Comment #171
mcdruid@RenatoG thanks, but we'll stick with #133 as that's very close to what's in D9 (see #166).
If you feel strongly that this can be improved, you should submit an issue for D9 per the backport policy.
Comment #172
izmeez CreditAttribution: izmeez commented@RenatoG if you do open an issue for D9 please put a link here so we can follow. Thanks.
Comment #173
renatogMakes sense, let's do this. Thank you people
Comment #174
renatogOpened at the 9.x version
Comment #175
izmeez CreditAttribution: izmeez commentedok, is this pending commit? Thanks
Comment #176
mcdruidYes, #133 should be committed before the next release.
Comment #177
izmeez CreditAttribution: izmeez commentedThanks, just wasn't sure if it needed the Pending commit tag.
Comment #179
mcdruidThank you everybody!
Comment #181
joseph.olstadWould be cool to see a performance benchmark of 7.0 vs 7.81
Nice milestone here!