Hi!
Last night I just discovered, on our live website, a highly-time-consuming code on the node revision history page (the page you get when you click moderate on a node). The fact is the more you have nodes and revisions of nodes, the more this page takes to be rendered. In my case we have a large website with thousands of nodes and revisions, this goes as far as it reaches the max execution time limit set by our hosting company.
I set this report as "feature request" as this is not a bug, as it worked until now, but there is no "code suggestion" category... Feel free to move it to another category if needed ;) I'm writing this to discuss which way must be followed to avoid the problem before writing a patch to fix it (need your opinion).
I found the problem resides in the file workbench_moderation.node.inc in the function workbench_moderation_node_history_view(). At line 135 of the file starts the revision operations part (that ends at line 150). Commenting out these lines makes the page render in milliseconds instead of reaching the max execution time. It seems the time consuming method called is workbench_moderation_access_link().
The function workbench_moderation_access_link() is set at line 1821 of the workbench_moderation.module file. It's a very simple method which just call Drupal's menu_get_item() (in fact plenty of times it there is plenty of revisions).
So the question is, what's your suggestion to replace this call by a more efficient function ? Sorry the my english, I hope it will be understandable ;)
Thank you!
| Comment | File | Size | Author |
|---|---|---|---|
| #67 | workbench_moderation-optimize_node_revision_history-1408838-67.patch | 7.57 KB | colan |
| #67 | interdiff-1408838-61-67.txt | 2.86 KB | colan |
| #61 | workbench_moderation-n1408838-61.patch | 7.26 KB | damienmckenna |
Comments
Comment #1
acrollet commentedI've delved into this a bit, as we've experienced the same problem. Basically the moderate tab for any node with a large number of revisions takes a long time to render, or can even time out. I've found that at least in our case, this seems to be entirely due to the calls to workbench_moderation_access_link() that generate the View/Revert/Delete links. Would it be feasible to replace this function call with a simple permissions check on the relevant ({View,Revert,Delete} Revisions) permissions, and generate the links if the user has the permission? Happy to submit a patch for this approach if a maintainer approves it, or any other suggested approach.
Comment #2
hdennen commentedSomething like revision_deletion module would be even better. I've just implemented Workbench and can already see how my DB will soon grow exponentially due to so much revision data...
Comment #3
jwilson3Here is a patch that applies the suggestion in #1 by using _workbench_moderation_revisions_access() to test for access before building the link, instead of depending on the menu system to build the links. This has incredible performance gains.
In the process of solving this issue, I actually also fixed and improved the functionality of the _workbench_moderation_revisions_access() function itself, which was doing nothing to prevent people from accessing node/%node/revisions/%/revert or node/%node/revisions/%/delete on current and published revisions. This is now remedied (you get an Access Denied message).
I stumbled upon http://drupal.org/node/1064954, which has now been fixed in Drupal core, and so we don't need the temporary -1 / -2 vid's that were being used. This was actually breaking the usage of this function for successive calles of different revisions (such as what is needed for the moderation page).
Note also, that core has changed the $op in _node_revisions_access() from 'revert' to 'update', which was also causing issues.
Though there is a lot going on here, I feel like this is an important performance issue that should be taken into serious consideration.
Comment #4
jwilson3Regarding #2, deletion of stale revisions might be a work around for some use cases, but its is not a valid solution to this performance problem because it doesn't address the root of the issue.
Please file bulk revision pruning as feature request on a separate issue if it interests you.
Comment #5
acrollet commentedI've reviewed the patch in #3, and it definitely fixes the performance issues noted by the OP. Coding style is good and I like the approach, setting to RTBC.
Comment #6
istryker commentedWas there actually a security hole. I tried to find it by playing around with the permissions and doing direct url links to
node/%/revisions/%/[revert, delete]. Every single time I recieved a 403 (access denied) when I did not have the Revert content revisions and/or Delete content revisions perrmission.Comment #7
jwilson3I think you misunderstood this. It wasn't a security hole and nothing to do with edit/revert permissions, but a logic hole: you shouldn't be able to "revert" or "delete" the current revision (ie $node->workbench_moderation['current']) or the published revision (ie $node->workbench_moderation['published']).
You can see this functionality right now on 7.x-1.x-dev:
1) Create a content type, enable revisions and enable workbench default to "Draft".
2) Create a node:
node/13) Edit the node, to create a new draft:
node/1/revisions/24) Edit the node again, to create a second revision
node/1/revisions/35) Try to revert / delete the current published revision:
node/1/revisions/1/revert,node/1/revisions/1/deleteResult: You can access the page! There are even direct links to these pages via the "revert" and "delete" link on the Revisions operations column listings at node/1/moderation for the published version. The odd think is that clicking through the confirmation button for either operation does nothing (the node nor the revision is reverted/deleted). In other words, these are useless links to non-functional pages. IMHO better to just block access and not show a link.
6) Try to revert / delete the current draft revision:
node/1/revisions/3/revert,node/1/revisions/3/deleteNote: there is no link in the Revisions operations for this, but you can still access the page. The result is that it asks for confirmation to revert/delete, and if you click through, it does absolutely nothing.
7) Finally, try to delete the old draft revision:
node/1/revisions/2/deleteThis is permitted (as normal) and the draft is then removed from the Revisions listing.
8) Now, apply this code, and try this same routine and note two differences:
8.a) there is no delete/revert link for current drafts or published revisions.
8.b) accessing these two pages (based on grabbing the revision id from the Revisions table and building the link) provide Access Denied for current drafts or published revisions, whereas old revisions are reversible/deletable.
Does this make sense at all?
Comment #8
jwilson3Thinking on #7 a little further, there may be one case where reverting does make sense:
You should be able to revert to the published version if the there are newer 'current' drafts. This would effectively create a new "current draft" resetting its contents back to the current published version, essentially abandoning the current edits. This is, in principle, the same thing as adding an edit button (in place of revert).
Comment #9
hyperglide commentedHas #3 been reviewed by anyone with commit rights. Looks like a plus.
Comment #10
hyperglide commented#3: workbench_moderation-revision-operations-performance-1408838.diff queued for re-testing.
Comment #11
jwilson3This code fails if there is no published version of the node, giving the following error message:
Fix on the way.
Comment #12
jwilson3Oops.
Comment #13
jwilson3This is pretty much the same patch as #3, except I've added an isset() check for the 'published' array index first. See the interdiff for the lo-down.
Comment #14
jwilson3Saved the wrong patch :( heres another try.
Comment #15
hyperglide commentedjwilson3 Glad to see the updated patch.
Is this ready to be tested on any scale or need a review prior?
Related to #2 issue #1875468: Integrate support for node_revision_delete and workbench_moderation. was created.
Comment #16
hyperglide commentedWith all the great movement in the WorkBench Issue Queue, wanted to stir the pot and see if this or the issue in #15 can grab any assistance. thx
Comment #17
jwilson3Re #15, the current issue status demonstrates that this is ready to be tested. We're using it in at least 2 production projects now.
Comment #18
hyperglide commented@jwilson3 okay. I can test. Can you advise what to look for as this is mostly a back end update? We can run it. .See it things break but any other guidance to help focus our testing?
Comment #19
jwilson3The change in this issue can be tested by:
1) Create a node with a LOT of revisions (eg, like 20 or 30).
2) Load the "moderate" tab for the node, taking note of how long the page takes to load. If its instantaneous, then create more node revisions until you start to seriosuly notice the load of the page slow down.
3) Then apply the patch to the codebase, and reload the "moderate" tab for the same node, taking note of how long it takes to load the page.
4) Test permissions model, by setting up different user roles some without the ability to delete nodes, confirm that the revert and delete links work the same way before and after the patch being applied.
UPDATE:
5) Test various moderation state configurations. Eg when you unpublish an older published revision, and have newer unpublished revisions, does everything still work as expected.
Comment #20
hass commentedMarked #1512806: Add support for limiting total number of rows displayed on moderation page (aka: add a pager) as duplicate.
Comment #21
hass commentedMarked #1836324: Moderate tab on node page requires pager as duplicate.
Comment #22
jwilson3Even if the original motivation for #1512806 was to solve the performance problem that this issue resolves, I would still consider leaving it open, because adding a pager to the Moderate tab seems like a valid feature request (Eg, imaging a node with 1000 revisions). What do you think?
Comment #23
hass commentedYes that's correct, but I thought to add the 2-3 lines required for a pager here into the patch.
Comment #24
hass commentedRe #14: all codewise...
* Why are you using addJoin left outer if we have leftJoin that is an left outer join?
* We shoud fix the singe word operations to be ucfirst as well known. So we do not need to change the lines twice.
*I would use an alias wbnh or nh for the left joined table, but this may be nitpicking.
Comment #25
hass commentedComment #26
jwilson3This re-roll addresses everything from #24, except for the change to the alias.
As for the "LEFT OUTER" and the alias "m", I was just following existing code examples in workbench_moderation.module (eg see function
workbench_moderation_node_data()).Done!
Again, I was following existing code examples in workbench_moderation.module (eg see function
workbench_moderation_node_data()).I feel in general that this is pointless because a) it varies from the existing alias used in workbench_moderation.module, and b) the alias is not used anywhere else besides this single query and its plainly obvious what it means. wbnh adds nothing to the readability of the code, and with that logic the node_revisions alias should also be updated to 'nr' from 'r'. Finally, I must insist on leaving this as is, because the three aliases line up vertically to spell 'RUM' (my favorite spirit) :P
imho, the pager request is a completely separate chunk of code, functionally independent from what this issue addresses, i.e. the fundamentally slow performance of calling workbench_moderation_access_link() repeatedly. The current patch solves the problem in question without the need for a pager. I believe these "2-3 lines would be better served as a separate feature request and separate patch that can be added and tested on its own. That being said, if you want to add the pager to this patch, have a ball. However, hijacking the progress with additional feature requests (ie scope creep) and leaving it marked Needs Work is not productive to moving this forward. Baby steps. Atomic changes. Forward progress. #win.
Comment #27
jwilson3And now.... for the actual patch!
Comment #28
thekevinday commentedThe patch works properly and solves the problem.
I have not noticed any new bugs as a result of this.
A moderation view page that takes ~10,000 miliseconds (according to the dev module) prior to this patch now takes ~700ms (which is the ~default for most pages on this particular site).
Comment #29
BarisW commentedSame here, patch works great. Thanks!
Comment #30
BarisW commentedPlease commit. 2 months without action..
Comment #31
kristen polBump :)
Comment #32
hyperglide commentedIs this able to be committed?
Comment #33
kristen polI'm bumping to major since it can cause serious performance problems.
Comment #34
vacilando commentedThe patch seems to work well!
Comment #35
dave reidNeed spaces inbetween array values as per coding standards.
Unnecessary additional whitespace.
--
With the patch applied, are there any more uses of workbench_moderation_access_link? If not, we should remove that function.
Comment #36
kristen polHere's the patch with changes from #35.
Comment #38
kristen polTrying again.
Comment #39
drupalmonkey commentedPatch from #38 worked for me. Had a node/%/moderate page that was giving a 500 error due to timing out, page loads very quickly now with the patch.
Comment #40
hyperglide commented@Dave Reid -- How things look now?
Comment #41
hyperglide commented@Dave Reid -- How things look now?
Comment #42
hyperglide commentedPIng any status on this patch?
Comment #43
jwilson3Status hasn't changed it is still RTBC, waiting on free time from maintainers. I'd love to move this up to critical, because without the patch the module doesn't scale at all for any site that uses revisioning and where simply pruning or throwing away old revisions is not an option.
Comment #44
hyperglide commentedMoving to critical in hopes of getting another review.
Comment #45
jwilson3I meant to complete my sentence above with "but I don't think moving to critical will get this in any faster". Let's leave it as is and let the maintainer decide if it should be on the short list or not.
Comment #46
fullerja commentedMarked #2117927: visiting Moderate tab exceeds PHP memory_limit of 256M as duplicate
Comment #47
fullerja commentedPatch in 38 works well for me
Comment #48
fullerja commentedCame across an issue with the patch in #38, the "Action" links (In particular the "Create a new draft from the published revision." on the node/*/edit page disappear with this patch
Comment #49
miteshmapHere is another solution by adding paging.
patch created at Coworks
Comment #50
jwilson3The pager limits each page to thirty elements, but iirc, my experience 2 years ago, was that anything over 10 or 15 revisions started to produce a noticeable lag in page display.
Also, this solution does nothing for the bug mentioned in #3 and elaborated in #7 where you can still access non-functional pages to revert/delete revisions that logically cannot be reverted or deleted.
Comment #51
jwilson3This is not to say the idea for a pager is a bad one. If anything the two solutions should be combined.
Comment #52
miteshmapI have merged both patches and tha would works great.
patch created at Coworks
Comment #53
gabriel.achille commentedI'm using SQL Server 2012 and once i applied this patch, i have the following error message when trying to access to the 2nd page or more:
PDOException: SQLSTATE[42000]: [Microsoft][SQL Server Native Client 11.0][SQL Server]The column 'nid' was specified multiple times for 'sub1'.: SELECT * FROM ( SELECT sub2.*, ROW_NUMBER() OVER(ORDER BY sub2.__line2) AS __line3 FROM ( SELECT 1 AS __line2, sub1.* FROM (SELECT TOP(60) m.*, n.[vid] AS [live_revision], r.[nid] AS [nid], r.[vid] AS [vid], r.[title] AS [title], r.[log] AS [log], r.[uid] AS [uid], r.[timestamp] AS [timestamp], u.[name] AS [name] FROM {node} n LEFT OUTER JOIN {node_revision} r ON n.nid = r.nid LEFT OUTER JOIN {users} u ON r.uid = u.uid LEFT OUTER JOIN {workbench_moderation_node_history} m ON r.vid = m.vid WHERE ( ([n].[nid] = :db_condition_placeholder_0) ) ORDER BY r.vid DESC) AS sub1 ) as sub2 ) AS sub3 WHERE __line3 BETWEEN 31 AND 60; Array ( [:db_condition_placeholder_0] => 3447 ) in PagerDefault->execute()AFAIU This is cause by the way Drupal SQL Driver is rewriting the range options in a query (see DatabaseConnection_sqlsrv->addRangeToQuery()) AND the fact that there are fields doublon in the query defined in this patch: nid, vid and uid are retrieved from both tables node_revision and workbench_moderation_node_history.
The suggested fix is to retrieved those fields only from the workbench_moderation_node_history table in this query.
I couldn't generate an interdiff from #52 but the only difference is:
Comment #54
rv0 commentedbump
Comment #55
jwilson3RTBC. The changes introduced in #53, didn't have any adverse effects for me, and makes sense.
Also to address Dave's question in #35, since no one else answered it:
Yes, the workbench_moderation_access_link() is still used in a couple other places (albeit much less than before) so we can't remove it.
Comment #56
hass commentedComment #57
dealancer commentedThanks, patch #53 worked well and improved loading time for history page dramatically!
Comment #58
colanRequires a re-roll for latest dev.
Comment #61
damienmckennaRerolled, hopefully. Two of the original changes failed - one was the link building in workbench_moderation_node_history_view(), and the other was _workbench_moderation_revision_access(), I updated both.
Comment #62
gabriel.achille commentedHow could a patch be in queue for testing for more than a week: something wrong with the test website?
On Drupal QA it says status: postponed ?...
Comment #63
damienmckenna@gabriel.achille: That can happen when the branch tests are failing, it basically means that the existing tests need to be fixed and then future tests will test correctly.
Comment #64
gabriel.achille commentedok, thank you for the explanation.
Comment #66
colanSome issues with #61:
There's also some extra whitespace on lines 136 and 144 (visible with Dreditor), but that's minor. I'll see if I can get these fixed in a new patch.
Comment #67
colanHere's what I changed:
See the interdiff for details.
Comment #68
joelpittetThe changes in #67 look great, going to give this patch a test drive. That pager idea may also work on the compare revisions page?
Comment #69
colanI didn't have enough revisions to test the pager. I suppose I could have lowered the cut-off, but I didn't think of that until now. ;) I wasn't really worried about it because I just needed to get the basic functionality working, but the pager would be a happy side effect.
Comment #70
joelpittetI tested it, it seemed to work:) did the lowering to 3 because the same.
Comment #71
colanIf someone other than me RTBCs this, I can commit it.
Comment #72
BarisW commentedGo ahead. Thanks!
Comment #74
colanThanks all!
Comment #75
Leeteq commentedComment #76
jwilson3Nice. Happy to see this has finally made its way in! a long time coming :)
Comment #78
charles belovThank you for your work on this. Is this going to be released in a 7.x-1.5 version soon? We are running into this issue on a 3-year-old site that has some pages with 100+ revisions, but I prefer to avoid patches wherever possible.
Comment #79
colanWe probably should cut a new release soon, but for now, just use the dev version. You don't need to patch anything.
Comment #80
sgdev commentedSo how does this relate to the new 7.x-3.0 release? Is this patch no longer relevant?
Comment #81
das-peter commented@ron_s According to comment #77: "colan committed c1573b6 on 7.x-3.x authored by jwilson3" the patch was already committed to 7.x-3.x - and this issue defines 7.x-2.x as target version. So I'd say it's fair to assume the patch is not relevant for 7.x-3.0.
Comment #82
sgdev commented@das-peter, yes, seemed confusing that there were commits to 7.x-1.x and 7.x-3.x, and tagged to 7.x-2.x. Should the issue be closed?
Comment #83
sgdev commentedSorry, I would assume should stay open since 7.x-2.x does still exist (minimally maintained) as noted on the project page.
Comment #84
das-peter commented@ron_s Yeah I think we leave it open for the moment. Not sure if the issue applies to 2.x since that branch integrates with State Machine to solve some of the tasks.