Hi!

Last night I just discovered, on our live website, a highly-time-consuming code on the node revision history page (the page you get when you click moderate on a node). The fact is the more you have nodes and revisions of nodes, the more this page takes to be rendered. In my case we have a large website with thousands of nodes and revisions, this goes as far as it reaches the max execution time limit set by our hosting company.

I set this report as "feature request" as this is not a bug, as it worked until now, but there is no "code suggestion" category... Feel free to move it to another category if needed ;) I'm writing this to discuss which way must be followed to avoid the problem before writing a patch to fix it (need your opinion).

I found the problem resides in the file workbench_moderation.node.inc in the function workbench_moderation_node_history_view(). At line 135 of the file starts the revision operations part (that ends at line 150). Commenting out these lines makes the page render in milliseconds instead of reaching the max execution time. It seems the time consuming method called is workbench_moderation_access_link().

The function workbench_moderation_access_link() is set at line 1821 of the workbench_moderation.module file. It's a very simple method which just call Drupal's menu_get_item() (in fact plenty of times it there is plenty of revisions).

So the question is, what's your suggestion to replace this call by a more efficient function ? Sorry the my english, I hope it will be understandable ;)

Thank you!

Comments

acrollet’s picture

Version: 7.x-1.1 » 7.x-1.x-dev

I've delved into this a bit, as we've experienced the same problem. Basically the moderate tab for any node with a large number of revisions takes a long time to render, or can even time out. I've found that at least in our case, this seems to be entirely due to the calls to workbench_moderation_access_link() that generate the View/Revert/Delete links. Would it be feasible to replace this function call with a simple permissions check on the relevant ({View,Revert,Delete} Revisions) permissions, and generate the links if the user has the permission? Happy to submit a patch for this approach if a maintainer approves it, or any other suggested approach.

hdennen’s picture

Something like revision_deletion module would be even better. I've just implemented Workbench and can already see how my DB will soon grow exponentially due to so much revision data...

jwilson3’s picture

Status: Active » Needs review
StatusFileSize
new6.01 KB

Here is a patch that applies the suggestion in #1 by using _workbench_moderation_revisions_access() to test for access before building the link, instead of depending on the menu system to build the links. This has incredible performance gains.

In the process of solving this issue, I actually also fixed and improved the functionality of the _workbench_moderation_revisions_access() function itself, which was doing nothing to prevent people from accessing node/%node/revisions/%/revert or node/%node/revisions/%/delete on current and published revisions. This is now remedied (you get an Access Denied message).

I stumbled upon http://drupal.org/node/1064954, which has now been fixed in Drupal core, and so we don't need the temporary -1 / -2 vid's that were being used. This was actually breaking the usage of this function for successive calles of different revisions (such as what is needed for the moderation page).

Note also, that core has changed the $op in _node_revisions_access() from 'revert' to 'update', which was also causing issues.

Though there is a lot going on here, I feel like this is an important performance issue that should be taken into serious consideration.

jwilson3’s picture

Regarding #2, deletion of stale revisions might be a work around for some use cases, but its is not a valid solution to this performance problem because it doesn't address the root of the issue.

Please file bulk revision pruning as feature request on a separate issue if it interests you.

acrollet’s picture

Status: Needs review » Reviewed & tested by the community

I've reviewed the patch in #3, and it definitely fixes the performance issues noted by the OP. Coding style is good and I like the approach, setting to RTBC.

istryker’s picture

In the process of solving this issue, I actually also fixed and improved the functionality of the _workbench_moderation_revisions_access() function itself, which was doing nothing to prevent people from accessing node/%node/revisions/%/revert or node/%node/revisions/%/delete on current and published revisions. This is now remedied (you get an Access Denied message).

Was there actually a security hole. I tried to find it by playing around with the permissions and doing direct url links to node/%/revisions/%/[revert, delete]. Every single time I recieved a 403 (access denied) when I did not have the Revert content revisions and/or Delete content revisions perrmission.

jwilson3’s picture

Was there actually a security hole. I tried to find it by playing around with the permissions and doing direct url links to node/%/revisions/%/[revert, delete]. Every single time I recieved a 403 (access denied) when I did not have the Revert content revisions and/or Delete content revisions permission.

I think you misunderstood this. It wasn't a security hole and nothing to do with edit/revert permissions, but a logic hole: you shouldn't be able to "revert" or "delete" the current revision (ie $node->workbench_moderation['current']) or the published revision (ie $node->workbench_moderation['published']).

You can see this functionality right now on 7.x-1.x-dev:

1) Create a content type, enable revisions and enable workbench default to "Draft".

2) Create a node: node/1

3) Edit the node, to create a new draft: node/1/revisions/2

4) Edit the node again, to create a second revision node/1/revisions/3

5) Try to revert / delete the current published revision: node/1/revisions/1/revert, node/1/revisions/1/delete

Result: You can access the page! There are even direct links to these pages via the "revert" and "delete" link on the Revisions operations column listings at node/1/moderation for the published version. The odd think is that clicking through the confirmation button for either operation does nothing (the node nor the revision is reverted/deleted). In other words, these are useless links to non-functional pages. IMHO better to just block access and not show a link.

6) Try to revert / delete the current draft revision: node/1/revisions/3/revert, node/1/revisions/3/delete

Note: there is no link in the Revisions operations for this, but you can still access the page. The result is that it asks for confirmation to revert/delete, and if you click through, it does absolutely nothing.

7) Finally, try to delete the old draft revision: node/1/revisions/2/delete

This is permitted (as normal) and the draft is then removed from the Revisions listing.

8) Now, apply this code, and try this same routine and note two differences:

8.a) there is no delete/revert link for current drafts or published revisions.

8.b) accessing these two pages (based on grabbing the revision id from the Revisions table and building the link) provide Access Denied for current drafts or published revisions, whereas old revisions are reversible/deletable.

Does this make sense at all?

jwilson3’s picture

Thinking on #7 a little further, there may be one case where reverting does make sense:

You should be able to revert to the published version if the there are newer 'current' drafts. This would effectively create a new "current draft" resetting its contents back to the current published version, essentially abandoning the current edits. This is, in principle, the same thing as adding an edit button (in place of revert).

hyperglide’s picture

Has #3 been reviewed by anyone with commit rights. Looks like a plus.

hyperglide’s picture

jwilson3’s picture

Status: Reviewed & tested by the community » Needs review
+++ b/workbench_moderation.moduleundefined
@@ -429,36 +431,22 @@ function _workbench_moderation_access($op, $node) {
-  // Prevent reverting to or deleting the current revision.
-  if ($node->workbench_moderation['current']->vid == $node->workbench_moderation['my_revision']->vid) {
+  // Prevent reverting to (ie, update) or deleting the current and published revisions.
+  if (($node->workbench_moderation['current']->vid == $node->workbench_moderation['my_revision']->vid
+    || $node->workbench_moderation['published']->vid == $node->workbench_moderation['my_revision']->vid)
+      && ($op == 'update' || $op == 'delete')) {
     return FALSE;
   }

This code fails if there is no published version of the node, giving the following error message:

Notice: Undefined index: published in _workbench_moderation_revision_access() (line 501 of sites/all/modules/contrib/workbench_moderation/workbench_moderation.module).

Fix on the way.

jwilson3’s picture

Status: Needs review » Needs work

Oops.

jwilson3’s picture

This is pretty much the same patch as #3, except I've added an isset() check for the 'published' array index first. See the interdiff for the lo-down.

jwilson3’s picture

Saved the wrong patch :( heres another try.

hyperglide’s picture

jwilson3 Glad to see the updated patch.

Is this ready to be tested on any scale or need a review prior?

Related to #2 issue #1875468: Integrate support for node_revision_delete and workbench_moderation. was created.

hyperglide’s picture

With all the great movement in the WorkBench Issue Queue, wanted to stir the pot and see if this or the issue in #15 can grab any assistance. thx

jwilson3’s picture

Re #15, the current issue status demonstrates that this is ready to be tested. We're using it in at least 2 production projects now.

hyperglide’s picture

@jwilson3 okay. I can test. Can you advise what to look for as this is mostly a back end update? We can run it. .See it things break but any other guidance to help focus our testing?

jwilson3’s picture

The change in this issue can be tested by:

1) Create a node with a LOT of revisions (eg, like 20 or 30).
2) Load the "moderate" tab for the node, taking note of how long the page takes to load. If its instantaneous, then create more node revisions until you start to seriosuly notice the load of the page slow down.
3) Then apply the patch to the codebase, and reload the "moderate" tab for the same node, taking note of how long it takes to load the page.
4) Test permissions model, by setting up different user roles some without the ability to delete nodes, confirm that the revert and delete links work the same way before and after the patch being applied.

UPDATE:

5) Test various moderation state configurations. Eg when you unpublish an older published revision, and have newer unpublished revisions, does everything still work as expected.

hass’s picture

jwilson3’s picture

Even if the original motivation for #1512806 was to solve the performance problem that this issue resolves, I would still consider leaving it open, because adding a pager to the Moderate tab seems like a valid feature request (Eg, imaging a node with 1000 revisions). What do you think?

hass’s picture

Yes that's correct, but I thought to add the 2-3 lines required for a pager here into the patch.

hass’s picture

Re #14: all codewise...

* Why are you using addJoin left outer if we have leftJoin that is an left outer join?
* We shoud fix the singe word operations to be ucfirst as well known. So we do not need to change the lines twice.
*I would use an alias wbnh or nh for the left joined table, but this may be nitpicking.

hass’s picture

Status: Needs review » Needs work
jwilson3’s picture

Status: Needs work » Needs review

This re-roll addresses everything from #24, except for the change to the alias.

* Why are you using addJoin left outer if we have leftJoin that is an left outer join?

As for the "LEFT OUTER" and the alias "m", I was just following existing code examples in workbench_moderation.module (eg see function workbench_moderation_node_data()).

* We shoud fix the singe word operations to be ucfirst as well known. So we do not need to change the lines twice.

Done!

*I would use an alias wbnh or nh for the left joined table, but this may be nitpicking.

Again, I was following existing code examples in workbench_moderation.module (eg see function workbench_moderation_node_data()).

I feel in general that this is pointless because a) it varies from the existing alias used in workbench_moderation.module, and b) the alias is not used anywhere else besides this single query and its plainly obvious what it means. wbnh adds nothing to the readability of the code, and with that logic the node_revisions alias should also be updated to 'nr' from 'r'. Finally, I must insist on leaving this as is, because the three aliases line up vertically to spell 'RUM' (my favorite spirit) :P

I thought to add the 2-3 lines required for a pager here into the patch.

imho, the pager request is a completely separate chunk of code, functionally independent from what this issue addresses, i.e. the fundamentally slow performance of calling workbench_moderation_access_link() repeatedly. The current patch solves the problem in question without the need for a pager. I believe these "2-3 lines would be better served as a separate feature request and separate patch that can be added and tested on its own. That being said, if you want to add the pager to this patch, have a ball. However, hijacking the progress with additional feature requests (ie scope creep) and leaving it marked Needs Work is not productive to moving this forward. Baby steps. Atomic changes. Forward progress. #win.

jwilson3’s picture

thekevinday’s picture

Status: Needs review » Reviewed & tested by the community

The patch works properly and solves the problem.
I have not noticed any new bugs as a result of this.

A moderation view page that takes ~10,000 miliseconds (according to the dev module) prior to this patch now takes ~700ms (which is the ~default for most pages on this particular site).

BarisW’s picture

Same here, patch works great. Thanks!

BarisW’s picture

Please commit. 2 months without action..

kristen pol’s picture

Bump :)

hyperglide’s picture

Is this able to be committed?

kristen pol’s picture

Priority: Normal » Major

I'm bumping to major since it can cause serious performance problems.

vacilando’s picture

The patch seems to work well!

dave reid’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/workbench_moderation.moduleundefined
@@ -193,7 +193,9 @@ function workbench_moderation_menu_alter(&$items) {
   $items['node/%node/revisions/%/revert']['access callback'] = '_workbench_moderation_revision_access';
+  $items['node/%node/revisions/%/revert']['access arguments'] = array(1,'update');
   $items['node/%node/revisions/%/delete']['access callback'] = '_workbench_moderation_revision_access';
+  $items['node/%node/revisions/%/delete']['access arguments'] = array(1,'delete');
 

Need spaces inbetween array values as per coding standards.

+++ b/workbench_moderation.moduleundefined
@@ -494,36 +496,22 @@ function _workbench_moderation_access($op, $node) {
 function _workbench_moderation_revision_access($node, $op) {
+
   // Normal behavior for unmoderated nodes.

Unnecessary additional whitespace.

--

With the patch applied, are there any more uses of workbench_moderation_access_link? If not, we should remove that function.

kristen pol’s picture

Here's the patch with changes from #35.

Status: Needs review » Needs work
kristen pol’s picture

Trying again.

drupalmonkey’s picture

Status: Needs review » Reviewed & tested by the community

Patch from #38 worked for me. Had a node/%/moderate page that was giving a 500 error due to timing out, page loads very quickly now with the patch.

hyperglide’s picture

@Dave Reid -- How things look now?

hyperglide’s picture

@Dave Reid -- How things look now?

hyperglide’s picture

PIng any status on this patch?

jwilson3’s picture

Status hasn't changed it is still RTBC, waiting on free time from maintainers. I'd love to move this up to critical, because without the patch the module doesn't scale at all for any site that uses revisioning and where simply pruning or throwing away old revisions is not an option.

hyperglide’s picture

Priority: Major » Critical

Moving to critical in hopes of getting another review.

jwilson3’s picture

Priority: Critical » Major

I meant to complete my sentence above with "but I don't think moving to critical will get this in any faster". Let's leave it as is and let the maintainer decide if it should be on the short list or not.

fullerja’s picture

fullerja’s picture

Patch in 38 works well for me

fullerja’s picture

Status: Reviewed & tested by the community » Needs work

Came across an issue with the patch in #38, the "Action" links (In particular the "Create a new draft from the published revision." on the node/*/edit page disappear with this patch

miteshmap’s picture

Here is another solution by adding paging.

patch created at Coworks

jwilson3’s picture

The pager limits each page to thirty elements, but iirc, my experience 2 years ago, was that anything over 10 or 15 revisions started to produce a noticeable lag in page display.

Also, this solution does nothing for the bug mentioned in #3 and elaborated in #7 where you can still access non-functional pages to revert/delete revisions that logically cannot be reverted or deleted.

jwilson3’s picture

This is not to say the idea for a pager is a bad one. If anything the two solutions should be combined.

miteshmap’s picture

I have merged both patches and tha would works great.

patch created at Coworks

gabriel.achille’s picture

Status: Needs work » Needs review
StatusFileSize
new7.35 KB

I'm using SQL Server 2012 and once i applied this patch, i have the following error message when trying to access to the 2nd page or more:

PDOException: SQLSTATE[42000]: [Microsoft][SQL Server Native Client 11.0][SQL Server]The column 'nid' was specified multiple times for 'sub1'.: SELECT * FROM ( SELECT sub2.*, ROW_NUMBER() OVER(ORDER BY sub2.__line2) AS __line3 FROM ( SELECT 1 AS __line2, sub1.* FROM (SELECT TOP(60) m.*, n.[vid] AS [live_revision], r.[nid] AS [nid], r.[vid] AS [vid], r.[title] AS [title], r.[log] AS [log], r.[uid] AS [uid], r.[timestamp] AS [timestamp], u.[name] AS [name] FROM {node} n LEFT OUTER JOIN {node_revision} r ON n.nid = r.nid LEFT OUTER JOIN {users} u ON r.uid = u.uid LEFT OUTER JOIN {workbench_moderation_node_history} m ON r.vid = m.vid WHERE ( ([n].[nid] = :db_condition_placeholder_0) ) ORDER BY r.vid DESC) AS sub1 ) as sub2 ) AS sub3 WHERE __line3 BETWEEN 31 AND 60; Array ( [:db_condition_placeholder_0] => 3447 ) in PagerDefault->execute()

AFAIU This is cause by the way Drupal SQL Driver is rewriting the range options in a query (see DatabaseConnection_sqlsrv->addRangeToQuery()) AND the fact that there are fields doublon in the query defined in this patch: nid, vid and uid are retrieved from both tables node_revision and workbench_moderation_node_history.

The suggested fix is to retrieved those fields only from the workbench_moderation_node_history table in this query.

I couldn't generate an interdiff from #52 but the only difference is:

-    ->fields('r', array('nid', 'vid', 'title', 'log', 'uid', 'timestamp'))
+    ->fields('r', array('title', 'log', 'timestamp'))
rv0’s picture

bump

jwilson3’s picture

RTBC. The changes introduced in #53, didn't have any adverse effects for me, and makes sense.

Also to address Dave's question in #35, since no one else answered it:

With the patch applied, are there any more uses of workbench_moderation_access_link? If not, we should remove that function.

Yes, the workbench_moderation_access_link() is still used in a couple other places (albeit much less than before) so we can't remove it.

dealancer’s picture

Thanks, patch #53 worked well and improved loading time for history page dramatically!

colan’s picture

Status: Reviewed & tested by the community » Needs work

Requires a re-roll for latest dev.

Status: Needs work » Needs review

Status: Needs review » Needs work
damienmckenna’s picture

Status: Needs work » Needs review
StatusFileSize
new7.26 KB

Rerolled, hopefully. Two of the original changes failed - one was the link building in workbench_moderation_node_history_view(), and the other was _workbench_moderation_revision_access(), I updated both.

gabriel.achille’s picture

How could a patch be in queue for testing for more than a week: something wrong with the test website?
On Drupal QA it says status: postponed ?...

damienmckenna’s picture

@gabriel.achille: That can happen when the branch tests are failing, it basically means that the existing tests need to be fixed and then future tests will test correctly.

gabriel.achille’s picture

ok, thank you for the explanation.

colan’s picture

Status: Needs review » Needs work

Some issues with #61:

  • Revisions that existed before WM are not listed. Surfing to node/ID/revisions simply redirects to the moderation page, so it's not even possible to get there that way. Even though these may not be relevant to the new workflow, they still exist, and should still be shown. Also, there's no way to delete these from the Web UI. This is due to the join on the workbench_moderation_node_history table. Can we remove it?
  • The one exception to my other point is the node's first revision. That's the only non-WM revision that's shown. Not sure why this one is showing up. Is this intentional? The links for it are broken too as the IDs are missing:
    • node//revisions//view
    • node//revisions//revert
    • node//revisions//delete

There's also some extra whitespace on lines 136 and 144 (visible with Dreditor), but that's minor. I'll see if I can get these fixed in a new patch.

colan’s picture

Status: Needs work » Needs review
StatusFileSize
new2.86 KB
new7.57 KB

Here's what I changed:

  1. I dropped the left join. We don't actually need anything from that table. There's another query later which gets the information that's needed.
  2. Removed the unnecessary whitespace.
  3. Moved the Delete link to the end of the list so that it's the last action, as it was before this issue.

See the interdiff for details.

joelpittet’s picture

The changes in #67 look great, going to give this patch a test drive. That pager idea may also work on the compare revisions page?

colan’s picture

I didn't have enough revisions to test the pager. I suppose I could have lowered the cut-off, but I didn't think of that until now. ;) I wasn't really worried about it because I just needed to get the basic functionality working, but the pager would be a happy side effect.

joelpittet’s picture

I tested it, it seemed to work:) did the lowering to 3 because the same.

colan’s picture

If someone other than me RTBCs this, I can commit it.

BarisW’s picture

Status: Needs review » Reviewed & tested by the community

Go ahead. Thanks!

  • colan committed c1573b6 on 7.x-1.x authored by jwilson3
    Issue #1408838 by jwilson3, Kristen Pol, miteshmap, gabriel.achille,...
colan’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Thanks all!

Leeteq’s picture

jwilson3’s picture

Nice. Happy to see this has finally made its way in! a long time coming :)

  • colan committed c1573b6 on 7.x-3.x authored by jwilson3
    Issue #1408838 by jwilson3, Kristen Pol, miteshmap, gabriel.achille,...
charles belov’s picture

Thank you for your work on this. Is this going to be released in a 7.x-1.5 version soon? We are running into this issue on a 3-year-old site that has some pages with 100+ revisions, but I prefer to avoid patches wherever possible.

colan’s picture

We probably should cut a new release soon, but for now, just use the dev version. You don't need to patch anything.

sgdev’s picture

So how does this relate to the new 7.x-3.0 release? Is this patch no longer relevant?

das-peter’s picture

@ron_s According to comment #77: "colan committed c1573b6 on 7.x-3.x authored by jwilson3" the patch was already committed to 7.x-3.x - and this issue defines 7.x-2.x as target version. So I'd say it's fair to assume the patch is not relevant for 7.x-3.0.

sgdev’s picture

@das-peter, yes, seemed confusing that there were commits to 7.x-1.x and 7.x-3.x, and tagged to 7.x-2.x. Should the issue be closed?

sgdev’s picture

Sorry, I would assume should stay open since 7.x-2.x does still exist (minimally maintained) as noted on the project page.

das-peter’s picture

@ron_s Yeah I think we leave it open for the moment. Not sure if the issue applies to 2.x since that branch integrates with State Machine to solve some of the tasks.