Order locking is a hard subject, and history has shown that the implementation we have in Drupal Commerce is severely lacking.

Background

Commerce by default locks the order every time, because it cannot know what the order will be used for, and whether it will be saved.
However, a vast number of order loads is for reading purposes only, and never results in a save.
This includes all views, access checking, link outputting, token generation, etc.
For example, our code ensures that an admin viewing the order view makes the order unavailable for the customer to view, and vice versa.
All this results in a number of deadlocks, and people having to disable locking in production (two examples I know are RoyalMail and since today, CommerceGuys Marketplace). Of course, locking is there for a reason, so disabling it opens you to subtle errors and can increase problems.

Related issues:
#1514618: Developer documentation: Loading order entities without considering locking can cause problems
#1320062: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded
#1608294: Provide a way to load an order without locking

Proposed solution

1) Allow orders to be loaded read-only.
2) Don't allow unlocked orders to be saved or deleted, throw an exception in that case.
3) Convert the admin/user order view pages to load the orders read-only
4) Disable locking for safe order views (any other than the cart one, basically).

CommentFileSizeAuthor
#218 commerce-improve-order-locking-2240427-218.patch1.5 KBJorgenSandstrom
#209 commerce-improve-order-locking-2240427-209.patch1.46 KBiSoLate
#189 interdiff-2240427-143-188-do-not-test.diff12.52 KBjsacksick
#189 commerce-improve-order-locking-2240427-188.patch28.13 KBjsacksick
#186 interdiff-2240427-143-186-do-not-test.diff11.58 KBjsacksick
#186 commerce-improve-order-locking-2240427-186.patch27.71 KBjsacksick
#183 interdiff-2240427-143-183-do-not-test.diff11.58 KBjsacksick
#183 commerce-improve-order-locking-2240427-183.patch27.7 KBjsacksick
#180 interdiff-2240427-143-180-do-not-test.diff2.31 KBjsacksick
#180 commerce-improve-order-locking-2240427-180.patch26.6 KBjsacksick
#168 interdiff.txt2.29 KBthePanz
#168 improve_commerce_entities_locking-2240427-168.patch65.67 KBthePanz
#165 interdiff.txt10.74 KBthePanz
#165 improve_commerce_entities_locking-2240427-165.patch63.39 KBthePanz
#164 interdiff.txt2.65 KBthePanz
#164 improve_commerce_entities_locking-2240427-164.patch53.04 KBthePanz
#162 interdiff-159.txt4.33 KBthePanz
#162 improve_commerce_entities_locking-2240427-162.patch51.93 KBthePanz
#160 improve_commerce_entities_locking-2240427-160.patch51.91 KBthePanz
#160 interdiff-159.txt4.18 KBthePanz
#159 interdiff.txt28.76 KBthePanz
#159 improve_commerce_entities_locking-2240427-159.patch50.39 KBthePanz
#156 interdiff.txt27.29 KBthePanz
#156 improve_commerce_entities_locking-2240427-155.patch50.33 KBthePanz
#152 interdiff.txt22.81 KBthePanz
#152 improve_commerce_entities_locking-2240427-152.patch47.81 KBthePanz
#149 interdiff.txt11.77 KBthePanz
#149 improve_order_locking-2240427-149.patch38.65 KBthePanz
#146 interdiff.txt1.54 KBthePanz
#146 improve_order_locking-2240427-146.patch26.54 KBthePanz
#143 improve_order_locking-2240427-143.patch25.2 KBmglaman
#139 improve_order_locking-2240427-139.patch25.2 KBmglaman
#137 improve_order_locking-2240427-137.patch25.49 KBmglaman
#137 interdiff-2240427-137-134.txt7.08 KBmglaman
#134 interdiff-2240427-134-130.txt831 bytesmglaman
#134 improve_order_locking-2240427-134.patch22.78 KBmglaman
#132 improve_order_locking-2240427-132.patch21.97 KBmglaman
#130 interdiff-2240427-130-125.txt23.56 KBmglaman
#130 improve_order_locking-2240427-130.patch22.4 KBmglaman
#126 interdiff-99-124.txt5.51 KBjoelpittet
#124 2240427-124-improve-order-locking.patch22.78 KBsmccabe
#121 2240427-121-improve-order-locking.patch21.89 KBsmccabe
#103 improve_order_locking_7.11-2240427-103.patch22.25 KBboyan.borisov
#99 2240427-99-improve-order-locking.patch22.27 KBsmccabe
#99 interdiff-2240427-90-99.txt805 bytessmccabe
#78 2240427-78-improve-order-locking.patch17.02 KBsmccabe
#78 interdiff-2240427-76-78.txt3.29 KBsmccabe
#77 Screenshot 2015-12-26 19.38.58.png342.35 KBtorgosPizza
#1 2240427-1-rework-order-locking.patch26.1 KBbojanz
#5 2240427-5-rework-order-locking.patch26.1 KBpjcdawkins
#20 2240427-20-rework-order-locking.patch30.48 KBpjcdawkins
#22 2240427-22-rework-order-locking.patch33.14 KBpjcdawkins
#24 2240427-24-rework-order-locking.patch33.99 KBpjcdawkins
#29 2240427-26-rework-order-locking.patch38.33 KBpjcdawkins
#31 2240427-31-rework-order-locking.patch39.11 KBpjcdawkins
#33 2240427-33-rework-order-locking.patch39.08 KBpjcdawkins
#40 2240427-40-rewrite-order-locking.patch13.06 KBbojanz
#42 2240427-42-improve-order-locking.patch14.11 KBbojanz
#44 2240427-44-improve-order-locking.patch18.67 KBnanak
#54 2240427-54-improve-order-locking.patch17.62 KBnanak
#54 interdiff-2240427-42-54.txt3.72 KBnanak
#69 2240427-69-improve-order-locking.patch17.63 KBsmccabe
#70 interdiff-2240427-54-69.txt567 bytessmccabe
#76 interdiff-2240427-69-76.txt9.09 KBsmccabe
#76 2240427-76-improve-order-locking.patch15.11 KBsmccabe
#80 2240427-80-improve-order-locking.patch23.46 KBsmccabe
#80 interdiff-2240427-78-80.txt6.44 KBsmccabe
#87 Screenshot 2015-12-28 17.05.39.png483.64 KBtorgosPizza
#90 interdiff-2240427-80-90.txt1.62 KBsmccabe
#90 2240427-90-improve-order-locking.patch22.63 KBsmccabe
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bojanz’s picture

Status: Active » Needs review
FileSize
26.1 KB

Here's an initial patch, for your commenting pleasure.

Status: Needs review » Needs work

The last submitted patch, 1: 2240427-1-rework-order-locking.patch, failed testing.

bojanz’s picture

Issue summary: View changes
pounard’s picture

I don't think that the option of throwing exceptions when saving a non locked order is really necessary even useful at all. In the end, you can still reload the order locking it into the save method, actually it will be exactly the same as if you'd loaded it locking it in the first place. The only thing it changes is that you'll lock the order a tiny shorter amount of time. I don't see use cases where it is useful to proceed this way, do you have any?

pjcdawkins’s picture

Status: Needs work » Needs review
FileSize
26.1 KB

New patch with minor syntax fix.

I'm not sure about #4. As far as I can tell, the point of locking is to ensure the order and its related entities are the same as they were at the moment you decided whether you want to make a change. If that's correct, locking an unlocked order right before saving it would be too late.

Status: Needs review » Needs work

The last submitted patch, 5: 2240427-5-rework-order-locking.patch, failed testing.

pounard’s picture

@#5 Right, indeed, another thread could attempt a save between this thread's load and save operations.

Damien Tournoud’s picture

For consistency, the locking is pretty much not optional. We really want to trigger an exception if you are trying to save an unlocked order instead of letting people with unknown bugs that will cause trouble at the worse possible moment (in production during high traffic).

Damien Tournoud’s picture

I like the look of the test results. Now we just need to fix those :)

mibfire’s picture

Is there any workaround till it is evolving cos my sites are collapsing. Would the cron run be solution for this?

a.ross’s picture

@mibfire: try the patch in https://drupal.org/node/1608294#comment-8000455 and see if that works for you. Clear all caches after applying the patch, then it should basically disable locking for all views. We're using it in production out of sheer necessity. Others have disable locking altogether, but I'm not sure how to do it and how big of a chance there is of side-effects.

mibfire’s picture

Thx ur reply a.ross

I tryed this patch https://drupal.org/node/1514618 but didnt solve all cases. What is the difference between these patches? Should i use both of them?

bojanz’s picture

Nobody should be using the patch in this issue for anything, it's not close to being done and will break your entire site.

a.ross’s picture

Basically what my patch does is prevent locking only for Views, as that is the highest impact scenario. If you have problems elsewhere you should probably alter the entity info to disable pessimistic locking entirely. I think I have an idea how to do it, but maybe someone else here can help you with a tested piece of code.

Hopefully the refactoring will be done sooner rather than later, so we won't have to be juggling patches anymore.

mibfire’s picture

Thx, but should i use both patches or just yours?

a.ross’s picture

Please follow Bojanz' advice. The patch here fails tests and it has been clearly mentioned that you shouldn't use it. Like I said, if you still experience problems using my patch in the other issue, then disable locking altogether.

If your hosting allows you to tweak MySQL configuration, that might help as well.

pjcdawkins’s picture

   protected function buildQuery($ids, $conditions = array(), $revision_id = FALSE) {
     $query = parent::buildQuery($ids, $conditions, $revision_id);
 
-    if (isset($this->entityInfo['locking mode']) && $this->entityInfo['locking mode'] == 'pessimistic') {
+    if ($this->requiresLocking() && !empty($conditions['_lock'])) {

The _lock element should be removed from $conditions before passing to parent::buildQuery(), otherwise you get an SQL exception.

I think the other tasks are just to pass $lock = TRUE on the end of a lot of (30 or so) commerce_order_load() and commerce_cart_order_load() calls.

mibfire’s picture

a.ross, I did never say that i wanna use the patch from this issue(https://drupal.org/node/2240427). I said: There are already two patches here https://drupal.org/node/1514618#comment-5825868 and here(it is yours) https://drupal.org/node/1608294#comment-8000455

So which patches should i use or both?

a.ross’s picture

I didn't notice that you were talking about another patch. My bad.
I can't comment on the other one as I don't know what it does. But it's quite likely that combining the patches won't work. I suggest you solely try my patch before trying anything else.

pjcdawkins’s picture

Status: Needs work » Needs review
FileSize
30.48 KB

There's a lot more to do, I'm just taking advantage of the test bot here (to check my local tests are working OK)

Status: Needs review » Needs work

The last submitted patch, 20: 2240427-20-rework-order-locking.patch, failed testing.

pjcdawkins’s picture

Status: Needs work » Needs review
FileSize
33.14 KB

Status: Needs review » Needs work

The last submitted patch, 22: 2240427-22-rework-order-locking.patch, failed testing.

pjcdawkins’s picture

FileSize
33.99 KB

It looks like (possibly) all the recent exceptions were coming from commerce_cart_product_add(), which is the only function that calls commerce_cart_order_new() if a cart order doesn't yet exist. It seems the cart order then needs to be re-loaded from the DB so that it can be locked. So, in commerce_cart_product_add():

 // ...
  // If no order existed, create one now.
  if (empty($order)) {
    $order = commerce_cart_order_new($uid);
    // Reload the order from the database, to enable locking.
    $order = commerce_order_load($order->order_id, TRUE);
    $order->data['last_cart_refresh'] = REQUEST_TIME;
  }
 // ...
pjcdawkins’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 24: 2240427-24-rework-order-locking.patch, failed testing.

Damien Tournoud’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 24: 2240427-24-rework-order-locking.patch, failed testing.

pjcdawkins’s picture

Status: Needs work » Needs review
FileSize
38.33 KB

Two issues addressed, one obvious, one less so:

1. If we are loading an order with locking, and it has already been loaded unlocked in the same request, then the entity controller's cache needs to be bypassed. So I added to DrupalCommerceEntityController:

  /**
   * Overrides DrupalDefaultEntityController::load().
   */
  public function load($ids = array(), $conditions = array()) {
    // If locking has been required, then bypass the internal cache for any
    // entities that are not already locked.
    if (!empty($conditions['_lock']) && $this->requiresLocking()) {
      foreach (array_diff($ids, $this->lockedEntities) as $id) {
        unset($this->entityCache[$id]);
      }
    }
    elseif (isset($conditions['_lock']) && !$conditions['_lock']) {
      // Remove the pseudo-condition _lock if it is FALSE.
      unset($conditions['_lock']);
    }
    return parent::load($ids, $conditions);
  }

2. It turns out the function commerce_cart_order_product_line_item_delete() always saves the order, via commerce_line_item_delete() then commerce_line_item_delete_references(). So I removed its $skip_save parameter.

Status: Needs review » Needs work

The last submitted patch, 29: 2240427-26-rework-order-locking.patch, failed testing.

pjcdawkins’s picture

Status: Needs work » Needs review
FileSize
39.11 KB

I wish the testing system didn't create a new comment for every failed patch...

Status: Needs review » Needs work

The last submitted patch, 31: 2240427-31-rework-order-locking.patch, failed testing.

pjcdawkins’s picture

FileSize
39.08 KB

Leaving 'Needs work' as there will still be some test failures. It seems that up to 11 of them are a direct consequence of these changes - the others appear to be indirect effects - and I'm a bit stuck (partly because running tests locally is taking me 1-2 hours).

bojanz’s picture

We should convert the Commerce tests to run under the "testing" install profile (works for me on commerce_license_billing without any additional changes other than the protected $profile = 'testing';), that should speed them up dramatically.

amidda1217’s picture

I have a site deadline in just a few days. Has any progress been made on this?

My issue is that I'm trying to use the Commerce module on a CDN. As soon as the second server connects to the database the link for "add product" and "create product" disappear as well as billing and shipping addresses cannot be added to orders.

Has anyone experienced these problems? Is there a fix for the database locking?

bojanz’s picture

@amidda1217
Order locking has nothing to do with those operations. Your problem is elsewhere.

sepehr.sadatifar’s picture

I want to work on this, does anyone have worked on this after #33 but hasn't posted it here?
tnx

Jurgen8en’s picture

Hi guys,

(I took the latest dev version and applied the patch.)
When trying to save an order at '/admin/commerce/orders/78/edit',
gives the following error. That is the main problem at the moment? Or I did something wrong

Exception: Cannot save unlocked commerce_order entity #78 in DrupalCommerceEntityController->save() (line 222 of C:\xampp\htdocs\coolgift\profiles\commerce_kickstart\modules\contrib\commerce\includes\commerce.controller.inc).

pjcdawkins’s picture

Re. #37: I'm not working on this at the moment and I don't think anyone else is.

Re. #38: the patch is going to need quite a lot of extra work, the problem is there are so many unpredictable ways an order can be updated so it's hard to catch all cases.

bojanz’s picture

New plan, explanation in the next comment.

Note that tests are currently broken because of an addressfield regression, I'll need to fix that tomorrow before pinging the testbot.

bojanz’s picture

Issue summary: View changes

So, this patch went nowhere because trying to load orders unlocked by default is too big of a change for Commerce 1.x.
So instead we keep locking by default, but allow orders to be loaded unlocked, and use that for the admin/user order view pages.
We also implement hook_views_pre_execute() to skip locking on safe order views (such as the admin/customer views, all except the cart view basically).
That's as far as we can go in 1.x.

The potentially problematic change is the exception throwing in save. We never actually checked if the order being saved is locked, so there were edge cases where unlocked orders were saved:
1) Once an order is saved for the first time, it is unlocked, running commerce_order_save($order) on it again saves it unlocked. This was being done by the commerce_cart_product_add() function.
2) Once an order is saved once, the lock is released, so if by any chance commerce_order_save($order) runs again, the order will be saved unlocked.

I've addressed both of these cases in my patch, but I'm afraid more could pop up. For example, if a form handler tries to save an order retrieved from form state (without reloading it first like Commerce does).

bojanz’s picture

Title: Rework order locking » Improve order locking
FileSize
14.11 KB

I had to remove the exception throwing in save() / delete() because there was no way to make tests pass, even Commerce core isn't strict about only saving the order once (especially around checkout, panes&payments...).

So the summary of the patch is:
1) Introduce a way to skip locking in a single request (global switch on the controller)
2) Skip locking for safe order views using #1
3) Provide a way to load an unlocked order, and add a function for checking whether the order is locked.
4) Load the unlocked order on the order view pages.

Test coverage has been added. The patch passes tests via Travis, so it is committable.

A future patch could try to fix Commerce order saving, and then introduce logging warnings in save() / delete() for unlocked orders, I ran out of time.

Marko B’s picture

Wondering did people implemented latest patch and how does it work for them? Comments?

nanak’s picture

Yep I did and found an issue, fixed with the attached patch.
Adding a condition in $conditions['_lock'] caused the DrupalDefaultEntityController:load() to load all the line items stored in the db (causing the memory being completely eaten) on each page call because of
if ($ids === FALSE || $ids || $revision_id || (($conditions && !$passed_ids) in (includes/entity.inc:187) (I noticed it only for line item, but I guess all Commerce entities are affected)

I've updated the DrupalCommerceEntityController::load() method by the complete DrupalDefaultEntityController, but adding an extra condition in the if that checks if _lock is the only key in the $conditions array.

torgosPizza’s picture

We're starting to see lock_wait a lot now as well, and it makes me wonder if this could cause other issues such as line items becoming out of sync and discounts/coupons being erroneously removed from an order? Interested to know what kinds of side effects there are from simply removing locking for order entities. Thanks for the work on this issue - as soon as a patch is determined to be "ready for production" I will test it on our production server.

bojanz’s picture

@torgosPizza
It would be great if you could figure out where the lock_wait is happening, the patch is supposed to make the system lock less, not more.

torgosPizza’s picture

So far it seems like simply loading a Commerce Order that allows its line items to be refreshed make the system prone to lots of lock_wait, sometimes in the order of 15 seconds.

I'm still digging in, but it seems related to this issue I created earlier: #2384331: Do not look for Checkout panes on non-Checkout pages - see my latest comment .. and it is exacerbated with the new Memcache Stampede Protection (which we need to avoid other issues!).

If you think this patch could help, I'd definitely like to test it.

@nanak: Do you think you could re-roll this patch with --relative?

joelpittet’s picture

@bojanz the improvements are huge with the locking, but my number of queries jumped too and made it slower.
So the locking query was eating up 318ms and only called 5 times and after it was 1ms, but there seemed to be more calls at 19 times and over all the those extra transactions added up to a slower page load from anywhere between 100ms to 1s. So I'm seeing slower results kinda of like @torgosPizza. There is also a bit more on the unserializing for some reason...

@nanak would you mind putting an interdiff on there so we can see what you changed from #42 It would really save us on review time when trying to decide if your changes are all ok. If you haven't made one before it's just a diff between the first patch and the second, here's instructions https://www.drupal.org/documentation/git/interdiff

joelpittet’s picture

Also I had a large order, 42 line items in the cart, 2 discounts, 5 product variation types

joelpittet’s picture

@nanak's patch has a slight improvement but not much, the biggest drawback I see is there are >1000 more calls to DrupalCommerceEntityController::load

joelpittet’s picture

Ok using @nanak's patch here's an unscientific performance test.

This is on the cart page with Devel SQL query log (refreshed the page 3 times to warm the caches after swapping patches, probably not needed but did it anyways:P):

No patch:

Executed 364 queries in 212.3 ms.
Page execution time was 3720.5 ms.
Memory used at: devel_boot()=2.9 MB, devel_shutdown()=59.44 MB, PHP peak=60.75 MB.

With @nanak's patch: #44

Executed 395 queries in 225.44 ms.
Page execution time was 3683.17 ms.
Memory used at: devel_boot()=2.9 MB, devel_shutdown()=59.68 MB, PHP peak=60.75 MB.

With @bojanz's patch in #42

Executed 395 queries in 226.07 ms
Page execution time was 3708.6 ms.
Memory used at: devel_boot()=2.9 MB, devel_shutdown()=59.68 MB, PHP peak=60.75 MB.

Looks like there are 15 more DrupalEntityController::load calls that generate SQL queries.
3 extra calls to DrupalCommerceEntityController::buildQuery (but they seem like they just moved around a bit in their execution)
1 extra call to field_sql_storage_field_storage_load

Marko B’s picture

I am bit confused, first you said over 1000, now you say there are 15 and 3 extra calls. What is the conclusion. No difference with performance, but better with deadlocking?

joelpittet’s picture

Those were different tests, granted that technique is horrible though the second test is a bit more consistent.

The first one was using blackfire.io and comparing the profiles. The second was just devel query output in the footer.

But with the second was more careful to make sure the results were similar after cache warmed.

In general there were more queries run and no less SELECT ... FOR UPDATE calls but more which I don't think is helping the problem.

I should run xhprofkit on this to get super accurate results, but I'm quite sure I'd see any better results with these patches.

nanak’s picture

Here's the patch made relative and its interdiff

Marko B’s picture

I have to say I am confused with this :) tempted to apply it to production where order locking happens 2-4 a day but not sure will it bring more problems or help.

nanak’s picture

It's already on production on my side on a large corpo site, it did not introduced more issues (but do not fix them all of course)

enorniel’s picture

Nanak, can you do your patches relative to the commerce module only ?

You make them relative to a profile install. I have to rewrite all paths before apply.

Thanks.

nanak’s picture

@enorniel: the patch #54 have been made relative, do not use #44

Anybody’s picture

Can we get an update what's the current status and how we can load an order without locking until there is a final solution? This is really major but things don't seem to move forward. What can we do?

lildragon77’s picture

anyone with luck solving this issue? thanks.

torgosPizza’s picture

Issue tags: +commerce-sprint

Tagging for Sprint to try and at least get some eyes on it.

smccabe’s picture

Using the newest #54 patch I am able to reproduce @joelpittet's results. Using a simple view with 1 order I get the following without and with the patch.

	                   Queries 	Execute Time	 For Update Calls
Without patch	       57	               ~4.9ms	       2
With patch	       72	               ~6.5ms	       9

We also tested the patch on a larger production site of ours and noticed slight speed decreases as well. I'm not sure what the problem is yet, but testing show the new patch is clearly worse.

smccabe’s picture

Ok, more info. Turns out I had missed my views cache clear on that test and the performance is actually identical, except with no more "for update" calls, which is what this patch intends to do so makes sense.

	                   Queries 	Execute Time	 For Update Calls
Without patch	       57	               ~4.9ms	       2
With patch	       57	               ~4.9ms	       0

This mistake actually proved useful, at it turns out that this patch is better if locking is not required, but worse if it is. I did some further investigation and this appears to be the offending piece.

+++ b/includes/commerce.controller.inc
@@ -21,14 +44,157 @@ class DrupalCommerceEntityController extends DrupalDefaultEntityController imple
+    // If locking has been required, then bypass the internal cache for any
+    // entities that are not already locked.
+    if (!empty($conditions['_lock']) && $this->requiresLocking()) {
+      foreach (array_diff($ids, $this->lockedEntities) as $id) {
+        unset($this->entityCache[$id]);
+      }
+    }

The cache gets unset for locking entities, which is not something we previously used to do, so it makes the existing functionality worse and requires more entity loads. I don't understand why this part is here, it doesn't seem like it is needed for this patch as it would be an existing problem with locking. Hopefully someone with more insight can clue me in to what is going on.

smccabe’s picture

Status: Needs work » Needs review
santerref’s picture

I'm wondering if someone is currently testing #54 or if anyone have more details about this issue?

skek’s picture

Subscribe

mglaman’s picture

This looks good. There's a handful of coding standard changes which make the patch a bit harder to discern what existing in regards to locking workflow and what was changed, but not horrible.

+++ b/includes/commerce.controller.inc
@@ -21,14 +44,157 @@ class DrupalCommerceEntityController extends DrupalDefaultEntityController imple
+    // If locking has been required, then bypass the internal cache for any
+    // entities that are not already locked.
+    if (!empty($conditions['_lock']) && $this->requiresLocking()) {
+      foreach (array_diff($ids, $this->lockedEntities) as $id) {
+        unset($this->entityCache[$id]);
+      }
+    }

I wonder if this tidbit right here could help mitigate issues we have with orders in AJAX because we're always hitting entity cache. When loading an order that needs a lock we can then bypass.

This looks really promising, but just needs some manual testing and verification.

smccabe’s picture

@mglaman the problem with that code snippet is it doesn't work right though, it bypasses caching way too often and actually makes this patch perform worse instead of better. I haven't got a chance to dissect exactly what is wrong with it though, hopefully this weekend.

smccabe’s picture

Aight, so I got this bastard figured out:

This little bit is at fault here, at first glance it looks fine, but lockedEntities is actually array_flip()'ed when it is set, so this never matches, thus causing us to uncache the entity each time we call load, which on my current test page resulted 10 addition queries each page load.

+      foreach (array_diff($ids, $this->lockedEntities) as $id) {
+        unset($this->entityCache[$id]);
+      }

Updated patch is attached, I have tested it a bunch and it now seems to have the expected effects, speed remains the same but FOR UPDATES are removed in some instances. I think there are further instances we could remove locking, but that is for another issue.

smccabe’s picture

FileSize
567 bytes

Interdiff as per request

Anybody’s picture

Whao great work! Do you have performance results like in #63?
I'll have a look at it soon on course. At first glance the patch looks good.

Let's get some more feedback on this. We should fix this as soon as ever possible. Thank you very much!!

smccabe’s picture

@anybody yes, performance is the same, I don't have the same testing environment setup, but here are some stats on my current testing setup. I included a bit more detail this time, as you can see all the queries numbers and times remain the same and only the FOR UPDATEs decrease. I also included the edit page to confirm it remains unchanged as it should still be using a FOR UPDATE.

Without Patch			
URL                              # of Queries      Query Time       FOR UPDATEs
admin/commerce/orders                182	           20ms	          1
admin/commerce/orders/1              187	           16ms	          1
admin/commerce/orders/1/edit         139	           16ms	          1
			
With Patch			
URL	                           # of Queries      Query Time     FOR UPDATEs	
admin/commerce/orders                182	           20ms	          0
admin/commerce/orders/1              187	           16ms	          0
admin/commerce/orders/1/edit         139	           16ms	          1
torgosPizza’s picture

We just had a shippable order take 50 seconds to complete a SQL query. It happened while calculating shipping line items, so I'm not sure if there is something in the Shipping module that is causing a new bottleneck or what. Thoughts?

smccabe’s picture

This patch would only effect orders themselves, it doesn't change anything for line items. Also it only fixes some of the locking, there are still lots of instances where locking occurs, but those should probably be handled by future patches.

Do you happen to know what the query was? I can look into it but it should probably be split off into a separate issue.

Damien Tournoud’s picture

+    // Lock by default if the caller didn't indicate a preference.
+    $conditions += array('_lock' => TRUE);

^ You can set this directly to $this->requiresLocking() so that you don't have to recheck that everytime down the stack... something like:

$conditions += array('_lock' => TRUE);
$conditions["_lock"] = $this->requiresLocking() && $conditions["_lock"];

Also, we do control the stack from there, so let's pass this condition as a protected property to ->buildQuery() and ->cacheGet() so that we don't have to hack around $conditions everywhere.

+    // If locking has been required, then bypass the internal cache for any
+    // entities that are not already locked.
+    if (!empty($conditions['_lock']) && $this->requiresLocking()) {
+      foreach (array_diff(array_flip($ids), $this->lockedEntities) as $id) {
+        unset($this->entityCache[$id]);
+      }
+    }

^ I assume this means to use array_diff_key(). Please add an explicit test case for this branch.

+    $entities = array();
+
+    // Revisions are not statically cached, and require a different query to
+    // other conditions, so separate the revision id into its own variable.
+    if ($this->revisionKey && isset($conditions[$this->revisionKey])) {
[... a ton more code...]

^ If we start passing around the lock property as a protected property, we can remove this nearly verbatim copy of the ancestor here, no?

       // Maintain the list of locked entities and release the lock if possible.
-      unset($this->lockedEntities[$entity->{$this->idKey}]);
-      $this->releaseLock();
+      if ($this->requiresLocking()) {
+        unset($this->lockedEntities[$entity->{$this->idKey}]);
+        $this->releaseLock();
+      }

^ Given that ->requiresLocking() can change value between load and save, please just do that unconditionally (i.e. just remove this hunk).

smccabe’s picture

Hey Damien, thanks for all the good feedback!

#1 - With changing to a protected property, I can just include most of this in requriesLocking
#2 - Changed to protected property, I was able to clean things up a LOT because of this
#3 - You are correct sir, also the unset was not correctly using the key, so I fixed that as well. Also added some isCached checking to the tests to confirm the locking is not breaking caching
#4 - Removed 90% of it and replaced with parent::load, only the custom bit at the start is now there
#5 - Removed

I posted an interdiff, but tbh it is probably just easier to look at the new patch since lots was removed.

torgosPizza’s picture

@smccabe: Thanks, indeed there seems to be some weirdness around orders being saved/loaded any time a line item is added to an order; in this case it was a shipping line item. I'll have to dig in a bit further with XHprof (see my New Relic screengrab below):

Screenshot

I'll update my patch with the latest too just to keep parity (and to continue testing)!

smccabe’s picture

Updated patch with change to use a single transaction object for loading and saving, instead of 2 separate objects. Also separated clearing cache and releasing locks, since you may want to do one and not the other, for example loading an original from the database for comparison.

The last submitted patch, 78: 2240427-78-improve-order-locking.patch, failed testing.

smccabe’s picture

Turns out resetCache was used in a few tests as an unlock hack, replaced with releaseLocking() function.

torgosPizza’s picture

Green! That is awesome. Fantastic work, I'll take a look and test asap.

smccabe’s picture

Additional performance testing data.

If everything is running quickly, locking is unlikely to become an issue as the locks will clear out quicker than the page loads and not become an issue. I ran a simple test at 30 concurrency both with and without the patch and times were essentially the same, about 10 trans/sec on my little test server.

To simulate some less than idea circumstances where locking would be more problematic and test the patch, I added a 1sec delay into the page generation during the "locked" phase. Without the patch applied this brings things essentially to a halt, I get 0.04 trans/sec and almost everything ends in a timeout. With the patch enabled I get roughly the same at just under 10 trans/sec, although with an expected ~1 second longer response time.

                                 trans/sec
7.x-1.x - No Delay                 ~10/sec
W/Patch - No Delay                 ~10/sec
7.x-1.x - 1sec Delay               ~0/sec
W/Patch - 1sec Delay               ~10/sec

Testing command for those interested:

siege -c 30 -b -t120s -v --header="Cookie: SESSa84...=wREq..." drupalvm.dev/admin/commerce/orders
torgosPizza’s picture

I have pushed the new locking code to Production, and so far so good. In Dev I was noticing our load time for Cart page cut in half (1.5 seconds from ~3 seconds) using identical code except one with locking and one without. So I'm cautiously optimistic that we'll see some good improvements on production as well. I'll be watching in New Relic as we push a new product release later today!

torgosPizza’s picture

@smccabe and I were discussing in IRC, it seems that the commerce_line_item_delete() function does not pass its transactions back to its subsequent load() and save() calls, resulting in potential locked tables. We saw this in a 15 second SELECT ... FOR UPDATE query when a user clicked "Remove" on an item in /cart.

Hoping to see a new version of the patch later today, and once we have it I'll put that up on production as well.

Damien Tournoud’s picture

   public function save($order, DatabaseTransaction $transaction = NULL) {
-    if (!isset($transaction)) {
-      $transaction = db_transaction();
+    //we don't really ever want to have 2 transactions, should we really even accept passed in transactions? or at least flush our internal one immediately?
+    if (isset($transaction)) {
+      $this->controllerTransaction = $transaction;
+    }
+
+    //check if we already have a transaction, else create one
+    if(!isset($this->controllerTransaction)){
+      $this->controllerTransaction = db_transaction();
       $started_transaction = TRUE;
     }

^ This is not a good idea. Transactions are nested, and inner transactions are simply SAVEPOINTs. In case this save operation fails, you want only that operation to be rolled-back, not the whole transaction stack. (Plus, the database layer will freak-out if you try to rollback out of order.)

Damien Tournoud’s picture

@torgosPizza: I don't understand what you are describing. From the look of it, it seems like *something else* is causing the lock to be held. A single request cannot by definition lock itself. You should track what is locking... on production systems, admin pages and background processes (like advanced queue, etc.) are common culprit.

torgosPizza’s picture

@Damien Tournoud: It was definitely a SELECT ... FOR UPDATE in the commerce_order table after a customer deleted an item from their cart:

Based on what you're saying in #85 perhaps I'll revert our lock patch for now, but I will admit that the locking fixes have shown performance improvements and (so far) no side effects that I can tell.

smccabe’s picture

Hey @Damien, Yah I spent all last night learning about Drupal's transactions, takes a bit cause they are not really documented. I now know about how it sets an initial transaction, then everything else within that is just save points, when you pass a name it is just for save points, etc. etc. What I was trying to do with that piece of code is make sure both transactions close, so it worked by making sure everything got closed off, but as you pointed out it did so incorrectly. I will take another go at adjusting the patch to do what I intended correctly.

@torgosPizza, I looked into delete line item stuff a bunch last night, it does lock for a much longer time than loads or adds, but @Damien is correct in that a single thread can't conflict with itself since it just uses savepoints within the same transactions. What is somewhat of a problem is that order_load usually starts a transaction early on, so a transactions is held for most of the pageload. By itself though, this is ok, as this should be the only instance of this order being accessed, so the lock should not cause issues. There has to be something else that is also locking the order to conflict, admin views, cron jobs, ajax requests, multiple tabs open etc. The problem presents itself on line_item_delete likely because that holds a lock for longer than some other actions, so it just has a higher chance of conflicting with some other process that is running and holding a lock.

I know one another site we have locking issues on, here are some big admin views and reports that would lock everything up and cause huge deadlocking issues. Do you have anything that runs and would likely be hitting orders that are still in cart status?

torgosPizza’s picture

Ah okay. That does make sense. I'll wrack my brain to see if I can figure out what else might be causing this.

smccabe’s picture

@Damien, updated patch put up, I just removed that piece as on further inspections no replacement was needed, it succeeded only in messing with the rollback. Also a couple of other small fixes added, easy to see in the interdiff.

vadym.kononenko’s picture

@smccabe
1) #78 patch has unfinished transaction related changes at 'modules/order/includes/commerce_order.controller.inc' file.
See '->rollback()' method call which must be called as $this->controllerTransaction->rollback() in your logic.
2) I think this part of the patch could be ommited as new transaction opening will just add 'SAVEPOINT' at DB level.
3) Each commerce entity type controller uses identical transaction behaviour. Just keep it similar.
4) 'commerce_entitycache' module requires to be patched as well to be compatible with your changes. Per every 'includes/commerce_entitycache.*.inc' file:

public function load($ids = array(), $conditions = array()) {
+    unset($conditions['_lock']);
     return EntityCacheControllerHelper::entityCacheLoad($this, $ids, $conditions);
   }
Damien Tournoud’s picture

@vadym.kononenko: if commerce_entitycache replaces the commerce_order entity controller, it probably doesn't provide locking, which is *extremely* unsafe. Don't use it.

smccabe’s picture

@vadym.kononenko, patch 78 is outdated and that controller stuff is removed as per #90, transactionController is used for locking and not for rollbacks.

smccabe’s picture

@vadym.kononenko @Damien Tournoud, looking over commerce_entitycache, it calls the buildQuery parent controller function still, so on clean loads locking would still get turned on. The big problem seems to be that since it overrides the load function, it doesn't call this:

    // If locking has been required, then bypass the internal cache for any
    // entities that are not already locked.
    if ($this->requiresLocking()) {
      foreach (array_diff_key(array_flip($ids), $this->lockedEntities) as $id => $value) {
        unset($this->entityCache[$id]);
      }
    }

Which means that if an order gets loaded and then needs to be locked, it will already be cached and not be cleared out and locked and instead the unlocked version from cache will be used. There may be other problems as well, as it replaces a lot of code and I only did a quick scan, but this was the obvious problem that jumped out at me related to this patch.

Additionally, since commerce_entitycache can cache orders across page loads, it seems like it would stomp on locking completely, since commerce's locking has no concept of locking once the database connection has been closed. Commerce will assume it is receiving a locked order when it could be receiving from cache. I would agree with Damien that commerce_entitycache disregards order locking completely.

Damien, correct me if I said anything stupid above plz :)

vadym.kononenko’s picture

@smccabe, thank you for your attention and explanation about 'commerce_entitycache'.

torgosPizza’s picture

As with the previous patches this one works very well for me. I did notice an almost immediate improvement to checkout performance. I'd set this to RTBC but I figure others may want to look at this and test more deeply, but for us - using this in production - the results have been very positive so far.

joelpittet’s picture

  1. +++ b/modules/order/commerce_order.module
    @@ -774,6 +784,8 @@ function commerce_order_load_by_number($order_number) {
    + * @param $lock
    + *   Whether to lock the loaded order.
    ...
    @@ -782,6 +794,32 @@ function commerce_order_load_multiple($order_ids = array(), $conditions = array(
    

    Was this param to be added missed or is it an accident?

  2. +++ b/modules/order/commerce_order.module
    @@ -1472,6 +1510,34 @@ function commerce_order_entity_query_alter($query) {
    +  $previous_result = &drupal_static(__FUNCTION__, NULL);
    

    nit: the NULL second argument is the default, not needed.

  3. +++ b/modules/order/commerce_order_ui.module
    @@ -35,6 +35,8 @@ function commerce_order_ui_menu() {
    +    'load arguments' => array(FALSE),
    
    +++ b/modules/payment/commerce_payment_ui.module
    @@ -13,6 +13,7 @@ function commerce_payment_ui_menu() {
    +    'load arguments' => array(TRUE),
    

    Is this an undocumented trick with 'load arguments'? The docs don't seem to mention this is how it works...
    https://api.drupal.org/api/drupal/developer!hooks!core.php/function/hook...

joelpittet’s picture

Status: Needs review » Needs work

Sorry should have changed the status on #97

smccabe’s picture

Status: Needs work » Needs review
FileSize
805 bytes
22.27 KB

#1 - parameter was added, then change to protected property, I forgot to remove comment, fixed
#2 - fixed
#3 - this was done before I started working on the patch, but yes it basically passes that parameter along to the automatic load function coming from the %order option. ctrl+f for "load arguments" here, it is down in the docs a bit https://api.drupal.org/api/drupal/modules!system!system.api.php/function...

joelpittet’s picture

Thanks @smccabe re #3 ah that's my bad I didn't know how that magic offset 'load arguments' worked.

joelpittet’s picture

Status: Needs review » Reviewed & tested by the community

Using the siege command from #82 I got the following with a production site.

There are 2200 orders and showing 50 at a time in the view. Turn on order viewing permissions for anonymous so I didn't have to pass in the session header (cheating). xdebug off.

siege -c 30 -b -t120s -v http://eminence.dev/admin/commerce/orders

Without Patch
=============
Transactions:             140 hits
Availability:             100.00 %
Elapsed time:             119.18 secs
Data transferred:         18.05 MB
Response time:            23.29 secs
Transaction rate:         1.17 trans/sec
Throughput:               0.15 MB/sec
Concurrency:              27.36
Successful transactions:  140
Failed transactions:      0
Longest transaction:      35.12
Shortest transaction:     5.42

With Patch
==========
Transactions:             195 hits
Availability:             100.00 %
Elapsed time:             119.60 secs
Data transferred:         25.13 MB
Response time:            16.83 secs
Transaction rate:         1.63 trans/sec
Throughput:               0.21 MB/sec
Concurrency:              27.43
Successful transactions:  195
Failed transactions:      0
Longest transaction:      23.41
Shortest transaction:     10.32

Though maybe not amazing (and I probably have other problems I should look at here). This seems to be getting nearly 1.4 times the transactions in the same amount of time. And I ran this a few times with similar results.

EDIT: FYI those are apache transactions not SQL transactions (because we are talking about transactions in this issue)

mglaman’s picture

boyan.borisov’s picture

I'm going to test the latest patch against version 7.11.
I've attached the relevant backport patch.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 103: improve_order_locking_7.11-2240427-103.patch, failed testing.

mglaman’s picture

Status: Needs work » Reviewed & tested by the community

You can out "do-not-test" in file name so that it doesn't get run by DrupalCI

joelpittet’s picture

I've been getting lock timeouts this weekend, may be just me and they seem sporadic but thought I'd mention it. Mostly after I did some updates to rules.

smccabe’s picture

@joelpittet, with or without patch?

joelpittet’s picture

With the patch

torgosPizza’s picture

So far we haven't seen any issues like this in production, but I also have these patches applied: #2303721: make variable_initialize safe when the lock system fails and #1191290: system_cron() should not run hook_flush_caches(), but use a cached version. Might take a look (especially if you use Memcache).

The lock_wait issues we were seeing were mostly happening during the variable_initialize function during bootstrap. It would show up sometimes during PUT requests in Services, and sometimes in the checkout_router, which is what made it hard to track down. I think it was just part of cron/time-based cache rebuilds.

Between those two patches and the order locking patch the most time-consuming calls in our site now on the back-end have been all the external API calls. Right now I'm very happy with our performance!

joelpittet’s picture

Just to clarify where this is happening, it only seems to be with one of my menu callbacks, also in dc_cart_ajax which is loaded through authcache ajax too and a custom context order load in panels to check if the cart is empty. So at least 2 of those I'll be able to change to load without locking and see if it helps.

joelpittet’s picture

Not removing my RTBC, because this patch doesn't fix all locking, it just gives us the tools to do so, so I've tried to change my custom stuff to unlocked read-only requests.

torgosPizza’s picture

@joelpittet: Intriguing! I'd love to learn more about your custom code, since that optimization might be something I could use on our site as well.

torgosPizza’s picture

Oddly enough two days ago we had an order that got stuck at "Checkout: Complete" and did not move into "Completed" (which happens with Rules). Looking at syslog, the only thing I can find is a deadlock error when attempting to get a lock during the INSERT for field_revision_commerce_customer_billing.

Jan 17 15:56:25 10.45.15.248 rifftrax7: https://www.rifftrax.com|1453046185|php|https://www.rifftrax.com/checkout/.../paypal_ec|https://www.rifftrax.com/checkout/.../review|||PDOException: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction: INSERT INTO {field_revision_commerce_customer_billing} (entity_type, entity_id, revision_id, bundle, delta, language, commerce_customer_billing_profile_id) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6); Array

So far that's the only issue we have had - haven't had any other orders stuck since the beginning of the year, which is definitely a good thing. But obviously I'm curious as to what could've caused the deadlock.

Damien Tournoud’s picture

@torgosPizza: are you running in READ COMMITTED transaction isolation mode? This sounds to me like a deadlock that really should not happen.

Most likely, there is something on your site that is causing these concurrency issues (long running background processing? or drush command lines? weird things happening in the admin UI? API callbacks from external systems?). We have had very large Drupal Commerce websites in production for years without seeing this type of issues.

torgosPizza’s picture

@Damien: You're right, this is probably unrelated. I think there are some performance improvements we can make on the hosting side. Thanks!

joelpittet’s picture

Status: Reviewed & tested by the community » Needs work

Ok this caused me some grief lately so I'm knocking back to NW, sorry @smccabe.

So commerce discount was creating new line items, then trying to read them and the new line item was in entityCache but not in the database yet because it was still in a nested transaction. So when reading from entityCache it would be fine but as soon as something asked for it from the DB it would bail saying 'bundle missing in entity'. This caused me 2-3 days of entity metadata wrapper debugging:(. Thought it was an AJAX related race condition but nope, also thought it was entityCache problem (kinda was because it was made available to entityCache before it was saved).

The scenario: I had a views_handler_field_field::post_execute() loading the order. Which is a views listing in a panel. That started the transaction DrupalCommerceEntityController::buildQuery, and never released it.

I couldn't seem to figure out how or where it should've released completed the transaction, ping me on IRC to discuss further.

joelpittet’s picture

Status: Needs work » Reviewed & tested by the community

I think it may be my version of redis/memcache, I'm using apdqc which is supposed to help deadlocks entirely.
#2657922: Commerce - module incompatibilities

If I add this patch in, get my errored up page again it still fails, when I turn off that that cachebackened it does not fail.

My related patch #2657792: Cart order totals incorrect after updating line item quanity needs to be fixed still I think but it doesn't affect the outcome of the bundle missing fatal error, though it is also brought to light from apdqc's transaction handling.

rszrama’s picture

Assigned: Unassigned » rszrama
Status: Reviewed & tested by the community » Needs work

This is gonna need a bit of comment clean-up and potential refactoring before I commit it, but I'm committing myself to do it. Basically, some docblocks have copypasta errors, and I'm not convinced by the names of the protected properties and methods (e.g. requestLocking, given the fact that we actually default that to TRUE and disable locking per the $conditions array, it makes more sense to talk about releasing locks not locking, etc.).

Otherwise, the patch looks good to me. I don't have time to do the refactor right now but will try to get 'er done and re-tested this weekend.

arefen’s picture

After apply patch #99 i still get bellow error

PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: SELECT revision.order_number AS order_number, revision.revision_id AS revision_id, revision.revision_uid AS revision_uid, revision.mail AS mail, revision.status AS status, revision.log AS log, revision.revision_timestamp AS revision_timestamp, revision.revision_hostname AS revision_hostname, revision.data AS data, base.order_id AS order_id, base.type AS type, base.uid AS uid, base.created AS created, base.changed AS changed, base.hostname AS hostname FROM {commerce_order} base INNER JOIN {commerce_order_revision} revision ON revision.revision_id = base.revision_id WHERE (base.order_id IN (:db_condition_placeholder_0)) FOR UPDATE; Array ( [:db_condition_placeholder_0] => 3669 ) in DrupalDefaultEntityController->load() (line 198 of includes/entity.inc). Backtrace:

PDOStatement->execute(Array) database.inc:2171
DatabaseStatementBase->execute(Array, Array) database.inc:683
DatabaseConnection->query('SELECT revision.order_number AS order_number, revision.revision_id AS revision_id, revision.revision_uid AS revision_uid, revision.mail AS mail, revision.status AS status, revision.log AS log, revision.revision_timestamp AS revision_timestamp, revision.revision_hostname AS revision_hostname, revision.data AS data, base.order_id AS order_id, base.type AS type, base.uid AS uid, base.created AS created, base.changed AS changed, base.hostname AS hostname
FROM 
{commerce_order} base
INNER JOIN {commerce_order_revision} revision ON revision.revision_id = base.revision_id
WHERE  (base.order_id IN  (:db_condition_placeholder_0))  FOR UPDATE', Array, Array) select.inc:1265
SelectQuery->execute() entity.inc:198
DrupalDefaultEntityController->load(Array, Array) commerce.controller.inc:148
DrupalCommerceEntityController->load(Array, Array) common.inc:7978
entity_load('commerce_order', Array, Array, ) commerce_order.module:792
commerce_order_load_multiple(Array, Array, ) commerce_order.module:756
commerce_order_load('3669') commerce_cart.module:813
commerce_cart_order_load('174') commerce_cart.module:548
commerce_cart_commerce_product_calculate_sell_price_line_item_alter(Object, NULL, NULL, NULL) module.inc:1157
drupal_alter('commerce_product_calculate_sell_price_line_item', Object) commerce_product_pricing.module:118
commerce_product_calculate_sell_price(Object) commerce_cart.module:2214
commerce_cart_add_to_cart_form(Array, Array, Object, 0, Array) 
call_user_func_array('commerce_cart_add_to_cart_form', Array) form.inc:841
drupal_retrieve_form('commerce_cart_add_to_cart_form_76_80_229', Array) form.inc:350
drupal_build_form('commerce_cart_add_to_cart_form_76_80_229', Array) form.inc:130
drupal_get_form('commerce_cart_add_to_cart_form_76_80_229', Object, 0, Array) commerce_cart.module:2695
commerce_cart_field_attach_view_alter(Array, Array, NULL, NULL) module.inc:1157
drupal_alter('field_attach_view', Array, Array) field.attach.inc:1244
field_attach_view('node', Object, 'teaser', 'fa') node.module:1415
node_build_content(Object, 'teaser', 'fa') node.module:1336
node_view(Object, 'teaser', NULL) node.module:2669
node_view_multiple(Array) node.module:2708
node_page_default() 
call_user_func_array('node_page_default', Array) menu.inc:527
menu_execute_active_handler() index.php:21

smccabe’s picture

Hey @arefen, it won't eliminate all order locking, as commerce still uses pessimistic order locking. All it does is add a method of loading orders without locking and default some instances to non-locking loading. If you have locking views or other types of batch jobs can still cause pretty bad locking. Do you happen to know what is running when you get locking?

smccabe’s picture

Sooooo... the cleanup sort of turned into a refactor of all the set/request/require locking methods to clean up the duplication and hopefully make it a lot cleaner.

smccabe’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 121: 2240427-121-improve-order-locking.patch, failed testing.

smccabe’s picture

Scaled back the refactoring and this patch is just naming and commenting cleanup. With the weird way we're sort of cheating when using conditions, a further refactor isn't possible without creating the possibility of defaulting to unlocked by mistake.

smccabe’s picture

Status: Needs work » Needs review
joelpittet’s picture

FileSize
5.51 KB

Here's an interdiff between #99 and #124 to see the difference between the patches.

rszrama’s picture

Status: Needs review » Needs work

So, two questions:

  1. Do we really need the 'load arguments' parameters in the event that it's TRUE? That's the default value of the argument in the loader, so just including it for FALSE should suffice, no?
  2. commerce_order_views_pre_execute() kind of concerns me, because it disables locking at the controller level but never re-enables it after the View has finished executing. What if there is an operation on the page after an order View is built that actually should result in a locked order load?

Additionally, I'm not sure certain comments add up. In the docs for $requestSkipLocking it says,

  /**
   * Stores whether a request for skipping locking has been set.
   * If locking has been requested as well it will take preference and the entity
   * load will default to locking
   */
  protected $requestSkipLocking = FALSE;

However, when I look at requireLocking(), I see:

  protected function requireLocking() {
    $enabled = isset($this->entityInfo['locking mode']) && $this->entityInfo['locking mode'] == 'pessimistic';
    $not_skipped = empty($this->requestSkipLocking);

    return $enabled && $not_skipped && $this->requestLocking;
  }

Note that locking will not be required even if pessimistic locking is enabled for an entity type and locking was requested if $requestSkipLocking has been set to TRUE. This seems to contradict the earlier statement that if locking is requested, it will override any attempt to skip locking.

That said, this may just be a documentation problem, not a logic problem. If I get a confirmation one way or the other, I can re-roll. Would still need to address points 1 and 2 above and fix some copypasta like this:

  /**
   * Implements DrupalCommerceEntityControllerInterface::isLocked().
   */
  public function isCached($entity) {
    return isset($this->entityCache[$entity->{$this->idKey}]);
  }

(Note the docblock vs. function name.)

fearlsgroove’s picture

Also running into this issue. I'm definitely entering this conversation late, but I gotta point out that database locking via transactions is a really really terrible idea. If a process dies/is killed/etc the whole table gets locked indefinitely. You should have to explicitly specify when you're going to update an order.

In our case we have a background task that iterates over the list of orders in certain states. It loads an order, runs a buncha rules and calls some external services that might take maybe 1s real time (in a background task), then updates the order (or not) and bails. it's continually fetching new locks, so it can basically block almost anything on the site from showing up.

I can't imagine an architectural decision that would have a worse impact on concurrency

mglaman’s picture

Assigned: rszrama » mglaman

Going to take a look at this. Did some discussion with rszrama and we think this should be split: this patch implements ability to load an order as "read only" without locking, or with locking. Then a follow up to alter the hook menu and such.

This way we can get the implementation and test in.

mglaman’s picture

Here is an updated patch. I fixed up a bunch of the doc blocks. I also expanded the tests and broke the methods up -- to test locking, releasing locks, and caching.

This patch also removes any adjustments to implement this measure. That makes this patch, and issue, an API change to allow loading a non-locked order for read-only purposes. We can then open a follow up to make sure that Views and the menu system do not erroneously lock orders.

Status: Needs review » Needs work

The last submitted patch, 130: improve_order_locking-2240427-130.patch, failed testing.

mglaman’s picture

Status: Needs work » Needs review
FileSize
21.97 KB
+++ b/tests/commerce_base.test
@@ -11,6 +11,8 @@
 
+  protected $profile = 'minimal';
+

Whoopsies.

Status: Needs review » Needs work

The last submitted patch, 132: improve_order_locking-2240427-132.patch, failed testing.

mglaman’s picture

Status: Needs work » Needs review
FileSize
22.78 KB
831 bytes

Looks like we need to implement the load functions.

mglaman’s picture

Adding torgosPizza and joelpittet to credits, thanks for all the testing and performance tests.

torgosPizza’s picture

Nice, thanks!

@fearlsgroove: We were getting locking errors way more than we should have prior to this patch; but with the patch and with moving to READ-COMMITTED transaction mode in MySQL, the error hasn't been seen in quite some time.

It'd be great if you could help us test the patch and any subsequent follow-ups written to implement an improved locking mechanism to see if your issues go away as well. That sort of data is super valuable to making sure our Commerce sites are as performant as possible.

mglaman’s picture

Ran into an issue with the way things are currently implemented and suggested in the patch.

1) The order is loaded by us locked
2) The cart load fires and refreshes the cart
3) At the end the order is saved by the cart module
4) After order_load our API tells us the order is unlocked

Cart refresh will essentially nullify the lock almost immediately by saving the order.

Status: Needs review » Needs work

The last submitted patch, 137: improve_order_locking-2240427-137.patch, failed testing.

mglaman’s picture

Disregard #137. Here is a bump of #134 to add a releaseLock($entity); method and a test for cart. It proves that saving an order will remove the lock the order was loaded with.

torgosPizza’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 139: improve_order_locking-2240427-139.patch, failed testing.

mglaman’s picture

I posted a bad patch. I'll have an updated one later.

mglaman’s picture

This should be the proper patch.

Status: Needs review » Needs work

The last submitted patch, 143: improve_order_locking-2240427-143.patch, failed testing.

The last submitted patch, 143: improve_order_locking-2240427-143.patch, failed testing.

thePanz’s picture

Updated patch, using un-locked loading for orders in:

commerce_order_form_commerce_customer_customer_profile_delete_form_alter()
and
commerce_order_handler_area_order_total views handler.

Still not able to understand why test as failing with the message:

exception: [Notice] Line 100 of sites/all/modules/commerce/includes/commerce.controller.inc:
Undefined property: CommerceOrderEntityController::$controllerTransaction

torgosPizza’s picture

Status: Needs work » Needs review

There's a new patch so setting to Needs Review.

Status: Needs review » Needs work

The last submitted patch, 146: improve_order_locking-2240427-146.patch, failed testing.

thePanz’s picture

Improved patch: I defined a custom `CommerceOrderViewsQuery` to allow admins to selectively define if orders should be locked or not during the SQL execution in Views (just use the View -> Advanced -> Query settings).

The patch also extends the `views_handler_field_field` used by the `commerce_price_handler_field_commerce_price` handler to respect the CommerceOrderViewsQuery locking setting.

The attached interdiff is relate to the patch in comment in #2240427-146: Improve commerce entities locking.

mglaman’s picture

Status: Needs work » Needs review

Awesome patch! Marking it for testing

Status: Needs review » Needs work

The last submitted patch, 149: improve_order_locking-2240427-149.patch, failed testing.

thePanz’s picture

Title: Improve order locking » Improve commerce entities locking
FileSize
47.81 KB
22.81 KB

Updated patch:

  • moved newly added CommerceViewsQuery to commerce module itself as it could be used by sub modules (commerce_order, commerce_product, commerce_line_item).
  • Refactored settings added in patch #149
  • Using CommerceViewsQuery in Products and LineItem
  • Added "$lock" parameter to Product and LineItem `*_load()` ad `*_load_multiple()` functions.
czigor’s picture

  1. +++ b/includes/views/commerce_views_query.inc
    @@ -0,0 +1,115 @@
    +   * Returns the according entity objects for the given query results.
    

    The word 'according' is not appropriate here and is also superfluous.

  2. +++ b/includes/views/commerce_views_query.inc
    @@ -0,0 +1,115 @@
    +    // Bail out if the table has not specified the according entity-type.
    

    The word 'according' is not appropriate here and is also superfluous.

  3. +++ b/includes/views/commerce_views_query.inc
    @@ -0,0 +1,115 @@
    +   * Mimics the same behaviour of views_plugin_query, but allows admins to define
    

    The word 'same' is not needed here.

joelpittet’s picture

Status: Needs work » Needs review

@czigor feel free to make those changes with an interdiff. Setting to needs review so Testbot picks it up.

Status: Needs review » Needs work

The last submitted patch, 152: improve_commerce_entities_locking-2240427-152.patch, failed testing.

thePanz’s picture

Updated patch with @czigor comments. I also had a chat with him and I fixed few more issues he discovered.
Interdiff against patch in #143

thePanz’s picture

Status: Needs work » Needs review

Go testbot, go!

Status: Needs review » Needs work

The last submitted patch, 156: improve_commerce_entities_locking-2240427-155.patch, failed testing.

thePanz’s picture

Status: Needs work » Needs review
FileSize
50.39 KB
28.76 KB

Updated patch, fixed wrong testCartRefreshOrderLock() assertions.

thePanz’s picture

Updated patch, reworked the locking and caching handling functions

Status: Needs review » Needs work

The last submitted patch, 160: improve_commerce_entities_locking-2240427-160.patch, failed testing.

thePanz’s picture

Locking fixes (part 2)

Status: Needs review » Needs work

The last submitted patch, 162: improve_commerce_entities_locking-2240427-162.patch, failed testing.

thePanz’s picture

Status: Needs work » Needs review
FileSize
53.04 KB
2.65 KB

Removed latest 2 patches (160 and 162) as over-complicated code was added.

This patch is a followup of #159 tries to add the locking/unlocking feature to the Cart sub-module

thePanz’s picture

Further analyzed where the commerce_order_load() is used.
I spotted some places where the $order is never saved after loading (with locking enabled), I then changed such invocations with a "non-locking" call.

czigor’s picture

Status: Needs review » Needs work
+++ b/modules/cart/commerce_cart.module
@@ -605,22 +606,22 @@ function commerce_cart_user_login(&$edit, $account) {
+        commerce_order_save($order);

The whole commerce_cart_user_login() part seems to me a completely unrelated performance fix. Since the patch is big enough already can we fix this in a separate issue?

thePanz’s picture

@czigor: the commerce_cart_user_login() is slightly related to the performance itself.
The current implementation is querying all the orders (and locking all of them) of a specific user, subsequently the order email is checked, and only a subset of those orders saved. The non-altered orders would stay locked.

The patch fixes that.

thePanz’s picture

Status: Needs work » Needs review
FileSize
65.67 KB
2.29 KB

Improved commerce_order loading and locking in "Checkout" submodule

mglaman’s picture

Thanks for all the work on this, but I do want to remind that we made #2713193: Implement loading orders without locking so the patch is easier to review.

+++ b/modules/checkout/commerce_checkout.module
@@ -15,6 +15,7 @@ function commerce_checkout_menu() {
+    'load arguments' => array(FALSE),

@@ -23,6 +24,7 @@ function commerce_checkout_menu() {
+    'load arguments' => array(FALSE),

Shouldn't checkout always load a locked order? It's going to be modified.

+++ b/modules/checkout/includes/commerce_checkout.pages.inc
@@ -351,9 +351,6 @@ function commerce_checkout_form_validate($form, &$form_state) {
-  // Load a fresh copy of the order stored in the form.
-  $order = commerce_order_load($form_state['order']->order_id);
-

@@ -368,6 +365,9 @@ function commerce_checkout_form_submit($form, &$form_state) {
+      // Load a fresh copy of the order stored in the form.
+      $order = commerce_order_load($form_state['order']->order_id);
+

Why was this moved? Just to prevent an order load if there is not a next page? Which technically that probably doesn't run (You can't actually submit the last page of checkout.)

czigor’s picture

@mglaman @thePanz is away, but I can answer:
1. It's enough to load the orders lockingly in the checkout pane's submit callback.
2. Yes, the reason of moving that piece of code was to prevent another locking order load.

We are getting PDOExceptions due to locked orders. It's hard to say where it's exactly coming from so we are trying to convert locking order loads to non-locking ones whenever possible.

mglaman’s picture

Assigned: mglaman » Unassigned

Gotcha. Also, have you tried using READ-COMMITTED mode for MySQL? it seems to resolve 90% of the problems.

joelpittet’s picture

Shamless plug for PDOExceptions: #2690765: Prevent entity_load_unchanged() from changing orders through order refresh
Also it's blocked on this likely.

thePanz’s picture

Thanks @mglaman, but our MySQL is already using the READ-COMMITTED mode and the error is still thrown.

fearlsgroove’s picture

Just throwing this out there .. database locking to solve this "problem" is nuts. It only applies to the order revision, not line items, payments or any other related objects, and it won't even stop subsequent requests from unintentionally stomping data. Database level locking only makes sense in the context of fetch and update operations, but this is an application level lock. Starting the transaction and the database itself ensure the data for each save is atomic, which seems to be what this naive locking is trying to solve.

Is there a test case where a concurrency issue can occur anywhere? In addition to being (what seems to me to be) conceptually wrong, I don't even think it's solving a real problem that actually occurs, but instead is attempting to circumvent a perceived or potential problem (that isn't in fact valid).

This change just makes the commerce 1.x api even less understandable as a workaround to the workaround that doesn't need to be there in the first place.

I'd propose dropping the database locking FOR UPDATE altogether and replacing it with a proper application level lock that would start as soon as the application code determines the order needs to be updated in an atomic way (with lock_acquire).

wesleymusgrove’s picture

I haven't applied the patch yet, but switching MySQL to READ-COMMITTED seemed to fix most of the errors related to lock wait timeouts, at least on the front end of the store (adding products to cart, checking out).

//added to settings.php
$databases['default']['default']['init_commands'] = array(
    'isolation' => "SET SESSION tx_isolation='READ-COMMITTED'",
);

However I'm still having lock wait timeout issues with Views used by Backoffice Orders and Reporting:

admin https://www.example.com/admin/commerce/reports [ip address] PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: SELECT revision.order_number AS order_number, revision.revision_id AS revision_id, revision.revision_uid AS revision_uid, revision.mail AS mail, revision.status AS status, revision.log AS log, revision.revision_timestamp AS revision_timestamp, revision.revision_hostname AS revision_hostname, revision.data AS data, base.order_id AS order_id, base.type AS type, base.uid AS uid, base.created AS created, base.changed AS changed, base.hostname AS hostname FROM {commerce_order} base INNER JOIN {commerce_order_revision} revision ON revision.revision_id = base.revision_id WHERE  (base.order_id IN  (:db_condition_placeholder_0, :db_condition_placeholder_1, :db_condition_placeholder_2, :db_condition_placeholder_3, :db_condition_placeholder_4, :db_condition_placeholder_5, :db_condition_placeholder_6, :db_condition_placeholder_7, :db_condition_placeholder_8, :db_condition_placeholder_9, :db_condition_placeholder_10, :db_condition_placeholder_11, :db_condition_placeholder_12, :db_condition_placeholder_13, :db_condition_placeholder_14, :db_condition_placeholder_15, :db_condition_placeholder_16, :db_condition_placeholder_17, :db_condition_placeholder_18, :db_condition_placeholder_19, :db_condition_placeholder_20, :db_condition_placeholder_21, :db_condition_placeholder_22, :db_condition_placeholder_23, :db_condition_placeholder_24, :db_condition_placeholder_25, :db_condition_placeholder_26, :db_condition_placeholder_27, :db_condition_placeholder_28, :db_condition_placeholder_29))  FOR UPDATE; Array (
    [:db_condition_placeholder_0] => 67961
    [:db_condition_placeholder_1] => 67941
    [:db_condition_placeholder_2] => 68476
    [:db_condition_placeholder_3] => 68661
    [:db_condition_placeholder_4] => 68436
    [:db_condition_placeholder_5] => 69241
    [:db_condition_placeholder_6] => 69401
    [:db_condition_placeholder_7] => 69556
    [:db_condition_placeholder_8] => 69746
    [:db_condition_placeholder_9] => 69981
    [:db_condition_placeholder_10] => 70381
    [:db_condition_placeholder_11] => 70346
    [:db_condition_placeholder_12] => 70806
    [:db_condition_placeholder_13] => 71111
    [:db_condition_placeholder_14] => 71221
    [:db_condition_placeholder_15] => 71311
    [:db_condition_placeholder_16] => 71591
    [:db_condition_placeholder_17] => 71791
    [:db_condition_placeholder_18] => 72061
    [:db_condition_placeholder_19] => 72546
    [:db_condition_placeholder_20] => 72666
    [:db_condition_placeholder_21] => 72866
    [:db_condition_placeholder_22] => 74376
    [:db_condition_placeholder_23] => 74661
    [:db_condition_placeholder_24] => 74946
    [:db_condition_placeholder_25] => 74961
    [:db_condition_placeholder_26] => 75611
    [:db_condition_placeholder_27] => 75776
    [:db_condition_placeholder_28] => 75916
    [:db_condition_placeholder_29] => 76466
)
 in DrupalDefaultEntityController->load() (line 198 of /mnt/www/html/example/docroot/includes/entity.inc). 

Also while the lock wait timeouts are occurring, cron jobs related to loading orders are negatively affected:

Anonymous http://www.example.com/index.php 127.0.0.1 Exception: exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction' in /mnt/www/html/example/docroot/includes/database/database.inc:2171
Stack trace:
#0 /mnt/www/html/example/docroot/includes/database/database.inc(2171): PDOStatement->execute(Array)
#1 /mnt/www/html/example/docroot/includes/database/database.inc(683): DatabaseStatementBase->execute(Array, Array)
#2 /mnt/www/html/example/docroot/includes/database/select.inc(1265): DatabaseConnection->query('SELECT revision...', Array, Array)
#3 /mnt/www/html/example/docroot/includes/entity.inc(198): SelectQuery->execute()
#4 /mnt/www/html/example/docroot/includes/common.inc(7963): DrupalDefaultEntityController->load(Array, Array)
#5 /mnt/www/html/example/docroot/sites/all/modules/contrib/commerce/modules/order/commerce_order.module(782): entity_load('commerce_order', Array, Array, true)
#6 /mnt/www/html/example/docroot/sites/all/modules/custom/example_store/example_store.module(3861): commerce_order_load_multiple(Array, Array, true)
#7 [internal function]: _example_store_external_api_error_cron('execute', 'example_store...')
#8 /mnt/www/html/example/docroot/sites/all/modules/contrib/elysia_cron/elysia_cron.module(1264): call_user_func_array('_example_store...', Array)
#9 /mnt/www/html/example/docroot/sites/all/modules/contrib/elysia_cron/elysia_cron.module(1170): elysia_cron_internal_execute_job('example_store...')
#10 /mnt/www/html/example/docroot/sites/all/modules/contrib/elysia_cron/elysia_cron.drush.inc(153): elysia_cron_run_job('example_store...', false, false, false)
#11 [internal function]: drush_elysia_cron_run_wrapper('run', 'example_store...')
#12 /usr/local/drush6/includes/command.inc(362): call_user_func_array('drush_elysia_cr...', Array)
#13 /usr/local/drush6/includes/command.inc(214): _drush_invoke_hooks(Array, Array)
#14 [internal function]: drush_command('run', 'example_store...')
#15 /usr/local/drush6/includes/command.inc(182): call_user_func_array('drush_command', Array)
#16 /usr/local/drush6/drush.php(92): drush_dispatch(Array)
#17 /usr/local/drush6/drush.php(61): _drush_bootstrap_and_dispatch()
#18 /usr/local/drush6/drush.php(16): drush_main()
#19 {main} 

Are these remaining issues addressed by the patch even after switch to READ-COMMITTED mode?

thePanz’s picture

@wesleymusgrove we were always getting the "locking timeout" errors even with the "READ-COMMIT" setting for MySQL.
I personally give up digging into this issue as I totally agree with fearlsgroove (comment #174).

You can use `hook_entity_info_alter()` to disable the pessimistic locking for Commerce entities, of course, you need to be aware that you can have data loss if, as an example, two orders are saved concurrently.

/**
 * Implements hook_entity_info_alter().
 */
function hook_entity_info_alter(&$entity_info) {
  // Altering Commerce entities to NOT use "pessimistic" locking
  $entity_types = ['commerce_order'];
  foreach ($entity_types as $type) {
    $entity_info[$type]['locking mode'] = NULL;
  }
}
wesleymusgrove’s picture

@thePanz thanks for the example! Are you doing this in conjunction with your patch from #168 or only disabling pessimistic locking without the patch?

thePanz’s picture

@wesleymusgrove: no patch applied, just the hook.

rszrama’s picture

Status: Needs review » Needs work

Just chiming in that Matt, Bojan, and I were discussing this last week and were also wondering about the use of lock_acquire() for application level locking. Didn't even realize that'd been suggested up above as well. Seems like the most worthy thread to pursue.

jsacksick’s picture

The attached patch is a follow-up to Matt's patch from comment #143, it adds a isReadOnly() method to the controller used to track the intent.
Thus, if you disable the order locking by altering the entity info array, you'd still be able to track the read-only orders although we're not preventing the orders to be saved.

The last submitted patch, 180: commerce-improve-order-locking-2240427-180.patch, failed testing.

mglaman’s picture

+++ b/includes/commerce.controller.inc
@@ -181,6 +206,13 @@
+      if (!$this->requestLocking && $ids) {
+        $this->readOnlyEntities += array_flip($ids);

What if it gets loaded for writing? How to we unset this.

jsacksick’s picture

The attached patch implements a different approach.
By default, we assume an order is loaded to perform a "write", if that's the case then we try to lock the order unless the lock is explicitly skipped via the requestSkipLocking() method.

We also skip the cart refresh for read-only orders.

The last submitted patch, 183: commerce-improve-order-locking-2240427-183.patch, failed testing.

torgosPizza’s picture

I'd be interested in also seeing a patch that uses the lock_acquire() approach mentioned by Ryan and @fearlsgroove.

jsacksick’s picture

The last submitted patch, 186: commerce-improve-order-locking-2240427-186.patch, failed testing.

smccabe’s picture

I might recommend running tests locally and not on an issue with 67 followers getting a notification each time they fail :)

jsacksick’s picture

add a isWriteable() method to the controller.

rszrama’s picture

hah, yeah, we'll be splitting out of this issue ASAP, but to @torgosPizza's point, we decided to simplify the resolution for this issue and then provide additional potential locking controllers in follow-ups. Some folks had reservations about using the locking API too aggressively.

DamienMcKenna’s picture

So what's the plan here? Besides writing more tests to cover other scenarios (commerce_discount, etc).

DamienMcKenna’s picture

Issue tags: +entitycache
nico.knaepen’s picture

What is the status on this?
Is it possible to apply the latest patch in a production environment? Which patch should be applied (.diff and .patch)?

DamienMcKenna’s picture

@nico.knaepen: You should always test a code change before uploading it to production to make sure there aren't any regressions. That said, a file with "interdiff" in the name is the INTERmediate DIFF between two patches, i.e. the interdiff in #189 contains the changes between the patch in #186 and the patch in #189; therefore, you should always start with the patch file.

DamienMcKenna’s picture

@rszrama: Any updates on this and related issues?

DamienMcKenna’s picture

After applying the patch in #189 I'm still having problems with lock timeouts on a site - if I don't touch my local install for a few days the first page request is fine, but the next request times out. It's kinda weird. FYI there are a few other Commerce contrib modules installed.

a.ross’s picture

It might be worth pointing out that in my experience, the commerce coupon module (1.x) performs pretty bad. I've experienced serious problems (including lock timeouts) when there were a lot of coupons.

DamienMcKenna’s picture

Related to #196, fyi I have MySQL set to 'READ-COMMITTED' and it's using the database cache when it hits this problem.

DamienMcKenna’s picture

Ok, adding the code to from #176 seems to have stopped the locking problem.. more details after I test it some more.

Sebastien M.’s picture

Hi everybody,
We encountered this issue in production environnement too.
@DamienMcKenna any news of your tests ?
Many thanks

jordan.gillet’s picture

Hi,

We are also interested in your tests.
We are encoutering this issue in production too (using PostgreSQL).

Thanks !

hgoto’s picture

I tested the patch #189 a little in my environment and it seems to work well as expected. (For example, I confirmed using the $write = FALSE option for commerce_order_load() successfully avoids blocking other requests.)

  1. Orders loaded by $write = FALSE can be saved and deleted with the patch #189. Is this behavior intended or just not implemented yet?
  2. It may be better to add some description in the doc comment for DrupalCommerceEntityController::load() for the new $conditions['_write'] option.
Artros’s picture

hello all !

I have add patch #189 on my local environment.
I have create new column _write type Boolean default value on TRUE in commerce_order table in my PostgreSQL.

after rebuild my environnement and clear cache with drush. I have this error :

Notice : Undefined property: stdClass::$_write dans DrupalDefaultEntityController->cacheGet() (ligne 414 dans /srv/alaska/dev/www/drupal/includes/entity.inc).

if you have a solution for this error . your help is welcome !

thanks for your reply

Artros’s picture

Fabianx’s picture

Here is a much simpler solution, which I will implement on a commerce site now:

diff --git a/sites/all/modules/contrib/commerce/includes/commerce.controller.inc b/sites/all/modules/contrib/commerce/includes/commerce.controller.inc
index 1cdd605..3f29adb 100644
--- a/sites/all/modules/contrib/commerce/includes/commerce.controller.inc
+++ b/sites/all/modules/contrib/commerce/includes/commerce.controller.inc
@@ -28,6 +28,11 @@ class DrupalCommerceEntityController extends DrupalDefaultEntityController imple
   protected function buildQuery($ids, $conditions = array(), $revision_id = FALSE) {
     $query = parent::buildQuery($ids, $conditions, $revision_id);
 
+    // In case this is a GET request, we directly return without locking.
+    if ($_SERVER['REQUEST_METHOD'] == 'GET') {
+      return $query;
+    }
+
     if (isset($this->entityInfo['locking mode']) && $this->entityInfo['locking mode'] == 'pessimistic') {
       // In pessimistic locking mode, we issue the load query with a FOR UPDATE
       // clause. This will block all other load queries to the loaded objects
@@ -173,6 +178,12 @@ class DrupalCommerceEntityController extends DrupalDefaultEntityController imple
    *   SAVED_NEW or SAVED_UPDATED depending on the operation performed.
    */
   public function save($entity, DatabaseTransaction $transaction = NULL) {
+    // In case this is a GET request, we log this incident.
+    if ($_SERVER['REQUEST_METHOD'] == 'GET') {
+      $e = new \Exception();
+      watchdog('commerce', 'Entity with type "%type" saved during GET request. Backtrace: %bt', array('%type' => $this->entityType, '%bt' => $e->getTraceAsString()), WATCHDOG_WARNING);
+    }
+
     if (!isset($transaction)) {
       $transaction = db_transaction();
       $started_transaction = TRUE;

Unless I am missing something all operations with save() should be on POST requests. Most POST requests will only load the entity for saving anyway.

While there could be edge cases where the transaction is still active for the whole request, the problem should still be way less prominent than doing the transaction for every page request, which only reads.

And as GET requests should generally not write persistent data (caching is okay obviously), the condition is logged so it can be monitored.

While this solution is not perfect, I think it is a good compromise between data integrity (which is improved via the FOR UPDATE obviously) and not locking the database for complex read operations.

Anybody’s picture

Absolutely great idea, Fabianx! That might be a good intermediate solution for Drupal Commerce. As you said perhaps not the perfect fix but should save all of us from a lot of trouble. What do the DC maintainers think about this? Better than the current state, I think?

Fabianx’s picture

commerce_cart_commerce_order_load makes my solution difficult as it generally just refreshes the order on the next page request.

I was able to work around it by generally refreshing carts after saving at the end of the page request. (hook_exit() / hook_drupal_goto_alter()).

In general however my proposition above holds, orders should be refreshed after save() during the POST and just refreshed for the cart if anything in the system has changed while the order was not loaded.

That is still a legitimate case, but as it happens so seldom I will ignore it for my own solution.

Diff is here:

diff --git a/sites/all/modules/contrib/commerce/modules/cart/commerce_cart.module b/sites/all/modules/contrib/commerce/modules/cart/commerce_cart.module
index 002b263..6d10edc 100644
--- a/sites/all/modules/contrib/commerce/modules/cart/commerce_cart.module
+++ b/sites/all/modules/contrib/commerce/modules/cart/commerce_cart.module
@@ -751,6 +751,14 @@ function commerce_cart_commerce_order_load($orders) {
         commerce_cart_order_is_cart($order) &&
         commerce_order_is_latest_revision($order) &&
         commerce_cart_order_can_refresh($order)) {
+      // Refresh the order and add its ID to the refreshed array.
+      $refreshed[$order->order_id] = TRUE;
+      commerce_cart_refresh_cart($order);
+    }
+  }
+}
+
+function commerce_cart_refresh_cart($order) {
   // Update the last cart refresh timestamp and record the order's current
   // changed timestamp to detect if the order is actually updated.
   $order->data['last_cart_refresh'] = REQUEST_TIME;
@@ -758,8 +766,6 @@ function commerce_cart_commerce_order_load($orders) {
   $unchanged_data = $order->data;
   $last_changed = $order->changed;
 
-      // Refresh the order and add its ID to the refreshed array.
-      $refreshed[$order->order_id] = TRUE;
   commerce_cart_order_refresh($order);
 
   // If order wasn't updated during the refresh, we need to manually update
@@ -777,7 +783,39 @@ function commerce_cart_commerce_order_load($orders) {
       ->execute();
   }
 }
+
+function commerce_cart_refresh_saved_orders() {
+  $refresh_orders = &drupal_static('commerce_cart_commerce_order_update', array());
+  if ($refresh_orders === FALSE) {
+    return;
   }
+
+  $orders = $refresh_orders;
+  $refresh_orders = FALSE;
+
+  foreach ($orders as $order_id => $order) {
+    commerce_cart_refresh_cart($order);
+  }
+}
+
+function commerce_cart_drupal_goto_alter(&$path, &$options, &$http_response_code) {
+  commerce_cart_refresh_saved_orders();
+}
+
+function commerce_cart_exit() {
+  commerce_cart_refresh_saved_orders();
+}
+
+function commerce_cart_commerce_order_update($order) {
+  $refresh_orders = &drupal_static(__FUNCTION__, array());
+
+  if ($refresh_orders !== FALSE && $_SERVER['REQUEST_METHOD'] == 'POST') {
+    $refresh_orders[$order->order_id] = $order;
+  }
+}
+
+function commerce_cart_commerce_order_insert($order) {
+  commerce_cart_commerce_order_update($order);
 }
 
 /**

Missing doxygen, but just posting that for others right now - not as a final solution.

fearlsgroove’s picture

leaking web request information as a decider seems like a bad idea. It's very implicit and magical. What happens if I'm running a long running script that queries many objects and updates some? This function is API level and should not implicitly rely on aspects of a web request, as there may be no web request.

iSoLate’s picture

Added patch based on Fabianx his comments. It fixed it for us for now.

DamienMcKenna’s picture

Status: Needs review » Needs work

At the very least it should check that $_SERVER['REQUEST_METHOD'] exists before using it.

jsacksick’s picture

I don't really get the idea of not locking the order simply because it's a $_GET request... What about cron jobs, drush commands? In case you're manipulating orders in a drush command, the request is not going to be a POST request and that script wouldn't be able to save orders?

pounard’s picture

Not locking does not mean not being able to manipulate it. If you want to keep your frontend responsive I'd frankly advise you NOT do locking in such cron tasks, or do it only the time you treat a single entity then release it right after. It happens to me once, a one hour cron each day made all carts locked for all its lifetime: site down. Problem is, Drupal entity system is so slow and MySQL so poor with performances when doing real SQL that locking will always come with trouble. Right solution is to use isolated transactions whenever necessary, but make them the shorter you can, at least to diminish the timespan where conflicts can happen and avoid transaction errors (but still, with rightly done transactions, your data is safe, if it breaks just rerun it until it works).

torgosPizza’s picture

The patch in #209 seems to be a very interesting solution which appears to help us, as one thing we're seeing on our site, is that even Ajax calls to and endpoint which loads the cart and returns the number of items in it - were producing FOR UPDATE queries in the database, leading to occasional db locks, and occasional issues with the site being slow to respond. We're going to test it in production to see if it helps with some scalability issues we've been seeing.

I personally agree that "reading" an order should not result in the order changing; one potential edge case might be instances where a site-wide Commerce Discount has been applied, which doesn't require any user interaction, since it would get picked up during the refresh phase. That might be an edge case worth considering, but in my testing, everything worked as expected.

@Fabianx: I did find an issue with the diff you posted in #207, which might actually be Commerce Discount-related:

- commerce_cart_refresh_cart() calls commerce_cart_order_refresh(),
- which invokes the 'commerce_product_calculate_sell_price' Rules callbacks,
- causing Rules to unwrap data values, including that of the commerce_order entity in question;
- which requires another call to commerce_cart_commerce_order_load(),
- leading to infinite recursion.

I'm not quite sure what exactly the workaround described is attempting to avoid, but (at least in my local testing) it didn't seem to be necessary. This could be another side effect of other modules we have in use, but I can't be sure. However that makes me a bit hesitant since there might be other Rules doing similar calculations with orders, that could potentially cause more recursion.

So while I think the solution to simply return a non-FOR UPDATE query is sensible, I think the commerce_cart_commerce_order_load() workaround needs more time.

darnzen’s picture

I've been working on this issue for some time. There is a better method than using " FOR UPDATE" in the build query. The MySQL InnoDB engine has the ability to lock rows for updates while leaving them unlocked for reads. The issue seems to hit the commerce module most because the DrupalCommerceEntityController class adds a $query->forUpdate() to the DrupalCommerceEntityController::buildQuery function. So every single database read results in hundreds of locked rows across dozens of tables.

Instead of appending " FOR UPDATE" to the query, you can append " LOCK IN SHARE MODE" instead.

The problem is this is defined in the _toString function of the Drupal core SelectQuery class. I have avoided hacking drupal core for 5 years, but this was a lot easier than adding entity_get_controller('commerce_order')->resetCache(array($order->order_id)) everywhere an entity is loaded for display purposes. Once I started finding the commerce_order_load function being called in a lot of the views controllers I broke down and implemented the change mentioned above.

adam1’s picture

I tried to patch most recent commerce module version 7.x-1.14 with no success. If someone could make a working patch, this would be really great!

Neo13’s picture

@darnzen Could you please provide a patch for your proposed solution?

dotist’s picture

@darnzen i second that request for your patch, or more explicit info on what you changed. This bug has been around for years already! I can't believe there's no better solution.

JorgenSandstrom’s picture

Hi
Patch in #209 seems to work for us. Although, we process our orders with drush and so we need to check if it is run in cli. Attached is the patch we use with just drupal_is_cli() added to conditions.

itsekhmistro’s picture

Status: Needs work » Needs review
AlfTheCat’s picture

#218 works against latest stable version of commerce here. Much thanks for this patch!

I have nodes that reference a commerce order entity. Before this patch, the database grinded to a halt when viewing one of those nodes and threw SQL State errors as described in #1320062: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded.

johnennew’s picture

We're still seeing lock wait timeouts after applying the patch in #218, though less of them. For those asking about the @darnzen patch described in #214, I think they are suggesting the following hack on Drupal core:

edit `includes/database/select.inc` (currently line 1609)

replacing

if ($this->forUpdate) {
      $query .= ' FOR UPDATE';
    }

with

  if ($this->forUpdate) {
      $query .= ' LOCK IN SHARE MODE';
  }

We may test this if we don't find any other solution.

matthensley’s picture

It seems like progress made on D8/D9 is unlikely to flow back downstream at this point, looking at the types of solutions being discussed. I still have this problem show up from time to time, despite implementing some of the patches suggested above. Most problematically, it occasionally happens in the transition from checkout review to completion which means payment transactions happen, but the order doesn't move to "completed" and the user is presented with the review page again. This has led to customers paying for their orders multiple times.

Is it likely that any sql / hosting tuning could have a mitigating impact here? If transactions take longer to complete, are they locked for longer and increase the risk of overlap? Has anyone attempted the "SHARE MODE" approach indicated above as a core hack, and if so were there any downstream effects to be aware of?

kjl’s picture

If you want to test changing the commerce_order_load() Locking Mode in D7 without hacking core, you can do something like the following:

class ExampleExtender extends SelectQueryExtender {
  public function __toString() {
  $query = parent::__toString();
  $query .= ' LOCK IN SHARE MODE';
  return $query;
  }
}

function my_module_query_alter(QueryAlterableInterface &$query) {
  if (array_key_exists('commerce_order_load_multiple', $query->alterTags)) { 
    $query->forUpdate(FALSE);
    $query = $query->extend('ExampleExtender');
  }
}

Note that assigning the extender result back to $query is required, as a new object is returned from extend().