public function query() {
    $order_ids = $this->getOrderIds();
    $field_names = array_keys($this->fields());
    $query = $this->select('uc_orders', 'uo')->fields('uo', $field_names);
    $query->condition('order_id', $order_ids, 'IN');

That's putting a LOT of order IDs into the query as parameters. I'm pretty sure mySQL has limits on the actual length of a query that's passed into it.

The goal here appears to be to get the single most recent order for each customer.

A better way to do that is with a self-join maximum query -- see https://stackoverflow.com/questions/15211479/groupwise-maximum

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

joachim created an issue. See original summary.

quietone’s picture

Status: Active » Needs review
FileSize
953 bytes

@joachim, thanks for catching this.

Status: Needs review » Needs work

The last submitted patch, 2: 2950119-2.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review
FileSize
638 bytes
946 bytes

A bit of a typo. I'll sort out the full migration test later.

Status: Needs review » Needs work

The last submitted patch, 4: 2950119-4.patch, failed testing. View results

joachim’s picture

The join needs to be on the uid, not on the order id.
You also need a condition that the left table has nothing on the right, so that you get only the orders that have nothing newer.

Also, I think it's clearer if you give it an alias of whatever_2.

So the condition would be 'uo2.order_id IS NULL'.

quietone’s picture

Status: Needs work » Needs review
FileSize
3.55 KB
3.4 KB

Thanks joachim. This updates the source plugin test as well.

joachim’s picture

Status: Needs review » Needs work
+++ b/modules/ubercart/src/Plugin/migrate/source/d6/BillingProfile.php
@@ -72,15 +72,9 @@ class BillingProfile extends SqlBase {
+    $query->isNotNull('uo2.uid');

Should be isNull. And the field should probably be uo2.order_id, to be on the safe side, as the primary key can never be NULL.

+++ b/modules/ubercart/src/Plugin/migrate/source/d6/BillingProfile.php
@@ -72,15 +72,9 @@ class BillingProfile extends SqlBase {
+    $query->leftJoin('uc_orders', 'uo2', 'uo.uid = uo2.uid AND uo.modified > uo2.modified');

That should be a <.

What the self-join is doing is finding all orders that have no other order that is newer.

Here's a fiddle showing this with a demo orders table with 3 orders for the same user: http://sqlfiddle.com/#!9/f7c7aa/5

If you remove the IS NULL, you get lots of extra rows, where for each row, the left copy of uo has an order that's older than the order in the right copy. There's also one row with nothing on the right, because it's a LEFT JOIN. The IS NULL condition gets rid of everything except that row, which is the order that has no order newer than it. In other words, the newest order.

quietone’s picture

Status: Needs work » Needs review
FileSize
3.55 KB
832 bytes

@joachim, thank for setting up the sqlfiddle. It helps to see what is happening in another way but I still don't get it. (I thought I had a grasp of these joins - time to do more reading)

Anyway, here's the next iteration of the patch.

joachim’s picture

It is definitely a tough concept to get your head round. I first encountered this years ago and it took a long time for me to understand it. I still have to think about it really carefully so I don't get it wrong... the fiddle was partly to make sure what I was saying was right!

heddn’s picture

Status: Needs review » Needs work

Code looks fine here. Great detail

+++ b/modules/ubercart/src/Plugin/migrate/source/d6/BillingProfile.php
@@ -72,15 +72,9 @@ class BillingProfile extends SqlBase {
+    $query->leftJoin('uc_orders', 'uo2', 'uo.uid = uo2.uid AND uo.modified < uo2.modified');

Could we add a comment here stating what @joachim documented in the IS so nicely:

> The goal here appears to be to get the single most recent order for each customer.

joachim’s picture

+++ b/modules/ubercart/src/Plugin/migrate/source/d6/BillingProfile.php
@@ -72,15 +72,9 @@ class BillingProfile extends SqlBase {
   public function getOrderIds() {

Actually, I've only just noticed -- this is keeping the 2-query system where we query for order IDs and pass them ALL as parameters to the real query.

There should only be one query, and getOrderIds() should be removed.

The comment on getOrderIds() about the assumptions we're making here should be kept though -- maybe moved to the class docblock, as it's an assumption that concerns this entire source plugin.

quietone’s picture

Status: Needs work » Needs review
FileSize
3.86 KB
633 bytes

This changes the method summary line to use joachim's nice prose.

heddn’s picture

Status: Needs review » Needs work

NW for #12.

quietone’s picture

Status: Needs work » Needs review
FileSize
4.36 KB
1.68 KB

@joachim, somehow I missed your comment. Yes, of course, it should be one query.

heddn’s picture

Status: Needs review » Reviewed & tested by the community

All feedback addressed. Great.

  • heddn committed ba07619 on 8.x-2.x authored by quietone
    Issue #2950119 by quietone, joachim, heddn: Ubercart D6 billing profile...
heddn’s picture

Status: Reviewed & tested by the community » Fixed
joachim’s picture

Status: Fixed » Active

This is maybe a bug we accidentally fixed here rather than a new one, but on upgrading to alpha 7, my total count for my profile migration has gone from 6165 to 3753...

heddn’s picture

Status: Active » Fixed

Can we open a new issue for the findings from #19?

joachim’s picture

Possibly no need.

> - $query->groupBy('uo.order_id');

This might be the culprit.

Grouping by order ID looks wrong, when we are trying to get the most recent order for each user. Every row has a different order ID, so grouping by order ID won't do any grouping at all, will it?

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.