This is necessary for supporting paging and tablesort queries, and hopefully making it easier for Views to use the new SELECT builder. An extender is a class that gets added to a select query via composition from externally, thereby adding additional methods to it without using inheritance. That way we can add optional additional functionality to a SELECT query at any time.

If that description doesn't make much sense, I apologize. It makes sense in my head. :-) It should hopefully make more sense once I've written the code.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

moshe weitzman’s picture

Heh. Not much sense at all. Lets see what larry comes up with

dmitrig01’s picture

I think it'd either use __call or $select->Object->method() if I understand correctly

drewish’s picture

subscribe

chx’s picture

SQLite can fire a callback for every record of a certain resultset. Also, you can bind a PHP function to be an SQLite function. So you can run any complex UPDATE function as you wish. It will just work. PostgreSQL has pl/PHP so you can do pretty much the same. However, to call PHP functions from a MySQL UPDATE you need to retrieve, invoke the PHP function and UPDATE the record back.

Now suppose that my module in Drupal 7 has the need for this. Even if I am willing to properly support all three, I simply can not. There is no way the add a method to the already written classes from a module.

This is use case #4 for this one. We need some mechanism to extend an already written class. Yes, it's either __call or something else :)

Crell’s picture

Priority: Normal » Critical
Status: Active » Needs review
FileSize
14.35 KB

Moving this to critical as some queries can't be converted without this...

This patch does a few things, but it can't not do them so there is no kitten damage.

It introduces a concept of an "extender" for select queries. An extender is an alternate way of modifying a query selectively rather than using an alter hook. Right now there are two extenders, pager queries and tablesort. They replace the separate hacky mechanisms we use now.

I've converted 2 queries only; the default front page (which uses pager and node access) and the comment admin page (which uses pager and tablesort). Other queries get converted later.

Of note:

- I'm doing both pager and tablesort in the same patch because I needed two implementations to test out the API.

- The pager uses count queries, and count queries break if you try to sort by a field that isn't specified. I also kept getting an infinite loop on the count query. :-) So count queries remove all extenders.

- Because I'm trying to avoid a circular reference, the extender objects do not maintain a reference to their main query object. I am a little concerned that is too limiting, and it may well be. If so, we'll have to associate the query earlier and then figure out how to break the circular reference.

- The code right now is essentially lifted directly from the existing implementations. I haven't done much cleanup on it internally. I really really really want to exterminate the globals in the pager query, but I'm not certain how to do that yet, in part because they're so badly named I can barely understand what they do.

- PagerFull is so named because this opens up the potential to have multiple types of pagers in core, each a different extender class. That does require some theme integration that we'd have to figure out first.

- Because they're classes, once we remove the legacy code we can just lazy-load them as needed. Yay.

Remember to clear the registry after applying this patch!

Crell’s picture

From Moshe, various other paging schemes we could implement: http://www.mysqlperformanceblog.com/2008/09/24/four-ways-to-optimize-pag...

bjaspan’s picture

I read semi-quickly through this patch and in general it seems like a fine idea. This is basically a local instead of a global query_alter hook. If we had a Real Language (tm), one could even pass an anonymous class here... :-) I haven't thought about it carefully nor tried to run the code.

One question: It isn't obvious to me whether extenders should before or after query_alter. Why did you make that decision?

bjaspan’s picture

A couple more questions (these are coming after I fell asleep on the couch, woke up, and just before actually going to bed, so pardon me if they make no sense):

* Why is this better/different than a query tag (e.g. 'pager' or 'tablesort') triggering behavior in query_alter?

* You mentioned in email that you think the extender needs a reference to the query object to be sufficiently powerful. I thought, "well, just pass the query into the extender operation methods." Of course, now that I'm writing this, I see that is what you are doing. Do you still think the circular reference is necessary? If so, perhaps the answer to this question answers the previous one.

Crell’s picture

Potentially pager and tablesort could possibly be done with the alter hook, with the extra methods passed in as metadata. However, that strikes me as quite ugly. I also really really want to use this mechanism to allow Views to extend the query builder in core, and that can NOT be done with just query alter. It needs extra methods on the query object like ensureTable(), which would be a nightmare to do via query_alter. That's also a case where the circular reference will be necessary, since the extender would have to do operations to the query's structure before execute(), and passing the query object in to every method call is just silly.

Extenders before or after alter... Um. I may have had a reason for the way I did it. I don't remember what it is. I am open to arguments either way. :-)

Damien Tournoud’s picture

Status: Needs review » Needs work

- Regarding bjaspan#8: it makes more sense to do the query_alter before the extender: imagine a 'node_access' hook_query_alter() modifying the base query, but not the count query, because the pager extender has been executed before.

- It makes no sense to me to remove the extenders on count query (has it will make that query count different results than the base query). The pager extender should be smart enough to remove itself from the count query.

Damien Tournoud’s picture

One other thing: the order in which the extenders are executed is critical, but there is no such notion of order right now.

Crell’s picture

If I don't remove the extenders on count queries, I end up with an infinite loop. I couldn't figure out a way around that other than blasting all extenders. If you can figure one out, I agree I'd prefer to not do that.

The order of extenders is the order in which they're added. At least at this point I would much rather do that then try to work out a weighting system. That can maybe come later of we have to.

chx’s picture

Please note that this is an absolutely necessity -- I have ranted long and many times about the rigidity of OOP and this would allow us to break free of that for SELECTs at least. I do not see, however, how is the extender better than the alter. Both get $this as an argument. What's the big difference then?

Crell’s picture

The differences were explained in #9 above. They are solving different problems that both involve "doing extra stuff" in very different ways.

moshe weitzman’s picture

3 hunks failed ... i'll give this a good review once it is rerolled.

I don't order of extenders as a new issue. Form API handlers are also executed in the order that they are added.

Crell’s picture

I've been thinking about this a great deal recently, and I want to propose a different approach that is more traditional OO. The usual OO response here would be a decorator:

interface SelectQueryInterface { 
  public function join();
  // ...
}

class SelectQuery implements SelectQueryInterface { ... }

class SelectDecorator implements SelectQueryInterface {
  protected $query;
  function __construct(SelectQueryInterface $select) {
    $this->query = $select;
  }
  public function join() {
    $this->query->join();
  }
}

class PagerQuery extends SelectDecorator {
  public function pagerStuff() {
    // Stuff here.
  }
}

class TablesortQuery extends SelectDecorator {
  public function tablesortStuff() {
    // Stuff here.
  }

  public function join() {
    // Do new stuff, then:
    $this->query->join();
  }
}

$query = db_select();
$query = new PagerQuery($query);
$query->join();
$query->pagerStuff();

All works great, with one caveat. You can only nest decorators if you are not adding methods, because the outer decorator doesn't know to pass on methods to the inner decorator that it doesn't know about. PHP, however, gives us a nice out using __call().

class SelectDecorator implements SelectQueryInterface {
  // ...
  public function join() {
    $this->query->join();
  }
  public function __call($method, $args) {
    call_user_func_array(array($this->query, $method), $args);
  }
}

And that way we can nest PagerQuery, TablesortQuery, ViewsQuery, or whatever we want to nest later.

This should give us almost everything we need with a cleaner and more conventional approach than the extenders I was proposing above, but with one caveat: There is an extra method call for every wrapped method and an even bigger performance hit for __call(). Mind you we are talking about sub-microsecond differences, but it is there.

However, I am going to suggest that is not, in fact, a problem. Firstly, the majority of select queries will not be using a decorator anyway so it's a moot point. Secondly, in most cases the decorator will be added only after the bulk of the methods have been called, so there won't be a performance hit. Thirdly, __call() will only be triggered if nesting a query *and* calling methods on the inner decorator. Fourthly, we're talking about a micro-optimization of sub-microsecond differences that is not along a critical path.

We'd probably also have some sort of wrappers around the "new" calls, like we do for everything else.

Thoughts? Are people open to that approach? It's much more conventional OO and sidesteps a lot of the issues I was running into before, but doesn't have a built-in namespacing. (Really, I'm OK with that. I don't see that actually being a problem in practice.)

Crell’s picture

Status: Needs work » Needs review
FileSize
47.71 KB

OK, so I will take your silence as approval. :-) Actually I did get positive feedback on the idea in IRC, so here goes. New patch attached. There's a couple of things it does, but (almost) all are necessary to do together.

- Break SelectQuery into an interface and a concrete class. Most of the documentation moves to the interface, which takes up a huge part of the patch just moving existing docs around.

- Introduce a new "Extendable" interface, which SelectQueryInterface uses.

- Introduce a SelectQueryExtender concrete class that also implements SelectQueryInterface, but just delegates everything to an inner select query object (which itself could be an extender!). See description in #16.

- Add two new extenders, one for pagers and one for tablesort. As above, I named the pager one "PagerFull" in anticipation of us adding alternate paging mechanisms later.

- Converts two core queries to use the extenders mechanism: the comment admin page (pager and tablesort) and the front page. That shows how the API works.

It sounds weird if you're not from an OO background, but the long and short of it is that the resulting API is seriously dead simple. Take a select query, call ->extend('class name') on it, and you get back the "same" query but with extra methods on it, magically. The rest is all internal. Because it returns the extending object, it is fully fluent. There's even support for database-specific versions of extenders.

While I was at it, I did 3 other things that are not strictly required, but I saw no good reason not to as I was already touching the code in question:

- Fixed a bug where SelectQuery::groupBy() did not return $this, so it couldn't chain.

- Fixed code comments that were wrong in code I was already editing.

- Made the query tagging-related methods fluent. I already had to wrap them in the extender base class, so it didn't make sense to wait for another patch for just one line that was in code I was already tweaking.

There's only one small caveat at this point, which I'm not sure is a show-stopper: If you want to "stack" PagerFull and TableSort extenders, and you have a field that then has an alias, then PagerFull must be added before TableSort is. That's because PagerFull needs to use a count query, and TableSort adds to the count query, and countQuery() is no longer wiping extenders. If you do it the wrong way around, then TableSort will add an orderBy() to the count query after countQuery() has already removed the ordering, and it will be an orderBy on a field that countQuery() stripped out (since count queries shouldn't need fields or ordering). I'm not overly concerned about this oddball case at the moment, especially as the rest of it seems to be working really well. :-)

Reviews please!

Status: Needs review » Needs work

The last submitted patch failed testing.

Crell’s picture

Status: Needs work » Needs review

I think that was a test bot hiccup.

Status: Needs review » Needs work

The last submitted patch failed testing.

Crell’s picture

Status: Needs work » Needs review
FileSize
48.02 KB

OK, the bot was right. Some whitespace breakage. Bah.

Attached patch is a reroll, plus some doc cleanups per webchick in IRC. PagerFull is renamed PagerDefault. Nothing else really major changes.

moshe weitzman’s picture

From an API perspective, this is terrific. The change to node_page_default() is simple and lovely.

Crell’s picture

From webchick in IRC;

- Typo when defining what PagerDefault_mysql is. There's still mention of PagerFull.

- comment.admin.inc block, it looks like $select->addField('c', 'name', 'registered_name'); maybe should be on table u, not c. Probably a conversion typo on my part.

- Needs unit tests for pager and tablesort. Damned if I know how to do that properly since they interact with $_GET. :-) She suggested doing actual HTTP requests back to the server on a dummy page that uses tablesort and pager and regexing for classes, and that would be enough.

I'll try to take care of those soon.

Crell’s picture

Oh yeah, and she wants to move extend() to later in the call chain for performance, which ranks above having all of the "what this query is" stuff together.

Crell’s picture

Oh yeah, and she wants to move extend() to later in the call chain for performance, which ranks above having all of the "what this query is" stuff together.

Status: Needs review » Needs work

The last submitted patch failed testing.

chx’s picture

+  public function extend($extender_name) {
+    $override_class = $this->connection->driver();
+    if (class_exists($override_class)) {
+      $extender_name = $override_class;
+    }
+    return new $extender_name($this, $this->connection);
+  }

given that $override_class becomes mysql for that database, something is missing here. The comment in the interface says

PagerDefault_mysql is the MySQL-specific override for PagerFull.

and I say, huh? Some discussion with Larry lead to $override_class = __CLASS__ . $this->connection->driver(); which makes a lot more sense especially if the comment is fixed too...

mikey_p’s picture

subscribing

Crell’s picture

FileSize
52.96 KB

So I've gotten a bit distracted by other things, so I'm going to post what I've got. This should be everything we need minus unit tests, because this is a hard thing to unit test. If someone wants to finish it off, I would be most appreciative. :-)

I have a pager query unit test partially written. It works, when the result set has an odd number of elements in it. If it's an even number, as near as I can tell the query works fine but the unit test does not. (And several other tests break if we just add more data to the existing test tables.) Someone needs to fix my unit tests, please. Tablesort also still needs a test or two.

mikey_p’s picture

In reference to $limit in PagerDefault:

+   * The number of elements per page to allow.

This should probably say items, or something other than elements, since elements is used to refer to the number of pagers found on a given page, no the number of items within a pager.

Same thing at limit() in PagerDefault.

+    // A NULL limit is the "kill switch" for pager queries.
+    if (empty($this->limit)) {
+      return;
+    }

In execute of PagerDefault, I don't understand the "kill switch" check on $this->limit, what would this be used for?

+    global $pager_page_array, $pager_total, $pager_total_items;

Why are these globals? Is there another place that these items are used? Seems like maybe these should be static, and we should clean up the names at least (as they are all arrays).

+  public function setHeader(Array $header) {
+    $this->header = $header;
+    return $this;
+  }

This needs some docs. Specifically need to know what format we need $header in, looks like an array of of arrays with 'data' and 'field' as keys.

+  protected function init() {
+    $ts = $this->order();
+    $ts['sort'] = $this->getSort();
+    $ts['query_string'] = $this->getQueryString();
+    return $ts;
+  }

I don't see anywhere that we are using getQueryString right off, is this used somewhere else?

+  /**
+   * Determine the current sort direction.
+   *
+   * @param $headers
+   *   An array of column headers in the format described in theme_table().
+   * @return
+   *   The current sort direction ("asc" or "desc").
+   */
+  protected function getSort() {
+    if (isset($_GET['sort'])) {
+      return ($_GET['sort'] == 'desc') ? 'desc' : 'asc';
+    }
+    // User has not specified a sort. Use default if specified; otherwise use "asc".
+    else {
+      foreach ($this->header as $header) {
+        if (is_array($header) && array_key_exists('sort', $header)) {
+          return $header['sort'];
+        }
+      }
+    }
+    return 'asc';
+  }

It looks like this will only return the sort for the first column that has a sort specified. I'm not sure why this isn't part of the default returned from order(), which is where we get the field name from. It's hard to follow where the default values are coming from here and how the $headers array is processed. This could probably use some cleanup. I may just need to spend some time looking at how the current tablesort works, but it seems like we may be able to include some improvements here.

I only made it through these first two items. I'm not super into DBTNG so I'll have to look at query.inc and select.inc later. node.admin.inc looks good but I haven't fully reviewed it yet.

Also, comment.admin.inc has conflict markers in it.

Crell’s picture

The API and internal implementation of PagerDefault and TableSort are taken directly from the current procedural implementations and will not change in this patch. They both seriously suck, but trying to make them not suck as part of this patch would harm too many cute and adorable felines. Please just focus on getting the unit tests working, and we can clean up the internal logic flow in later patches.

I'll reroll without the conflict markers in comment.admin.inc shortly.

Crell’s picture

FileSize
51.21 KB

Reroll.

webchick’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch failed testing.

dmitrig01’s picture

+ $query = db_select('test', 't');
there's a tab

mikey_p’s picture

I re-rolled for failing hunks in comment.admin.inc, cleaned up a bunch of tabs, and fixed the pagerquery tests that were failing.

Couple more comments before I start on some tests for the tablesort.

How sure are we of test data for the DB tests? If it is guaranteed to be consistent then this floor isn't really needed. If it is actually needed, then these tests will fail as the last page will not have the correct number of items on it (equal to $limit):

    $limit = 2;
    $count = db_query("SELECT COUNT(*) FROM {test}")->fetchField();

    $correct_number = $limit;
    $num_pages = floor($count / $limit);
    for ($page = 0; $page < $num_pages; ++$page) {
      $this->drupalGet('database_test/pager_query/' . $limit, array('query' => array('page' => $page)));
      $data = json_decode($this->drupalGetContent());

      if ($page == $num_pages) {
        $correct_number = $count - ($limit * $page);
      }

      $this->assertEqual(count($data->names), $correct_number, t('Correct number of records returned by pager.'));
    }
mikey_p’s picture

FileSize
49.21 KB

Tried to use the preview button to attach...trying again

Crell’s picture

The test data in the test tables is entirely under the control of the database_test module. Several of the other tests are dependent on the existing data and the number of records. Of course, that does run into a problem when the pager test would be better with an odd number of elements but the table has an even number in it, which is where it was when I last was working on this patch before getting distracted by handlers. ;-)

mikey_p’s picture

Yeah, I'll try to re-roll this with a test against another one of the test tables with an odd number of rows, and add something to check that the correct items are coming up on each page, not just the correct number of items.

Would you recommend a similar approach to the tablesort test? In terms of a path with parsing returned json for accuracy?

Crell’s picture

Actually I think the json wrapping is probably overkill. Just faking out $_GET is probably enough, as long as we set it back afterward at the end of the test to avoid confusing simpletest. (Yet another reason we need proper environment objects, but that's another issue for another time. :-) ) That goes for pager as well as tablesort, but whatever gets it working and passing is OK by me at this point. this is a critical blocker patch that has to get in by DCDC so that we can do core conversion work there. :-/

mikey_p’s picture

Status: Needs work » Needs review
FileSize
53.65 KB

I don't know if this is the best approach to the tablesort tests, but it is passing here, and while it seems brittle, I suppose that might make it a good test of the functionality in some ways.

Setting to needs review to see if the testbot will pick this up.

Crell’s picture

$num_pages * $limit < $count ? $num_pages : $num_pages--;

It took me a while to figure out what the heck this was doing. Clever, but let's go with a traditional if() statement.

Other than that, this looks great and the bot likes it. As soon as that's changed please go ahead and mark it RTBC to get webchick's attention. :-)

chx’s picture

Status: Needs review » Needs work

Mandating ->extend('Pagerdefault') exposes way too much from the implementation and there should be (much) simpler ways to use a pager.

Crell’s picture

Status: Needs work » Needs review

Suggest one that doesn't make DBTNG dependent on hooks. This has already been extensively worked through and approved. It can't get much simpler when you can swap in any possible class. (This mechanism supports any number of pager mechanisms, or any other type of extender. I've been talking to mbutcher about direct XML mapping extenders, too.)

chx’s picture

Status: Needs review » Needs work
  1. No matter how we invoke it, naming this PagerDefult just so that there can be other pagers seems overengineering to me. if it's simply Pager then you can still have, say, AlphaPager.
  2. About invoking, why not add 'pager' => TRUE to $options and then check whether there is a class with that name extending us then act accordingly.
mikey_p’s picture

FileSize
53.83 KB

@chx if we move it to a setting in options, where do we want to put limit settings?

This is a patch that at least fixes the question crell had and adds a comment there.

Crell’s picture

Status: Needs work » Reviewed & tested by the community

I am happy with the code in #46 then if the bot is.

chx: Stuffing more crap into the options array is no good either. It means you MUST know at creation time what extenders you're going to want. That means a query alter implementation can't extend a query (say, to turn it into a pager, or do something else with extenders that I can't even think of yet). It also means you have to construct the array first, which is ugly, compared to just chaining your way through and the object getting more powerful as you go.

As for the name of the pager class, it's more for the engine type. We could have a pager that implements one of the previously mentioned alternate algorithms, for instance. Or one that does a sliding pager. "Pager" vs "PagerDefault" is extraneous bikeshedding at that point. Please don't hold back a patch that is blocking a lot of core conversion work on a bikeshed.

chx’s picture

Let it be. We can always do more later.

Dries’s picture

From an OOP point of view this code does the right thing, but boy, it adds 800+ lines of code/docs. Ah, the wonders of the decorator pattern ... ain't it "beautiful"? ;-)

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks!

Crell’s picture

Thanks, Dries! And no, it just moves 800 lines of code/docs around. :-) 95% of the comments were already there, they're just moved out into an interface where they should have been to begin with.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

swentel’s picture

Status: Closed (fixed) » Needs work

Paging with the DBTNG conversion seems broken now: if you have more than 50 comments or log entries you simply get no paging, but a very long list .. I'm reopening this issue as I can't find anything related to this for now, of course correct me if this isn't the fault of this patch :)

Crell’s picture

Status: Needs work » Closed (fixed)

This issue was for the extender mechanism itself, which we know works since the unit tests pass. :-) If there is a bug with the PagerDefault extender specifically, open a new issue for that. If the issue affects only certain pages, then it may be related to the ongoing work converting core over to the new API. Open an issue for the appropriate core component. Thanks.