Support from Acquia helps fund testing for Drupal Acquia logo

Comments

catch’s picture

Status: Needs work » Needs review
FileSize
13.19 KB

All tests pass, profile module implements the new hook_user_load(), profile_browse() uses multiple loading.

With 50 users on the site, profile module enabled but no fields added, the benchmarks show no change one way or the other on either /user/1 or /profile.

Marking to needs review in the hope of some independent benchmarks :) Will try to benchmark on a real database with some real profiles myself soonish.

catch’s picture

FileSize
13.19 KB

without debug

catch’s picture

FileSize
13.18 KB

Found a strange issue.

_menu_load_objects is sending along more than one argument to user_load(), which means it's being called with $reset = array('user', '1')- resulting in two full loads on user/1 instead of one. Not sure why that is.

Benchmarks are very inconclusive in any direction, however even with the above bug, devel shows a big drop on queries - user_load() hasn't got a static cache currently so there's a drop in queries there too.

Real profile and user tables from a d6 site - the site has 4176 users, 25300 rows in profile_value 29 rows in profile_field.

/profile after a couple of page refreshes, devel reports:

HEAD:
Executed 85 queries

Patch:
Executed 26 queries

On user/1
HEAD:
Executed 44 queries

Patch:
Executed 26 queries

Marking to needs work pending some kind of answer on that menu load issue. Attached patch does some small cleanup in profile.pages.inc to make use of the static better.

catch’s picture

Status: Needs review » Needs work
catch’s picture

FileSize
13.22 KB

Re-rolled to use drupal_strtolower() instead of strcasecamp() per Damien, when filtering the static cache - which means we're multi-byte safe now. Although this is yet another reason to add a lowercase column to {users} and {term_data} since it could cause issues on obscure collations.

moshe weitzman’s picture

Status: Needs work » Needs review

Looks good. The patch occassionally does user_load(array('uid' => $edit['uid'])). You don't need to build a whole conditions array for this. Just pass the integer to user_load().

catch’s picture

I left the existing parameters for user_load() apart from adding the reset argument - so where those calls are in the patch it's likely lines which are moved around rather than anything introduced. Whereas it made sense to enforce a single nid or vid on node_load, users have unique names, e-mail addresses and passwords, so there's no reason to remove that support here. I'd very much like to kill all these superfluous array('uid' => $uid) in core, but assuming we keep things roughly as they are, it doesn't need to be part of this patch - they're already superfluous.

Note that while I'd love to get some more reviews and/or benchmarks of this, there's still the outstanding menu bug to track down (which may or may not be a direct result of this patch), but leaving status as is for now.

pwolanin’s picture

subscribing

nielsvm’s picture

Subscribing.

R.Muilwijk’s picture

subscr

catch’s picture

Fixed a typo, added hook documentation (hook_user docs haven't been de-opped yet). So this is now blocked by the LOWER() issue and that menu weirdness with $map getting passed to user_load() as the second parameter.

Removing LOWER() entirely is over at #279851: Replace LOWER() with db_select() and LIKE() where possible - needs refreshing but still remains the cleanest way to deal with this without breaking either other dbms or static caching.

catch’s picture

FileSize
14.35 KB

Now with patch.

drewish’s picture

Status: Needs review » Needs work

On user_load_multiple():

  • The int the PHPDoc on user_load_multiple() is missing whitespace. I'd also suggest stealing the @param descriptions from those we wrote on the file_load_multiple() patch.
  • Seems like drupal_strtolower($conditions['name'] != drupal_strtolower($user_values['name'])) has some incorrect parens. The first drupal_strtolower() doesn't close. Also I'd like a comment explaining why we're not doing it in the database.
  • $query = db_select('users', 'u'); $user_fields = drupal_schema_fields_sql('users'); $query->fields('u', $user_fields); can become $query = db_select('users', 'u')->fields('u');
  • Screwy indenting on:
    +    // Invoke hook_user_load() on the users loaded from the database
    +    // and add them to the static cache.
  • I don't like that the re-rodering of the users at the end is done into $passed_uids we should do that into a new variable $ordered_uids or something like:
      // Put the array of users we're going to return into the same order as the
      // uids the caller specified. array_intersect_key() preserves the order and
      // removes any invalid uids.
      if ($passed_uids) {
        $ordered = array_intersect_key($passed_uids, $users);
        foreach (array_keys($ordered) as $uid) {
          $ordered[$uid] = $users[$uid];
        }
        $users = $ordered;
      }
    

On user_load()

  • I'd suggest getting rid of the conditions and only taking a uid.
  • Missing PHPDocs for the $reset parameter
drewish’s picture

Status: Needs work » Needs review
FileSize
32.07 KB

I went ahead and started working on this. I see why catch had left user_load() alone... the conditionals are used quite a bit. I decided to add a user_load_mail() and user_load_name() to get the two common conditionals that are also unique fields. I'm curious what people think.

drewish’s picture

Status: Needs review » Needs work

forgot to save comment.module before generating the patch. looks like one test in user is still failing but i don't have time to sort it out.

catch’s picture

There's already taxonomy_get_term_by_name() which will be a wrapper soon #343788: Taxonomy module doesn't use it's own APIs - and drupalGetNodeByTitle() in dwtc - which was again a wrapper because the tests (and almost nowhere else) used node_load(array('title' => 'foo)) so much. I think user_load_name() and user_load_mail() fit into this nicely enough. If I'm around later I'll try to get to that failing test.

drewish’s picture

maybe user_load_by_name() and user_load_by_mail() would be better function names then.

drewish’s picture

Status: Needs work » Needs review
FileSize
806 bytes

renamed the functions and got all the tests to pass.

drewish’s picture

FileSize
32.98 KB

whoops that wasn't a patch.

David Strauss’s picture

Status: Needs review » Needs work

Not to be a jerk, but there are arbitrary whitespace changes in the patch. Please clean up.

drewish’s picture

Status: Needs work » Needs review

i don't think removing trailing whitespace is a problem.

catch’s picture

FileSize
32.71 KB

drewish's changes look good, was still getting one fail in user admin tests - added a $reset=TRUE to the last user_load() and it's fine for me now. Also added $reset parameters to the _by_name and _by_mail functions. No other changes.

catch’s picture

FileSize
32 KB

Re-rolled after some changes to user module.

catch’s picture

FileSize
31.93 KB

Re-rolled for new db_placeholders replacement.

drewish’s picture

Status: Needs review » Needs work

Everything looks really good except for some left over debug code:

+$this->assertTrue(1, var_export($edit, 1));
+$this->assertTrue(1, var_export($user, 1));

and

+$this->assertTrue(1, var_export($url, 1));
bcn’s picture

Status: Needs work » Needs review
FileSize
30.05 KB

Re-roll of patch from #24, with drewish's comments from #25

mfer’s picture

Is there a reason half a line in the patch of #26 is commented out? The commented out part reads

//, 'pass' => $pass, 'status' => 1));

I realize this was in core before the patch. If there is no reason to keep it can we remove it here?

catch’s picture

Issue tags: +Performance
FileSize
29.95 KB

Re-rolled with two changes:

Killed that commented bit in dblog.test
Removed $conditions/static cache matching - so we can work on resolving the LOWER() issue elsewhere. This means if you use $conditions, then we don't get the advantage of the static cache - but user_load() doesn't have a static at all now, so we haven't lost anything.

Dries’s picture

Do we have any performance data? Number of queries is interesting but fewer queries don't necessarily translate to faster execution times. It would be best to test this on a large database (e.g. a copy of d.o).

catch’s picture

The menu system calls user_load(1, $map) somewhere on user/n. Since $map is treated as TRUE this calls the reset, and means we get two full loads on that page instead of one. Haven't tracked down where that's happening yet. This makes it hard to benchmark individual user pages until that's fixed.

I've run benchmarks on /profile with a Drupal 6 database, but they were all within margin of error - so if someone else could do some with a bigger db (and a copy of d.o would be ideal), then that'd be great.

Dries’s picture

Issue tags: +Favorite-of-Dries
Coornail’s picture

I've succesfully applied the latest patch and did some performance measure. After 50 user and 100 node: ab -c 20 -n 100 http://localhost/drupal7/

Without patch:

Requests per second:    10.38 [#/sec] (mean)
Time per request:       1926.538 [ms] (mean)
Time per request:       96.327 [ms] (mean, across all concurrent requests)
Transfer rate:          140.53 [Kbytes/sec] received

With patch:

Requests per second:    10.42 [#/sec] (mean)
Time per request:       1919.720 [ms] (mean)
Time per request:       95.986 [ms] (mean, across all concurrent requests)
Transfer rate:          141.03 [Kbytes/sec] received
catch’s picture

Coornail - thanks for the benchmarks, afaik the front page doesn't involve a call to user_load() though (or one at most) - would you mind rerunning the benchmarks on user/1 and /profile (with the profile module enabled)?

catch’s picture

So while I couldn't get much differential in benchmarks before, kcachegrind is my new favourite toy. I did two requests each patched and unpatched to account for basic margin of error checking. The results are consistent enough to be useful I think.

This is on /profile with some devel generated users, with zero profile fields configured or populated - so pretty much vanilla HEAD. The values for each row are the 1st/2nd page request. All times are microseconds and are the sum of all time spent in the function during the request.

HEAD:
Total time spent in user_load() (called 22 times):
Incl: 59,874 / 51596
Self: 4199 / 3788

Patch:
Total time spent in microseconds spent in user_load_multiple (called twice):
Incl: 4971 / 5847
Self: 1024 / 1112

Time spent in PDOStatement->execute(). Self and incl are pretty much the same here since we're in native PHP anyway - PDOStatement->execute() is the most expensive function during the page request both patched and unpatched:
HEAD:
96 calls
36,383 / 26,348

Patch:
36 calls
12,319 / 12,751

On a user/n page (this calls user_load() which in turn calls user_load_multiple() - so we just take the incl for user_load in both cases as that's the like for like comparison. I think the very small speedup here ought to be due to the new static cache, although that's not being used to it's full extent due to the user_uid_optional_to_arg() issue from #3.

HEAD:
8 calls
Incl. 21233 / 21829

Patch:
9 calls
Incl. 16229 / 14397

So with a nearly empty database, there's a few milliseconds gained per request. I think we'll see user_load() get heavier once CCK fields can be attached directly as well.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review

test bot meltdown.

catch’s picture

Status: Needs review » Needs work
FileSize
32.77 KB

No longer applied after node/8 went in. This re-rolls the obvious stuff, but I'm getting 25 failures in the user_cancel() tests which I can't track down yet.

catch’s picture

FileSize
33.19 KB

Now 15 and removed some notices found via manual testing (but strangely not picked up by simpletest - suggesting the environment and my install are behaving differently).

catch’s picture

FileSize
35.57 KB

Tracked it down to the check for $hashed_password = user_pass_rehash() in user_cancel_confirm() - attached patch with this check commented out has 100% pass. Don't yet know why this is returning differently with and without multiple load though.

drewish’s picture

this needs a re-roll now that #357403: Move user picture to managed files has landed. it'll probably take a little tinkering to refactor it so that we figure out all the file ids for the pictures and then do a call to file_load_multiple() and then associate the pictures with the users.

catch’s picture

So the issue holding this one up is that $account->login in the user cancellation tests is used to fake the URL for a cancellation e-mail - and $account->login is different between the tested environment and the user object available to the tests - this means the results of user_rehash_pass() in both places are different, and results in failures. I'm not sure why this appears with the multiple load yet, removing the static caching doesn't help, moving a test into its own class doesn't help either.

catch’s picture

Status: Needs work » Needs review
FileSize
36.2 KB

Got this passing.

+      $users += $queried_users;
+      $user_cache += $queried_users;

Turns out to be different to:

+      $users = $users + $queried_users;
+      $user_cache = $user_cache + $queried_users;

So remaining todos are some fresh performance testing, and that call to user_load where $map gets passed as the second parameter on user/n. Back to CNR though.

catch’s picture

Found the menu issue as well, user/%user/edit was being sent 'load arguments' with the full %map for no reason I can see. That's the only change in this patch. All tests should pass, and it makes benchmarking viable as well.

Benchmarks on user/1 (anonymous users given access to user profiles): ab -c1 -n1000 + devel query log summary.

HEAD:
Requests per second: 12.56 [#/sec] (mean)
Executed 55 queries in 19.46 milliseconds.

Patch:
Requests per second: 13.64 [#/sec] (mean)
Executed 26 queries in 15.74 milliseconds.

And /profile - no profile fields or user pictures on any of the users - obviously we'd see a bigger improvement when those were present.

HEAD:
Requests per second: 10.26 [#/sec] (mean)
Executed 90 queries in 33.77 milliseconds.

Patch:
Requests per second: 10.74 [#/sec] (mean)
Executed 29 queries in 15.81 milliseconds.

So user/n is just under 10% faster due to the introduction of the static - there's currently none - which means user_load and file_load issue queries multiple times on that page. And /profile with no fields or user pictures is 5% faster.

catch’s picture

FileSize
36.47 KB
catch’s picture

FileSize
36.51 KB

Renamed $user_pictures to $picture_fids following discussion with drewish in irc. Also only query for user pictures if the variable is set.

drewish’s picture

I'm happy with this but I'd love a second set of eyes.

catch’s picture

Status: Needs review » Needs work

somehow the commented out user_pass_rehash() in function user_cancel() crept back in while re-rolling, which is why the tests pass..

webchick’s picture

While I agree that...

$target_user = user_load_by_mail($target);

...is definitely more legible than...

$target_users = user_load_multiple(array(), array('mail' => $target));
$target_user = reset($target_users);

...I'm a little torn on these convenience functions (user_load_by_name() and user_load_by_mail()). It's creating multiple ways of loading users which nodes don't have. If there's one thing I hate, it's inconsistencies in the user and node APIs (and node and comment APIs and X and Y APIs, but that's a whole nother thing ;)).

catch and I went back and forth for awhile on this, and he pointed out that a function like node_load_by_title() doesn't really make sense in practical, every-day use, and none of the other keys in the node table are unique, unlike u.name and u.mail. Furthermore, Taxonomy module does offer a similar convenience function, so we are keeping consistency with at least one part of core. So I begrudgingly accept this. For now. ;) j/k

There seems to be an awful lot of this patch that uses the $reset param on user_load(). Could someone take a quick peek with Devel query log on a few user operations, and make sure we're not double-clearing the cache anywhere? Related:

+ * @param $reset
+ *   A boolean indicating that the internal cache should be reset.

Could you please recommend here when would be a good time to call this? I can't figure out why so many of the user_load() calls have it; it seems like it'd be pretty rare.

+      // Using LIKE() to get a case insensitive comparison because Crell and
+      // chx promise that dbtng will map it to ILIKE in postgres.

Let's wrap that in a TODO so that we know to check it before release. This is a shame, because we *just* got pgsql tests passing again. :(

-    $account = user_load(array('uid' => (int)$uid));
+    $account = user_load($uid);

Just an observation: we're no longer casting this and a few others to an int first. Is this okay?

-    if ($timestamp <= $current && $current - $timestamp < $timeout && $account->uid && $timestamp >= $account->login && $hashed_pass == user_pass_rehash($account->pass, $timestamp, $account->login)) {
+    if ($timestamp <= $current && $current - $timestamp < $timeout && $account->uid && $timestamp >= $account->login) {
+      //&& $hashed_pass == user_pass_rehash($account->pass, $timestamp, $account->login)) {

Um. No. :)

-    $this->assertTrue(user_load(array('uid' => $admin_user->uid, 'status' => 1)), t('Administrative user is found in the database and enabled.'));
+    $this->assertTrue(reset(user_load_multiple(array($admin_user->uid), array('status' => 1))), t('Administrative user is found in the database and enabled.'));

I find this way less legible than the original code, and that's saying something since the original code was horrid. ;) DamZ suggested something like $this->assertEqual($admin_user->status, 1) (incidentally, we should add constants for those in a separate patch)

catch’s picture

FileSize
36.6 KB

So, fixing those tests failures turns out to be a complete pain in the arse.

Here's a pastebin of what happens - basically the $account in simpletest itself, and a straight db_query() of the users table in the tested site are different. With the static cache, user->login is 0 inside simpletest and after a couple of page requests the hashed password changes. Without the static, just the hashed password changes inside simpletest. In both cases, 14 fails.

Here's a patch just to keep up with HEAD in case anyone fancies debugging this, since I've pretty much given up for now.

edit: link http://drupalbin.com/8026

catch’s picture

FileSize
36.64 KB

This one addresses the issues in #48. With the reset, it's only called where we're specifically recreating a fresh user object inside user_save() (which obviously happens in a single request), and in the test - so shouldn't be any more queries than necessary, none anywhere critical, and of course we do all those queries now already. Still 14 fails with this patch.

Berdir’s picture

I found two reason for the failing tests, but there are more..

FIrst, the following line in user.pages.inc, function user_cancel_form_submit, adds all properties of $account to $edit, including the password:

    $edit = get_object_vars($account);

This causes the user_save function to rehash that password. I don't know why the get_object_vars is even needed there, adding those two lines below fixes the issue

    // Remove the pass key, because that would re-hash it
    unset($edit['pass']);

The second issue is simple.
Some user_load calls after the user cancellation are missing the TRUE parameter to empty the static cache, for example in testUserBlock.

However, it seems as if there are more reasons for the failing tests as I still have fails but I don't have time to continue the debugging right now so I wanted to report what I've found.

catch’s picture

Berdir, thank you very, very much. I probably won't get a chance to look at this until at least later this evening, but my debugging had come to a complete dead end - looks like $edit = get_object_vars($account) was the bit I missed.

Berdir’s picture

Status: Needs work » Needs review
FileSize
38.22 KB

Got all tests working. I needed to add several $reset = TRUE parameters for two reasons:

- After the user has been created, I needed to load him with $reset = TRUE so that I had the correct $account->login as that is needed to create the password hash

- in testUserBlock and testUserblockUnpublish, $reset = TRUE was needed to get the updated $account object with $account->status = 0.

I also completely removed the get_object_vars($account) call, all tests pass fine without and it doesn't seem to do anything needed.

catch’s picture

Note that the extra $reset = TRUE are only added in the tests (and in user_save()) otherwise the static is pretty much self-contained.

There shouldn't be much more to do here - would be great if someone could give it a final look over.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

FileSize
36.01 KB

Re-rolled.

catch’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
38.04 KB

Berdir's patch from #53 is still good after the user/users revert. Re-uploaded.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
38.05 KB

hmm.

webchick’s picture

Status: Needs review » Needs work
Issue tags: +Needs documentation

I looked through this once more this evening with my nit-picky eye, and couldn't find anything. It's nice to have both nodes and users able to benefit from this performance enhancement, and user_load is actually cached now. Hooray!

Committed to HEAD. Another Favorite-of-Dries knocked off the list. :) Thanks for your hard work, everyone!

Marked CNW until this is documented in the upgrade page.

catch’s picture

Status: Needs work » Fixed
yched’s picture

Status: Fixed » Active

The patch switches the order of the field_attach_load() call and the hook_user_load() invocation. Is that intentional ?
node_load() still calls field_attach_load() first, I think it might be good to keep consistency, and state that field data have been populated by the time hook_[node|user]_load() is invoked.

moshe weitzman’s picture

Yeah, modules should get a hook after fields load. heed the yched.

catch’s picture

Status: Active » Needs review
FileSize
1 KB

Thanks yched, patched.

yched’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
924 bytes

:-)

Same patch, with the TODO note removed - field_attach_load() had a $types param previously (from node_load_multiple), but does't anymore.
We still need to do a check pass on the TODOs, but this one can go while we're looking there.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Eep! Thanks for catching that! Committed follow-up patch to HEAD. :)

c960657’s picture

Status: Fixed » Needs review
FileSize
1006 bytes

The patch missed user_external_load() that calls user_load() with an array. This gives the following error when logging in using OpenID:

Warning: array_flip(): Can only flip STRING and INTEGER values! in user_load_multiple() (line 197 of /home/chsc/www/drupal7/modules/user/user.module).

webchick’s picture

Yikes! Thanks for catching this!!

However, I don't *think* (but could be wrong) that we need to resort to a db_select() here because there's nothing dynamic about this query. Just a straight-up placeholder conversion should work, no?

c960657’s picture

This patch uses db_query() (I am not too familiar with DBTNG).

catch’s picture

Re-roll just uses the new style placeholders.

Sorry for double breakage, thanks for writing those OpenID tests c960657!

webchick’s picture

Status: Needs review » Fixed

Awesome, thanks a lot. :) Committed to HEAD.

Once the Views FAPI improvements issue gets fixed, I'll make it a priority to get those OpenID tests committed so we don't do this again. Good to know they work though! ;)

Status: Fixed » Closed (fixed)
Issue tags: -Performance, -Favorite-of-Dries, -Needs documentation

Automatically closed -- issue fixed for 2 weeks with no activity.