While migrating lots of users (i.e. calling user_save() over and over), we noticed that performance and memory utilization were both poor. We've traced that a some seemingly useless call to field_attach_form() in user_save(). This patch removes it. That code says that it exists to help keep unwanted fields out of user.data but I fail to see how adding more stuff to $edit could achieve that. The specified goal of field_attach_form is "Add form elements for all fields for an entity to a form structure.". Thats not cleaning anything.

This patch also removes a useless user_load later on since we explicitly have a new user who's got nothing data to load anyway. Whats needed is a stub user object instead.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

chx’s picture

Title: Two performance problems with user_save() » user_save should not use field_attach_form just to figure out fields attached
FileSize
1.79 KB

This patch is two issues in one. Let's split them. I think this is the proper solution to the first problem.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Actually, that field_attach_form() did eventually do something useful. chx reworked it so it is fast. RTBC. thanks.

sun’s picture

I quickly scanned user.test, and it seems like we don't have any tests for {users}.data handling... :-/

Dries’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs tests

Committed to CVS HEAD. Thanks. We should have tests so marking this 'needs work'.

moshe weitzman’s picture

Priority: Critical » Normal
yched’s picture

+++ modules/user/user.module	2010-02-22 04:43:34 +0000
@@ -385,11 +378,19 @@ function user_save($account, $edit = arr
+      foreach (field_info_instances('user', $bundle) as $instance) {
+        $field = field_info_field_by_id($instance['field_id']);
+        $field_names[] = $field['field_name'];
+      }

Should be doable with a simpler

foreach (field_info_instances('user', $bundle) as $instance) {
  $field_names[] = $instance['field_name'];
}

?

Powered by Dreditor.

catch’s picture

I'm borderline spamming, but please add to the hate for user->data on #347988: Move $user->data into own table and not least #86299: Add "current password" field to "change password form", which also show the lack of tests.

catch’s picture

Title: user_save should not use field_attach_form just to figure out fields attached » Remove magical fairy saving of cruft from user_save()
Status: Needs work » Needs review
FileSize
2.73 KB

Given we were saving passwords in plain text for a couple of weeks there, I think it's time to remove the magic $data handling in user_save().

If you want to save stuff in there, you can add to $account->data in hook_user_presave(). Modules shouldn't have to remove their own form values just to avoid security holes or bloating of the serialized array. Patch does not remove the column, or the automatic loading into the user object, just the magical saving of cruft.

catch’s picture

FileSize
4.49 KB

Found more cruft. This is like a cancer.

catch@catch-laptop:~/www/7$ diffstat user_data.patch
user.module | 44 --------------------------------------------
user.pages.inc | 2 --
2 files changed, 46 deletions(-)

catch’s picture

FileSize
6.42 KB

Fixed api docs. Moshe reminded me that user_cancel and contact module make use of the user->data 'feature', however I'm pretty sure user_cancel doesn't use the magic form handling, contact module might though. Let's see what breaks.

catch@catch-laptop:~/www/7$ diffstat user_data.patch
user.api.php | 14 ++++----------
user.module | 44 --------------------------------------------
user.pages.inc | 2 --
3 files changed, 4 insertions(+), 56 deletions(-)

sun’s picture

How much would break if we didn't remove it, but _move_ it into $user->data for real? Thus, only what's in $user->data gets that handling, nothing else.

moshe weitzman’s picture

@sun - we already have a patch that does more than that. Its a good patch for all the reasons in #8. Also see its suggestion of a presave for modules that really want it. I don't see much sense in putting in more effort to handle POST of edit['data'] field

why is bot not testing latest patch?

catch’s picture

Yeah I stick by #8, except you need to add to $edit['data'] not $account->data I think. I don't think we need to offer any more than that, and it's way better than requiring everything else to manually remove their stuff.

Status: Needs review » Needs work

The last submitted patch, user_data.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests +Performance
FileSize
7.8 KB

Fixed the tests.

What this patch does:

1. Removes the behaviour of saving whatever happens to be in user form submissions into $user->data unless you explicitly take it out yourself. http://drupal.org/node/86299#comment-2669278 would have been one of the very worst core security vulnerabilities ever, had it not been caught during alpha.

2. Does not remove the $user->data column from {users} or the automatic loading of it into user objects. The only API change is that you now have to explicitly set $edit['data']['mymodulekey'] in hook_user_presave() instead of removing stuff there, if you want it to be saved.

3. Removes all the associated cruft we had accumulated in user_save() and elsewhere to avoid saving crap into this serialized field.

4. Is a performance improvement, in addition to security hardening, because it saves a drupal_write_record() on every user save - this existed only due to having to account for the wonky behaviour . We have shockingly bad insert performance in core which I'm trying to fix at the moment - it currently takes around 13 queries to do one user_save() - and that's with no contrib modules inserting their data.

catch@catch-laptop:~/www/7$ diffstat diediedie.patch
 contact/contact.module |   14 ++++++-------
 user/user.api.php      |   14 +++----------
 user/user.module       |   51 ++++++-------------------------------------------
 user/user.pages.inc    |    2 -
 4 files changed, 18 insertions(+), 63 deletions(-)

Status: Needs review » Needs work
Issue tags: -Performance

The last submitted patch, diediedie.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
Issue tags: +Performance

#15: diediedie.patch queued for re-testing.

sun’s picture

Priority: Normal » Critical
+++ modules/contact/contact.module	19 Mar 2010 06:19:47 -0000
@@ -241,6 +234,13 @@ function contact_form_user_profile_form_
+function contact_user_presave(&$edit, $account, $category) {
+  $edit['data']['contact'] = isset($edit['contact']) ? $edit['contact'] : variable_get('contact_default_status', 1);;
+}

Hm. Thinking out loud...

Doesn't this mean that $edit -- which may be $form_state['values'] when coming from a form submission -- must always have all properties assigned? I'm not 100% sure, but I was under the assumption that 'data' always contained the existing values already before...? This code looks like the 'contact' property's value is lost when $edit doesn't contain it?

Do we need to also account for $account->contact for programmatic updates?

Further, is it a good idea to store default configuration values when no user configuration data is given? I'd say that global defaults should be accounted for at run-time, no?

Lastly, very minor: duplicate semicolon here.

+++ modules/user/user.pages.inc	19 Mar 2010 06:20:10 -0000
@@ -287,8 +287,6 @@ function user_profile_form_validate($for
-  // Remove unneeded values.
-  form_state_values_clean($form_state);

We can still keep this though - just removes all button elements from the form values.

Powered by Dreditor.

Status: Needs review » Needs work

The last submitted patch, diediedie.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
8.5 KB

In drupal_write_record(), if $object->field isn't set, then it won't be updated. However I guess it's possible that custom user forms might set partial $edit['data'], so we now pre-build the array from any existing information in user_user_presave().

Programmatic updates should also be covered due to the same change.

"Further, is it a good idea to store default configuration values when no user configuration data is given? I'd say that global defaults should be accounted for at run-time, no?"
I don't think it's a good idea, but that's what contact module does currently. I am only interested here in not breaking contact module, not in fixing its obvious deficiencies. Please open an issue for that though, would keep the data array cleaner.

Added back value cleaning, also removed double semicolon.

sun’s picture

Thanks!

+++ modules/user/user.module	19 Mar 2010 08:02:37 -0000
@@ -1156,15 +1116,28 @@ function user_user_presave(&$edit, $acco
+  // Prepopulate $data with the current value of $account->data. Modules can add
+  // to or remove from this array in their own hook_user_presave()
+  // implementations.
+  if (!empty($account->data)) {
+    $data = unserialize($account->data);
+    foreach ($data as $key => $value) {
+      $edit['data'][$key] = $value;
+    }
+  }

This should happen in user_save(), I think (due to module weights & hook invocation order)

+++ modules/user/user.pages.inc	19 Mar 2010 08:02:42 -0000
@@ -277,6 +277,9 @@ function user_profile_form($form, &$form
+  // Remove unneeded values.
+  form_state_values_clean($form_state);

@@ -287,8 +290,6 @@ function user_profile_form_validate($for
-  // Remove unneeded values.
-  form_state_values_clean($form_state);

Just keep it where it was. It's irrelevant for this issue.

Powered by Dreditor.

catch’s picture

FileSize
7.52 KB

Works for me.

sun’s picture

Status: Needs review » Reviewed & tested by the community

Thank you!

Berdir’s picture

Further, is it a good idea to store default configuration values when no user configuration data is given? I'd say that global defaults should be accounted for at run-time, no?

This is tricky, because it would change the current behavior when the global default changes. Then all users that had the same value as the global one would change to the new default.

chx’s picture

Status: Reviewed & tested by the community » Needs work
    if (!empty($account->data)) {
      $data = unserialize($account->data);
      foreach ($data as $key => $value) {
        $edit['data'][$key] = $value;
      }
    }

either just $edit['data'] = $data or go with if (!isset($edit['data'])) $edit['data'] = array(); $edit['data'] += $data (and actually the data variable is not even necessary just move the unserialize in).The former destroys the data passed in via edit the latter keeps it. Oh and if you want to keep your behaviour then array_merge.

Dries’s picture

Status: Needs work » Fixed

Committed to CVS HEAD. Thanks.

sun’s picture

Status: Fixed » Needs work

d'oh! :)

sun’s picture

Status: Needs work » Needs review
FileSize
939 bytes

So how about this?

Status: Needs review » Needs work
Issue tags: -Performance

The last submitted patch, drupal.user-data.28.patch, failed testing.

aspilicious’s picture

Status: Needs work » Needs review
Issue tags: +Performance

#28: drupal.user-data.28.patch queued for re-testing.

catch’s picture

Priority: Critical » Normal

Hmm, I think I'd prefer chx's latter approach except with the array_merge() - there's no reason to destroy $edit['data'] here.

David_Rothstein’s picture

Status: Needs review » Needs work

This appears to have caused #748976: "Show/hide descriptions" links no longer work -- and in order to get that working again, we'll actually need to explicitly not destroy $edit['data'] here (at least as far as I can tell).

Also, we'll want to update the API docs for user_save(), since those currently say the following which is out of date:

 * @param $edit
 *   An array of fields and values to save. For example array('name'
 *   => 'My name'). Keys that do not belong to columns in the user-related
 *   tables are added to the a serialized array in the 'data' column
 *   and will be loaded in the $user->data array by user_load().
 *   Setting a field to NULL deletes it from the data column, if you are
 *   modifying an existing user account.

Also, are there still situations where a not-fully-loaded user object is ever passed in to user_save()? - i.e., what happens with the following code if $account->data hasn't been loaded yet?

    // Prepopulate $edit['data'] with the current value of $account->data.
    // Modules can add to or remove from this array in hook_user_presave().
    if (!empty($account->data)) {
      $data = unserialize($account->data);
      foreach ($data as $key => $value) {
        $edit['data'][$key] = $value;
      }
    }

Doesn't that mean we'll lose some of its saved values?

Finally, a quick grep suggests there is more stuff left over that could be cleaned up here:

In profile_save_profile():

    // Mark field as handled (prevents saving to user->data).
    $edit[$field->name] = NULL;

Probably isn't needed anymore?

Also in user_register_submit():

  // The unset below is needed to prevent these form values from being saved as
  // user data.
  form_state_values_clean($form_state);
  unset($form_state['values']['notify']);

The same, I assume.

sun’s picture

Priority: Normal » Critical

I assume this also caused some weird double-serialization - after a fresh install, I get the following in {users}.data for uid 1:

s:6:"a:0:{}";

Of course, along with the error message:

Warning: Invalid argument supplied for foreach() in drupal_unpack() (line 1175 of includes\bootstrap.inc).
jbrown’s picture

jbrown’s picture

Ripping off http://drupal.org/node/745668#comment-2747534 then http://drupal.org/node/721436#comment-2740728 fixes it, but you have to reinstall, otherwise saving uid 1 results in:

Fatal error: Cannot unset string offsets in /home/jonny/sites/drupal-7.x-dev/modules/user/user.module on line 400

eojthebrave’s picture

Also getting the errors reported in #33. Attached patch fixes the double serialization issue.

jbrown’s picture

Status: Needs work » Needs review

Works for me.

Nice work eojthebrave!

catch’s picture

Status: Needs review » Reviewed & tested by the community

Let's get that in to fix the critical bug, then we have extra cleanup to do once that's in, but I think 99% of that cleanup is cruft removal now so that shouldn't hold up the quick fix.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

Committed the fix for the double initialization. Moving to 'needs work' for follow-up work.

catch’s picture

FileSize
2.52 KB

#754446: $user->data serialization unpack problem was duplicate - it looks like at the moment $account->data can be either an array or serialized when passed into user->save which is just lovely....

chx didn't like marcingy's patch on that issue, but didn't say why, and just switching it to always unserialize breaks tests.

So we have achoice of either checking for both a serialized or unserialized ->data in user_save(), trying to track down every possible place it could be one or the other, or the change to drupal_unpack(). The former is pretty bad (although a small price to pay for removing all the other dozens of lines we had working around it), if we do the middle then we're probably only delaying contrib modules running into it, the latter I'm not sure why it was so unpopular with chx but let's at least see what the test bot says. It's possible user_load() could also do it just for itself if we don't put that in drupal_unpatck(). So re-uploading that patch alongside some of the fixes from David's #32. If this doesn't work then I'll work on one of the alternative options.

catch’s picture

Status: Needs work » Needs review

Let's see what the bot says - this isn't RTBC so please no-one mark as such.

Dries’s picture

+++ modules/user/user.module	27 Mar 2010 04:27:55 -0000
@@ -3391,10 +3388,7 @@ function user_register_submit($form, &$f
-  // The unset below is needed to prevent these form values from being saved as
-  // user data.
   form_state_values_clean($form_state);
-  unset($form_state['values']['notify']);

Good lord, I'm already loving this patch. :)

catch’s picture

FileSize
3.03 KB

Added @return documentation to drupal_unpack() which it had previously lacked. Also confirmed that marcingy's steps to reproduce in #754446: $user->data serialization unpack problem no longer trigger the error. Will try to grab chx so he has a chance to complain (again) about this approach but seems cleanest to me (apart from ripping out $user->data altogether which it's a bit late for now.

rfay’s picture

I still get

Warning: Invalid argument supplied for foreach() in drupal_unpack() (line 1184 of /home/rfay/workspace/d7git/includes/bootstrap.inc).

on every page even with #43 applied. I think it's non-negotiable to solve the unpack warning. I don't have time to debug it right now, but thought it was important to apply the patch and check in.

catch’s picture

@rfay - did you reinstall? I didn't get any such error..

rfay’s picture

@catch: I did not reinstall, but did re-save the current user. I thought that would do it. I hate to reinstall and lose the state that is causing the problem.

Is it essential to reinstall? I'll be happy to if that's fundamental to the fix.

marcingy’s picture

@rfay yes a reinstall is essential as once the data in $user->data is corrupted the unpack will always fail. #754446: $user->data serialization unpack problem provides information on how to recreate the state with great ease on a code base without this patch applied.

rfay’s picture

OK (#44-47): I reinstalled and no longer have the error. I have a similar setup and am doing similar things that should result in similar results, but since I didn't look deeply into this, I just have to trust you (with good reason, of course) that this is a real fix.

Please disregard my interruption :-)

catch’s picture

FileSize
2.65 KB

Same patch except this time no change to drupal_unpack(), we can just do this in the user entity controller and keep it self contained there.

eojthebrave’s picture

+++ modules/user/user.module	28 Mar 2010 12:21:54 -0000
@@ -239,6 +239,10 @@ class UserController extends DrupalDefau
+      // unserialized array. This ensures we can always safely resialize it

resialize should be reserialize

Otherwise this looks good to me. Comment cleanup and removal of more unnecessary code.

132 critical left. Go review some!

catch’s picture

FileSize
2.66 KB

Fixed the typo.

andypost’s picture

mfb’s picture

It's just not ideal, performance-wise, to unserialize $user->data twice in a row, once by drupal_unpack() and once by unserialize(). On the other hand this is a critical bug fix so maybe it should be applied and cleaned up later?

catch’s picture

We run unserialize dozens of times each request for menu links etc, an extra one on users is fairly minor IMO, often there's only one user, rarely more than a few loaded per page. The security hardening and user_save() performance outweigh that IMO.

If we were to commit this as is, I think there's still a remaining issue with #748976: "Show/hide descriptions" links no longer work, haven't had time to look into that one yet, although I'm frankly a bit surprised that's being stored in user->data in the first place. If we were somehow able to remove the extra unserialize() it'd be nice though. One option would be to just keep all $user->data in $user->data and lose the drupal_unpack() - since we've change the API slightly already I'm not sure if that'd be acceptable, but it'd solve some of these remaining issues.

mfb’s picture

The solution for #748976: "Show/hide descriptions" links no longer work is that you also need to unserialize $user->data in session.inc _drupal_session_read(), i.e.

  // We found the client's session record and they are an authenticated,
  // active user.
  if ($user && $user->uid > 0 && $user->status == 1) {
    // This is done to unserialize the data member of $user.
    $user = drupal_unpack($user);
    $user->data = unserialize($user->data);

More double unserialization which again isn't ideal :)

catch’s picture

That convinces me we should cull the drupal_unpack() and have $user->data be the canonical place to find this stuff. That synchronises _save() and _load(), means we don't do double work etc. Given we made modules using this do a bit more work on _save() already, slightly changing the object structure doesn't seem too bad to fix the bugs. If someone wants to beat me to a patch that'd be great, otherwise I'll try to get to it.

David_Rothstein’s picture

FileSize
3.18 KB

Hm, what that really makes me wonder is why two totally different mechanisms are used to populate the global $user object on login vs. actually loading a user account.... however, I guess that's another can of worms.

We do need to fix the serialization, but as per #32, the main thing we need here to get #748976: "Show/hide descriptions" links no longer work working is to respect the $edit['data'] that is passed in to user_save(), rather than blowing it away. I've rerolled the patch and added that one-line fix. No other changes for now, although I definitely agree with the direction proposed in #56. The magical drupal_unpack() thing is really weird and messes up the namespace pretty badly. (I can only imagine what happens at the moment if someone tries to put something in $account->data['roles'], for example.)

mfb’s picture

Looks like drupal_unpack() is also used a couple times by comment module to "unpack" $user->data into the $comment object...

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

re: #57, i tried to standardize the building of global $user at #361471: Global $user object should be a complete entity. Read it and weep :)

Latest patch looks rtbc to me.

mfb’s picture

Status: Reviewed & tested by the community » Needs work

$user->data still needs to be unserialized in session.inc, otherwise you are array merging a string and array, when $user object was built by _drupal_session_read() as opposed to user_load()

catch’s picture

Status: Needs work » Needs review
FileSize
3.8 KB

Here's the same patch with the addition of an unserialize() in session.inc

I've opened #763048: Remove drupal_unpack($user) due to namespacing collisions to tackile the namespacing, left it as critical, although it's more like 'major' if we get that priority eventually. We should get this in first though.

Dries’s picture

Status: Needs review » Fixed

This looks good now. Committed to CVS HEAD. Thanks.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

rfay’s picture

Assigned: moshe weitzman » rfay
Priority: Critical » Normal
Status: Closed (fixed) » Needs work
Issue tags: +API change, +Needs documentation updates

It looks to me like this important change didn't get documented in the module upgrade guide. Where did it get documented? With multiple commits and 63 comments it's hard to figure out what the net result was.

moshe weitzman’s picture

I think the upgrade should state: The user edit form no longer automatically saves unrecognized properties to the users.data column. Modules relying on this behavior are recommended to create own table for storage of user properties. The users.data column is deprecated and will be removed in Drupal 8.

Jackinloadup’s picture

Is there a task issue already in place to make sure that the users.data column is removed in Drupal 8?

rfay’s picture

Status: Needs work » Fixed

Status: Fixed » Closed (fixed)
Issue tags: -Performance, -API change, -Needs documentation updates

Automatically closed -- issue fixed for 2 weeks with no activity.