Password expirations and expiration warnings are handled by cron. For sites with many users (1000s or more), the cron code is problematically inefficient--it uses too much memory and takes too long to execute. The result is password expiration becomes unusable because memory limits are exceeded or execution timeouts occur. All current release branches are affected: 7.x-1.x, 7.x-2.x, and 8.x-3.x.

Need to make the expiration cron code capable of efficiently handling 1000s (if not 10000s and 100000s) of users.

Comments

aohrvetpv’s picture

Status: Active » Needs review

This looks like a design issue: A condition plug-in acts on a single user, and the condition callback expects a user object as an argument. So to evaluate a condition for many users, their user objects must all be loaded and then individually passed to the condition, which is very inefficient for large numbers of users.

Instead I think a condition should act as a filter on an array of candidate uids. The condition could take as an input the array of uids, retrieve only the data it needs to evaluate the condition, and return the uids that match the condition. For instance, for the role condition, you need only do some table joins to determine whether users have some roles, not user_load() each user. A condition could still user_load() if it really needed to.

I have little familiarity with the code, and have only cursorily looked at this problem, so I am unsure whether this suggested design change is sensible. Setting issue to Needs Review.

For this specific bug, it looks like a fix could alternatively be hacked by building faux user objects without user_load() that have only the information needed for the role condition to be evaluated.

deekayen’s picture

Status: Needs review » Active

there's no patch to review...

aohrvetpv’s picture

Version: 7.x-2.0-alpha1 » 7.x-2.x-dev
colan’s picture

elijah lynn’s picture

Priority: Normal » Major

Read that this was a blocker over at #2331599: 2.0 release roadmap?. Think this should be bumped to a major then if not critical.

greggles’s picture

I think one solution to this could be to modify the query so it only runs on users who have a role that has a policy. At least in 1.x it appears it runs for everyone.

A workaround would be to make it possible to run this via drush instead of via the normal cron interface. Then it can be run in a place where timeouts and RAM are less likely to be an issue.

I'm interested in working on improving both of those things for the 7.x-1.x branch which seems like it would mostly be be portable to 7.x-2.x. If it seems off-topic for this issue then I can move it elsewhere.

greggles’s picture

StatusFileSize
new1.84 KB

I started with an explain on the main query that seems to take a long time for me:

MySQL [cardcom]> explain select p.created, e.pid, e.unblocked, e.warning, u.uid, u.created from users u left join password_policy_history p on u.uid = p.uid left join password_policy_expiration e on u.uid = e.uid where u.uid > 0 and u.status = 1 order by p.created;
+----+-------------+-------+-------+---------------+---------+---------+---------------+--------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref           | rows   | Extra                                        |
+----+-------------+-------+-------+---------------+---------+---------+---------------+--------+----------------------------------------------+
|  1 | SIMPLE      | u     | range | PRIMARY       | PRIMARY | 4       | NULL          | 327768 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | p     | ref   | uid           | uid     | 4       | cardcom.u.uid |      1 | Using where                                  |
|  1 | SIMPLE      | e     | ref   | uid           | uid     | 4       | cardcom.u.uid |      1 | Using where                                  |
+----+-------------+-------+-------+---------------+---------+---------+---------------+--------+----------------------------------------------+

I noticed that it's using an orderby and doesn't join to password_policy_role/users_roles to reduce the set of uids to query. So, I thought I'd try tweaking those things to see if they can help the plan.

Removing the orderby seems to dramatically improve the plan removing the "Using temporary" and "Using filesort":

MySQL [cardcom]> explain select p.created, e.pid, e.unblocked, e.warning, u.uid, u.created from users u left join password_policy_history p on u.uid = p.uid left join password_policy_expiration e on u.uid = e.uid where u.uid > 0 and u.status = 1;
+----+-------------+-------+-------+---------------+---------+---------+---------------+--------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref           | rows   | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+---------------+--------+-------------+
|  1 | SIMPLE      | u     | range | PRIMARY       | PRIMARY | 4       | NULL          | 327768 | Using where |
|  1 | SIMPLE      | p     | ref   | uid           | uid     | 4       | cardcom.u.uid |      1 | Using where |
|  1 | SIMPLE      | e     | ref   | uid           | uid     | 4       | cardcom.u.uid |      1 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+---------------+--------+-------------+

And here's the plan without the orderby and with joining to users_roles and password_policy_role. My sense is that Using temporary/Using filesort is not great, but if you look at how many rows it is returning...they are *drastically* reduced.

MySQL [cardcom]> explain select p.created, e.pid, e.unblocked, e.warning, u.uid, u.created from users u inner join users_roles ur on u.uid = ur.uid inner join password_policy_role r on ur.rid = r.rid left join password_policy_history p on u.uid = p.uid left join password_policy_expiration e on u.uid = e.uid where u.uid > 0 and u.status = 1 order by p.created;
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref            | rows | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+----------------------------------------------+
|  1 | SIMPLE      | r     | index  | PRIMARY       | PRIMARY | 8       | NULL           |    5 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | ur    | ref    | PRIMARY,rid   | rid     | 4       | cardcom.r.rid  |    5 | Using where; Using index                     |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | cardcom.ur.uid |    1 | Using where                                  |
|  1 | SIMPLE      | p     | ref    | uid           | uid     | 4       | cardcom.ur.uid |    1 | Using where                                  |
|  1 | SIMPLE      | e     | ref    | uid           | uid     | 4       | cardcom.ur.uid |    1 | Using where                                  |
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+----------------------------------------------+

In practice on my site, running that query returned 384 rows in set (0.01 sec). Whereas the previous query (without orderby) returns 921720 rows in set (7.67 sec). The query with orderby returns 921720 rows in set (8.27 sec)

We can even take it a little further and add that the policy has to be enabled and has to involve an expiration. This would work well on a site where the number of users affected by an enabled expiration policy is much lower than those affected by a general policy:

MySQL [cardcom]> explain select h.created, h.pid, e.pid, e.unblocked, e.warning, u.uid, u.created from users u inner join users_roles ur on u.uid = ur.uid inner join password_policy_role r on ur.rid = r.rid inner join password_policy p on r.pid = p.pid left join password_policy_history h on u.uid = h.uid left join password_policy_expiration e on u.uid = e.uid where u.uid > 0 and u.status = 1 and p.enabled = 1 and p.expiration > 0;
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+-----------------------------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref            | rows | Extra                                                           |
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+-----------------------------------------------------------------+
|  1 | SIMPLE      | p     | ALL    | PRIMARY       | NULL    | NULL    | NULL           |    1 | Using where                                                     |
|  1 | SIMPLE      | r     | index  | PRIMARY       | PRIMARY | 8       | NULL           |    5 | Using where; Using index; Using join buffer (Block Nested Loop) |
|  1 | SIMPLE      | ur    | ref    | PRIMARY,rid   | rid     | 4       | cardcom.r.rid  |    5 | Using where; Using index                                        |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | cardcom.ur.uid |    1 | Using where                                                     |
|  1 | SIMPLE      | h     | ref    | uid           | uid     | 4       | cardcom.ur.uid |    1 | Using where                                                     |
|  1 | SIMPLE      | e     | ref    | uid           | uid     | 4       | cardcom.ur.uid |    1 | Using where                                                     |
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+-----------------------------------------------------------------+

If we can reduce the number of rows returned, we'll reduce the size of the arrays that are built which will reduce the memory consumed. It also seems to be a much more efficient query for the database to run, so we're not shifting load to the database but in fact reducing load on the database.

This isn't a catch-all solution: if a site has a policy that affects basically every user on the site they are still stuck returning all the data and managing it in php. I think there are probably some ways to handle the data more efficiently as well, but I think the query improvements are worthwhile on their own.

greggles’s picture

Status: Active » Needs review

Untested patch because I have to stop working on this for the day - needs review and testing.

Status: Needs review » Needs work

The last submitted patch, 7: 2252541_policy_query_speed.patch, failed testing.

greggles’s picture

StatusFileSize
new916 bytes

Well, the patch doesn't apply to 7.x-1.x. Really the question is if this approach makes sense to people before I go further with it.

Another approach I think could be taken at the same time is just pushing this out of cron so that it only runs on a nightly basis.

Really, I think that doing both of these ideas is the right way to go. Any feedback on them? I use the 7.x-1.x branch so this is against that as well, but if people like it conceptually I can make a 7.x-2.x patch and then backport.

aohrvetpv’s picture

Filtering to roles to which expiration policies apply seems sensible to me (why not?), even if it will not help the problem where there are just very many users to which expiration policies apply.

Would it hurt performance much to always filter on roles that have an expiration policy? So instead of the following as in your patch:
Query 1: Does an expiration policy apply to the 'authenticated user' role?
Query 2: If not, filter to roles to which expiration policies apply.

Do:
Query 1: Filter to roles to which expiration policies apply (even if 'authenticated user' is one of the roles).

I ask because the expiration code in Password Policy 7.x-1.x is already overcomplicated and needing refactoring, IMHO. Wary of adding more complexity.

greggles’s picture

If it applies to all roles then I believe there is a performance decrease from adding 2 unnecessary joins to the process, even if the joins are on indexed columns.

I'd be fine doing your proposed way...people who have a password policy that applies to all users and who have a lot of users are probably already doing something custom (e.g. running in Drush) to make this work at all.

Any thoughts on the code in #10 to make this all basically optional?

greggles’s picture

Version: 7.x-2.x-dev » 7.x-1.x-dev
Status: Needs work » Needs review

OK, here's a 7.x-1.x patch that combines the two concepts and explains the use of the hidden variable.

Moving the version to 7.x-1.x for automated tests.

I'm happy to provide a similar patch for 7.x-2.x - just let me know.

greggles’s picture

StatusFileSize
new3.49 KB

The last submitted patch, 10: 2252541_policy_cron_optional.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 14: 2252541_password_policy_performance_defer.patch, failed testing.

greggles’s picture

Status: Needs work » Needs review
StatusFileSize
new3.5 KB

[] vs. array() mistake.

greggles’s picture

To help benchmark the performance/scalability issues here I ran the following tests on a brand new site. The result is that PHP memory consumption goes from ~71MB to ~33MB if the password policy applies to a subset of users. The time to run the command goes from ~6.6 seconds to ~.5 seconds. In the case where the policy applies to basically authenticated users then performance/resource consumption is not affected (no surprise). In the case where password policy applies to a role that is present on a large number of users (~3,000) the query is still much faster than trying to load all the users.

Steps:
* created a new role, that does not have a password policy associated with it.
* created a password policy associated with admin account
* ran "drush genu --roles=3 100" and then "drush genu --roles=4 3000" and then "drush genu 30000"
* ran the command: /usr/bin/time -l drush -v ev "password_policy_cron();"

Current 7.x-1.x-dev:

6.66 real 3.46 user 0.54 sys
71823360 maximum resident set size

With this patch:

0.58 real 0.30 user 0.08 sys
33746944 maximum resident set size

With this patch, and apply password policy to all auth users:

6.83 real 3.82 user 0.51 sys
71901184 maximum resident set size

With this patch, and apply password policy to a large number of users with a role:

Then I did "drush genu --roles=4 10000" to test out the scenario of a "lot" of users with roles and it's still dramatically better.

3.08 real 1.65 user 0.26 sys
46624768 maximum resident set size

aohrvetpv’s picture

Looks good to me.

What throws an exception in password_policy_process_expirations()? The docblock says it throws \Exception.

greggles’s picture

Not sure, that was auto-generated by phpstorm. I think any db query *can* throw a PDOException, but if you'd like to remove that I can easily do so.

Please let me know if you want me to port this to 7.x-2.x. I can say that on a site I manage cron runs went from 21 minutes to 1 minute after we deployed this.

aohrvetpv’s picture

Made several minor changes to patch:
- Removed couple lines changing table aliases. Also removed "We don't" -> "Don't". Agree with the changes but think they are unrelated so would like to commit separately. Please correct if wrong.
- "with a command like" -> "with a Drush command like" (Administrators may not know about Drush and this may help a little to figure out it is not a native Drupal command.)
- DRUPAL_AUTHENTICATED_RID in place of 2
- Capitalized "Password Policy" in watchdog message.

I have not tested this functionality.

Status: Needs review » Needs work

The last submitted patch, 21: 2252541_password_policy_performance_defer_21.patch, failed testing.

aohrvetpv’s picture

Status: Needs work » Needs review
StatusFileSize
new3.5 KB

Let's see if the patch from #17 fails testing too. (Using the "Retest" link does not seem to work for patches submitted before the CI switch.)

aohrvetpv’s picture

Tests won't run because the automated tests are "Waiting for branch to pass".

The 7.x-1.x branch isn't passing because it is being tested on D6 for some reason, and I can't find a way to delete the D6 test configuration. Any ideas?

It looks like a CI bug. Will try to figure out how to report an issue.

aohrvetpv’s picture

Submitted DrupalCI support request:
https://www.drupal.org/node/2638452

greggles’s picture

Thanks for identifying the problem. I subscribed to the other issue. I took a look at the settings and agree with what you see, but don't see a way to fix it.

The last submitted patch, 21: 2252541_password_policy_performance_defer_21.patch, failed testing.

aohrvetpv’s picture

Status: Needs review » Needs work

Having trouble seeing the functional difference between greggles patch in #17 and my patch in #21.

aohrvetpv’s picture

I see it now. Re-using the 'p' alias for different tables. I wrongly thought changing the alias for password_policy_history to 'h' was a cosmetic change. Will fix.

aohrvetpv’s picture

Status: Needs work » Needs review
StatusFileSize
new3.48 KB

  • AohRveTPV committed 4a583a1 on 7.x-1.x authored by greggles
    Issue #2252541 by greggles, AohRveTPV: Cron job runs out of memory...
aohrvetpv’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev
Status: Needs review » Patch (to be ported)
aohrvetpv’s picture

greggles’s picture

Thanks, AohRveTPV. I'll try to work on the 7.x-2.x if you don't beat me to it.

  • AohRveTPV committed 396ca70 on 7.x-1.x
    Revert "Issue #2252541 by greggles, AohRveTPV: Cron job runs out of...
aohrvetpv’s picture

#31 breaks expiration.

aohrvetpv’s picture

The patch changes:

    ->fields('p', array('created'))

to:

    ->fields('h', array('created'))

However, this line remains:

$accounts[$row->uid] = empty($row->p_created) ? $row->created : $row->p_created;

The result is $row->p_created is always empty, and the user creation time ($row->created) is used instead of the most recent password change time for determining expiration.

The tests should have caught this, IMO, and need improvement.

aohrvetpv’s picture

Status: Patch (to be ported) » Needs work
aohrvetpv’s picture

Status: Needs work » Needs review
StatusFileSize
new3.97 KB
aohrvetpv’s picture

Version: 7.x-2.x-dev » 7.x-1.x-dev

Status: Needs review » Needs work

The last submitted patch, 40: 2252541_password_policy_performance_defer_40.patch, failed testing.

aohrvetpv’s picture

Status: Needs work » Needs review

Accidentally tested patch on 7.x-2.x instead of 7.x-1.x. Re-testing.

aohrvetpv’s picture

Made improvement to test method that would have caught bug in #31:
http://cgit.drupalcode.org/password_policy/commit/?id=ee096470a271ab3298...

After the user whose password is expired changes their password, again run cron, then attempt login, to check that the password is not immediately re-expired.

  • AohRveTPV committed e3cf1f8 on 7.x-1.x authored by greggles
    Issue #2252541 by AohRveTPV, greggles: Cron job runs out of memory...
aohrvetpv’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev
Status: Needs review » Patch (to be ported)

  • AohRveTPV committed d50f72a on 7.x-1.x
    Revert "Issue #2252541 by AohRveTPV, greggles: Cron job runs out of...
aohrvetpv’s picture

Version: 7.x-2.x-dev » 7.x-1.x-dev
Status: Patch (to be ported) » Needs review

Removing orderBy('p.created') might be problematic, because of this code:

foreach ($result as $row) {
  /* snip */

  // Use account creation timestamp if there is no entry in password history
  // table.
  $accounts[$row->uid] = empty($row->p_created) ? $row->created : $row->p_created;
  
  /* snip */
}

The purpose of the statement in the for loop is to get the time of the user's most recent password change.

But there can be multiple rows for each user, each with a different p_created. (The password_policy_history table stores the times of all password changes for a user, not just the most recent.) So the code is relying on the last iteration of the for loop for a given user to provide their most recent password change time in $row->p_created. I'm not sure we can depend on the last iteration being the most recent password change time unless the results are ordered ascending on p.created.

I've reverted the patch out of caution. Will recommit if convinced removing orderBy() is safe.

(The intent of the code I excerpted above is not very clear. The expiration code in general is complicated and hard to read, in my opinion. I am working on refactoring it.)

andypost’s picture

Issue tags: +needs port to Drupal 8
aohrvetpv’s picture

Status: Needs review » Needs work

orderBy('p.created') is needed, so #40 will not work. See #2833776: Multiple warning mails sent per day in certain condition for an example of a problem that may occur when rows are not ordered as expected.

sebastien m.’s picture

This performance issue seems to be fixed just by changing a flag in the 'user_load()' function in the 'PasswordPolicyExpire::cron' method.
Indeed, User entity controller use by default a static cache to store already loaded entities.
In a batch process, all users are different, so no need to use this kind of cache.
Instead, you can 'reset' at each call this cache and it won't be fill.

File 'plugins/item/expire.inc' at line '327' :

      foreach ($candidates as $candidate) {
        $account = user_load($candidate->uid, TRUE);
        if ($this->policy->match($account)) {

What do you think about it ?
Should I create a patch or a new issue ?

Many thanks

aohrvetpv’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev

Sebastien, thanks for the idea, I'll plan to review it and try it out soon! A patch would be helpful.

Your suggestion applies to 7.x-2.x, so updating issue version accordingly. Both 7.x-1.x and 7.x-2.x have this problem.

sebastien m.’s picture

Thanks for your feedback.
Here is the patch for the 7.x-2.x branch.

sebastien m.’s picture

Status: Needs work » Needs review
StatusFileSize
new1.09 KB

Here is the patch for the 7.x-1.x branch.

sebastien m.’s picture

In the 8.x-3.x branch, all users are loaded at once using this code:


      $users = \Drupal::entityTypeManager()
        ->getStorage('user')
        ->loadMultiple($valid_list);

Instead, users should be loaded one by one in the following 'foreach' loop:


      /** @var \Drupal\user\UserInterface $user */
      foreach ($users as $user) {
        $user->set('field_password_expiration', '1');
        $user->save();
      }

sebastien m.’s picture

Here is my patch for 8.x-3.x branch.

aohrvetpv’s picture

Thanks for the patches. Do you or anyone have a suggestion on how to reproduce the problem, so we can verify that the patch fixes it?

My current idea is to use Drush to create a lot of users--maybe 100,000. I can then see if the PHP memory limit is exceeded, or measure the memory usage using PHP functions. Then, I'll apply the patch and see either the memory limit is not exceeded, or the memory usage is reduced.

sebastien m.’s picture

Sure. I used Devel to create a batch of 1000 users.
I have added a call to `memory_get_usage` before and in the foreach loop.

aohrvetpv’s picture

Did some testing with 7.x-2.x. Unfortunately I think memory consumption is just one facet of this performance problem. The cron execution time is extremely long, probably problematically so for sites with 100,000s of users. This seems to be a design issue as I wrote in #1.

Here was my test:
- Fresh Drupal 7 site.
- Password Policy 7.x-2.x-dev, default policy plus password expiration enabled.
- 10,000 users created with Drush and Devel: drush user-generate 10000.
- 10,000 password_policy_history entries, 1 for each user, created using Bash/MySQL.

1. Ran drush cron.
2. Cron completed successfully, but only after maybe 15-30 minutes. There was no out of memory error.

If cron takes 15-30 minutes for 10000 users, it could take hours or days for 100000s of users. That is problematic. Passing $reset = TRUE to user_load() won't help the execution speed.

aohrvetpv’s picture

Version: 7.x-2.x-dev » 7.x-1.x-dev

Setting this back to 7.x-1.x-dev because it helps me with bug triage. (I filter on 7.x-1.x bug reports because I am more interested in fixing bugs in the stable branch than 7.x-2.x or 8.x-3.x.)

Please feel free to change the version as needed, though.

sebastien m.’s picture

This issue seems to concern a 'out of memory' issue.
That's why I passed 'TRUE' to avoid any static memory consuption.
Concerning the time of the cron, it's quite normal because it loads 100'000 users from database which is really huge.
I saw in your 8.x-3.x branch that's you use the 'range()' function to restrict database query.
May be it could be done on the 7.x branches.
Simple suggest.

aohrvetpv’s picture

This issue seems to concern a 'out of memory' issue.

You're right, the issue summary only mentions the memory usage problem. Execution time is also a problem, as discussed in the comments. If we fix memory usage, execution time will still be a problem. We will ultimately need a solution for to the general problem of processing 1000s of users efficiently.

I'll rewrite the issue summary.

That's why I passed 'TRUE' to avoid any static memory consuption.

We could go ahead and commit this change. There are possibly sites where memory usage is a problem but execution time is not a problem. So your change would be an improvement, even if not a fix for the general inefficiency problem.

Concerning the time of the cron, it's quite normal because it loads 100'000 users from database which is really huge.

Yes, we need to somehow avoid calling user_load() for 1000s of users in a single cron run.

I saw in your 8.x-3.x branch that's you use the 'range()' function to restrict database query.
May be it could be done on the 7.x branches.

I'll plan to look into this. We probably need to process all users in a single cron run. Many sites are configured to run cron only once per day. In 7.x, expiration notifications are only sent when cron runs on the day they are due to be sent: #2833764: Send expiration warnings even if cron not run daily. So any users not processed in a daily cron run may miss expiration notifications.

aohrvetpv’s picture

Title: Cron job runs out of memory trying to process 1000s of users » Expiration cron does not scale for large numbers of users
Status: Needs review » Needs work

(Removed comment that was supposed to be summary edit.)

aohrvetpv’s picture

Issue summary: View changes

aohrvetpv’s picture

Status: Needs work » Active

Thanks for the memory patches. They should help some sites. Leaving 8.x-3.x memory patch for nerdstein to review/commit. (I've been working primarily on the D7 branches.)

Only the exceeded memory limits are fixed by the previous commits. We will ultimately need to fix both expiration memory usage and execution time.

malcomio’s picture

Running 7.x-1.14 with the patch from #54 applied on a site with 462091 users and 512M memory, we're still getting out of memory issues.

aaron.ferris’s picture

I've been looking at a way of reducing the result set of users based on whether they have a password policy applicable role, patch attached.

This won't help sites that have a large user base who all have a password policy applicable role, but it helps in my implementation.

Im running 7.x-1.14.

Thanks
Aaron

aohrvetpv’s picture

I'm working on a 7.x-1.x patch to change the expiration processing in two ways:
- Processes users in chunks (e.g., 1000 at a time).
- Replaces per-user user_load() calls with single database queries that operate on the chunk of users.

I think it will solve the out of memory issues, but execution time may still be long depending on the number of users. Will have to do some testing to quantify. Will perhaps also use the Batch API.

Just wanted to mention in case there's something basically wrong with this idea, or something else that could be done.

joachim’s picture

> I think it will solve the out of memory issues, but execution time may still be long depending on the number of users. Will have to do some testing to quantify. Will perhaps also use the Batch API.

For something that runs in cron, you should use the batch queue API.

johnhanley’s picture

The following patch is a modified version of #69:

1) When building user role query, added inner join to table password_policy to retrieve roles for enabled password policy roles only
2) Exit password_policy_cron() if no roles enabled
3) Replaced potential multiple OR conditions with single IN condition
4) Restored missing 'name' field from select criteria

johnhanley’s picture

Same patch as above sans extraneous lines.

Applied successfully to 7.x-1.15 (and maybe dev, but we're not running dev version on production).

nerdstein’s picture

I'd like to recommend we split out two separate issues for ongoing D7 improvements and D8 work because there are two separate threads getting munged together and it's hard to follow.

I'm recommending we take https://www.drupal.org/project/password_policy/issues/2252541#comment-12... as a candidate patch for a new D8 issue.

andypost’s picture

Good idea to separate d8 issue!

johnhanley’s picture

I've actually rewritten the cron hook to use Queue API. This includes consolidating several queries.

We're in the midst of testing, but it looks good so far. I'll try to post the code at some point in the near future.

banviktor’s picture

The last few patches don't take into consideration that users_roles don't hold references to rid 2 (authenticated role) and as a result they filter out a bit too much.

I think that the variable password_policy_process_on_cron that was included in patches <= #40 is a useful addition. This patch resembles #40 the most and cleanly applies to 7.x-1.x (and 7.x-1.15)

aohrvetpv’s picture

Status: Active » Needs review
StatusFileSize
new37.92 KB

Patch refactors expiration code and removes calls to user_load().

The expiration code is currently overly complex and in serious need of refactoring, in my opinion. For instance, the expiration cron code is a single 140+ line function with many variables, deep nesting, and hard-to-understand logic. It makes it difficult to work on making improvements to the expiration code. So I think the top priority is to simplify that code, and this patch attempts to do that.

Code pertaining to expiration is moved from password_policy.module to password_policy.expiration.inc. This reduces the length of password_policy.module from 1835 lines (ridiculous) to 1473 lines (a little less ridiculous). The long cron handling code is broken into several functions.

After refactoring the code some, the user_load() calls were removed. I believe the user_load() are the costliest part of the expiration cron. I have not tested the changes for a large number of users, but I believe they should help.

A difficulty is that warning emails can be customized for the user: user tokens can be used in the email, and the user's language should be used for the email. Loading the user is normally required for both those customizations, but I came up with workarounds that do not require loading the user. Not sure they're entirely correct, but all expiration tests pass.

Perhaps removing the user_load() calls is problematic or not enough of a performance improvement, and queueing is a better solution (as suggested in #71, #76, etc.) Or maybe both could be done. Either way, I think it's important to refactor the expiration code as a first step. I plan to make a patch that is just the refactoring without any attempts at performance improvements.

(I hope this comment is not insulting to the authors of the existing expiration code. I appreciate them getting it working to the extent that it does! The code probably grew in its complexity and scope over time.)

aohrvetpv’s picture

Status: Needs review » Postponed (maintainer needs more info)

I have some time I could spend on this issue. I have some questions:
1. Does this issue still affect anyone using 7.x-1.x? (Or has everyone moved on to D8?)
2. If so, how many users does your site have?
3. If so, is the problem running out of memory, that cron takes too long to run, or both?

I wouldn't want to put effort into fixing this problem if no one really cares.

aohrvetpv’s picture

Status: Postponed (maintainer needs more info) » Closed (won't fix)

No one seems to care about fixing this in 7.x.

greggles’s picture

Status: Closed (won't fix) » Needs review

1. Yes.
2. over a million.
3. I'm not sure any more. I think it was chewing up RAM to the point that the drush process running cron would exhaust RAM on the server and get killed. The site is using the patch in #77 to stay under those limits. The time that it takes to run cron isn't as important when running it via drush.

IMO the ideas presented in #78 are good, but make a big departure from how things were done. Big departures in architecture are great to do in D8. The changes proposed in #77 are small and hopefully easy to review. Did you consider them? Moving this to needs review for #77 or #78.

jasonawant’s picture

I think this is still an issue with 8.x version, any one else?

I think it can occur here in password_policy_cron() when there are too many users that have been loaded and then attempted to save.

      foreach ($users as $user) {
        $user->set('field_password_expiration', '1');
        $user->save();
      }
greggles’s picture

For anyone that this affects in Drupal 7, does the fix from #77 help you?

aohrvetpv’s picture

greggles, I'll review #77. A few years back I felt like refactoring the expiration code was the best way forward, and I was working on that, but I guess I didn't get it done. So #77 may be the most realistic solution. Sorry for the delay.