It is impossible to do programmatic user deletion in D7. Its a real train wreck of FAPI mixed with API. The closest API function we have is _user_cancel() and that does watchdogs and cache_clear_all() for every single deleted user. Thats not at all feasible when deleting 1000 users like we do in migrate module.

For starters, attached small patch refactors the queries out of that function and into own function. I also removed the authmap query since core no longer writes into the authmap table. It is just there as a convenience for contrib modules. See #181578: Remove distributed auth from user_save().

Comments

catch’s picture

Any reason why this is a private function? And could we invoke hook_user_cancel() there too?

moshe weitzman’s picture

@catch - its private because deleting a user is supposed to be done via a batch and the batch creation and population is in user_cancel(). user_cancel is the public facing function. there is some good logic to how it works now, but it sucks for programmatic deletes. i don['t think we can fix it trivially. lets just move this patch along for now.

webchick’s picture

I'm slightly confused because when I look at http://api.drupal.org/api/function/node_delete/7 and http://api.drupal.org/api/function/comment_delete_multiple/7, they consist of just:

function node_delete($nid) {
  node_delete_multiple(array($nid));
}

It seems like the proper way to fix this to create a user_delete(), and user_delete_multiple(), and then call them from user_cancel().

moshe weitzman’s picture

I'd like that too. user_delete_multiple() implies we want to delete multiple objects but we don't. User cancellation means you have to delete one user AND unpublish or delete his nodes and his comments. And send notification email and destroy session. Thats just what core does. Contrib will need to do more so we started down the road of batching all this and we are still working on it.

webchick’s picture

Status: Needs review » Needs work

Right, I understand that cancelling users results in all of that. But deleting users should still be just deleting them. So just make the proper functions and call them from user_cancel() IMO. Or is there a reason we can't do that?

Additionally, let's ditch the authmap rollback hunks; I still see several references to that elsewhere in user.module, so I don't think we're ready to drop those yet.

moshe weitzman’s picture

Title: Refactor user_cancel_delete (a little) » user_cancel_delete method calls into a "standard" user_delete_multiple API
Status: Needs work » Needs review
StatusFileSize
new8.27 KB

OK, this is what webchick requested. We add user_delete() and user_delete_multiple(). These get called into by the cancel API when user_cancel_delete is the method. This gives us a nice API which doesn't do dsm() or watchdog(). It also matches how comment and node do deletion.

The patch mostly moves code from the hook_user_cancel to hook_user_delete. The code still gets called in all the same flows.

Batch API is really a separate issue. We want all this stuff to be batched, but we aren't there yet. This patch doesn't significantly help or hurt in that regard.

catch’s picture

This looks great to me, and will go a long way towards fleshing out the entity API to handle stuff other than load in D8. I couldn't find anything to mark CNW for, so looks RTBC to me once the test bot comes back clean.

Status: Needs review » Needs work

The last submitted patch, user_del.diff, failed testing.

moshe weitzman’s picture

Status: Needs work » Needs review
StatusFileSize
new10.03 KB

Fixed one problem. Having trouble debugging tests on my machine.

Status: Needs review » Needs work

The last submitted patch, user_del.diff, failed testing.

moshe weitzman’s picture

Status: Needs work » Needs review
StatusFileSize
new10.02 KB

One more fix. Too much paste.

sun’s picture

-1 This totally breaks the overall design of the account cancellation.

I thought we discussed a $verbose flag in IRC?

Let's do that instead, please.

moshe weitzman’s picture

Please elaborate on what exactly is objectionable. The proposed solution is careful to preserve the whole cancel API. I tried hard to not "break the design". We just ask modules to implement hook_user_delete instead of a method on cancel hook.

Status: Needs review » Needs work

The last submitted patch, user_del.diff, failed testing.

sun’s picture

+++ modules/user/user.module
@@ -2132,8 +2132,11 @@ function user_cancel($edit, $uid, $method) {
-  // Allow modules to add further sets to this batch.
-  module_invoke_all('user_cancel', $edit, $account, $method);
+  // Modules use hook_user_delete() to respond to deletion.
+  if ($method != 'user_cancel_delete') {
+    // Allow modules to add further sets to this batch.
+    module_invoke_all('user_cancel', $edit, $account, $method);
+  }

Most importantly this part.

There can be a "better_user_delete" or whatever account cancellation method.

The "verbose" problem also exists for all other methods, so I don't see why we should special-case a single one.

+++ modules/user/user.module
@@ -2209,6 +2204,45 @@ function _user_cancel($edit, $account, $method) {
+function user_delete_multiple($uids) {
+  if (!empty($uids)) {
+    $accounts = user_load_multiple($uids, array());
+
+    db_delete('users')
+      ->condition('uid', $uids, 'IN')
+      ->execute();
+    db_delete('users_roles')
+      ->condition('uid', $uids, 'IN')
+      ->execute();
+    db_delete('authmap')
+      ->condition('uid', $uids, 'IN')
+      ->execute();
+
+    foreach ($accounts as $uid => $account) {
+      module_invoke_all('user_delete', $account);
+      field_attach_delete('user', $account);
+    }
+
+    entity_get_controller('user')->resetCache();
+  }
+}

Furthermore, not using batch API for this is a major regression. It means that you cannot cancel the accounts of hundreds of users, where each one potentially has thousands of nodes, comments, private messages, profile data, and whatever else.

The problem is the verbosity, so we should fix the verbosity for programmatic executions.

moshe weitzman’s picture

Status: Needs work » Needs review

Thanks for the reply. You make two good points, but ultimately they don't hold water for me.

1. better_delete cancellation method is a poor straw man. We already have a way for modules to do better_delete and thats hook_user_delete. They can hook in as needed. If they are not happy with that for some reason, they can create a new cancellation method and that method can choose to do module_invoke_all('user_delete'). This patch is a refactot. It does not take away any functionality that is present today.

2. Yes, the deletes to the user tables themselves are no longer batched. You call that a regression. I call it an improvement. Nothing else during deletion is batched. Not poll, trigger, statistics, comments, nodes, ... Its a long list. If we are to use batching, lets do it for all hooks. I think it is incorrect to hold up a nice patch because we want to preserve our current, crappy inconsistent state.

catch’s picture

Agreed with Moshe. We need proper, encapsulated, API functions for CRUD, for all entities in core, and we currently don't have one for users. Splitting delete out makes a tonne of sense, and means that when we add batch support for comments, nodes etc. (there's a patch in progress), we'll be able to apply the same pattern to users and everything else too.

sun’s picture

We also need a proper user account cancellation API. Getting that sucker right required a horrible lot of time.

Removing batch API from one of core's most important account cancellation methods is a major regression.

Furthermore, introducing user_delete() next to user_cancel() will cause lots of confusion for any other developer out there.

user_cancel() was designed to be the central account cancellation entry point. We neither need the confusion, nor do we need to remove existing functionality.

All of this doesn't hold water. You want to remove verbosity, then remove the verbosity. No major API change required for that.

catch’s picture

node_delete(), taxonomy_term_delete(). comment_delete(), user_cancel(), spot the odd one out, and the point of confusion.

During development, you often just want to delete stuff, same as you often just want to generate stuff, we need consistent ways to do that. Not having those is a regression from D6. Three delete queries not being in a batch is neither a regression, nor a major issue in itself.

Also I currently see no use case for batch API on user delete. There's no batch here at all: http://api.drupal.org/api/function/node_user_cancel/7, nor here http://api.drupal.org/api/function]/comment_user_cancel/7 - nor should there be, it should be in node_delete_multiple() and comment_delete_multiple(), not buried in a hook implementation - like #89181: Use queue API for node and comment, user, node multiple deletes. Other modules attaching data to users can also implement their own _delete_multiple(), batched, following core's example. Or if they're storing information in a single table, then a DELETE FROM {table} WHERE uid = :uid doesn't need to be batched anyway.

sun’s picture

That's why chx insisted on me creating the following issues to get #8: Let users cancel their accounts done:

#355905: node_delete_multiple() has to use batch API
#355926: Add comment_mass_update()

Both critical bug reports.

Both stating the problem that *we need* to implement batch API support for those delete operations. Because the new account cancellation process simply requires it.

Arbitrary people marked those issues as duplicates of other issues. Sorry for, me, stupid, trusting that they'd know what they are doing, and not re-opening those.

You simply cannot load thousands of nodes via http://api.drupal.org/api/function/node_delete_multiple/7

and on top of those nodes, you cannot simply load thousands of comments via http://api.drupal.org/api/function/comment_delete_multiple/7

NOT within the same request. Period.

catch’s picture

Then in node_delete_multiple(), it should chunk the $nids array, and do the loading inside the batch too.

sun’s picture

As far as this issue is concerned, no, then we just need to add a $verbose flag to support arbitrary programmatic user account cancellations.

That's what we discussed. Running in circles.

sun’s picture

Title: user_cancel_delete method calls into a "standard" user_delete_multiple API » user_cancel_delete method too verbose and clearing cache too often
Issue tags: +API change

Better title, up to the point. Since adding a $verbose or $interactive flag in any possible way will lead to an API change, tagging accordingly.

Possible ways:

1) Re-using existing arguments, i.e. $account->verbose or $edit['interactive']

2) Adding a new argument, passed along to all implementations and operations

moshe weitzman’s picture

Title: user_cancel_delete method too verbose and clearing cache too often » user_cancel_delete method calls into a "standard" user_delete_multiple API
StatusFileSize
new10.95 KB

@sun - it is pretty clear that opinion differs from those stated by webchich, catch, and myself. please be gentle in your issue queue proceedings. i think your recent retitle here is a forceful "takeover" of this issue and i'm not amused.

i don't have anything to add beyond the compelling justification in #16, #17, #19.

this issue is another case where a core committer is going to have to decide how we proceed (once the tests are passing).

catch’s picture

Status: Needs review » Needs work

There's no need for

+  module_load_include('inc', 'comment', 'comment.admin');

Because that code is in comment.module, where it should be. Not invented here, but should be removed anyway. Otherwise looks RTBC.

webchick’s picture

In my mind, canceling and deleting users are two totally different workflows. It just so happens that the canceling workflow will call the delete workflow (under certain conditions).

It makes a lot of sense to me for core to only expose canceling operations in the UI and not deleting. But what we're talking about here are API functions for programmers -- your Migrate, Devel, etc. type modules. And to preserve DX I feel that it is critical that all of these functions behave the same.

Shorter version: I agree with Moshe and catch.

webchick’s picture

And on the note of the batching stuff, it's nice that user_cancel introduced some batching stuff, but the arguments about it being unlike anything else in core are right on. I would prefer to do this whole-sale throughout all of our delete/cancel mechanisms in D8 than have a weird out-lier. AFAIK that was always the plan, it just didn't actually get done this release.

moshe weitzman’s picture

Status: Needs work » Needs review
StatusFileSize
new10.89 KB

without that line

catch’s picture

Status: Needs review » Reviewed & tested by the community

Bot was happy with the last one, line removed was a no-op. RTBC.

sun’s picture

Status: Reviewed & tested by the community » Needs review

How many comments will you delete, at max, when deleting a single node?

How many nodes and/or comments will you delete, at max, when deleting a nodes?

Now multiply this with the scope of a user. Wanna delete my account? Go ahead. But be prepared to expect fatal error while deleting +15,000 comments.

Yes, I can see the "API consistency" perspective. But that's all.

This proposed change is a fundamental regression to the user account cancellation process we introduced for D7. It's not only killing the entire concept behind it. It is also a huge slap in my face.

Why? Because this means that

a) we did not completely read and understand node/8

b) the documentation is incomplete

c) instead of making it work as it should, we rather completely break it

And why? Because of some silly watchdog() and cache_clear_all() invocations.

When we discussed this in IRC, I already stated that this kind of change won't fly. At all.

If it will nevertheless, then please change CHANGELOG.txt accordingly. The user account cancellation makes no sense at all with this hack. It won't complete. Not for my account.

Try it.

catch’s picture

Status: Needs review » Reviewed & tested by the community

sun, please explain how the current code in HEAD stops 15,000 comments being deleted at once. Only then mark this back from RTBC.

sun’s picture

So what? This patch just exchanges the position where we will *need* to implement batch API support, nothing else.

A major API change for: nothing.

sun’s picture

Status: Reviewed & tested by the community » Needs work

And if you want a review, you get one.

+++ modules/poll/poll.module
@@ -923,13 +923,16 @@ function poll_user_cancel($edit, $account, $method) {
+  db_delete('poll_vote')
         ->condition('uid', $account->uid)
         ->execute();

Wrong indentation.

+++ modules/profile/profile.module
@@ -233,15 +233,22 @@ function profile_user_insert(&$edit, $account, $category) {
+  db_delete('profile_value')
+  ->condition('uid', $account->uid)
+  ->execute();

Ditto.

+++ modules/user/user.module
@@ -2209,6 +2204,45 @@ function _user_cancel($edit, $account, $method) {
+    foreach ($accounts as $uid => $account) {
+      module_invoke_all('user_delete', $account);
+      field_attach_delete('user', $account);
+    }

Why isn't this a hook implementation?

Powered by Dreditor.

catch’s picture

It's a fix to an API regression from D6, not an API change as such. It's also transparent to any modules which are calling user_cancel()

field_attach_*() are currently never called from hook implementations in field.module. You could ask the same for every other field_attach_*() (and I have). I would like entity API to either call them, or have field API implement entity level hooks, in D8, not for here though.

sun’s picture

It's also transparent to any modules which are calling user_cancel()

It's not transparent, because expensive operations are ought to be done in batch operations during account cancellation. That's the whole point for why those two issues I mentioned earlier had to be critical.

Please go ahead with this direction. But make sure you re-open node/8 to roll it back. Thanks.

Alternatively, try to kill my account with user_delete(). Fatal error. Whatever direction.

moshe weitzman’s picture

Status: Needs work » Needs review
StatusFileSize
new11.02 KB

Fixed indentation.

The field_attach_delete() call is not a hook just like whole field_attach API. see node_delete_multiple().

AFAICT, the current user cancel code does not delete any of the user's field information. This is a big bug fix as well.

moshe weitzman’s picture

StatusFileSize
new11.02 KB

this one calls hook_user_delete before deleting from user tables as per existing code and #218189: Invoke hook_user before deleting user from database

catch’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC. Sun has not explained how this breaks the user cancel API at all, just stated repeatedly that it does, so there's nothing to respond to there until that happens, other reviews have been dealt with, batch has nothing to do with this issue, this also now fixes a critical bug.

sun’s picture

I did. But do whatever you want. This will break. Heavily.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Yeah, I don't know. To me this code just makes a heck of a lot more sense, and brings consistency back to our APIs that we had in D6.

I would also argue that 99% of the time, when you delete a user it's because they're a spamming douche, and you catch them within the first couple hours (and within the first post or two) of their nefariousness. At least this is the case for Drupal.org, which is way more popular than any of my sites. ;) The normal behaviour for a long-time user is to simply ban their account.

So this "delete user with thousands and thousands of posts/comments" case feels extremely edge-casey to me, and I believe that was the entire point of adding hook_user_cancel_methods_alter(). So that a contrib module could add something like a "Delete user's content in batch" cancellation method.

However, don't we need those op 'user_cancel_delete' hunks in foo_user_cancel()? I don't want to *remove* the ability to delete users' content from the cancellation API; that's incredibly handy. I just want to restore the sanity back to our user deletion DX.

sun’s picture

So this "delete user with thousands and thousands of posts/comments" case feels extremely edge-casey to me, and I believe that was the entire point of adding hook_user_cancel_methods_alter(). So that a contrib module could add something like a "Delete user's content in batch" cancellation method.

No.

Apparently my follow-ups are not read. If no one reads them, why am I commenting here.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

@webchick - all the same cancel methods are still there. you can still cancel a user and delete all his content. thats what you could do in the case of a spammer. the only substantial change here is as the title describes: the cancel API calls into the programmatic user_delete() API when the method calls for deletion.

back to rtbc as i don't see any code changes needed.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

*sigh* Apparently no one reads mine, either.

Once again: 99.9% of the time, you are deleting users who are < 2 hours old. They might have nodes, but no one's commented on them. Other than to say that they're spam. They might have comments, but no one's commented on them. Other than to say they are spam.

This experience comes from over 4 years of having administration access on one of the top 1000 sites on the Internet. I have never once brought Drupal.org down to its knees deleting nodes/comments, since well before we even thought of the very concept of a batch API.

Yes, I absolutely concede, if we delete the "sun" account, or the "webchick" account, we would absolutely get a fatal error. In the real world this does not happen. We set $user->status = 0. Done.

Now, for ultimate 100% uber COPPA compliance, you might indeed give your users, even the long-standing ones who hold within their content the very essence of your entire community, the ability to delete themselves and take their entire history of content with them. And if you want to do that, that's great. But you are an edge case. And therefore, you can use a contrib module to satisfy your edge case.

Now can you please tell me where my reading comprehension has failed me?

webchick’s picture

Status: Needs review » Reviewed & tested by the community
sun’s picture

Status: Reviewed & tested by the community » Needs work

The reasoning stated in #43 has nothing to do with the functionality that's proposed in this patch.

+++ modules/user/user.module
+function user_delete_multiple($uids) {

Type hinting?

+++ modules/user/user.module
@@ -3188,13 +3222,13 @@ function user_node_load($nodes, $types) {
   // Fetch name, picture, and data for these users.
-  $user_fields = db_query("SELECT uid, name, picture, data FROM {users} WHERE uid IN (:uids)", array(':uids' => $uids))->fetchAllAssoc('uid');
-
-  // Add these values back into the node objects.
-  foreach ($uids as $nid => $uid) {
-    $nodes[$nid]->name = $user_fields[$uid]->name;
-    $nodes[$nid]->picture = $user_fields[$uid]->picture;
-    $nodes[$nid]->data = $user_fields[$uid]->data;
+  if ($user_fields = db_query("SELECT uid, name, picture, data FROM {users} WHERE uid IN (:uids)", array(':uids' => $uids))->fetchAllAssoc('uid')) {
+    // Add these values back into the node objects.
+    foreach ($uids as $nid => $uid) {
+      $nodes[$nid]->name = $user_fields[$uid]->name;
+      $nodes[$nid]->picture = $user_fields[$uid]->picture;
+      $nodes[$nid]->data = $user_fields[$uid]->data;
+    }
   }

This needs to be reverted. This situation *must not* exist, since it would break each and everything else, regardless of silently failing when trying to attach to nodes.

...which is also one of the reasons why it doesn't make sense at all to introduce separate functionality that cannot work reliably.

140 critical left. Go review some!

moshe weitzman’s picture

Status: Needs work » Needs review
StatusFileSize
new10.1 KB

Attached patch adds type hinting for user_delete_multiple() and removes that hunk in user_page_view().

sun’s picture

Thanks! I'm still opposed to this patch, but who am I...

+++ modules/node/node.module
@@ -1716,27 +1716,30 @@ function node_user_cancel($edit, $account, $method) {
+function node_user_delete($account) {
...
+  // Clean history.
+  db_delete('history')
+    ->condition('uid', $account->uid)
+      ->execute();

Can we fix this indentation while moving it around?

RTBC afterwards.

Powered by Dreditor.

moshe weitzman’s picture

StatusFileSize
new10.09 KB

Fixed indentation

sun’s picture

Status: Needs review » Reviewed & tested by the community

I still have to wonder how you are going to implement batch API support for user_delete() and user_delete_multiple().

But yeah, we can add another critical issue to the list when committing this.

chx’s picture

I would like to see this in too. The patch now apparently has the support of sun, moshe, catch and myself . What else is needed to get it in?

mikeryan’s picture

+1

dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

gábor hojtsy’s picture

Status: Closed (fixed) » Needs work

Update docs were not fixed. http://drupal.org/update/modules/6/7#user_cancel still lists user_cancel_delete as the way to go. This should be fixed in the update docs.

moshe weitzman’s picture

Status: Needs work » Fixed

I just updated it.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

sun’s picture

Status: Closed (fixed) » Needs review
Issue tags: -API change +Needs documentation
StatusFileSize
new2.67 KB

Good lord. This patch really complicated things a lot. Had to study the actual code for an hour in order to figure out what's actually going on now.

Can someone confirm that the docs in attached patch are correct?

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Looks good.

webchick’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: +API change

Committed to HEAD. Thanks!

Restoring API change tag for historical purposes, but Randy please ignore. The actual change got committed many moons ago, this was just docs.

Status: Fixed » Closed (fixed)
Issue tags: -Needs documentation, -API change

Automatically closed -- issue fixed for 2 weeks with no activity.