Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hello,
What are the chances I get two generated users getting the same name, the same day on the same Drupal website ?
I first generated 1000 users, then, the second time, tried to generate 10000 users, on the same database, I got a MySQL duplicate entry error for users.name at around 3200 created users.
Should I go play to casino today, or is it bound to happen sometimes ?
I know that would impact performances, but my suggestion would be to check users names unicity, in the batch itself, and compared to existing users.
Example of code (needs review and improvement)
$existing_names = db_select('users')
->fields('users', array('name'))
->execute()->fetchAllKeyed(0, 0);
while (count($names) < $num) {
$names = array();
while (count($names) < $num) {
$names[] = devel_generate_word(mt_rand(6, 12));
}
$names = array_unique($names);
$names = array_diff($names, $existing_names);
}
Regards,
David
Comment | File | Size | Author |
---|---|---|---|
#18 | 0001-Issue-1447778-by-David-Stosik-marvil07-Fixed-Duplica.patch | 2.58 KB | marvil07 |
#9 | devel-unique_username-1447778-9.patch | 2.34 KB | David Stosik |
#7 | devel-unique_username-1447778-7.patch | 2.07 KB | David Stosik |
#3 | devel-unique_username-1447778-3.patch | 1.05 KB | David Stosik |
Comments
Comment #1
salvisI haven't really looked into this, but I would expect that the "MySQL duplicate entry error" is a DBTNG exception that we should be able to catch. Even developing and testing should be straight-forward: just pass a constant username and the error will pop up at user 3.
D8 first, please...
Comment #2
David Stosik CreditAttribution: David Stosik commentedSwitching to D8 then...
I did some test and in a 2000 generated users population, one or two are usually duplicates. The new generated users also generally contain name that are already used in my 3000 previously generated users...
I need to generate 20000 users so...
Catching DBTNG Exception sounds good, but what would you do then ? Generate another user name, and so on until you get a good one ?
Comment #3
David Stosik CreditAttribution: David Stosik commentedApplies to 7.x-1.x-dev and 8.x-1.x-dev.
Comment #4
salvisYes, catch the exception inside the loop and just continue looping without incrementing the count. Catch the specific type of the exception (hopefully a DuplicateKeyException or something like that), so that we don't go into an endless loop for some other unrecoverable error.
And hope that our string generator doesn't run out of distinct strings... Otherwise we'll have to fix that, too — no matter how we treat the collisions.
Comment #5
salvisSorry, crosspost...
Comment #6
salvisI've looked at the current code. It's unnecessarily complex. There's no point in pre-creating the $names array.
Replace
with
Then you can just
Also, move the module_load_include() out of the loop.
Please check what happens if we keep getting the same error over and over again. We may or may not need to check for that case.
Comment #7
David Stosik CreditAttribution: David Stosik commentedI was just fixing the error, aiming to as few code lines as possible. :)
I once read that a provided patch should not "fix" other things than the problem it's dealing with. ("Keeping things organized" in the general patch guidelines, in http://drupal.org/node/707484).
Anyway, I guess attached patch is better.
Regards,
David
Comment #8
salvisI generally agree. But if the structure is bad to begin with and forces you to add more quirky code and further obfuscate the meaning of the whole thing, then we have to give priority to the quality of the code over minimizing the patch.
If you like to take the extra step, I'll gladly accept two separate patches, one to refactor the code and a second one with the actual fix.
Some minor comments:
Please move this to the end of the while block so that the control flow is easier to grasp.
Comments must be wrapped at col 80.
There's no need for an else here — we want to re-throw unconditionally at the end of the catch block. Also, throw is not a function (it should not look like a function call), and there's no need for parentheses.
Comment #9
David Stosik CreditAttribution: David Stosik commentedTook your comments into account.
Comment #10
David Stosik CreditAttribution: David Stosik commentedComment #11
salvisThat looks good to me, thank you.
Now we need someone to test this... A PostgreSQL test would be nice, too...
From #6:
Have you tried to generate your 20000 users (#2)?
Comment #12
David Stosik CreditAttribution: David Stosik commented16000 users, no problem. :)
I created them with custom code implying Batch API (which means many small passes).
Comment #13
salvisGreat.
What about the endless loop problem? What happens if you set
? Do we time out? Do we need to count consecutive exceptions and abort if we have no intervening successes? I'm not saying we need to do that, just wondering how it behaves if we don't. Will Batch API detect the timeout?
Comment #14
David Stosik CreditAttribution: David Stosik commentedGood question, but the problem is deeper. How many different names (N vowel/consonant combinations) can the algorythm provide?
Should the function abort if the passed number is greater?
Comment #15
salvisYeah, that's what I meant in #4. If the number of remaining names gets low, it takes progressively longer to find one, and the algorithm will slow down to a crawl long before they are completely exhausted.
Running out of names is just a special case. I'm more concerned with the same exception being thrown due to some other cause (maybe in a third-party contrib) that doesn't go away just because we try a different name.
If we handle the general case (or determine that it behaves reasonably well as it is), we don't need to worry about the special case.
Comment #16
David Stosik CreditAttribution: David Stosik commentedI'm no SQL / PDO expert, so I can't say which cases may throw the same exception.
By the way there's an alternative, consisting in getting a list of all user names before starting, and testing if the user name is in the list before creating it.
Comment #17
salvisThat's not the question. We can't know. A number of hooks are processed when saving a user, and any number of core and contrib modules may implement these hooks and do all sorts of things which have the potential to fail in a myriad of ways. Our change is to neutralize the PDOException 23000 no matter where it comes from. We can simulate an imaginary contrib persistently throwing PDOException 23000 (the worst case for our change) by passing the same username over and over again (causing core to exhibit the nasty behavior within the try block) and checking what happens. That's what I'm asking you to try out.
Comment #18
marvil07 CreditAttribution: marvil07 commentedI made a couple of changes:
entity_create()
instead ofuser_save()
. (following core)Comment #19
DrupalChimp CreditAttribution: DrupalChimp commentedError seems to occur if generated users already exist. e.g. If you generate 2000 users, then generate another 2000 users. The first generation seems to run fine. Second generation gets a 23000 Integrity constraint violation.
Here is the generated error
Comment #20
marvil07 CreditAttribution: marvil07 commented@James Linnell: Can you please confirm the error happens *after* applying the patch on comment 18?
Comment #21
DrupalChimp CreditAttribution: DrupalChimp commentedYes, it happens after applying the patch.
Comment #22
pcambraDevel generation has changed to be pluggable in #2154141: DevelGenerate as a D8 plugin type so this needs a re roll.
Comment #23
pcambraComment #24
willzyx CreditAttribution: willzyx commentedClosing for lack of activity. Feel free to reopen if the issue still exists