Problem/Motivation

  • Misuse of the method DrupalTestBase::randomString() is frequently responsible for random, hard-to-debug test failures, because it generates a list of random characters that includes special characters like #, *, /, \, ?, etc.
  • Developers do not expect this from the name.
  • #1672764: Improve documentation of randomString() and randomName() has been committed to clarify when each of these methods should and should not be used, but the names are still not necessarily intiuitive and developers may use them without having read the documentation.

Proposed resolution

  • More explicit method names would prompt developers to read the documentation and/or consider the purpose of their randomized strings more carefully.
  • Rename randomName() to randomMachineName().
  • Potentially rename randomString() to randomUnsanitizedString() .

Remaining tasks

  • Need consensus on whether to rename one or both of these methods, and what the names should be,
  • @sun and @djdevin have pointed out that the name randomString() really is exactly accurate; "a string is unsanitized by design".
  • @sun also pointed out that the name randomUnsafeString() would dissuade developers from using this method when they should (whenever unsanitized input is possible).
  • Any renames will need a change notice.

Related issues

API changes

TBD

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

xjm’s picture

Instances of randomString():


core/modules/block/lib/Drupal/block/Tests/BlockLanguageTest.php:    $info_name = $this->randomString(10);
core/modules/block/lib/Drupal/block/Tests/BlockLanguageTest.php:    $info_name = $this->randomString(10);
core/modules/block/lib/Drupal/block/Tests/BlockLanguageTest.php:    $info_name = $this->randomString(10);
core/modules/comment/lib/Drupal/comment/Tests/CommentContentRebuildTest.php:    $comment_loaded->content['test_property'] = array('#value' => $this->randomString());
core/modules/contact/lib/Drupal/contact/Tests/ContactSitewideTest.php:    $this->submitContact($this->randomName(16), $email, $subject, 2, $this->randomString(128));
core/modules/contact/lib/Drupal/contact/Tests/ContactSitewideTest.php:    $this->submitContact($this->randomName(16), $email, $this->randomString(64), 3, $this->randomString(128));
core/modules/contact/lib/Drupal/contact/Tests/ContactSitewideTest.php:    $this->submitContact($this->randomName(16), $email, $this->randomString(64), 4, $this->randomString(128));
core/modules/entity/lib/Drupal/entity/Tests/EntityTranslationTest.php:        'name' => $this->randomString(),
core/modules/field/lib/Drupal/field/Tests/TranslationTest.php:        'name' => $this->randomString(),
core/modules/menu/lib/Drupal/menu/Tests/MenuNodeTest.php:      "body[$langcode][0][value]" => $this->randomString(),
core/modules/node/lib/Drupal/node/Tests/NodeBuildContentTest.php:    $node->content['test_content_property'] = array('#value' => $this->randomString());
core/modules/poll/lib/Drupal/poll/Tests/PollTestBase.php:    $edit['choice[new:0][chvotes]'] = $this->randomString(7);
core/modules/simpletest/lib/Drupal/simpletest/TestBase.php:  public static function randomString($length = 8) {
core/modules/simpletest/lib/Drupal/simpletest/Tests/MailCaptureTest.php:    $subject = $this->randomString(64);
core/modules/simpletest/lib/Drupal/simpletest/Tests/MailCaptureTest.php:    $body = $this->randomString(128);
core/modules/simpletest/lib/Drupal/simpletest/Tests/MailCaptureTest.php:        'subject' => $this->randomString(64),
core/modules/simpletest/lib/Drupal/simpletest/Tests/MailCaptureTest.php:        'body' => $this->randomString(512),
core/modules/simpletest/lib/Drupal/simpletest/WebTestBase.php:      $name = $this->randomString(8);
core/modules/system/lib/Drupal/system/Tests/Form/FormTest.php:      'textfield' => $this->randomString(),
core/modules/system/lib/Drupal/system/Tests/Pager/PagerTest.php:      watchdog('pager_test', $this->randomString(), NULL, WATCHDOG_DEBUG);
core/modules/user/lib/Drupal/user/Tests/UserCreateTest.php:        'pass[pass1]' => $pass = $this->randomString(),
core/modules/user/lib/Drupal/user/Tests/UserRolesAssignmentTest.php:      'pass[pass1]' => $pass = $this->randomString(),
donutdan4114’s picture

I would argue that randomString() should remain the same, based on the name, which clearly states random. I don't think there is confusion there.

However, a 'safe string' variant would be nice as well, so what about randomSafeString().

Also, you could make it a parameter public static function randomString($length = 8, $safe = FALSE) {, and then tweak the function to not include special characters if $safe == TRUE.

xjm’s picture

Status: Active » Needs review
FileSize
15.34 KB

First pass.

[lorentz:drupal | Tue 10:25:05] $ grep -r "randomString" *
[lorentz:drupal | Tue 10:25:12] $ 

Status: Needs review » Needs work

The last submitted patch, random-unsafe-string-1659682-2.patch, failed testing.

xjm’s picture

+++ b/core/modules/simpletest/lib/Drupal/simpletest/TestBase.phpundefined
@@ -805,6 +818,25 @@ abstract class TestBase {
+  public static function randomName($length = 8) {
+    return $this->randomSafeString($length);
+  }

Oh right, can't do this because it's static.

It might just be better to rename rather than providing the wrapper anyway.

xjm’s picture

Status: Needs work » Needs review
FileSize
324.24 KB

Replacing both randomString() and randomName(). This is a way bigger patch. ;)

xjm’s picture

@@ -763,16 +763,23 @@ abstract class TestBase {
   /**
    * Generates a random string of ASCII characters of codes 32 to 126.
    *
-   * The generated string includes alpha-numeric characters and common misc
-   * characters. Use this method when testing general input where the content
-   * is not restricted.
+   * The generated string includes alpha-numeric characters and common
+   * miscellaneous characters. Use this method when testing general input
+   * where the content is not restricted.
+   *
+   * Do not use this method when special characters are not allowed (e.g., in
+   * machine or file names); instead, use
+   * Drupal\simpletest\TestBase::randomSafeString().
    *
    * @param $length
    *   Length of random string to generate.
+   *
    * @return
    *   Randomly generated string.
+   *
+   * @see Drupal\simpletest\TestBase::randomSafeString()
    */
-  public static function randomString($length = 8) {
+  public static function randomUnsafeString($length = 8) {
     $str = '';
     for ($i = 0; $i < $length; $i++) {
       $str .= chr(mt_rand(32, 126));

@@ -789,12 +796,18 @@ abstract class TestBase {
    * require machine readable values (i.e. without spaces and non-standard
    * characters) this method is best.
    *
+   * Do not use this method when testing unvalidated user input. Instead, use
+   * Drupal\simpletest\TestBase::randomUnsafeString().
+   *
    * @param $length
    *   Length of random string to generate.
+   *
    * @return
    *   Randomly generated string.
+   *
+   * @see Drupal\simpletest\TestBase::randomUnsafeString()
    */
-  public static function randomName($length = 8) {
+  public static function randomSafeString($length = 8) {
     $values = array_merge(range(65, 90), range(97, 122), range(48, 57));
     $max = count($values) - 1;
     $str = chr(mt_rand(97, 122));

For reviewers, this is the actual API change. The rest is string replace.

sun’s picture

ugh, I couldn't disagree more.

The method names randomString() and randomName() are technically correct. The actual problem are:

1) People do not read their documentation.

2) People do not properly account for character escaping in their tests.

It is not randomString() that is causing test failures. It is poorly written test and assertion code that is causing the test failures.

Tests absolutely must take string escaping/sanitization into account. Renaming the methods in the way proposed will only lead to developers using the wrong "safe" function everywhere.

This won't fix for me.

andypost’s picture

The method names randomString() and randomName() are technically correct.

This correct only for some developers, so let's make it more saner for others

xjm’s picture

@sun is correct in #8 that the problem is carbon- and not silicon-based. However:

  1. Giving developers more obvious method names is good DX. In a perfect world devs would always consider the implications of using any random character vs. alphanumeric. However, people are only human, and experience shows us that patches with misuse of randomString() slip past reviewers all the time and break QA all the time. Adding the word "unsafe" improves DX by making it abundantly clear that they need to take something else into consideration.

  2. The current method documentation is sadly lacking, so it's not just people being stupid. I'd like to backport the docs improvements I have here.

I'm open to only renaming randomString(), but I think a docs fix by itself would not remove this headache.

djdevin’s picture

I agree with sun in #8, String is inherently unsafe in any language (outside of Drupal, outside of PHP), and that shouldn't have to be stated in the function name.

Also, I don't think it's possible to define "unsafe" or "safe" without context (markup, mysql...). Renaming random* to either of these means that there is a chance that the function will not do what it says it will do.

I would only agree with randomPossiblyUnsafeString() and randomProbablySafeString() :p

sun’s picture

Issue tags: +API change

Thinking about this some more, I personally could only accept a name change for randomName() [if that helps in any way].

The problem is that people should use randomString() wherever appropriate. And that means almost everywhere.

Each of those random test failures means that character escaping/sanitization was not considered at all in a test. But developers have to write tests that properly take security into account.

The only exception to randomString() are places in which a machine name input field or the like needs to be populated, which has a limited range of allowed characters. That is the only reason why randomName() exists.

I strongly object to the original presumption that is made here. Developers (and also reviewers) need to use randomString(), and everyone has to think about security. randomString() is what you use, always. randomName() is the exception, and needs a valid reason for being used.

We can:

  1. Certainly improve the documentation for both.
  2. Consider to change the name of randomName() into randomLimitedName() or similar, so at to clarify for everyone that this is the exception, and whenever you do not see randomString(), there needs to be a reason for why a test does not care for security.

Lastly, I do not see how this can be backported - these functions are heavily used in tests and a rename is an API change.

xjm’s picture

Sorry, what I meant only was that the documentation fixes included in my patch can and should be backported. Not the renaming. I'll factor that out into its own issue.

xjm’s picture

merlinofchaos’s picture

While randomString() technically correct, it is not complete, so I disagree with sun's assertion. I believe that clarity in what that method is going to return is valuable to the developer, and if you want people to be testing for security you should be telling them explicitly when they're using unsecure strings vs secure strings and not leaving it up to the chance of them reading the documentation to see what kind of random string the random string is.

merlinofchaos’s picture

randomSanitizedString() and randomUnsanitizedString() would be very clear.

sun’s picture

That doesn't really change the fact that randomName() has nothing to do with randomSanitizedString(), because it does not sanitize the string in any way. :) randomName() could/should be renamed to randomMachineName() to resolve that confusion, since the function's clearly documented purpose is to create a string that is compound of characters that are supposed to be compatible with most of the machine name cases in Drupal.

As @djdevin mentioned in #11, a string is unsafe/unsanitized by design. Whether you need to escape/sanitize it always depends on the context you're using the string in.

We could add a separate randomASCIIString() or randomLatinString() helper method, but then again, there are so many possible different character class cases in the first place that it's going to be hard to explain why one case is privileged over others.

xjm’s picture

Title: Rename randomString() to randomUnsafeString() » Rename randomName() to randomMachineName() and possibly randomString() to randomUnsanitizedString()

randomName() could/should be renamed to randomMachineName() to resolve that confusion

Duh. Let's do this. :)

I'm still of the opinion that we should also rename randomString() to randomUnsanitizedString() as well. I think that name is more neutral than randomUnsafeString() and reminds the developer exactly what they're supposed to keep in mind: whether or not the string has been sanitized. #1672764: Improve documentation of randomString() and randomName() is in as well, so hopefully the renames will prompt people to read those docs and think about sanitization when they write their tests.

xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture

Summary updated.

xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture

Issue summary: View changes

Updated issue summary.

andypost’s picture

I think randomUnsafeString() is shorter and more sane then randomUnsanitizedString()
There's no sanitization just generation

sun’s picture

I will also admit and agree that randomName() is currently being (ab)used by many tests, which "just need to implant some value for some other module."

I almost guess that most of the confusion stems from that usage actually.

Random example:

  protected function drupalCreateUser(array $permissions = array()) {
...
    $edit['name']   = $this->randomName();
    $edit['mail']   = $edit['name'] . '@example.com';
...
    $account = entity_create('user', $edit);

While this is a dedicated helper function, and dedicated helper functions could/should actually take the extra mile of doing things properly, it's really just one of many examples for code that does not want to care for the actual allowed and valid characters in usernames as well as e-mail addresses.

The essence is that all code like this is not testing the actual user entity, account, and registration functionality - instead, it just needs a new user account to work with. So the test code wrongly assumes it could simply get away with ignoring the possible string escaping/sanitization issues. (which especially in case of this generic helper function pretty much means that no module tests are taking possible special characters in usernames and e-mails into account...)

Now one could start to think about dedicated helper functions like randomAlphaString(), randomNumbers(), randomAlphaNumericString(), etc and lastly randomUnicodeString() [== randomString()].

In the end, however, use-case specific helper functions à la randomUserName() and randomEmail() would be the most appropriate thing to do.

Only such dedicated functions are able to generate random results that actively leverage built-in knowledge about the minimum and maximum bounds of possible values. For example, both of these are valid random examples:

a@à.ca
someone.who.dare.not.be.named+drupal@super.institute.united-kingdom.emea...

With our current, eight character limited alphanumeric machine name strings, no tests in core and contributed modules actually ensure that they are able to properly handle the real range of possible values - neither regarding data storage (schema), nor regarding secure string escaping/sanitization, nor in the user interface regarding possible layout breakage.

(All of this is closely related to #913086: Allow modules to provide default configuration for running tests, so thanks for bringing up this additional need.)

xjm’s picture

#21 is a much bigger scope than I'd originally thought but it sounds like a good direction, and we could add these helpers one at a time without difficulty. randomMachineName() is then actually the first specific helper. The bigger task would be converting existing tests so that core provides good examples (instead of using machine names all over the place to sidestep the question of prior validation by Drupal, the filesystem, or whatever).

Aside, our current randomString() is actually randomBasePrintableASCIIString(), I think. :)

xjm’s picture

Title: Rename randomName() to randomMachineName() and possibly randomString() to randomUnsanitizedString() » Consider renaming randomString() and/or adding additional, context-appropriate helpers for random string generation

I spun #1676910: Rename randomName() to randomMachineName() off into its own issue since it seems non-controversial and is easy to do by itself. Updating the summary.

xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture

Also, it occurs to me that new helpers (and the conversion of existing tests to them) could be backportable.

xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture