Problem/Motivation
In the SimpleTest core module, the WebTestBase method is generating binary-text files instead of text files, and it looks like text was the intention.
As it is, WebTestBase calls the function: simpletest_generate_file('text-' . $count++, 64, $line);
without a $type argument, so it creates the default file type, which is binary-text. Since the filename is prefixed by 'text-' it seems it should create text files.
Binary files are created earlier in the code by calling
simpletest_generate_file('binary-' . $count++, 64, $line, 'binary');
(Notice the $type argument of 'binary')
So it seems the text files should be created like this:
simpletest_generate_file('text-' . $count++, 64, $line, 'text');
(Notice the $type argument of 'text')
A patch is needed to alter line #484 of WebTestBase.php
So... We tried making this change, and found a bug in how text files are generated in simpletest_generate_file(). The function is documented to create N lines, each containing M characters. But for the type 'text', this was not working right -- it called the PHP wordwrap() function, so it was breaking on spaces instead of making uniform line lengths.
Proposed resolution
a) Fix the bug in generating 'text' files so it works as documented.
b) Make WebTestBase generate 'text' files and not 'binary-text' when it prefixes them with 'text-'.
Remaining tasks
Commit.
Note that if #2299361: description of drupalGetTestFiles() is vague and has a typo; simpletest_generate_file() needs help too goes in first, we should return here and fix the docs in this patch in accordance with the patch here (which changes the type of files to text instead of binary-text).
User interface changes
None.
API changes
The type and size of files generated by WebTestBase and simpletest.module will be updated to match the implications of their documentation.
Data model changes
None.
Beta phase evaluation
Issue category | Bug because these two functions are not working as documented. |
---|---|
Issue priority | Normal - these functions are only used in a few tests. |
Unfrozen changes | Unfrozen because it only changes test-related code. |
Prioritized changes | This is a bug fix, and bugs are always prioritized. |
Disruption | None, unless someone is relying on the buggy current behavior of these functions. |
Comment | File | Size | Author |
---|---|---|---|
#58 | interdiff.txt | 447 bytes | David_Rothstein |
#58 | simpletest-2512210-58.patch | 2 KB | David_Rothstein |
#57 | simpletest-2512210-57.patch | 2.02 KB | David_Rothstein |
#54 | simpletest-2512210-54.patch | 2.06 KB | naveenvalecha |
#54 | interdiff-2512210-52-54.txt | 1.45 KB | naveenvalecha |
Comments
Comment #1
trgreen17 CreditAttribution: trgreen17 at FFW commentedBeginning work on the patch now.
Comment #2
trgreen17 CreditAttribution: trgreen17 at FFW commentedComment #3
trgreen17 CreditAttribution: trgreen17 at FFW commentedFirst patch, adding $type argument 'text'
Comment #4
trgreen17 CreditAttribution: trgreen17 at FFW commentedNeeds review, test request sent, awaiting results.
Comment #6
trgreen17 CreditAttribution: trgreen17 at FFW commentedOops, I think one of the text files is too large now that it is ASCII and not binary. This line:
should probably have the '20480' removed to make a smaller file size. I'll try another patch.
NOTES:
1. If this fixes the issue I should get it approved since it changes the behavior of SimpleTest.
2. The documentation will need to be changed in the related ticket to reflect this modification.
Trying another patch.
Comment #7
trgreen17 CreditAttribution: trgreen17 at FFW commentedNew patch with one less text file generated, hoping it solves the failure above.
Comment #9
trgreen17 CreditAttribution: trgreen17 at FFW commentedHmm, failed with same errors, reverting code. I will now look at lines 511 to 513 in simpletest.module. Since 'text' was never passed the switch never ran these lines:
So I suspect there's something there causing the errors. I will modify this code and try to test locally.
Comment #10
trgreen17 CreditAttribution: trgreen17 at FFW commentedNope, that code above seems fine. Checking one more time only calling simpletest_generate_file() once:
Uploading new patch for testing.
Comment #11
trgreen17 CreditAttribution: trgreen17 at FFW commentedNeeds review.
Comment #13
trgreen17 CreditAttribution: trgreen17 at FFW commentedTrying to see if all tests pass without my patch. (Ignore this patch.)
Comment #14
jhodgdontgreen: interesting test failures. So that patch in #3 (which looks correct to me), is failing in:
FileFieldTestBase->getTestFile('text', 131072) in FileFieldTestBase.php 49
So it looks like without this patch, even though WebTestBase::drupalGetTestFiles() said it was generating "text" files of these sizes: array(16, 256, 1024, 2048, 20480) -- it was apparently generating one of size 131072. And with this patch that changes it from generating a sequence of "0"/"1" characters to generating random ASCII characters with codes in the range of 32 to 126, the file size is changing.
Ah... So the problem is that to break the file into lines, it calls wordwrap(). So if some character that wordwrap() recognizes as "a good place to break the line" occurs in those random characters, the file will have a variable number of breaks in it, whereas with all 0/1 characters, it just breaks every 64 characters.
So if you call the function with 'text', you get variable size files.
Hm...
It seems like our options are:
a) Leave the function behaving as it was (calling binary-text) -- I think on the other issue we at least fixed the documentation to say this was happening.
b) Use something other than wordwrap() to break the files into lines at exactly the passed-in number of characters.
I vote for (b), since the function is documented to have an exact number of chraracters per line.
So really what it should be doing I think is something like this:
and skip the call to wordwrap, which isn't really the right thing to do at all for 'text' files.
Thoughts?
I'm also going to suggest we fix the docs on the other issue because the file sizes are not as documented.
Comment #15
trgreen17 CreditAttribution: trgreen17 at FFW commentedHi jhodgdon, excellent sleuthing. Yes, I expected patch #3 to work as well, and was confused until I found the required size of 131072 in FileFieldValidateTest.php, but didn't find the wordwrap issue. I knew it was making files of various sizes (which is causing the failure) but didn't realize what was causing it. Thanks!
I vote for "b" as well. We should fix it while we're here! I'll work on that today. And agreed about fixing the other issue re documentation, but I'm thinking of waiting until we get this sorted to do that, so we only have to do it once, okay? Thanks again.
Comment #16
trgreen17 CreditAttribution: trgreen17 at FFW commentedTrying new patch with suggestions above.
Comment #18
trgreen17 CreditAttribution: trgreen17 at FFW commentedYikes, 46 fails. Guess I better take another look at this.
Comment #19
trgreen17 CreditAttribution: trgreen17 at FFW commentedTrying another one, replacing '\n' with "\n" in simpletest.module.
Comment #21
trgreen17 CreditAttribution: trgreen17 at FFW commentedUgh... 50 fails this time. There must be an issue with my logic, or some
dumbsimple mistake I'm making. I have plans to look this over with a colleague first thing in the morning. He's a more advanced programmer than I am and will likely catch something right away.Comment #22
jhodgdonLet's see.
The errors in the test log are things like this:
So it looks like your patch is making line 517 of simpletest.module run out of memory.
Ah... So I think you just need to get rid of the for(i = 0 to $size) loop. Looping over $lines and $width is enough. You are generating N-squared bytes instead of N bytes.
Comment #23
trgreen17 CreditAttribution: trgreen17 at FFW commentedGoos catch, I knew it was something in my logic. But don't we need to do it '$width * $lines - $lines' times? To account for the new line character taking up memory in the resulting file? If not, then do we need the '$size' calculation at all?
I will try what you said while awaiting your response... I'm sure you're right. :-)
Thanks!
Comment #24
trgreen17 CreditAttribution: trgreen17 at FFW commentedQuick test, I'll clean up the code in a subsequent patch if this works.
Comment #25
jhodgdonI do not think we need $size at all. See, the idea is that the outer loop on $lines will generate each line, and within that, the inner loop generates $width characters. That is what the function is actually documented to do. The previous behavior was to generate a bunch of characters ($size of them) in one big line and then use wordwrap() to insert \n characters.
So I guess that the inner loop should only generate $width - 1 characters for each line, to leave room for the \n character. That is what that size calculation previously did.
Right?
So if you leave in the $size loop, then you're generating each line and each character within each line $size times again, so you end up with N-squared or a very large number of characters, hence the memory problem.
Comment #26
trgreen17 CreditAttribution: trgreen17 at FFW commentedYep, you're exactly right! I was wondering about using '$width - 1;' to account for the new lines. Shoulda gone with my hunch!
I just hadn't realized that wordwrap() ran after the fact. PHP.net here I come. My current patch should not work then. I'll post another with '$width - 1;' in the inner loop.
Comment #27
trgreen17 CreditAttribution: trgreen17 at FFW commentedNew patch with '$width - 1;' in inner loop.
Cleaned up the code too. Fingers crossed.
Comment #29
trgreen17 CreditAttribution: trgreen17 at FFW commentedYes! Thanks a million jhodgdon! I should have looked more carefully for the out of memory errors. I will go back and seek them out. Now, is this correct? I'll run some tests locally and see if I can tell...
Comment #30
jhodgdonThe test bot seems to like it. :) The code looks right to me too.
My only minor gripe would be this comment:
I never thought that "symmetrical file" in the previous code that you got rid of made any sense, and I still don't.
What I think would be useful instead, if you want to add some actually useful comments to the code... keeping in mind that ideally, code comments tell "why?" not "how?":
- It seems to me that the doc block tells the reader what the function is supposed to do (or at least it will once we fix it on that other issue), so I don't think we need to repeat that in the in-code comments.
- One thing we might want to comment on is the $width - 1 piece, something like "Generate $width - 1 characters to leave space for the "\n" character".
Honestly, I think the code is pretty clear with just that one comment in it, because if I see (for i = 1 to $lines) that immediately says "I'm looping over all the lines", and (for j = 1 to $width) says "I'm looping over the width of each line". So we don't need a comment saying "We're looping over lines and over the characters in each one" because the local variables have good names and the code is pretty simple.
Right?
Comment #31
trgreen17 CreditAttribution: trgreen17 at FFW commented"Why, not how." -- Yes I see your point, and I agree. I'll work on this documentation as well as the related issue's documentation later tonight or first thing in the morning, and we can close these two issues at once! Thanks again for all your help, it's much appreciated. I have learned so much!!
Comment #32
pfrenssenA binary file doesn't really have the concept of "lines", so it's a bit weird to see the newlines in there, but as this was already the case before it's fine I guess.
The solution looks good, but this still needs a test.
Comment #33
trgreen17 CreditAttribution: trgreen17 at FFW commentedThanks pfrenssen, I will do a final patch right now with the above code in it but improved documentation per dhodgdon.
Comment #34
jhodgdonprfenssen: In this case, the file has always had "lines" for binary files, yes. Agreed it is a bit odd, but it would probably break even more to change that at this point.
And regarding a test, this could be useful, but the existing test coverage actually found the problem with using wordwrap() for generating 'text' files. We have a lot of tests that are using these generated files... I am not entirely convinced we need another one added? It seems like the coverage is adequate. For the moment, removing the "needs tests" tag.
Comment #35
trgreen17 CreditAttribution: trgreen17 at FFW commentedHopefully the final patch, using jhodgdon's advice for documentation. Assuming this passes I will go to the related issue and roll a patch with updated documentation.
Comment #36
jhodgdonLooks great!
Nitpick: this line of comment needs to end in a .
Comment #37
trgreen17 CreditAttribution: trgreen17 at FFW commentedNo worries... I need to form good consistent and compliant habits. :-) Will do that now.
Comment #38
trgreen17 CreditAttribution: trgreen17 at FFW commentedUpdating documentation, adding a period for consistency and compliance.
Comment #39
jhodgdonThere we go, thanks!
Comment #40
trgreen17 CreditAttribution: trgreen17 at FFW commentedMy pleasure, and again, thank you so much for all your help and patience. I've learned so much!
Comment #41
jhodgdonUpdating summary and adding beta eval.
Note that if #2299361: description of drupalGetTestFiles() is vague and has a typo; simpletest_generate_file() needs help too goes in first, we should return here and fix the docs in this patch in accordance with the patch here (which changes the type of files to text instead of binary-text).
Comment #42
jhodgdonComment #43
liberatrSo glad we figured out the file size question. Happy to be working on core too.
Comment #44
trgreen17 CreditAttribution: trgreen17 at FFW commentedThanks again liberatr for all the help last week on the logic! I learned a lot. Adding you to attribution.
Comment #45
trgreen17 CreditAttribution: trgreen17 at FFW commentedliberatr should be added to commit message please.
Comment #46
jhodgdonAdding liberatr to commit message (helped trgreen with programming in person even though did not add a patch directly).
Comment #47
alexpottI think that we could open a followup to add explicit test coverage - since it would be useful.
Committed 2219b31 and pushed to 8.0.x. Thanks!
Thanks for adding the beta evaluation to the issue summary.
Comment #49
trgreen17 CreditAttribution: trgreen17 at FFW commentedI was happy to work on this. It's very exciting to get a commit in D8 core and it was a great learning experience for my first contribution to Drupal! Special thanks to jhodgdon for her patience and excellent advice, and to liberatr for his code help.
Comment #50
naveenvalecha.
Comment #52
naveenvalechaFixed the exceptions.
Comment #54
naveenvalecha.
Comment #55
jhodgdonThanks! This looks like the same patch as 8.x, and I think we should commit it to 7.x, as it is a bug.
Comment #57
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedThat patch messes up the code indentation, which also might make it a bit harder to review, so here's a version that fixes that.
I'm leaving this RTBC since I didn't change anything except the indentation.
Comment #58
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedActually no reason to mess with blank lines either...
Comment #59
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedAdded an issue to fix the indentation in Drupal 8 also.
Comment #60
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedCommitted to 7.x - thanks!