Please credit also @klausi for this issue!
Problem/Motivation
Mink/BrowserKit uses PHP's DOMElement to get the value of an HTML tag. It will provide the value HTML entity decoded, so we need to do the same with the string passed to assertion to actually match.
Mink thinks every time that the test is a human using a real browser. So, now assertText($text)
converts the HTML entities in the character actually viewed by a human.
Let's say the the returned page contains this piece of markup: <p>Bad html <script>alert(123);</script></p>
In WebTestBase this test passes because it wrongly consider that the text means only stripping tags:
$this->assertText('Bad html <script>alert(123);</script>');
In BrowserTestBase will fail but next will pass because Mink is converting HTML entities as is displayed to a human:
$this->assertText('Bad html <script>alert(123);</script>');
The unit test proves the bug.
Proposed resolution
Fix it. Solution taken from #2763013: Convert web tests to browser tests for link module.
Remaining tasks
None.
User interface changes
None.
API changes
None.
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#47 | interdiff.txt | 1.86 KB | claudiu.cristea |
#47 | 2773733-47.patch | 6.23 KB | claudiu.cristea |
Comments
Comment #2
claudiu.cristeaAdding the related issues.
Comment #3
claudiu.cristeaAnd the fix. Basically is extracted from #2763013: Convert web tests to browser tests for link module. Credits to @klausi
Comment #4
claudiu.cristeaComment #6
claudiu.cristeaComment #7
dawehnerIt feels weird, this test seems to be mocking too much without testing the actual functionality, don't you think so?
Comment #8
claudiu.cristea@dawehner, I found no other way to mock the method in a unit test. If you have some ideas, you'e welcomed. I didn't wanted to write Kernel tests for this, trying to keep it in the AssertLegacyTraitTest because is a deprecated method. Any idea?
Comment #9
claudiu.cristeaComment #10
dawehnerWell, maybe mock just the actual HTTP client (guzzle) in this case and return a response object.
Comment #11
claudiu.cristeaHm. I don't understand where is guzzle here.
Comment #12
Eric_A CreditAttribution: Eric_A commentedThis bug exists in the stable branch.
Unfortunately the current patch will need a little forking because #2759879: Additional assertions for WebAssert and AssertLegacyTrait, part 2 wasn't pushed to 8.1.x.
Comment #13
claudiu.cristeaPostponing on 8.1.x commit of #2759879: Additional assertions for WebAssert and AssertLegacyTrait, part 2.
Comment #14
Eric_A CreditAttribution: Eric_A commentedComment #15
Eric_A CreditAttribution: Eric_A commented@dawehner, any hints to get the major bug fixing moving again?
@claudiu.cristea, still working on this?
Comment #16
klausiI think the approach of the patch is fine, just unit testing the trait methods. No need for any Guzzle responses testing, since the trait does not deal with Guzzle responses. Just one minor thing:
Instead of the willThrow() I think we should use ->shouldNotBeCalled(). So no need to set up the exception then.
same here, should have ->shouldNotBeCalled().
Comment #17
klausiAddressing my own feedback, removing some other not required mocking in the process.
Comment #18
dawehnerGreat explanation!
Comment #19
alexpottI like the mocked text - but given the importance of this behaviour I think we should have a real test to prove the behaviour of mink. All the test currently does is prove what we pass it.
Comment #20
claudiu.cristeaHm. Because those assertion methods are legacy methods, I didn't wanted to create new test files/classes. I tried to keep the test inside AssertLegacyTraitTest. #19 means that we have to create another test (BTB or Kernel) for legacy methods :(
Comment #21
klausiI think it is fine if we add a testLegacyAsserts() method to BrowserTestBaseTest. Same as I did in #2784537: Add legacy assertFieldByXPath()/assertNoFieldByXPath() method for browser tests.
Comment #22
claudiu.cristeaHere it is.
But I'm not convinced anymore about the solution in this patch. As you can see in the test, asserting plain HTML or encoded HTML is the same. And this is wrong! IMO we should NOT commit this but in tests conversions we should change the actual behaviour and just pass the string to assertText() unescaped. That would be correct. My resolution is "Works as designed" or "Won't fix"
Comment #23
klausi@claudiu.cristea: I think you make a good point here and are right.
In Mink there is an important distinction to make:
$this->assertSession()->pageTextContains() is for inspecting the end user facing page text. All HTML tags are stripped in the page text, so you can never use this to test for actual script tags being there. Everything is HTML entity decoded, to represent the actual characters that the user sees.
Unfortunately Simpletest's assertText() works completely different. It uses the getTextContent() method which strips HTML tags with Xss::filter() and removes the whole section. It does not perform the HTML entity decoding.
So I think it is best to completely emulate Simpletest's behavior in assertText() and do the same Xss::filter() on the raw page content that you get from $this->getSession()->getPage()->getContent().
Comment #24
claudiu.cristeaOK. This is 100% the old behaviour.
Comment #25
dawehnerCould we remove this not(!) here? IMHO it just makes it a little harder to read.
Maybe having some documentation why we are doing the head stripping + xss filtering would be nice.
Could we use
$this->assertEquals($not_exists, (strpos...))
instead?Comment #26
claudiu.cristeaThank you for review, @dawehner
Fixed #25.1 & 2.
Here we provide legacy assertions and we replicate the legacy behaviour. Simpletest assertText()/assertNoText() are returning a boolean but assertEquals() is returning a void. Our legacy method should return a boolean. That's why I computed the value in a variable and if I already have the variable why computing twice in assertEquals()?
Comment #27
dawehnerWow that's quite a level of detail ...
This should contain more why, rather than, what, IMHO.
Comment #28
claudiu.cristeaI don't understand. Simpletest methods contains no why, no what. Can you suggest by example?
Comment #29
dawehnerHere is an example.
Comment #30
claudiu.cristeaOK, added that comment even is not correct 100%, see #22 why.
EDIT: I mean the parsed text is not exactly the same as the text showed to a user in browser. At least the encoded HTML entities. But this is a problem in simpletest assertions and we want to replicate them exactly to make easy the conversions.
Comment #31
dawehner@claudiu.cristea
Yeah sorry I haven't put really a lot of time into phrasing the exact comment, I rather just wanted to give you an example of a WHY documentation in opposite to a WHAT documentation.
Comment #32
catchComment #33
dawehnerThis line is a big hard to read, but I cannot think of some really better solution, beside using
assertEquals
, which we probably don't want here?Comment #34
claudiu.cristeaI agree but see #26 for an explanation.
Comment #35
dawehnerNote: This needs a reroll against 8.3.x
Comment #36
claudiu.cristeaRerolled.
Comment #37
claudiu.cristeaBack to RTBC as per #33.
Comment #38
klausiLooks good, RTBC+1.
Comment #39
alexpottI'm not sure about the new approach taken in #22 / #23.
This is wrong. It should be the body - otherwise this will include CSS file names (for example).
Comment #40
claudiu.cristeaCan you explain? Please see the test from #22. Both cases (sanitized and unsanitized) are passing and this is wrong, We need to stick to the old ->assertText() behavior otherwise we'll broke a lot of test conversions. But after conversions, we can modernize the tests by using directly the Mink assertions. This patch works exactly as the Simpletest method works and this is what we want in order to minimize the conversion patches and make them reviewable.
Also, you're not right regarding
<body>
. In fact we strip out the<head>
block entirely keep the body. This was also just copied from the legacy assertText().Comment #41
claudiu.cristeaSetting back to RTBC to get @alexpott input on #40.
Comment #42
alexpottAh yes you are right about the body - I just had interesting memories about an old issue.
We need to update the @deprecated documentation to explain how to convert the assertions.
Comment #43
claudiu.cristeaOK. Here are some clarifications on the docs. Hope is more clear now. (note that the reference to $this->assertSession()->responseContains() was wrong and I removed it). Back to RTBC as it was only a doc improvement.
Comment #44
claudiu.cristeaIgnore the last patch. Interdiff is against #36.
Comment #45
alexpottI'm pretty sure that the responseContains() is here because in Simpletests we occasionally set the content and then use this methods. But maybe @klausi might remember why this is here.
Missing a 'a'.
Comment #46
claudiu.cristeaVoilà
Comment #47
claudiu.cristeaSorry, in #46 I messed up the patches. Interdiff against #44.
Comment #48
alexpottOf course one problem we have is that existing tests in contrib might have converted halfway... if still be using the assertText but have changed the text inline with the assumptions
pageTextContains()
makes. In which case committing this patch will break their tests. Tricky.Comment #49
claudiu.cristea@alexpott, Not sure I understand. (1) A contrib has used pageTextContains() (because legacy assertText() is not yet in) and has aligned the passed text to not encode entities. (2) We commit assertText(). How our commit will affect the contrib test?
Comment #50
alexpott@claudiu.cristea contrib is tested against the latest branch of D8 - it has assertText since #2735045: Convert StandardTest to BrowserTestBase by introducing a AssertLegacyTrait and new assertions on the WebAssert landed... ie. 7th July this year.
Comment #51
claudiu.cristea@alexpott, you're right. Forgot that assertText() was already in HEAD. However, if I would have been in the situation to convert such test, after noticed the different behavior, I would decided to use directly pageTextContains(). Why relying on a legacy method if I'm already forced to touch the code there. Course, this is only an assumption. But I don't see this as a BC break. The migration to BTB is still not stable. BTB should be an experimental feature at this stage, IMO. Nothing critical will break if we push this, just that some tests will fail and they will fix them. No outage on production. It's so bad that we lost the momentum with the conversions, now this transition period becomes to big
Comment #52
alexpott@claudiu.cristea I think I agree - having the legacy methods behave the same way as WebTestBase is more important. Committing to 8.2.x as well since this is quite an important bug fix in the testing system that will help contrib convert to BrowserTestBase with the smallest amount of difficulty.
Committed and pushed d70e577 to 8.3.x and 95741c0 to 8.2.x. Thanks!
Comment #55
dawehnerThank you alex!