Please credit also @klausi for this issue!

Problem/Motivation

Mink/BrowserKit uses PHP's DOMElement to get the value of an HTML tag. It will provide the value HTML entity decoded, so we need to do the same with the string passed to assertion to actually match.

Mink thinks every time that the test is a human using a real browser. So, now assertText($text) converts the HTML entities in the character actually viewed by a human.

Let's say the the returned page contains this piece of markup: <p>Bad html &lt;script&gt;alert(123);&lt;/script&gt;</p>

In WebTestBase this test passes because it wrongly consider that the text means only stripping tags:

$this->assertText('Bad html &lt;script&gt;alert(123);&lt;/script&gt;');

In BrowserTestBase will fail but next will pass because Mink is converting HTML entities as is displayed to a human:

$this->assertText('Bad html <script>alert(123);</script>');

The unit test proves the bug.

Proposed resolution

Fix it. Solution taken from #2763013: Convert web tests to browser tests for link module.

Remaining tasks

None.

User interface changes

None.

API changes

None.

Data model changes

None.

CommentFileSizeAuthor
#47 interdiff.txt1.86 KBclaudiu.cristea
#47 2773733-47.patch6.23 KBclaudiu.cristea
#46 2773733-46.patch6.23 KBclaudiu.cristea
#46 interdiff.txt1.86 KBclaudiu.cristea
#46 interdiff.txt743 bytesclaudiu.cristea
#46 2773733-46.patch5.94 KBclaudiu.cristea
#44 interdiff.txt1.4 KBclaudiu.cristea
#44 2773733-44.patch5.94 KBclaudiu.cristea
#43 interdiff.txt1.41 KBclaudiu.cristea
#43 2773733-43.patch5.95 KBclaudiu.cristea
#36 2773733-36.patch5.23 KBclaudiu.cristea
#30 interdiff.txt1.17 KBclaudiu.cristea
#30 2773733-30.patch5.29 KBclaudiu.cristea
#26 2773733-26.patch5.23 KBclaudiu.cristea
#26 interdiff.txt1.15 KBclaudiu.cristea
#24 interdiff.txt5.36 KBclaudiu.cristea
#24 2773733-24.patch5.06 KBclaudiu.cristea
#22 interdiff.txt2.19 KBclaudiu.cristea
#22 2773733-22.patch6.35 KBclaudiu.cristea
#17 interdiff.txt1.97 KBklausi
#17 assert-encoding-2773733-17.patch4.16 KBklausi
#4 2773733-4.patch4.6 KBclaudiu.cristea
assert_text-test-only.patch2.62 KBclaudiu.cristea
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

claudiu.cristea created an issue. See original summary.

claudiu.cristea’s picture

claudiu.cristea’s picture

And the fix. Basically is extracted from #2763013: Convert web tests to browser tests for link module. Credits to @klausi

claudiu.cristea’s picture

The last submitted patch, assert_text-test-only.patch, failed testing.

claudiu.cristea’s picture

Issue summary: View changes
dawehner’s picture

+++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
@@ -64,6 +67,10 @@ protected function assertText($text) {
+      $text = html_entity_decode($text);

@@ -89,6 +99,10 @@ protected function assertNoText($text) {
+      $text = html_entity_decode($text);

+++ b/core/tests/Drupal/Tests/Core/Assert/AssertLegacyTraitTest.php
@@ -154,6 +157,58 @@ public function testAssertOptionSelectedFail() {
+   * @covers ::assertText
+   */
+  public function testAssertText() {
...
+  /**
+   * @covers ::assertNoText
+   */
+  public function testAssertNoText() {

It feels weird, this test seems to be mocking too much without testing the actual functionality, don't you think so?

claudiu.cristea’s picture

It feels weird, this test seems to be mocking too much without testing the actual functionality, don't you think so?

@dawehner, I found no other way to mock the method in a unit test. If you have some ideas, you'e welcomed. I didn't wanted to write Kernel tests for this, trying to keep it in the AssertLegacyTraitTest because is a deprecated method. Any idea?

claudiu.cristea’s picture

Issue summary: View changes
dawehner’s picture

Well, maybe mock just the actual HTTP client (guzzle) in this case and return a response object.

claudiu.cristea’s picture

Hm. I don't understand where is guzzle here.

Eric_A’s picture

Version: 8.2.x-dev » 8.1.x-dev
Status: Needs review » Needs work

This bug exists in the stable branch.
Unfortunately the current patch will need a little forking because #2759879: Additional assertions for WebAssert and AssertLegacyTrait, part 2 wasn't pushed to 8.1.x.

claudiu.cristea’s picture

Status: Needs work » Postponed
Eric_A’s picture

Status: Postponed » Needs review
Eric_A’s picture

Hm. I don't understand where is guzzle here.

@dawehner, any hints to get the major bug fixing moving again?
@claudiu.cristea, still working on this?

klausi’s picture

Status: Needs review » Needs work

I think the approach of the patch is fine, just unit testing the trait methods. No need for any Guzzle responses testing, since the trait does not deal with Guzzle responses. Just one minor thing:

  1. +++ b/core/tests/Drupal/Tests/Core/Assert/AssertLegacyTraitTest.php
    @@ -154,6 +157,58 @@ public function testAssertOptionSelectedFail() {
    +    $this->webAssert
    +      ->pageTextContains($sanitized)
    +      ->willThrow(new ResponseTextException('text not found', $this->getSession()));
    

    Instead of the willThrow() I think we should use ->shouldNotBeCalled(). So no need to set up the exception then.

  2. +++ b/core/tests/Drupal/Tests/Core/Assert/AssertLegacyTraitTest.php
    @@ -154,6 +157,58 @@ public function testAssertOptionSelectedFail() {
    +    $this->webAssert
    +      ->pageTextNotContains($sanitized)
    +      ->willThrow(new ResponseTextException('text found', $this->getSession()));
    

    same here, should have ->shouldNotBeCalled().

klausi’s picture

Addressing my own feedback, removing some other not required mocking in the process.

dawehner’s picture

Status: Needs review » Reviewed & tested by the community
+++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
@@ -89,6 +99,10 @@ protected function assertNoText($text) {
+      // Mink/BrowserKit uses PHP's DOMElement to get the value of an HTML tag.
+      // It will provide the value HTML entity decoded, so we need to do the
+      // same here for the text to actually match.
+      $text = html_entity_decode($text);

Great explanation!

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

I like the mocked text - but given the importance of this behaviour I think we should have a real test to prove the behaviour of mink. All the test currently does is prove what we pass it.

claudiu.cristea’s picture

Hm. Because those assertion methods are legacy methods, I didn't wanted to create new test files/classes. I tried to keep the test inside AssertLegacyTraitTest. #19 means that we have to create another test (BTB or Kernel) for legacy methods :(

klausi’s picture

I think it is fine if we add a testLegacyAsserts() method to BrowserTestBaseTest. Same as I did in #2784537: Add legacy assertFieldByXPath()/assertNoFieldByXPath() method for browser tests.

claudiu.cristea’s picture

Status: Needs work » Needs review
FileSize
6.35 KB
2.19 KB

Here it is.

But I'm not convinced anymore about the solution in this patch. As you can see in the test, asserting plain HTML or encoded HTML is the same. And this is wrong! IMO we should NOT commit this but in tests conversions we should change the actual behaviour and just pass the string to assertText() unescaped. That would be correct. My resolution is "Works as designed" or "Won't fix"

klausi’s picture

Status: Needs review » Needs work

@claudiu.cristea: I think you make a good point here and are right.

In Mink there is an important distinction to make:

$this->assertSession()->pageTextContains() is for inspecting the end user facing page text. All HTML tags are stripped in the page text, so you can never use this to test for actual script tags being there. Everything is HTML entity decoded, to represent the actual characters that the user sees.

Unfortunately Simpletest's assertText() works completely different. It uses the getTextContent() method which strips HTML tags with Xss::filter() and removes the whole section. It does not perform the HTML entity decoding.

So I think it is best to completely emulate Simpletest's behavior in assertText() and do the same Xss::filter() on the raw page content that you get from $this->getSession()->getPage()->getContent().

claudiu.cristea’s picture

Status: Needs work » Needs review
Issue tags: -Drupalaton
FileSize
5.06 KB
5.36 KB

OK. This is 100% the old behaviour.

dawehner’s picture

  1. +++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
    @@ -89,11 +97,38 @@ protected function assertNoText($text) {
    +    $message = !$not_exists ? new FormattableMarkup('"@text" found', $args) : new FormattableMarkup('"@text" not found', $args);
    

    Could we remove this not(!) here? IMHO it just makes it a little harder to read.

  2. +++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
    @@ -89,11 +97,38 @@ protected function assertNoText($text) {
    +
    +    $raw_content = $this->getSession()->getPage()->getContent();
    +    // Strip everything between the HEAD tags.
    +    $raw_content = preg_replace('@<head>(.+?)</head>@si', '', $raw_content);
    +    $page_text = Xss::filter($raw_content, []);
    +
    

    Maybe having some documentation why we are doing the head stripping + xss filtering would be nice.

  3. +++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
    @@ -89,11 +97,38 @@ protected function assertNoText($text) {
    +    $actual = $not_exists == (strpos($page_text, (string) $text) === FALSE);
    +    $this->assertTrue($actual, $message);
    

    Could we use $this->assertEquals($not_exists, (strpos...)) instead?

claudiu.cristea’s picture

FileSize
1.15 KB
5.23 KB

Thank you for review, @dawehner

Fixed #25.1 & 2.

Could we use $this->assertEquals($not_exists, (strpos...)) instead?

Here we provide legacy assertions and we replicate the legacy behaviour. Simpletest assertText()/assertNoText() are returning a boolean but assertEquals() is returning a void. Our legacy method should return a boolean. That's why I computed the value in a variable and if I already have the variable why computing twice in assertEquals()?

dawehner’s picture

Here we provide legacy assertions and we replicate the legacy behaviour. Simpletest assertText()/assertNoText() are returning a boolean but assertEquals() is returning a void. Our legacy method should return a boolean. That's why I computed the value in a variable and if I already have the variable why computing twice in assertEquals()?

Wow that's quite a level of detail ...

+++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
@@ -115,10 +115,12 @@ protected function assertNoText($text) {
+    // Retrieves the plain-text content from the current raw content by striping
+    // everything between the HEAD tags, well-forming all entities, removing
+    // dangerous protocols and filtering out all HTML tags.

This should contain more why, rather than, what, IMHO.

claudiu.cristea’s picture

This should contain more why, rather than, what, IMHO.

I don't understand. Simpletest methods contains no why, no what. Can you suggest by example?

dawehner’s picture

Here is an example.

assertText() tries to simulate what the user sees, given that it removes all text inside the head tags, removes inline JS etc., as they are not visible in a normal browser.
claudiu.cristea’s picture

FileSize
5.29 KB
1.17 KB

OK, added that comment even is not correct 100%, see #22 why.

EDIT: I mean the parsed text is not exactly the same as the text showed to a user in browser. At least the encoded HTML entities. But this is a problem in simpletest assertions and we want to replicate them exactly to make easy the conversions.

dawehner’s picture

@claudiu.cristea
Yeah sorry I haven't put really a lot of time into phrasing the exact comment, I rather just wanted to give you an example of a WHY documentation in opposite to a WHAT documentation.

dawehner’s picture

Status: Needs review » Reviewed & tested by the community
+++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
@@ -89,11 +97,41 @@ protected function assertNoText($text) {
+    $actual = $not_exists == (strpos($page_text, (string) $text) === FALSE);
+    $this->assertTrue($actual, $message);

This line is a big hard to read, but I cannot think of some really better solution, beside using assertEquals, which we probably don't want here?

claudiu.cristea’s picture

This line is a big hard to read, but I cannot think of some really better solution, beside using assertEquals, which we probably don't want here?

I agree but see #26 for an explanation.

dawehner’s picture

Version: 8.1.x-dev » 8.3.x-dev
Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs reroll, +Novice

Note: This needs a reroll against 8.3.x

claudiu.cristea’s picture

Status: Needs work » Needs review
Issue tags: -Needs reroll, -Novice
FileSize
5.23 KB

Rerolled.

claudiu.cristea’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC as per #33.

klausi’s picture

Looks good, RTBC+1.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

I'm not sure about the new approach taken in #22 / #23.

+++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
@@ -89,11 +97,41 @@ protected function assertNoText($text) {
+    $raw_content = preg_replace('@<head>(.+?)</head>@si', '', $raw_content);

This is wrong. It should be the body - otherwise this will include CSS file names (for example).

claudiu.cristea’s picture

Status: Needs work » Needs review

I'm not sure about the new approach taken in #22 / #23.

Can you explain? Please see the test from #22. Both cases (sanitized and unsanitized) are passing and this is wrong, We need to stick to the old ->assertText() behavior otherwise we'll broke a lot of test conversions. But after conversions, we can modernize the tests by using directly the Mink assertions. This patch works exactly as the Simpletest method works and this is what we want in order to minimize the conversion patches and make them reviewable.

Also, you're not right regarding <body>. In fact we strip out the <head> block entirely keep the body. This was also just copied from the legacy assertText().

claudiu.cristea’s picture

Status: Needs review » Reviewed & tested by the community

Setting back to RTBC to get @alexpott input on #40.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

Ah yes you are right about the body - I just had interesting memories about an old issue.

+++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
@@ -57,6 +59,9 @@ protected function assertElementNotPresent($css_selector) {
    *   $this->assertSession()->responseContains() instead.
    */
   protected function assertText($text) {

@@ -82,6 +87,9 @@ protected function assertText($text) {
    *   $this->assertSession()->responseNotContains() instead.
    */
   protected function assertNoText($text) {

We need to update the @deprecated documentation to explain how to convert the assertions.

claudiu.cristea’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
5.95 KB
1.41 KB

OK. Here are some clarifications on the docs. Hope is more clear now. (note that the reference to $this->assertSession()->responseContains() was wrong and I removed it). Back to RTBC as it was only a doc improvement.

claudiu.cristea’s picture

FileSize
5.94 KB
1.4 KB

Ignore the last patch. Interdiff is against #36.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
    @@ -53,10 +55,14 @@ protected function assertElementNotPresent($css_selector) {
    -   *   $this->assertSession()->responseContains() instead.
    
    @@ -78,10 +84,14 @@ protected function assertText($text) {
    -   *   $this->assertSession()->responseNotContains() instead.
    

    I'm pretty sure that the responseContains() is here because in Simpletests we occasionally set the content and then use this methods. But maybe @klausi might remember why this is here.

  2. +++ b/core/tests/Drupal/FunctionalTests/AssertLegacyTrait.php
    @@ -78,10 +84,14 @@ protected function assertText($text) {
    +   *   exactly as human sees it in the browser.
    

    Missing a 'a'.

claudiu.cristea’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
5.94 KB
743 bytes
1.86 KB
6.23 KB

Voilà

claudiu.cristea’s picture

FileSize
6.23 KB
1.86 KB

Sorry, in #46 I messed up the patches. Interdiff against #44.

alexpott’s picture

Of course one problem we have is that existing tests in contrib might have converted halfway... if still be using the assertText but have changed the text inline with the assumptions pageTextContains() makes. In which case committing this patch will break their tests. Tricky.

claudiu.cristea’s picture

@alexpott, Not sure I understand. (1) A contrib has used pageTextContains() (because legacy assertText() is not yet in) and has aligned the passed text to not encode entities. (2) We commit assertText(). How our commit will affect the contrib test?

alexpott’s picture

@claudiu.cristea contrib is tested against the latest branch of D8 - it has assertText since #2735045: Convert StandardTest to BrowserTestBase by introducing a AssertLegacyTrait and new assertions on the WebAssert landed... ie. 7th July this year.

claudiu.cristea’s picture

@alexpott, you're right. Forgot that assertText() was already in HEAD. However, if I would have been in the situation to convert such test, after noticed the different behavior, I would decided to use directly pageTextContains(). Why relying on a legacy method if I'm already forced to touch the code there. Course, this is only an assumption. But I don't see this as a BC break. The migration to BTB is still not stable. BTB should be an experimental feature at this stage, IMO. Nothing critical will break if we push this, just that some tests will fail and they will fix them. No outage on production. It's so bad that we lost the momentum with the conversions, now this transition period becomes to big

alexpott’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: +rc eligible

@claudiu.cristea I think I agree - having the legacy methods behave the same way as WebTestBase is more important. Committing to 8.2.x as well since this is quite an important bug fix in the testing system that will help contrib convert to BrowserTestBase with the smallest amount of difficulty.

Committed and pushed d70e577 to 8.3.x and 95741c0 to 8.2.x. Thanks!

  • alexpott committed d70e577 on 8.3.x
    Issue #2773733 by claudiu.cristea, klausi, dawehner: Fix...

  • alexpott committed 95741c0 on 8.2.x
    Issue #2773733 by claudiu.cristea, klausi, dawehner: Fix...
dawehner’s picture

Thank you alex!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.