Thanks for this super-helpful module!
Problem/Motivation
There are a couple of situations where you can end up making unnecessary requests via JS for fresh site alerts.
- If the workaround is being used and a timeout hasn't been configured, then in site-alert.js there's a call to
loadAlert($('.site-alert', context));that automatically sets a timeout to call itself again usingdrupalSettings.siteAlert.timeout- but that's not set, so the next request is made immediately, and schedules another request to be made immediately and so on, like a DOS attack. - If
Drupal.attachBehaviors()is called on.site-alertmore than once it will cause a fresh update and a freshsetTimeout().
Proposed resolution
- Stop
loadAlert()from automatically recalling itself after a timeout. - Use jQuery once to only run the site alert behavior once.
I realise it's not ideal to fix multiple issues in one ticket, but there are some other minorish fixes:
- Attach behaviors to newly inserted markup.
- Add
drupalSettingsas an explicit dependency. - Use
varinstead ofletfor compatibility. I can't find any examples of core usingletorconst, though I appreciate it's well supported these days. - Use
Drupal.url().
Remaining tasks
- The test
SiteAlertCacheWorkaroundTestneeds updating with this patch. The test schedules an alert for 2 seconds in the future, gets the page before that (ie without the alert) and then waits for the message to show without issuing a page refresh. I wonder if it only passes currently because the bug means it's continually issuing fresh requests? If not then presumably this patch is introducing a bug that needs to be addressed.
| Comment | File | Size | Author |
|---|---|---|---|
| #23 | 3154138-22.patch | 3.49 KB | pfrenssen |
| #21 | 3154138-21-test-only.patch | 2.72 KB | pfrenssen |
Comments
Comment #2
andyf commentedThanks!
Comment #4
andyf commentedHmmmn, I'm not 100% sure what the test failure is down to. But it looks like despite the comment Refresh the page for a few seconds until the alert appears the test doesn't actually issue a refresh, it just waits. I wonder if when there were two AJAX requests, the second one was processed after 2 seconds which enabled the test to pass.
Testing the hypothesis by adding a delay and refresh.
Thanks!
Comment #5
andyf commentedOnly attach behaviors after the new content's inserted (: Also check for error.
Comment #6
jwilson3Patch in #5 is working for me in production. This is a critical error. Without the patch the site alert js ends up pinging continuously to the backend, basically causing a significant unusable slow down on the frontend, and what amounts to a denial of service on the backend.
For example, all resources being taken up to respond to the ajax request several times a second causes login actions of > 1 minute.
I'm marking this major because there is a workaround -- i.e., just don't use the scheduling aspect of the module OR set your block to ping the server ever 5 minutes. But unwitting site admins that are not aware of the problem could easily break their site.
Comment #7
pfrenssenThis change is not correct, reloading the front page invalidates this test.
The documentation is indeed confusing. What this does is that it will poll the DOM 10x per second for a period of 10 seconds to check if the alert appears. If this test fails and needs a refresh of the front page then this means that the patch is breaking the current functionality.
Can we get a test case also that proves that the JS code is loaded twice unnecessarily and that the fix solves it?
Comment #8
pfrenssenI fixed the unclear documentation: https://git.drupalcode.org/project/site_alert/commit/d6a7d60
Comment #9
andyf commentedThanks for the feedback @pfrenssen!
I guess I don't understand what the test is doing then, would you possibly be able to expand how it actually verifies anything? From my limited understanding it seems it's meant to:
But why would that ever work? The page load was before the scheduled time, I know there's a single AJAX update, but I wouldn't expect it to arrive a full 2 seconds later and so return the site alert. Sorry if I'm being slow, here. I wonder if actually it's just the bug that enables it to pass - it keeps issuing AJAX requests until it gets the update.
Thanks!
Comment #10
pfrenssenI am sorry but I cannot look into this further right now. This is not affecting my paid work so this review is on my spare time. I unfortunately have less than 30 minutes per week to spend on reviewing patches and there are 20+ modules that I maintain. I manage this by prioritizing the ones that are RTBC.
At the moment this is not ready because the patch contains unrelated changes and there is no failing test that showcases the bug and the fix. Normally small unrelated changes are fine for me but this is JS code which I am not familiar with. My Drupal time budget doesn't include studying JS best practices.
I would suggest the best way forward would be to make separate issues for the JS best practices and the change that is required in the test. You can postpone this issue on those if you want.
This is by design but I agree that we should turn off this behavior by default. The default option for new users should be to not do requests in the background. It is out of scope for this issue though. If you could make a feature request to default this behavior to off then we can investigate whether it can be done in a way that doesn't break B/C.
Comment #11
andyf commentedThanks once again @pfrenssen! Moving back to Needs work. I too don't have extra time to spend on this, so hopefully someone will be able to step up. Thanks!
Comment #12
andyf commentedIS update with my thoughts about the test, thanks!
Comment #13
jwilson3I don't think it's by design that a valid configuration of the site alert has an unintended consequence as significant as this. I too, and all of us here, have limited time on the issue queues.
I agree that the patch should be reworked to remove the unrelated changes so that A) its easier to review and B) doesn't accidentally trigger another error, and I should have said as much before RTBCing this. For that I apologize, I've been using this module for years, and was part of the team that originally wrote the module with Ccers and Nicolo. Since I thought they were still maintaining they would be find committing the changes here. Thank you for clarifying your position on this.
Comment #14
jwilson3From issue summary:
I think we all agree that it is critical that there be a test that identifies this bug. Whether it is SiteAlertCacheWorkaroundTest that is broken or whether we need another new test, I cannot entirely say. It looks like the hardest part of this issue is figuring out whether SiteAlertCacheWorkaround is "broken" (and therefore somehow hiding this bug in the wild) or not.
Comment #15
pfrenssenI think this is a very good idea. I just read the test in full and it seems to me that there is indeed a problem in it. We need to investigate this.
At the start of the test this sets the timeout to 0:
The comment explains that alerts need to be loaded using JS, but if I understand this correctly, this should happen only one single time. The timeout should not be set, and after the initial retrieval of the currently active alerts, no further requests should be made.
But this section that was changed in the patch for this issue seems to conflict with this:
This last assert seems wrong, it seems to me this should be "Wait for a few seconds and check that the new alert DOES NOT appear." For the coverage to be complete, it should then reload the page and check that the alert now appears correctly.
But to make this even more complicated, at the end of the
SiteAlertTimeoutTest::testSiteAlertTimeoutsDisabled()there is a check that the JS code is not included in case that the timeout is set to 0. This conflicts with the SiteAlertCacheWorkaroundTest which tests that the JS should be there. Does this mean that both these tests are broken?Comment #16
pfrenssenOK so the difference between the two tests is in whether there are alerts scheduled in the future. This lines up with the comment in #6.
Unfortunately I cannot work on this any more right now, but if anyone can confirm that the way to replicate this bug is as follows then this would be very much appreciated:
* Set the timeout in the site alert block to 0.
* Create a site alert which is scheduled to appear in the future.
* Clear caches and load the frontpage.
* Check in the browser developer toolbar that there are continuous requests happening.
Comment #17
jwilson3I can confirm that this ^ is indeed how you replicate the bug. In step 2, the site alert doesn't have to be scheduled to appear in the future -- the bug is also triggered if the alert is schedule with a start and end date and is currently active. I haven't tested whether past / historical / expired scheduled alerts also trigger this, but I assume they do.
I might recommend leaving the test that sets
'timeout' => 0but then add another test that doesn't set a timeout, which should default to 0, I think this might be where the bug lies.Comment #18
rpsuupdated patch from #5 against current HEAD (failed due to updated comment in
tests/src/FunctionalJavascript/SiteAlertCacheWorkaroundTest.phpComment #19
pfrenssenThanks for the confirmation. Raising this to critical since it can cause very serious problems which can be triggered unexpectedly during normal usage.
I am in favor of getting a quick fix in for this and get a new release out ASAP. We can handle the tests and proper fix in a followup.
Is there a way we can get this fixed in as little lines of code as possible? For example to do a check if the timeout is lower than 1000ms in JS, and setting it to 1000ms if this is the case.
I'm asking because the larger the patch is the lower the chance I will have availability in my spare time to review and commit.
Comment #20
pfrenssenI'm going to take a look.
Comment #21
pfrenssenProvided a test that shows that unexpected JS requests are taking place when the timeout is set to 0 and a scheduled alert is present. The patch fixes the problem correctly.
While the tests are running I will review the JS code to the best of my ability.
Comment #23
pfrenssenHere is a minimal fix, just adding a check that the timeout is not 0 is enough. jQuery once does not seem to be needed.
I looked into the JS changes but I read conflicting advice on whether to use "var" or "let" and I did not feel qualified to make a good judgement without doing more reading. I would propose to do the JS refactoring in a followup issue. Let's only fix the issue at hand here.
Comment #25
pfrenssenThanks all for the bug report, patches and advice. I will immediately issue a new release that includes this fix.
My apologies for the slow reaction, I did not initially recognize the severity of the issue.
@AndyF if you feel like opening a followup issue to modernize our JS code and apply best practices, this would be very much appreciated!
Comment #26
jwilson3👍thanks Pieter