Scenario:

  • Set up XML sitemap including node content of type Basic page
  • Create a basic page
  • Configure cron to run nightly
  • Leave the site for a 15 days without changing any pages
  • Edit a page

This is a perfectly normal scenario for a site that changes fairly rarely. However there are now errors that stay until the next nightly cron run:

  • On admin/config: One or more problems were detected with your Drupal installation. Check the status report for more information.
  • On /admin/reports/status: The XML cached files are out of date and need to be regenerated.

False alarms can waste time and should be avoided. It's particularly a problem for people (like me) who are responsible for a fairly large number of sites that all change fairly rarely. In such cases, it makes sense to have mechanisms to gather status reports from all the sites and flag up errors. With many sites, the false alarms can become a nuisance.

In terms of a fix, the simplest thing might be to change xmlsitemap_cron to set xmlsitemap_generated_last if xmlsitemap_regenerate_needed is FALSE. I.e. make the variable to mean "last time we successfully generated whatever needed doing". This seems very simple to code, and I am happy to provide a patch.

Or as an alternative, store a timestamp when xmlsitemap_regenerate_needed was set, and only flag an error based on time elapsed from that time.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

AdamPS’s picture

Status: Active » Needs review
FileSize
524 bytes

Simple patch as per suggestion in original post.

Dave Reid’s picture

Version: 7.x-2.2 » 7.x-2.x-dev
Status: Needs review » Needs work

Maybe it would be easier to change xmlsitemap_regenerate_needed to a timestamp for when that value was set, and compare the time difference between now and that instead of when it was last generated. I'd rather not be saving 'fake' data for when the files were actually generated.

AdamPS’s picture

@Dave Reid Thanks for the reply.

Yes I agree that approach would be better. Although presumably it would need care when reading the value that it might not actually be a timestamp, but a bool written by the old code. Or perhaps there could be a hook_update function to correct the bool to a timestamp? Or perhaps keep the bool, and add another variable that is the timestamp?

This improved fix seems more complex and requires more careful testing - and for me a bit more learning. Unfortunately I don't have time right now. I hope that someone else might come along and take up coding the improved fix. In the meantime, my patch does seem to work for anyone else who doesn't mind "fake" data.

Status: Needs work » Needs review
darrell_ulm’s picture

This could still be an issue. Updated the sitemap, still getting the message. Perhaps the patch was not applied or no good solution found yet? Thanks!

Dave Reid’s picture

FYI for Drupal 8 you can manually set the cron_threshold_error and cron_threshold_warning values in your xmlsitemap.settings config to make this go away for longer. For Drupal 7 you would edit the cron_threshold_error and cron_threshold_warning variables.

AdamPS’s picture

Thanks for the suggestion. However I'm not sure that it really solves the problem as people do still want to receive timely alarms if the sitemaps need regenerating and for several days running that fails.

AdamPS’s picture

Version: 7.x-2.x-dev » 8.x-1.x-dev
FileSize
2.74 KB

Here is patch for D8 that doesn't save 'fake' data

Status: Needs review » Needs work

The last submitted patch, 8: xmlsitemap.false-alarm-2461505-8.patch, failed testing. View results

AdamPS’s picture

Status: Needs work » Needs review
FileSize
2.99 KB

Reroll for dev. #8 applies to stable.

Status: Needs review » Needs work

The last submitted patch, 10: xmlsitemap.false-alarm-2461505-10.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

AdamPS’s picture

Status: Needs work » Needs review
FileSize
3.91 KB

Fix the failing test

Dave Reid’s picture

I'm still confused about why this needs to be changed, because you'd only see this as a warning or error, if you're also getting an error about cron not being run in a long time. For example, my last cron run was 10 hours ago on my local. I have edited a bunch of content, and the xmlsitemap_regenerate_needed flag is TRUE. But I don't see any warning message until I exceed the cron_threshold_warning value. Are you also using a minimum lifetime setting in the XML sitemap settings? Could you take a screenshot of your status report page showing that there is not a cron warning, but there is an XML sitemap warning?

AdamPS’s picture

@Dave Reid Thanks for taking your time to follow this issue.

The scenario is

  • Create a site with some content.
  • Leave the site for 15 days without changing anything and cron runs successfully every night.
  • Add a page.
  • For the rest of the day until the next nightly cron run there is a false error report because xmlsitemap_regenerate_needed is set, and xmlsitemap_generated_last is more than cron_threshold_error in the past.

Hopefully that makes it clear without the need for any screenshots.

Dave Reid’s picture

@AdamPS: Oh thank you, that helps explain it a bit more.

Dave Reid’s picture

I think this should resolve it without needing to set anything else. This also takes into account if cron regeneration is disabled, and re-uses the system settings as intended (using our own settings shouldn't have happened in the D8 porting process).

Status: Needs review » Needs work

The last submitted patch, 16: 2461505-fix-requirement-false-alarms.patch, failed testing. View results

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
6.91 KB

Adding test coverage.

Dave Reid’s picture

Left in a test debugging line, try this one.

The last submitted patch, 18: 2461505-fix-requirement-false-alarms.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 19: 2461505-fix-requirement-false-alarms.patch, failed testing. View results

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
6.5 KB

This time using the correct setting in the test.

Dave Reid’s picture

Status: Needs review » Needs work

The last submitted patch, 22: 2461505-fix-requirement-false-alarms.patch, failed testing. View results

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
6.51 KB

One more incorrect state variable, should correct the tests.

AdamPS’s picture

Thanks for making a new patch. I think your idea will solve my scenario and so I would be happy to see it committed.

However there may be some potential complications in certain cases, so I'm just going to run through some scenarios and ask questions in case it helps. Feel free to ignore me:-)

disable_cron_regeneration is FALSE

  1. If cron is not running it will generate warnings/errors. Arguably xmlsitemap doesn't really need to duplicate that but I guess it doesn't matter too much that it will.
  2. If cron is running (cron_last is set), but XML sitemap regeneration isn't working. I don't know if this this case can occur, but if so it is vital to generate warnings/errors and I think your patch won't do it.

disable_cron_regeneration is TRUE

I guess this option is a minority, so it less important, but I'll mention it anyway. xmlsitemap needs to generate warnings/errors as needed. However

  1. tying the warnings to cron settings seems odd in this case
  2. this issue applies here too - it's not really desirable to generate an instant error if a site has been inactive then a page is saved

Non back-compatible change?

The patch retires two existing settings. What about sites that have carefully set the existing settings for a particular reason?

Alternative approach

To cut back to simplicity, you had a good idea in #2 that we could track the time of xmlsitemap_generated_last, i.e. when a change was last made, and generate warnings/errors if it becomes too long ago.

Dave Reid’s picture

I think we should definitely continue following up with making the regenerate_needed flag a timestamp instead of a boolean, so we can accurately use that. It's going to require more changes, so I think this is a good fix for now and we can continue to iterate on it.

If cron is running (cron_last is set), but XML sitemap regeneration isn't working. I don't know if this this case can occur, but if so it is vital to generate warnings/errors and I think your patch won't do it.

Yeah this is an interesting case, the user would continue to see that the last generated time gets further and further away, and we should be logging errors or exceptions that happen with the regeneration process, so hopefully they are reviewing logs. Making the regenerate_needed a timestamp would also solve this.

The patch retires two existing settings. What about sites that have carefully set the existing settings for a particular reason?

Well frankly the module is still in alpha for this reason, I'm noticing a lot of changes like this which went the wrong direction with the module port. This was a regression from the Drupal 7.x-2.x version. If a site administrator has the ability to manually set those configuration values (because they are not exposed in the UI) then I would think that they would have the ability to update the system.cron settings if desired. For site administrators inconvenienced for this change, I offer my apologies and hope they can understand I don't want to be storing additional settings.

AdamPS’s picture

Status: Needs review » Reviewed & tested by the community

Thanks for the explanation

  • Dave Reid committed c952b91 on 8.x-1.x
    Issue #2461505 by Dave Reid, AdamPS: Fixed false alarms on status report...
Dave Reid’s picture

Version: 8.x-1.x-dev » 7.x-2.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

I've filed #3007153: Store xmlsitemap_regenerate_needed flag as a timestamp instead of a boolean as a follow-up. Committed #25 to 8.x-1.x. Moving this for consideration for 7.x-2.x now.

AdamPS’s picture

Great many thanks