Problem/Motivation

If as an anonymous user you visit SITE_URL/sitemap.xml the sitemap is generated correctly. However, if you are logged in the sitemap is not read by the browser and this error shows up:

error on line 6 at column 6: XML declaration allowed only at the start of the document

I enclose an screenshot showing the problem on Chrome.

Proposed resolution

Fix sitemap generation for logged in users.

CommentFileSizeAuthor
chrome_sitemap_xml.png30.36 KBrcodina
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

rcodina created an issue. See original summary.

gbyte’s picture

Status: Active » Postponed (maintainer needs more info)

This appears to me as some sort of caching issue. I have never encountered it. The sitemaps should be exactly the same regardless of who is viewing. Can you please test the dev version of the module on a clean Drupal instance?

gbyte’s picture

Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)

Closing due to lack of activity.

rcodina’s picture

Status: Closed (cannot reproduce) » Needs work

I just installed latest dev of this module using simplytest.me and this is the result:

1) I created two articles
2) I enabled this module and enabled sitemap for articles
3) I visit /sitemap.xml as admin and 404 error shows up (FAIL)
4) I visit /sitemap.xml as anonymous and 404 error shows up (FAIL)
5) I run cron
6) I visit /sitemap.xml as admin and sitemap shows up (OK)
6) I visit /sitemap.xml as anonymous and 404 error shows up (FAIL)

I think once you enable sitemap for a content type you should generate the sitemap and not wait for cron to be executed (or at least show a message to notice user to run cron to see sitemap).

gbyte’s picture

Title: The sitemap is wrong if you access it while logged in » Rare caching problem for anon users after first sitemap generation
Version: 8.x-2.9 » 8.x-2.x-dev
Priority: Normal » Minor
Status: Needs work » Active

1), 2), 3), 4), 5), 6) are correct. When you add anything to the sitemap, you get a checkbox asking if you would like to regenerate along with the information that if you do not regenerate, the sitemap will be rebuilt during a future cron run (in case cron feature is turned on).

7) appears to be a caching problem and it happens only on initial generation: For some reason when an anonymous gets a 404 prior to the sitemap's first generation (correct behavior), they keep getting 404 after the first generation (incorrect behavior). As soon as the caches are cleared, the sitemap is also visible to anonymous users. The caching problem is not present on subsequent sitemap rebuilds.

I am not sure why this happens ATM, if anybody has an idea, I will gladly review patches.

emiliocfaria’s picture

I'm having the same problem as described in the original post. When I generate the first sitemap, I receive this error:

error on line 2 at column 6: XML declaration allowed only at the start of the document

The problem is that the XML is being generated with a blank line before the XML declaration line. If I delete it, the browser can read it without any problem.

rcodina’s picture

@emiliocfaria I tried to manually delete the first 5 blank lines and the browser reads the sitemap.

@gbyte.co Could be possible to avoid the blank lines at the beginning of the sitemap?

rcodina’s picture

@emiliocfaria Do you use nginx?

gbyte’s picture

@rcodina @emiliocfaria I don't get the XML error. There should be no empty lines. Tested new instances with Firefox and Chromium. Have you guys tested with simplytest.me?

  • gbyte.co committed 1157cd7 on 8.x-2.x
    Issue #2883763 by rcodina: Rare caching problem for anon users after...
gbyte’s picture

Status: Active » Fixed

I altered the controller so that all anonymous facing sitemap routes don't get cached in case no expected output is produced (no sitemap is generated yet/404 exception is thrown). This is fixing the prior-initial-generation cache issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.