Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
If as an anonymous user you visit SITE_URL/sitemap.xml the sitemap is generated correctly. However, if you are logged in the sitemap is not read by the browser and this error shows up:
error on line 6 at column 6: XML declaration allowed only at the start of the document
I enclose an screenshot showing the problem on Chrome.
Proposed resolution
Fix sitemap generation for logged in users.
Comment | File | Size | Author |
---|---|---|---|
chrome_sitemap_xml.png | 30.36 KB | rcodina |
Comments
Comment #2
gbyte CreditAttribution: gbyte as a volunteer and commentedThis appears to me as some sort of caching issue. I have never encountered it. The sitemaps should be exactly the same regardless of who is viewing. Can you please test the dev version of the module on a clean Drupal instance?
Comment #3
gbyte CreditAttribution: gbyte as a volunteer and commentedClosing due to lack of activity.
Comment #4
rcodina CreditAttribution: rcodina commentedI just installed latest dev of this module using simplytest.me and this is the result:
1) I created two articles
2) I enabled this module and enabled sitemap for articles
3) I visit /sitemap.xml as admin and 404 error shows up (FAIL)
4) I visit /sitemap.xml as anonymous and 404 error shows up (FAIL)
5) I run cron
6) I visit /sitemap.xml as admin and sitemap shows up (OK)
6) I visit /sitemap.xml as anonymous and 404 error shows up (FAIL)
I think once you enable sitemap for a content type you should generate the sitemap and not wait for cron to be executed (or at least show a message to notice user to run cron to see sitemap).
Comment #5
gbyte CreditAttribution: gbyte as a volunteer and commented1), 2), 3), 4), 5), 6) are correct. When you add anything to the sitemap, you get a checkbox asking if you would like to regenerate along with the information that if you do not regenerate, the sitemap will be rebuilt during a future cron run (in case cron feature is turned on).
7) appears to be a caching problem and it happens only on initial generation: For some reason when an anonymous gets a 404 prior to the sitemap's first generation (correct behavior), they keep getting 404 after the first generation (incorrect behavior). As soon as the caches are cleared, the sitemap is also visible to anonymous users. The caching problem is not present on subsequent sitemap rebuilds.
I am not sure why this happens ATM, if anybody has an idea, I will gladly review patches.
Comment #6
emiliocfaria CreditAttribution: emiliocfaria commentedI'm having the same problem as described in the original post. When I generate the first sitemap, I receive this error:
The problem is that the XML is being generated with a blank line before the XML declaration line. If I delete it, the browser can read it without any problem.
Comment #7
rcodina CreditAttribution: rcodina commented@emiliocfaria I tried to manually delete the first 5 blank lines and the browser reads the sitemap.
@gbyte.co Could be possible to avoid the blank lines at the beginning of the sitemap?
Comment #8
rcodina CreditAttribution: rcodina commented@emiliocfaria Do you use nginx?
Comment #9
gbyte CreditAttribution: gbyte as a volunteer and commented@rcodina @emiliocfaria I don't get the XML error. There should be no empty lines. Tested new instances with Firefox and Chromium. Have you guys tested with simplytest.me?
Comment #11
gbyte CreditAttribution: gbyte as a volunteer and commentedI altered the controller so that all anonymous facing sitemap routes don't get cached in case no expected output is produced (no sitemap is generated yet/404 exception is thrown). This is fixing the prior-initial-generation cache issue.