It appears to me that SEO best practice is to have the sitemap live example.com/sitemap.xml, while this module does have a routing entry for that path, it sets the paths to the files within the index file to be /sitemaps/{chunk_id}.
- https://www.sitemaps.org/faq.html#faq_sitemap_location
- https://www.sitemaps.org/protocol.html#location
I'm not 100% sure that referenced sitemap chunks also need to live in the site root, but from some initial exploration it seems fairly trivial to alter this module to support it.
Additionally, I searched the issue queue for information on why it was this way, and tried to find a code comment that might explain it, but have struck out thus far.
I will attach a patch to this issue later today that will alter the behavior of this module to support the sitemap "chunks" living in the root of a site.
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | root-sitemap-2930027-5.patch | 3.28 KB | adamzimmermann |
| #4 | root-sitemap-2930027-4.patch | 3.01 KB | adamzimmermann |
| #2 | root-sitemap-2930027-2.patch | 3.04 KB | adamzimmermann |
Comments
Comment #2
adamzimmermann commentedI'm guessing we will need to alter tests to support this patch, if this is a direction others want to go.
Comment #4
adamzimmermann commentedI think I based off of the wrong branch for my last patch.
Comment #5
adamzimmermann commentedAdded a cache context for
url.query_args.Comment #6
gbyteThanks for the patch.
There is nothing wrong with the changes you made but IMO there is also no need to implement them.
Best practice dictates (and so do the documents you linked) the sitemap file needs to be
root/sitemap.xml. From there it doesn't matter where the chunks are located as all links are available to the bot. The exception being the chunks have to be located on the same domain, which is the case.So can you please tell me why this is important to you and how this would benefit the module? :)
Comment #7
adamzimmermann commentedIf that is the case then this patch is not needed, and that is how I was hoping it worked. However, none of the docs I found came out and said that explicitly, but they didn't say otherwise either. The SEO QA team on my current project were the ones that raised this concern and requested the change. If we are pretty confident that the way the module currently works is fine, then this is just something we need to sort out internally.
Thanks for taking a look!
Comment #8
gbyteNo worries, I can assure you Google for one does not mind this configuration as I regularly review any errors coming from Google webmaster tools and I have a couple of projects with sitemap index in place; all links get indexed. I am not sure why I decided to use /sitemaps/ instead of URL parameters, I guess at the time it seemed the simpler thing to implement for the 10 sites that used this module. ;) I will close this now, but feel free to reopen with new insights.
Comment #9
websmithc commentedAs previously referenced by the author, https://www.sitemaps.org/protocol.html#location says:
"The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/."
By that logic, sitemaps in a /sitemap directory could only reference urls under the /sitemap directory structure. In practice, Google seems to "see" the sitemaps, but it's not following the spec fully. Google, however, is not the only search engine on earth, so the spec should win.
I agree that sitemaps should be in the root. I would love to see a sitemap index in the root that points to sitemaps by locale that exist in the respective locale directory. The lack of configuration for the directory paths of multiple sitemaps is frustrating to an SEO.
Comment #10
websmithc commentedComment #11
gbyteThat's an interesting take. My impression was, seeing that the index file is in the root directory and that it links to other files, it would not matter where those other files live. In any case this will have to go into 3.x. 3.x has seen many new architectural improvements including the addition of sitemap variants. I am looking forward to a patch.
Comment #12
gbyteI am on it, just need a question answered first: https://drupal.stackexchange.com/questions/266191/create-route-with-variable-first-argument.
Maybe one of you can help?
Comment #14
gbyteImplemented in 3.x and works for the default sitemap out of the box.
Sitemap variants also follow this standard:
{variant}/sitemap.xml. See the examples in simple_sitemap.api.php about how to create variants.