Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Realized that using both xmlsitemap_menu and xmlsitemap_node can result in duplicate items in the sitemap for nodes also having a link in a menu. After a quick search in the issue queue it seems that this bothered a few ppl and using other submodules can cause the same problem:
https://drupal.org/node/596008
https://drupal.org/node/631218
https://drupal.org/node/1999958
My proposed solution would be to add an option to always exclude duplicate entries from the sitemap. I'm waiting for some feedback/opinions before actually starting to work on a patch.
Comment | File | Size | Author |
---|---|---|---|
#12 | xmlsitemap-2257191-12.patch | 1.04 KB | Diego_Mow |
|
Comments
Comment #1
spidersilk CreditAttribution: spidersilk commentedI very much second this request! There is no valid reason I can think of for allowing duplicate entries in an XML stylesheet. The module should check before adding any URL to the sitemap to make sure it isn't a duplicate of one that's already there.
Also, I noticed in trying to investigate the issue on a client's site that the entries added by XML sitemap menu don't have any last modified date - that doesn't seem good!
I was about to disable XML sitemap menu to resolve this on that site, when I realized that the front page is only indexed as a menu link, not a node! So it looks like if I disable that module, the front page will no longer be indexed by search engines, which is a big problem! The node that acts as the front page would still be indexed, but not as the front page (i.e. it would be indexed as www.example.com/this-is-the-node-title, not as www.example.com).
Conversely, if I disable XML sitemap node instead, then none of the site map entries will have dates on them, which I expect is likely to cause problems. And if I don't disable either of them, then everything will keep on being indexed twice, which is also likely to cause problems (since I expect the search engines will see that as spamming, and penalize the site, or remove it entirely from their listings!).
There's also a support issue, #1999958: XML Sitemap Custom allows duplicate entries, which includes a patch to stop XML sitemap custom from creating duplicate entries - maybe it could be adapted to stop any duplicate entries from being indexed, regardless of which submodule is to blame? Then these two support issues could be merged...
Comment #2
gappleHere is my approach, which causes a sitemap entry created by xmlsitemap_node to take priority over one from xmlsitemap_menu.
Comment #3
alex.skrypnyk@gapple
Proposed patch loads node for each link - this is quite expensive operation.
Instead, hook_xmlsitemap_link_alter() can be used. It runs only during sitemap generation.
Comment #4
mhmhartman CreditAttribution: mhmhartman commented#3 did not work for me. Everything is still being duplicated.
#2 did work perfectly. @alex - 7000 links were generated in 1 minute, doesn't seem to be that expensive in my opinion.
Comment #5
Yaron Tal CreditAttribution: Yaron Tal at One Shoe commentedBoth options rely on having the xmlsitemap_node module run before the xmlsitemap_menu module runs. In my case they ran in the other order.
For me this seems to fix duplicate items:
Comment #6
Gomez_in_the_South CreditAttribution: Gomez_in_the_South commented#5 causes an issue for me whereby it sets link['access'] to false for valid entries on my multilingual site. This leaves me with a sitemap that is missing translations. I'm using language_hierarchy, but you may want to check before using on other multilingual sites as well.
Comment #7
Yaron Tal CreditAttribution: Yaron Tal at One Shoe commentedThere should be some kind of weight in there. At the moment it is impossible to choose which link should be visible to the user, and wich is a duplicate. We now use the following code to always use the link from the node, and only use others when there is no version available from the node module. This way you get all the data from the node module, and no duplicates.
The best solution would be to merge data from multiple sub-modules and make 'loc' the unique key, but I'm guessing that won't make it into the 7.x version of xmlsitemap.
Comment #8
JAINV18 CreditAttribution: JAINV18 as a volunteer commented#2 works for me. Thanks!
Comment #9
odrzutowiec CreditAttribution: odrzutowiec commentedHello solution proposed by gapple (the first one) worked perfectly for me, thanks!
Comment #10
giupenni CreditAttribution: giupenni commented#2 works for me
Comment #11
darrenwh CreditAttribution: darrenwh as a volunteer and at Investis Digital commentedVery minor DCS thing, comment missing end period (.)
Comment #12
Diego_Mow CreditAttribution: Diego_Mow at CI&T commentedHi.
I'm uploading patch number #12.
I resolved the Code Standard from patch number #2.
Please review and check.
Comment #13
contentsuit CreditAttribution: contentsuit commentedPatch 12 worked for me.
RTBC.
Comment #14
renatogHi, guys.
I applied the patch #12 and works good for me too.
Thank you very much for contribution @Diego_Mow.
Good Work.
Regards.
Comment #16
renatogFixed.
Committed to the dev branch.
Thank you all for contributions.
Regards.
Comment #17
JAINV18 CreditAttribution: JAINV18 as a volunteer commentedHi, guys,
I applied the patch #12 and works good for me too.
Thank you very much for contribution @Diego_Mow.
Good Work.