Problem/Motivation
I defined as homepage a node in `/admin/config/system/site-information` but there are two links of my home page in the generated sitemap.xml :
my-site.com/node/1 and my-site.com/
Same issue like : https://www.drupal.org/project/simple_sitemap/issues/2894762
Proposed resolution
Instead of the issue solution (#2894762), add a checkbox in the configuration form when she's active, the generation ignores automatically the link defined in the site configuration : /admin/config/system/site-information
This solution doesn't force the user to duplicate the link of the hompage that can be modified in one place and not the other.
Original report
https://www.drupal.org/project/simple_sitemap/issues/2894762
You think it can be a possible solution?
Comments
Comment #2
gbyteI don't understand this sentence.
Having a node as the home page is not very common and fixing this for the sitemap is very easy:
Doing the above is about as much work as it would be to check your hypothetical checkbox. Why is this important to you?
Comment #3
rmpereira commentedI agree that the original solution can work in my case but it force me to each new site to go set the homepage url in the module settings.
With the checkox, I can set it to enable automatically at site installation and I don't need to configure the module anymore.
There is no risk of forgetting to exclude the url from the homepage.
I want to automate as much as possible to limit manual actions and the risks of forgetting something.
Comment #4
gbyteAlso there is always the possibility of just checking 'remove duplicates' which will make sure the path is included just once.
In any case I'm afraid this is too much of an edge case and solvable in too many ways for it to be addressed in a new feature.
Comment #5
daggerhart commentedDisagree.
But for people who show up here looking for the answer, here is a simple example of
hook_simple_sitemap_links_alter().This example is not generic enough for every site. Another pass at this could use
\Drupal::config('system.site')->get('page.front')to determine what the front page node is, then detect that node during the loop by looking at$link['meta']['entity_info']Comment #6
gbyteYou are missing the foreach loop in your example code.
But why not just enable duplicate removal in the sitemap settings instead?
Comment #7
daggerhart commentedThanks for the loop issue catch. Fixed.
I have "Exclude duplicate links" under Advanced settings enabled but it doesn't seem to work for this. Haven't dug in to see why, but in my sitemap I have both "/" and "/home" showing up. Maybe this is a bug?
Also to expand on my way-too-terse "disagree", with Layout Builder becoming more popular I'd expect having nodes for homepages to increase in popularity as well.
Comment #8
jimmb commentedI'm also having this problem, having built a Drupal1 10 site with Layout Builder and consequently having a node for the homepage (/homepage) that's also loading at the main domain URL.
As such, in the XML sitemap, I'm seeing the same page twice: domain.com and domain.com/homepage
My question is about this from comment #2 above:
This would be a perfectly fine solution in the UI, but I'm not seeing how to do that. At /admin/config/search/simplesitemap/custom, I'm just seeing 'Add custom internal drupal paths to specific sitemaps.' and the 'Default' field below that (neither of which seem to apply here). And under that is 'Include images', which also isn't applicable.
So, where is this field where I can plug in "/homepage" as a relative path and have it excluded from the XML sitemap generation?
Comment #9
jimmb commentedComment #10
gbyte@jimmb If /homepage is a node, why not exclude that single node from index? (Edit the node and exclude it under the sitemap settings.)
And to answer your question (but in your case the above solution would be better I think): The default homepage is being indexed in the custom link area - removing the '/' from there means excluding the home page (or its duplicate) from index.
Comment #11
jimmb commentedThanks so much for the reply! I'd considered doing what you suggested, but thought it would also remove the homepage on the main domain from being indexed....
But I just went into 'Edit' mode for /homepage, and selected "Do not index this Landing Page entity in sitemap". I then went to /admin/config/search/simplesitemap and clicked "Rebuild queue & generate".
And afterward, viewing my domain.com/sitemap.xml, I am seeing the main domain (i.e. the de facto homepage) as being indexed but not the redundant /homepage page. So this looks perfect now!
Thanks again for the help, and I will close this issue.
Comment #12
jimmb commentedComment #13
gbyteFor 4.x see #3264573: Respect front page configuration