Problem/Motivation

I defined as homepage a node in `/admin/config/system/site-information` but there are two links of my home page in the generated sitemap.xml :
my-site.com/node/1 and my-site.com/
Same issue like : https://www.drupal.org/project/simple_sitemap/issues/2894762

Proposed resolution

Instead of the issue solution (#2894762), add a checkbox in the configuration form when she's active, the generation ignores automatically the link defined in the site configuration : /admin/config/system/site-information

This solution doesn't force the user to duplicate the link of the hompage that can be modified in one place and not the other.

Original report

https://www.drupal.org/project/simple_sitemap/issues/2894762

You think it can be a possible solution?

Comments

rmpereira created an issue. See original summary.

gbyte’s picture

This solution doesn't force the user to duplicate the link of the hompage that can be modified in one place and not the other.

I don't understand this sentence.

Having a node as the home page is not very common and fixing this for the sitemap is very easy:

you should exclude either the node/1 path from the sitemap on the node's edit page, or go to admin/config/search/simplesitemap/custom and exlude the path /.

Doing the above is about as much work as it would be to check your hypothetical checkbox. Why is this important to you?

rmpereira’s picture

I agree that the original solution can work in my case but it force me to each new site to go set the homepage url in the module settings.
With the checkox, I can set it to enable automatically at site installation and I don't need to configure the module anymore.
There is no risk of forgetting to exclude the url from the homepage.

I want to automate as much as possible to limit manual actions and the risks of forgetting something.

gbyte’s picture

Status: Active » Closed (works as designed)

Also there is always the possibility of just checking 'remove duplicates' which will make sure the path is included just once.

In any case I'm afraid this is too much of an edge case and solvable in too many ways for it to be addressed in a new feature.

daggerhart’s picture

Having a node as the home page is not very common

Disagree.

But for people who show up here looking for the answer, here is a simple example of hook_simple_sitemap_links_alter().

/**
 * Implements hook_simple_sitemap_links_alter().
 */
function MYMODULE_simple_sitemap_links_alter(array &$links, $sitemap_variant) {
  foreach ($links as $key => $link) {
    // Remove the duplicate home page path.
    // Link with "/" should remain.
    if (substr($link['url'], -5) == '/home') {
      unset($links[$key]);
    }
  }
}

This example is not generic enough for every site. Another pass at this could use \Drupal::config('system.site')->get('page.front') to determine what the front page node is, then detect that node during the loop by looking at $link['meta']['entity_info']

gbyte’s picture

You are missing the foreach loop in your example code.

But why not just enable duplicate removal in the sitemap settings instead?

daggerhart’s picture

Thanks for the loop issue catch. Fixed.

I have "Exclude duplicate links" under Advanced settings enabled but it doesn't seem to work for this. Haven't dug in to see why, but in my sitemap I have both "/" and "/home" showing up. Maybe this is a bug?

Also to expand on my way-too-terse "disagree", with Layout Builder becoming more popular I'd expect having nodes for homepages to increase in popularity as well.

jimmb’s picture

I'm also having this problem, having built a Drupal1 10 site with Layout Builder and consequently having a node for the homepage (/homepage) that's also loading at the main domain URL.

As such, in the XML sitemap, I'm seeing the same page twice: domain.com and domain.com/homepage

My question is about this from comment #2 above:

you should ... go to admin/config/search/simplesitemap/custom and exlude the path /.

This would be a perfectly fine solution in the UI, but I'm not seeing how to do that. At /admin/config/search/simplesitemap/custom, I'm just seeing 'Add custom internal drupal paths to specific sitemaps.' and the 'Default' field below that (neither of which seem to apply here). And under that is 'Include images', which also isn't applicable.

So, where is this field where I can plug in "/homepage" as a relative path and have it excluded from the XML sitemap generation?

jimmb’s picture

Status: Closed (works as designed) » Active
gbyte’s picture

@jimmb If /homepage is a node, why not exclude that single node from index? (Edit the node and exclude it under the sitemap settings.)

And to answer your question (but in your case the above solution would be better I think): The default homepage is being indexed in the custom link area - removing the '/' from there means excluding the home page (or its duplicate) from index.

jimmb’s picture

Thanks so much for the reply! I'd considered doing what you suggested, but thought it would also remove the homepage on the main domain from being indexed....

But I just went into 'Edit' mode for /homepage, and selected "Do not index this Landing Page entity in sitemap". I then went to /admin/config/search/simplesitemap and clicked "Rebuild queue & generate".

And afterward, viewing my domain.com/sitemap.xml, I am seeing the main domain (i.e. the de facto homepage) as being indexed but not the redundant /homepage page. So this looks perfect now!

Thanks again for the help, and I will close this issue.

jimmb’s picture

Status: Active » Closed (works as designed)
gbyte’s picture