Motivation

Noticed various problems with XML sitemap generation process, I will try to explain them through steps to reproduce.

Initial configuration

  • drush dl i18n
  • drush en i18n
  • drush en i18n_node
  • drush en i18n_menu
  • drush dl pathauto
  • drush en pathauto
  • drush dl xmlsitemap
  • drush en xmlsitemap_i18n
  • drush en xmlsitemap_user (for rebuild links)
  • Add French language
  • Detection from URL (admin/config/regional/language/configure)
  • Set path prefix for English to en (admin/config/regional/language/edit/en)

Steps to reproduce

  • Create new content type News
  • Edit content type News, set in publishing options, multilingual support, enabled with translations
  • Add custom menu, link content type to it
  • Add two nodes, one English (node/1) with title NewsEN one French (node/2) with title NewsFR, both linked as translation of another, both added in previously created custom menu
  • Pattern for all English News paths
  • news/[node:title]
  • Pattern for all French News paths
  • nouvelles/[node:title]
  • Add two XML sitemaps, English and French

Problem

From this point, there are various cases:

  1. Include nodes in XML sitemap (1 & 2), include menu in XML sitemap

    rebuild from en/admin/config/search/xmlsitemap =>
    * EN sitemap: /en/news/newsen (this comes from menu, node is most probably skipped due to generation process (if ($link_url == $last_url) { continue; })
    * FR sitemap: /fr/nouvelles/newsfr [lastmod]...[/lastmod][changefreq]...[/changefreq] (this comes from French node)

    rebuild from fr/admin/config/search/xmlsitemap =>
    * EN sitemap: en/news/newsen (this comes from menu)
    en/node/2 (this comes from French node)
    en/news/newsen [lastmod]...[/lastmod][changefreq]...[/changefreq] (this comes from English node)
    * FR sitemap:
    /fr/node/1 (this comes from English node)
    /fr/nouvelles/newsfr (this comes from French menu, node is skipped again due to generation process)

  2. Exclude nodes in XML sitemap (1 & 2), include menu in XML sitemap =>
    rebuild from en/admin/config/search/xmlsitemap =>
    * EN sitemap: en/news/newsen (which is expected behavior)
    * FR sitemap: fr/node/1 (again English one which shouldn't be here and there's no French one at all, menu link is missing)

    rebuild from fr/admin/config/search/xmlsitemap =>
    * EN sitemap: en/news/newsen
    en/node/4 (which is extra)
    * FR sitemap: fr/node/1 (still extra)
    fr/nouvelles/newsfr (now appeared correctly, see i18n_menu_translated_menu_link_alter() and _i18n_menu_link_is_visible())

Comments

miro_dietiker’s picture

One of my ideas was during the xmlsitemap rebuild process to switch into the target language of each sitemap, in case a sitemap has a language assigned. Then the build process will have the proper global $language object and all contrib modules thus the proper context in case they rely on the global language context.
Dunno though if this will work with the architecture.

Dave Reid’s picture

It should be passing the language code object (if language is set in the sitemap context) via any call to url() when outputting any links in the sitemap. So I'm not sure what we're missing, but I don't quite understand.

Dave Reid’s picture

FYI the xmlsitemap_i18n module really only handles logic for nodes. If it needs to also handle menus, it's something that needs to be addressed. Unfortunately, I'm not using i18n actively, so it would just need to be thoroughly tested.

vladan.me’s picture

Regarding missing menu links, mentioned functions are those:

function i18n_menu_translated_menu_link_alter(&$item) {
  // Only process links to be displayed not processed before by i18n_menu.
  if (_i18n_menu_link_process($item)) {
    if (!_i18n_menu_link_is_visible($item)) {
      $item['hidden'] = TRUE;
    }
    elseif (_i18n_menu_link_is_localizable($item)) {
      // Item has undefined language, it is a candidate for localization.
      _i18n_menu_link_localize($item);
    }
  }
}

which then enters

function _i18n_menu_link_is_visible($link, $langcode = NULL) {
  if (arg(0) == 'node' && arg(2) == 'edit') {
    $query = db_select('node', 'n');
    $query->addField('n', 'language');
    $query->condition('n.nid', arg(1));
    $langcode = $query->execute()->fetchField();
  }
  $langcode = $langcode ? $langcode : i18n_language_interface()->language;
  return $link['language'] == LANGUAGE_NONE || $link['language'] == $langcode;
}

I believe this code sets $item to hidden as $langcode is not matching $link['language']
In the start of function $langcode is not set and then becomes current language context, therefore, if mentioned rebuilding from /en/admin/config/search/xmlsitemap then it gets value 'en'.
Therefore, all menu items that are French are not visible, they don't match language...
It's looks like that this case is not covered with XML sitemap, but I am not completely sure how it should be resolved.
Thanks again for checking on it.

Dave Reid’s picture

Might it be worth trying #1966512: xmlsitemap_menu failing to properly create sitemaps for other languages. - sounds very similar at least for menu links.

Kristen Pol’s picture

Status: Active » Closed (outdated)

Closing this because it's so old.