Google seems to suggest NOT creating multiple sitemaps by language context, but rather using rel="alternate"
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=2620865

I had to manually create a sitemap index to explose these anyway.
#1663384: Sitemap index submodule

Is this submodule doing it right?

CommentFileSizeAuthor
#111 interdiff-1670086-108-111.txt13.57 KBjames.williams
#111 xmlsitemap-1670086-111-rel_alternate.patch14.14 KBjames.williams
#108 xmlsitemap-1670086-108-rel_alternate.patch9.98 KBjames.williams
#108 interdiff-1670086-107-108.txt1.68 KBjames.williams
#107 xmlsitemap-1670086-107-rel_alternate.patch9.91 KBjames.williams
#103 xmlsitemap-1670086-104-rel_alternate.patch9.88 KBbwaindwain
#99 interdiff.txt10.98 KBjyraya
#99 xmlsitemap-multilingual_rel_alternate-1670086-99.patch13.64 KBjyraya
#98 xmlsitemap-1670086-97-rel_alternate.patch9.44 KBcrasx
#96 xmlsitemap-1670086-96-rel_alternate.patch10.05 KBpapagrande
#91 xmlsitemap-1670086-91-rel_alternate.patch9.89 KBbwaindwain
#90 xmlsitemap-1670086-90-rel_alternate.patch9.89 KBapugacescu
#89 xmlsitemap-1670086-89-rel_alternate.patch9.9 KBjames.williams
#89 interdiff-1670086-88-89.txt493 bytesjames.williams
#88 interdiff-1670086-86-88.txt630 bytesles lim
#88 xmlsitemap-1670086-88-rel_alternate.patch10.38 KBles lim
#86 xmlsitemap-using_rel_alternate-1670086-86.patch10.35 KBpifagor
#80 xmlsitemap-using_rel_alternate-1670086-80.patch10.2 KBbwaindwain
#75 xmlsitemap-using_rel_alternate-1670086-71.patch10.1 KBartematem
#69 xmlsitemap-using_rel_alternate-1670086-69.patch9.84 KBartematem
#68 xmlsitemap-using_rel_alternate-1670086-68.patch9.73 KBtoschl
#66 xmlsitemap-using_rel_alternate-1670086-66.patch9.2 KBbwaindwain
#65 xmlsitemap-using_rel_alternate-1670086-65.patch9.19 KBbwaindwain
#61 xmlsitemap-using_rel_alternate-1670086-61.patch9.74 KBpobster
#60 xmlsitemap-using_rel_alternate-1670086-60.patch9.71 KBpobster
#50 xmlsitemap-using_rel_alternate-1670086-50.patch8.89 KBidebr
#50 interdiff-48-50.txt652 bytesidebr
#48 xmlsitemap-using_rel_alternate-1670086-48.patch8.87 KBsteven jones
#29 interdiff.txt1006 bytesjoelpittet
#29 using_rel_alternate-1670086-29.patch8.93 KBjoelpittet
#24 using_rel_alternate-1670086-24.patch8.84 KBjoelpittet
#17 xmlsitemap-hreflang-1670086-17.patch10.49 KBscreon
#15 using_rel_alternate-1670086-15.patch1.08 KBscreon
#12 Screen Shot 2014-09-17 at 3.55.18 AM.png327.14 KBmgifford
#10 xmlsitemap-hreflang-1670086-10.patch10.5 KBnuez
#9 xmlsitemap-hreflang-1670086-9.patch10.48 KBnuez
#7 xmlsitemap-multilingual_sitemap-1670086-7.patch7.7 KBkala4ek
#5 xmlsitemap-multilingual_sitemap-1670086-5.patch6.88 KBkala4ek

Comments

Anonymous’s picture

Version: 7.x-2.0-rc1 » 7.x-2.x-dev
Category: task » feature
Status: Active » Postponed (maintainer needs more info)

It isn't part of the standard sitemaps protocol. Dave Reid, what do you think about adding the extra namespace? This would have to be optional since the sitemap.xml chunks could get lengthy with this and there is a file size limit as well as a number of items per file limit. Frankly, if I were using i18n I would prefer separate files per language.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
  xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <url>
    <loc>http://www.example.com/english/</loc>
    <xhtml:link 
                 rel="alternate"
                 hreflang="de"
                 href="http://www.example.com/deutsch/"
                 />
    <xhtml:link 
                 rel="alternate"
                 hreflang="de-ch"
                 href="http://www.example.com/schweiz-deutsch/"
                 />
   ...
mgifford’s picture

Issue tags: +i18n

I'm very interested in seeing how to best do this as well.

i18n is part of Core and has been since D6. I think it would be a best practice to see that xmlsitemaps supports multiple languages out of the box.

tagging.

doublejosh’s picture

Agreed. However i18n translation support is only new to D7.

Anonymous’s picture

Assigned: Unassigned » dave reid

Dave there is a question to you on #1.

Anonymous’s picture

Issue summary: View changes

link

kala4ek’s picture

Component: xmlsitemap_i18n.module » xmlsitemap.module
Assigned: dave reid » Unassigned
Issue summary: View changes
Status: Postponed (maintainer needs more info) » Needs review
Issue tags: +multilingual
StatusFileSize
new6.88 KB

Implemented multilingual sitemap solution during Summer Drupal Code Sprint Novosibirsk 2014 #1.

This simple solution works well with all entities through enitiy_translation module or with nodes via i18n_node.
For start using you should run database update and select checkbox "Multilingual sitemap".

Best regards, kala4ek and Kovalevich.Artem.

mrP’s picture

Since this patch adds XHTML, than #2183765: Add XHTML namespace option is redundant. Feel free to close that issue if anyone agrees.

+++ b/xmlsitemap.admin.inc
@@ -280,7 +280,12 @@ function xmlsitemap_settings_form($form, &$form_state) {
+  $form['xmlsitemap_multilingual'] = array(
+    '#type' => 'checkbox',
+    '#title' => t('Multilingual sitemap'),
+    '#description' => t('When enabled, this add multilingual sitemap like !link.', array('!link' => l(t('here'), 'https://support.google.com/webmasters/answer/2620865?hl=en'))),
+    '#default_value' => variable_get('xmlsitemap_multilingual', 0),
+  );

I would think this should be under advanced options -- $form['advanced']['xmlsitemap_xhtml'] or $form['advanced']['xmlsitemap_multilingual']. If so, than we need to add it in function xmlsitemap_variables()

kala4ek’s picture

Updated patch. Added multilingual option into $form['advanced'] and to xmlsitemap_variables().

nuez’s picture

This patch is good for multilingual content (nodes) but what about adding the rel="alternate" tag to custom pages (generated by views) and the frontpage?
We need some UI to add the tags manually...I will make another patch for that if that's ok.

Then there is this other doubt I have: on this page https://support.google.com/webmasters/answer/189077?hl=en, it is mentioned that you should use the rel="alternate" if only your template is translated but the actual content is not.

For example: on my website I have mostly multilingual contents but also a language neutral blog (which is in fact in english). You can access the blogposts through 3 different language prefixes. In that case you should add the rel="alternate" tag here too, because the content is duplicated but the interface is translated.

nuez’s picture

StatusFileSize
new10.48 KB

Thanks kala4ek,

Updated patch a little bit further.

  • The rel="alternate" xhtml links added by the patch had the wrong attribute: it should be hreflang instead of lang.
  • Added the option to generate the rel="alternate" hreflang="x" link to all pages on the website, not only translated nodes. As explained here, you can use the link on pages that only have parts of the website translated, e.g. only the navigation. It's also important to have the links for the homepage so google will display the search result in the right language.
  • Added the possibility to add the x-default tag, telling google which page should be displayed when the visitor's language isn't any of the site's languages.See
nuez’s picture

StatusFileSize
new10.5 KB

There was an error in the patch the x-default being repeated. fixed in this updated patch

dr.osd’s picture

I have applied patch #10 for current sitemap module release, but have no effect. Cache cleared, sitemap rebuild, checkbox in settings is set.

mgifford’s picture

StatusFileSize
new327.14 KB

So did you test it on simplytest.me or on an existing install?

I assume you installed the XML sitemap internationalization sub-module. I set it up here http://s2a21e30abe4eca4.s3.simplytest.me/

I didn't see it here:
XML Sitemap Page

I assume if I've got:
http://s2a21e30abe4eca4.s3.simplytest.me/ab/sitemap.xml

The rel="alternate" should be in the header of that page, right?

@nuez - any suggestions what we might have missed?

screon’s picture

After some trial and error the patch from #10 worked for me.

I use entity translation for my multilingual site, enabled the node_translation_sitemap module, enabled xmlsitemap_i18n, updated database and re-added my sitemap with the correct options checked.

So thank you guys!

The only question I have is, you are using rel="alternative" while Google uses rel="alternate" in its examples. Does it matter?

mgifford’s picture

Status: Needs review » Needs work

It would matter. Needs to be rel="alternate" hreflang="x"

Think this needs to be changed here as you suggested:

+      $xhtml_links[] = array(
+        'rel' => 'alternative',
+        'href' => url($link['original'], array('language' => $lang, 'absolute' => TRUE)),
+        'hreflang' => $lang->language,
+      );
+      $x_default = xmlsitemap_var('multilingual_x_default');
+    }
+    if ($x_default != 'none') {
+      $x_default_lang = $x_default == "default" ? language_default()->language : $x_default;
+      $xhtml_links[] = array(
+        'rel' => 'alternative',
+        'href' => url($link['original'], array('language' => $x_default_lang, 'absolute' => TRUE)),
+        'hreflang' => 'x-default',
+      );
+    }
screon’s picture

StatusFileSize
new1.08 KB

Here's a patch that changes rel="alternative" in rel="alternate"

nuez’s picture

I think it should indeed be rel="alternate", thanks for fixing that

screon’s picture

Status: Needs work » Needs review
StatusFileSize
new10.49 KB

nuez, I edited your patch from #10 to incorporate my changes. That way there aren't 2 different patches to apply.

mgifford’s picture

Is anyone using this patch in a live environment?

Can we verify it is working with Google in a multi-lingual environment before marking it RTBC?

joelpittet’s picture

Status: Needs review » Needs work
+++ b/xmlsitemap.admin.inc
@@ -330,6 +330,32 @@ function xmlsitemap_settings_form($form, &$form_state) {
+  $form['advanced']['xmlsitemap_multilingual'] = array(
+    '#type' => 'checkbox',
+    '#title' => t('Multilingual sitemap'),
+    '#description' => t('When enabled, <em>rel="alternate" hreflang="x"</em> links are added to each item. More information !link.', array('!link' => l(t('here'), 'https://support.google.com/webmasters/answer/2620865?hl=en'))),
+    '#default_value' => xmlsitemap_var('multilingual'),
+  );
+  $form['advanced']['xmlsitemap_multilingual_only_nodes'] = array(
+    '#type' => 'checkbox',
+    '#title' => t('Only add <em>rel="alternate" hreflang="x"</em> links to translated nodes.'),
+    '#description' => t('When enabled the <em>rel="alternate" hreflang="x"</em>  links only apply to translated nodes. Multilingual sitemap has to be checked. More information !link.', array('!link' => l(t('here'), 'https://support.google.com/webmasters/answer/189077?hl=en'))),
+    '#default_value' => xmlsitemap_var('multilingual_only_nodes'),
+  );
+  $x_default_options = array(
+    'none' => t("Don't use X-Default link"),
+    'default' => t("Use default language"),
+  );
+  foreach(language_list() as $lang_code => $language){
+    $x_default_options[$lang_code] = $language->name;
+  }
+  $form['advanced']['xmlsitemap_multilingual_x_default'] = array(
+    '#type' => 'select',
+    '#title' => t('Add a <em>hreflang="x-default"</em> link'),
+    '#description' => t("The <em>hreflang='x-default'</em> link is used to set the default page when the visitor's language is not available as a site language. Multilingual sitemap has to be checked. More information !link.", array('!link' => l(t('here'), 'http://googlewebmastercentral.blogspot.nl/2013/04/x-default-hreflang-for-international-pages.html'))),
+    '#default_value' => xmlsitemap_var('multilingual_x_default'),
+    '#options' => $x_default_options,
+  );

These checkboxes look useful but they are not used within the module.

joelpittet’s picture

Status: Needs work » Needs review

Sorry, I think I'm wrong, xmlsitemap_var() is weird to me as usually it would use variable_get()

The last submitted patch, 15: using_rel_alternate-1670086-15.patch, failed testing.

joelpittet’s picture

I am using this and I am not sure yet if it's successful... we shall see:)

joelpittet’s picture

Status: Needs review » Needs work
Issue tags: +Needs reroll

This needs a re-roll since 2.1 was released.

joelpittet’s picture

Status: Needs work » Needs review
Issue tags: -Needs reroll
StatusFileSize
new8.84 KB

Here's a re-roll. Hope I didn't miss anything.

joelpittet’s picture

Status: Needs review » Needs work
+++ b/xmlsitemap.install
@@ -544,6 +544,15 @@ function xmlsitemap_update_7203() {
+/**
+ * Implement hook_update_N().
+ * Change primary keys to id, type, language of xmlsitemap table.
+ */
+function xmlsitemap_update_7204() {
+  db_drop_primary_key('xmlsitemap');
+  db_add_primary_key('xmlsitemap', array('id', 'type', 'language'));
+}
+

I'm not sure if this is still needed? And if so, does it need a schema change as well to accompany it?

joelpittet’s picture

+++ b/xmlsitemap.xmlsitemap.inc
@@ -83,6 +83,7 @@ class XMLSitemapWriter extends XMLWriter {
+    $attributes['xmlns:xhtml'] = 'http://www.w3.org/1999/xhtml';

This change may be out of scope and it seems to be throwing errors when I try to validate:

Error 1845: Element '{http://www.w3.org/1999/xhtml}link': No matching global element declaration available, but demanded by the strict wildcard. in https://example.com/us/sitemap.xml on line 4 column 0

Using this service:
http://www.xmlcheck.com/checkurl.php

joelpittet’s picture

nm #26 I read the errors wrong

joelpittet’s picture

+++ b/xmlsitemap.xmlsitemap.inc
@@ -133,23 +134,38 @@ class XMLSitemapWriter extends XMLWriter {
+  public function writeElement($name, $data = NULL) {

I made a mistake in the re-roll. Needs to be $data = array() I think...

Anyways this patch doesn't seem to be doing what it used to in #17

joelpittet’s picture

Status: Needs work » Needs review
StatusFileSize
new8.93 KB
new1006 bytes

Ok this patch is a closer match, though I had to revert format_xml_elements() change to make it work as it did in #17.

johnpicozzi’s picture

@joelpittet - I tried it with 7.x-2.1 and it failed on xmlsitemap.xmlsitemap.inc

joelpittet’s picture

It should only apply to 2.x-dev which is where the latest changes are.

johnpicozzi’s picture

@joepittet - That makes sense.. The patch works, however got the following error when I tried a rebuild.

"An AJAX HTTP error occurred. HTTP Result Code: 500 Debugging information follows. Path: /batch?id=189&op=do StatusText: error ResponseText:"

joelpittet’s picture

Huh... maybe there is a typo, though I didn't get that. Can you clear cache and rebuild the registry and try that again?

500 error is either a fatal or apache just gave up I find...

johnpicozzi’s picture

No luck... Same error

joelpittet’s picture

Check your PHP logs to see what error message is behind that 500.

johnpicozzi’s picture

I did... Nothing in their. I'm working locally right now... might push it out to a dev server and try it there.

johnpicozzi’s picture

Ok, Tried it on our dev server and got the following error when trying to rebuild the sitemap with drush

PHP Fatal error: Call to undefined function language_negotiation_get() in /sites/all/modules/contrib/xmlsitemap/xmlsitemap.generate.inc

joelpittet’s picture

+++ b/xmlsitemap.generate.inc
@@ -521,3 +527,52 @@ function xmlsitemap_get_rebuildable_link_types() {
+  if (language_negotiation_get('language', 'locale-url')) {

Ah, well then I guess there needs to be a check if you don't have a multilingual site. The language.inc file isn't loaded unless there is more than one language enabled.

johnpicozzi’s picture

@joelpittet - Is it possible to have the hreflang in the Sitemap with the sites default language?

joelpittet’s picture

@johnpicozzi I don't believe there is a good reason for that, it will just add bulk to it but no real value. IMO

johnpicozzi’s picture

@joelpittet Here is a use case... a multi-site with sites that are geared towards different countries but are still in english.

Where the hreflang would be something like: en-gb, en-au, es-mx, and es-es.

This post illustrates what I'm getting at: http://searchengineland.com/how-to-implement-the-hreflang-element-using-...

joelpittet’s picture

That sounds like a similar but different feature than this... Though I'm doing just that with the multilingal site and not multisite setup. en-CA and en-US in my case.

dermyrddin’s picture

@johnpicozzi The 500 error is caused by low script execution time in php.ini, increasing this value helps me.

@joelpittet On big amount of nodes (33k * 2 languages) it's take a lot of time to write xml files after generating data, so the 500 error occurs. Maybe it's a good idea to do some optimisations on writing. Also automatic calculation of links amount per file results to generate files bigger than 10 MB, one need to decrease it to 25k or even 10k.

Also I think there is a bug with disabled languages in the patch. Translations for languages disabled in site configuration are also added to the resulting xml, althought they are not accessible.

joelpittet’s picture

@Merlineus I've not had that problem yet but I have seen when doing some performance testing the slow hooks on this module for node_insert/update/delete.

I think it would be a good idea to open a performance issue for that. It may take a few things to get that right as there isn't going to be a panacea for performance knowing the structure that needs to be generated is verbose.

For the bug any chance you can post a patch trying to fix what you see?

Jurgen8en’s picture

Hi,

The entitiy_translation was not working for me, changed

function xmlsitemap_generate_multilingual($link) {

    if (module_exists('entity_translation')) {
      $type = $link['type']; 
      
      //node_translation_de instead of node
      preg_match("/(.*)_translation_(.*)/", $type, $output_array);
      if(!empty($output_array)){
        $type = $output_array[1];
        $langcode = $output_array[2];
      }
      
      $translation = db_select('entity_translation', 'et')->fields('et', array('language'))->condition('et.entity_type', $type)->condition('et.entity_id', $link['id'])->execute()->fetchCol();
    }
knalstaaf’s picture

It seems like this patch (#29) is breaking my actual sitemap. Whereas it looked pretty structured with the Multilingual sitemap option unchecked:

<url>
    <loc>http://domain.com/en/product/sdf800</loc>
    <lastmod>2014-07-02T16:47Z</lastmod>
    <changefreq>yearly</changefreq>
</url>
<url>
    <loc>http://domain.com/en/product/sdf0880</loc>
    <lastmod>2014-07-02T16:47Z</lastmod>
    <changefreq>yearly</changefreq>
</url>
<url>
    <loc>http://domain.com/en/product/sd6700</loc>
    <lastmod>2014-07-02T16:47Z</lastmod>
    <changefreq>yearly</changefreq>
</url>
(...)

With the Multilingual sitemap option checked all the links and data are one big long string:

http://domain.com/endaily1.0 http://domain.com/en/product/sdf501002014-07-02T16:47Zyearly http://domain.com/en/product/sdf751002014-07-02T16:47Zyearly http://domain.com/en/product/sdf1001002014-07-02T16:47Zyearly (...)

I'm also getting the following error when rebuilding the XML-sitemap after checking the Multilingual sitemap option:

Notice: Trying to get property of non-object in locale_language_url_rewrite_url() (line 429 of /includes/locale.inc).

It's not that the XML sitemap internationalization submodule has to be enabled?

sumeet.pareek’s picture

When working on the same problem, but not in the context of entity_translations, here is what I had to do eventually

- #2514288: Having rel="alternate" URLs in sitemap under custom circumstances

diff --git a/xmlsitemap.generate.inc b/xmlsitemap.generate.inc
index 57c3bc1..7d99094 100644
--- a/xmlsitemap.generate.inc
+++ b/xmlsitemap.generate.inc
@@ -207,11 +207,48 @@ function xmlsitemap_generate_chunk(stdClass $sitemap, XMLSitemapWriter $writer,
       $link_count++;
     }

-    $element = array();
-    $element['loc'] = $link_url;
+
+    // Existing functionality already writes the sitemap chunk as,
+    //     <url> something-123 something-456 </url>
+    // What we want is to write stuff in between as,
+    //     <url> something-123 something-NEW-alpha something-NEW-beta somehting-456 </url>
+
+    // XMLWriter's default way is to write chunks is forward only.
+    // So when using that method, there cannot be any 'inbetween' having multiple same xml elements
+    // in single chunk would require - (same array key + multiple values pairs). Not possible.
+    //   Refer - http://php.net/manual/en/intro.xmlwriter.php
+    // So we need to breakdown the chunk writing functionality as below.
+
+    $writer->startElement("url");
+    $writer->writeElement("loc", $link_url);
+
+    // For a geo enabled site, add alternate URLs in the sitemap
+    if (_todo_fund_do_enchanced_xmlsitemap()) { // @todo have real function.
+      global $base_url;
+      // Get list of countries for the current link
+      $link_array = parse_url($link_url);
+      $original_path = $link_array['path'];
+      $alternate_url_prefixes = _todo_func_get_prefixes($original_path); // @todo Have real function.
+      // Include alternate content url to sitemap
+      foreach ($alternate_url_prefixes as $alternate_url_prefix) {
+        $href = url("$alternate_url_prefix".$original_path, array(
+                  'absolute' => TRUE,
+                  'base_url' => $base_url,
+                ));
+        $writer->startElement('xhtml:link');
+        $writer->writeAttribute('rel', 'alternate');
+        $writer->writeAttribute('href', $href);
+        $writer->writeAttribute('hreflang', $alternate_url_prefix);
+        // @todo Assuming prefix = language code, above. This could be another function if not.
+
+        $writer->endElement(); //end element xhtml:link
+      }
+    }
+
+
     if ($link['lastmod']) {
       if (!empty($output_elements['lastmod'])) {
-        $element['lastmod'] = gmdate($lastmod_format, $link['lastmod']);
+        $writer->writeElement("lastmod", gmdate($lastmod_format, $link['lastmod']));
       }
       // If the link has a lastmod value, update the changefreq so that links
       // with a short changefreq but updated two years ago show decay.
@@ -220,19 +257,24 @@ function xmlsitemap_generate_chunk(stdClass $sitemap, XMLSitemapWriter $writer,
       $link['changefreq'] = (abs(REQUEST_TIME - $link['lastmod']) + $link['changefreq']) / 2;
     }
     if (!empty($output_elements['changefreq']) && $link['changefreq']) {
-      $element['changefreq'] = xmlsitemap_get_changefreq($link['changefreq']);
+      $writer->writeElement("changefreq", xmlsitemap_get_changefreq($link['changefreq']));
     }
     if (!empty($output_elements['priority']) && isset($link['priority']) && $link['priority'] != 0.5) {
       // Don't output the priority value for links that have 0.5 priority. This
       // is the default 'assumed' value if priority is not included as per the
       // sitemaps.org specification.
-      $element['priority'] = number_format($link['priority'], 1);
+      $writer->writeElement("priority", number_format($link['priority'], 1));
     }

     // @todo Should this be moved to XMLSitemapWritier::writeSitemapElement()?
     drupal_alter('xmlsitemap_element', $element, $link, $sitemap);

-    $writer->writeSitemapElement('url', $element);
+    $writer->endElement(); //end element url
+
+    // Sitemaps can be huge, after every 500 links, flush the XMLWriter buffer
+    if (($link_count % 500) == 0) {
+      $writer->flush();
+    }
   }

   return $link_count;
@@ -549,3 +591,30 @@ function xmlsitemap_rebuild_clear(array $types, $save_custom) {

   return $query->execute();
 }
+
+/**
+ * The function here should return alternate URL prefixes for given content.
+ * It could be based on entity translations, or country domain regions, etc.
+ *
+ * @param string $content_path
+ *
+ * @return array
+ *   alternate url prefixes for the content.
+ */
+function _todo_func_get_prefixes($content_path) {
+  //@todo Have a real function here.
+  return array ('ar', 'mx', 'uy'); // dummy return value
+}
+
+/**
+ * Tells when to have enhanced alternate URLs in xmlsitemap, when not to have them.
+ *
+ * @param string $content_path
+ *
+ * @return bool
+ *   whether or not to have enhanced alternate URLs in xmlsitemap
+ */
+function _todo_fund_do_enchanced_xmlsitemap() {
+  // @todo Have a real function here.
+  return true; // dummy return value
+}
steven jones’s picture

I needed a slight tweak on the patch in #29 because if you don't have the pre-fetching of aliases enabled, then the $link['original'] never gets set. I've moved this outside of the if statement.

Status: Needs review » Needs work

The last submitted patch, 48: xmlsitemap-using_rel_alternate-1670086-48.patch, failed testing.

idebr’s picture

Reroll of #48 plus a small change to fix a notice. The comments in #29 to #47 have not been addressed.

sokru’s picture

I've both entity_translation and i18n node translation enabled. With patch #50, I'll get only content translated by entity_translation right. I assume missing hreflang in sitemap is one of the reasons why Google's search console shows "Your site has no hreflang tags.".

https://www.drupal.org/project/hreflang module creates these correctly as a head links.

dave reid’s picture

+++ b/xmlsitemap.xmlsitemap.inc
@@ -133,22 +134,36 @@ class XMLSitemapWriter extends XMLWriter {
-  public function writeElement($name, $content = NULL) {
-    if (is_array($content)) {
-      $this->startElement($name);
-      $xml_content = format_xml_elements($content);
-      // Remove additional spaces from the output.
-      $xml_content = str_replace(array(" <", ">\n"), array("<", ">"), $xml_content);
-      $this->writeRaw($xml_content);
-      $this->endElement();
+  public function writeElement($name, $data = array()) {

Why did this parameter need to be renamed? Might it be better to key using something like '_attributes' instead?

dave reid’s picture

Could we generate this information on hook_node_insert/update at all, instead of the generation process? I'm wondering if we should add some kind of serialized 'data' column in the links table where modules could insert data that could be rendered into the element.

dave reid’s picture

Also, I'm pretty sure this means we would *get rid* of xmlsitemap_i18n right? Why would we have a need to have multiple sitemaps per language if we have this patch?

dave reid’s picture

Plus, if we generate this data on hook_node_insert()/update(), then we could have entity_translation implement hook_xmlsitemap_link_alter() to inject its data instead.

dave reid’s picture

When used with core translation, this seems to be completely broken:

  • So say you're using core translation (not entity translation) for nodes
  • And you've got English nodes by default, and everything else is a translation of those nodes
  • Any nodes that have a n.tnid value <> 0 to be excluded​ from xmlsitemap generation by default, because we want those nodes to be included "inside" the English node's record in the sitemap
  • When generating the source (English) node's xmlsitemap record, that we should be searching for any nodes that have n.tnid = $link['id'], and outputting those nodes as the additional alternate URLs
  • I think the query should be: $translation = db_query("SELECT n.language FROM {node} n WHERE n.tnid = :nid", array(':nid' => $link['id']))->fetchCol();
  • The code also does not check that $link['type'] == 'node' and could incorrectly return values when $link does not correspond to a node (for example, if $link['type'] == 'user'] and $link['id'] == 1, then this would attempt to load information about node 1, when it should do nothing).
  • I don't also see anything that ​*excludes*​ all the translated nodes from being also included in the generationl>
  • further more, I would assume that if using core translation, we should be linking to the individual translated nodes (which may have completely different URLs), not just linking to the original node and changing the language argument
dave reid’s picture

The one thing not really answered here, is what we're supposed to do when the site has configured different *domains* per language, instead of just changing the path. Are we still supposed to only have one sitemap even though there are several different domains?

Also, if using entity_translation, why would we want to support this? Since the content is available at the same URL, why would we want to add all the alternate links?

dave reid’s picture

Status: Needs review » Needs work

Also the patch seems to be missing that I actually added attribute support to writeElement() in #2387557: Support more advanced XML element output, and those changes to that method are not actually necessary.

dave reid’s picture

For example:

function hook_xmlsitemap_element_alter(&$element, $link) {
  $element[] = array(
    'key' => 'xhtml:link',
    'attributes' => array(
      'rel' => 'alternate',
      'hreflang' => 'en',
      'href' => 'https://www.example.com/en/'
    ),
  );
}

Accomplishes the same desired output without any changes to writeElement() at all.

pobster’s picture

I realise the approach is still very much "under review" but I needed to make some alterations to the patch at #50 due to it publishing unpublished nodes, and also using the base url of the site which created it rather than the given base url set in the modules admin page.

It's literally the same patch just with conditions for "status" on the queries, and passing the base url to the url() functions (in the same way the module does it).

pobster’s picture

StatusFileSize
new9.74 KB

Not my bug (!) but I spotted an issue in my logs with this patch, it passes a string rather than an object into url() for the language parameter.

mozh92’s picture

<loc>
http://site.com/ru/about</loc>
<xhtml:link rel="alternate" href="http://site.com/ar/node/10572" hreflang="ar"/>
<xhtml:link rel="alternate" href="http://site.com/node/10572" hreflang="en"/>
<xhtml:link rel="alternate" href="http://site.com/ru/about" hreflang="ru"/>
<xhtml:link rel="alternate" href="http://site.com/node/10572" hreflang="x-default"/>
<lastmod>2017-01-26T13:56Z</lastmod
<changefreq>monthly</changefreq>

I have problem with nid. I use i18n for translate nodes
/ru/about = nid 10572
but I see 10572 for ar and en language?

UPD
after patching I did next changes

function function xmlsitemap_generate_multilingual($link)
added

$tnid = false;

line 593 - 597

$xhtml_links[] = array(
        'rel' => 'alternate',
        'href' => url($link['original'], array('language' => $lang, 'absolute' => TRUE, 'base_url' => variable_get('xmlsitemap_base_url', $GLOBALS['base_url']))),
        'hreflang' => $lang->language,
      );

change on

$nid_node = db_select('node', 'n')->fields('n', array('nid'))->condition('n.tnid', $tnid)->condition('n.language', $lang->language)->condition('n.status', 1)->execute()->fetchCol();
      if(!empty($tnid) && $link['type'] == 'node' && !empty($nid_node[0])){
        $l_node = 'node/'.$nid_node[0];
        $xhtml_links[] = array(
          'rel' => 'alternate',
          'href' => url($l_node, array(
            'language' => $lang,
            'absolute' => TRUE,
            'base_url' => variable_get('xmlsitemap_base_url', $GLOBALS['base_url'])
          )),
          'hreflang' => $lang->language,
        );
      }else {
        $xhtml_links[] = array(
          'rel' => 'alternate',
          'href' => url($link['original'], array(
            'language' => $lang,
            'absolute' => TRUE,
            'base_url' => variable_get('xmlsitemap_base_url', $GLOBALS['base_url'])
          )),
          'hreflang' => $lang->language,
        );
      }

and
line 604

'href' => url($link['original'], array('language' => $x_default_lang, 'absolute' => TRUE, 'base_url' => variable_get('xmlsitemap_base_url', $GLOBALS['base_url']))),

change to
'href' => url('node/'.$tnid, array('language' => $x_default_lang, 'absolute' => TRUE, 'base_url' => variable_get('xmlsitemap_base_url', $GLOBALS['base_url']))),

and I got good links for translated nodes

<loc>
http://site.com/ru/about</loc>
<xhtml:link rel="alternate" href="http://site.com/ar/about" hreflang="ar"/>
<xhtml:link rel="alternate" href="http://site.com/about" hreflang="en"/>
<xhtml:link rel="alternate" href="http://site.com/ru/about" hreflang="ru"/>
<xhtml:link rel="alternate" href="http://site.com/about" hreflang="x-default"/>
<lastmod>2017-01-26T13:56Z</lastmod
<changefreq>monthly</changefreq>
giupenni’s picture

Any progress for this?
This is an essential features for a best SEO practice.

pobster’s picture

We use the patch at #61 quite successfully, and have done for quite a few months now.

bwaindwain’s picture

This patch checks that the translation is available before adding the x-default. In our case, it is possible that a translation of a page in the default language is unpublished or not available at all.

Thanks to all who have contributed so far.

bwaindwain’s picture

updated to fix illegal offset

giupenni’s picture

#66 works for me

toschl’s picture

I've been missing the alternate links for the frontpage. Which has its own type and is not a node. So I added those. Please can someone review?

artematem’s picture

Updated to show node aliases for translations links instead of node/[nid] (the issue from #62).

artematem’s picture

Status: Needs work » Needs review

The last submitted patch, 61: xmlsitemap-using_rel_alternate-1670086-61.patch, failed testing. View results

The last submitted patch, 66: xmlsitemap-using_rel_alternate-1670086-66.patch, failed testing. View results

The last submitted patch, 65: xmlsitemap-using_rel_alternate-1670086-65.patch, failed testing. View results

The last submitted patch, 68: xmlsitemap-using_rel_alternate-1670086-68.patch, failed testing. View results

artematem’s picture

Additional to patch #69. Fix for not entity links.

nchase’s picture

what about entity translations?

pobster’s picture

They're clearly handled in the xmlsitemap_generate_multilingual() function.

bwaindwain’s picture

I've noticed that, with patch #66, if a page is added or a path is changed, future cron runs do not update sitemap.xml. Rebuilding links is the only way to update it. Is anyone else seeing this?

1. verify sitemap.xml is up-to-date
2. make sure "Minimum sitemap lifetime" is set to "no minimum" so that cron will do an update
2. create a new page
3. run cron manually /admin/config/system/cron
4. check sitemap.xml to see if new page is added

After studying this for a bit it seems that xmlsitemap_node module has a hook_node_insert() which calls xmlsitemap_link_save() which calls _xmlsitemap_check_changed_link() which returns FALSE instead of TRUE.

bwaindwain’s picture

Re #78, we're using some modules with node_access_grants and it turns out that the problem is solved with https://www.drupal.org/project/xmlsitemap/issues/690120.

bwaindwain’s picture

Here's patch #71/75 updated to work without translation module. Thanks all.

giupenni’s picture

#80 works for me

kala4ek’s picture

+1 for #80
Seems it almost ready to be committed.

WidgetsBurritos’s picture

Status: Needs review » Reviewed & tested by the community

+1 RTBC

WidgetsBurritos’s picture

Note: I've started the effort to porting this to D8. To avoid confusion on this issue as it hasn't been officially merged in D7 yet, I've posted it as a separate issue: #2941164: using rel="alternate" rather than multiple sitemaps by language context (Drupal 8)

If the maintainers would prefer I post it in here, I can. I just figure it's best to keep this issue focused on D7 until it has been merged in.

giupenni’s picture

I'm trying #80 in a fresh installation and seem workrs as I expected.

pifagor’s picture

I corrected the coding standards in the patch.
The logic of the patch does not change.

giupenni’s picture

I've applied #86 in two different sites and works as I expected.

les lim’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new10.38 KB
new630 bytes

Got a fatal error from #86 because of a missing translation_path_get_translations() function. The patch presumes that if Entity Translation isn't enabled, then core Content Translation must be. In our case, we have locale enabled but no translations.

New patch and interdiff attached.

james.williams’s picture

StatusFileSize
new493 bytes
new9.9 KB

Here's a patch which avoids changing the primary key, i.e. resolving comment 25, hopefully.

This issue doesn't need to change the primary key, as I don't believe it needs to add rows to the xmlsitemap table anyway, it's just adding hreflang attributes at sitemap file generation time, rather than at the time of storing those records.

I think the primary key change was probably mixed up as part of #1481798: Add support for Entity Translation & Title modules, which does need to do it (it's in the latest patch there).

apugacescu’s picture

StatusFileSize
new9.89 KB

Latest patch does not apply on latest dev version of the module so I re-rolled it.

bwaindwain’s picture

StatusFileSize
new9.89 KB

here's patch #90 for 7.x-2.4

Status: Needs review » Needs work

The last submitted patch, 91: xmlsitemap-1670086-91-rel_alternate.patch, failed testing. View results

The last submitted patch, 89: xmlsitemap-1670086-89-rel_alternate.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 91: xmlsitemap-1670086-91-rel_alternate.patch, failed testing. View results

vdsh’s picture

In function xmlsitemap_generate_multilingual($link), there is a line generating errors:

if ($x_default != 'none') && in_array($x_default_lang->language, $translation)) {

This is because $translation is never defined for me. It's only defined if the option xmlsitemap_var('multilingual_only_nodes') is set. Can you let me know the purpose of this in_array check? I guess it's to have a x-default only for a multilingual node if we have selected this option. If that is the case, we should check again if the user has checked multilingual_only_nodes, else the error will always be there

papagrande’s picture

Status: Needs work » Needs review
StatusFileSize
new10.05 KB

I can't speak to the issues in #95, but here is a reroll of the patch in #91 against 7.x-2.5. Let's see if it passes tests.

Status: Needs review » Needs work

The last submitted patch, 96: xmlsitemap-1670086-96-rel_alternate.patch, failed testing. View results

crasx’s picture

StatusFileSize
new9.44 KB

Reroll of 91

jyraya’s picture

Status: Needs work » Needs review
StatusFileSize
new13.64 KB
new10.98 KB

Hello,

I have a case where we have a custom language negotiation method based on urls. So, limiting the multilingual attributes to the case where "locale-url" is limited is too strict.

Furthermore, the patch take the hypothesis that taxonomy terms are translated via Entity translation while there are other ways to translate them like one based on the same logic of "content translation" when "Taxonomy translation" module (i18n_taxonomy) is enabled: "Translate. Different terms will be allowed for each language and they can be translated."

I adapted the #98 patch in order to take into account of these elements:

  1. I added a _xmlsitemap_is_url_based_language_negotiation() function allowing to control is an URL based language negotiation method is enabled.
    I tried to make automatic but I realised that affects quite a lot the process performances. So I based the control on a list of the URL based language negotiation methods. It invoke a hook allowing to extend this list: hook_xmlsitemap_url_based_negotiation_alter().
  2. I added also the process to cover the taxonomy terms that would use the "Translate. Different terms will be allowed for each language and they can be translated." translation way.

I also propose a last process, if an element is not based on the "Entity translation" and Taxonomy/Content translation" mechanisms. This ones based on the mechanism of the "Menu translation" module.

I attach the patch and an inter diff based on #98 patch

james.williams’s picture

Good idea on _xmlsitemap_is_url_based_language_negotiation() and the alter hook. I've made custom language negotiation methods before, so that could be very handy. And, yes, the taxonomy translation support ought to be extended, though that's not just specific to this issue around whether to use multiple sitemaps or not, but ought to be in any of the xmlsitemap project's general taxonomy handling. (Yes, the patches for this issue do explicitly touch term translations, but hopefully the point still stands.)

I wonder if both of those are things better covered in a follow-up issue? I'm not sure they need covering in the core of this issue really, and this issue has already been around for long enough.

Don't get me wrong though - great contribution! This issue has just been around for too long already, and this is quite a lot of change to add here. I'd still certainly love to see the ideas/work incorporated into xmlsitemap one way or another though!

jyraya’s picture

I do not think being out of the issue scope.

The #97 treats the node using entity translation or content translation when "multilingual_only_nodes" is checked. If this option is unchecked, it treats suddenly only content translation.

I only ensure that the process is the same in all situations for the nodes ("multilingual_only_nodes" checked or not)

After, when the "multilingual_only_nodes" is unchecked, then other entities than nodes should be treated in this function too; that means the taxonomy terms too.
Like the nodes, the taxonomy terms can use different translation mechanism. If the function treats the 2 node translation mechanisms, then it is logic to apply the same logic on taxonomy terms, no?

Concerning the hook, thank you for the nice words about it and my contribution.

Here, also, I consider that I am not outside the issue scope. The patch #97 introduces the control on the enable language negotiation method but limits it to one.
The hook is there to give the possibility to add other method.
Since 4 years, my predecessors on my project did not stop to apply a patch to the current patch to extend the feature to our specific language negotiation method.
So, to avoid spending again years patching xmlsitemap for extending this control, it would be great to include this in the current issue. And As I said, this control is introduced by the issue's patches. So, it is logic for me to have a way to fix this in the current issue.

The only code part I hesitated to include in the patch was the one based on mechanism of the "Menu translation" module at the end of the function. (In fact, I am wrong saying it is based "Menu translation" because the function I used is in i18n. Sorry for this mistake.)
I thought that could be useless but on the other hand, it will be called on edge cases and then should have an limited impact from the point of view of the issue scope.
That's why I kept it but I understand that it is too much and we could remove it.

For the rest, I reorganised a little bit the code to exit the function as soon as it is possible, nothing more; In other words, the behaviour of the preceding patches is kept.

james.williams’s picture

I'm happy enough with your justification, it's all good change after all (as far as I can see from just reading your code & explanation). :-) But I'm in no place myself to actually test or review further, sorry (and that was the case before your patch too, I should add).

bwaindwain’s picture

StatusFileSize
new9.88 KB

The line $link['original'] = $link['loc']; was dropped in #97/#98. Here's an updated patch with the missing line added for 7.x-2.6.

Sorry, I can't help with #95 or #99.

jyraya’s picture

Hello bwaindwain,

The #99 patch addressed the #95 issue + another for sites that have another language negotiation mechanism based on urls.

Could you tell me what is wrong with it that obliged you to remove everything from it in the #103?

Regards.

bwaindwain’s picture

FYI: I ran patch #99 through our own automated testing and it passes fine. There's nothing wrong with #99. I just feel I don't have time to give or enough knowledge about the issues of #95 and #99 to contribute to the discussion.

pieterdc’s picture

Thanks all for your contributions.

The patch from comment #103 generates alternate links for my website's frontpage, whereas the patch from comment #99 doesn't.

#103 is still missing an hreflang="x-default" link for the frontpage though.
#99 has the same admin settings, and produces the same non-homepage links (in my case), but seems a bit more modularly (chunks split off to functions) coded.

james.williams’s picture

StatusFileSize
new9.91 KB

Here is a re-rolled version of #103. Sorry that misses out what was done in #99 again; that work may still be worthwhile but has added change that goes beyond what has already been reviewed & tested by anyone else here.

james.williams’s picture

StatusFileSize
new1.68 KB
new9.98 KB

I was getting the following notice with patch #107 (and patch #103) when sitemaps were generated:

Notice: Undefined index: none in xmlsitemap_generate_multilingual() (line 679 of /<...>/modules/contrib/xmlsitemap/xmlsitemap.generate.inc).

This fixes that.

jyraya’s picture

Status: Needs review » Needs work

Hello,

I am sorry to say that the fix introduce a limitation on the language negotiation support that does not exist in the current stable version.

Therefore, the final fix must take into account of that in a way or another, or the module description should be modified in order to mention that it only supports the Drupal default language negotiation mechanism.

Otherwise, the fix introduces a kind of regression.

james.williams’s picture

That's a fair argument. Note that the current stable version of the module doesn't add the alternate links at all though, so I don't think it's a regression, just a missing feature for anyone using a custom URL-based negotiation method?

Unless you can demonstrate something that currently works in XML sitemap 2.6 but does not with patch 108?

I don't mean to sideline what you've done at all; I (and no-one else so far) have not made time to evaluate it yet, no doubt because most of us do use the core URL-based negotiation.

james.williams’s picture

Status: Needs work » Needs review
StatusFileSize
new14.14 KB
new13.57 KB

I've spent some time going through patch 99 and decided to incorporate it for my client anyway, so here's a patch that adds what it did, on top of patch 108.

The 'xmlsitemap_multilingual_only_nodes' setting was unnecessarily restricted to nodes, and wasn't fully respected in patch 99 anyway (when disabled, that patch still didn't output alternate links to languages that did not have translations of a node). So I've taken the opportunity to better implement it for entity translation and node/taxonomy translation alongside the patch 99 changes and renamed the setting 'xmlsitemap_multilingual_translated_only' to better reflect its intention, i.e.:

* when disabled, hreflang links are included for all enabled languages, regardless of whether the translation has been made for that language
* when enabled, hreflang links are only included for enabled languages that have had translations created

Links to the frontpage will be included for all enabled languages regardless of that setting.

I haven't included an interdiff between this patch and patch 99 to demonstrate the full extent of the tweaks to it, because it's not trivial to filter out all the other work between patches 98 and 108! @jyraya sorry about that - hopefully you can spot the differences between your original interdiff and this one? They're not perfectly identical because of the change to that setting explained above.

velrat’s picture

Can anyone tell me why this patch can't be installed?

Hunk #1 succeeded at 378 (offset -12 lines).
patching file xmlsitemap.api.php
patching file xmlsitemap.generate.inc
Hunk #1 succeeded at 206 with fuzz 2.
Hunk #2 FAILED at 246.
Hunk #3 succeeded at 591 (offset -11 lines).
1 out of 3 hunks FAILED -- saving rejects to file xmlsitemap.generate.inc.rej
patching file xmlsitemap.module
Hunk #1 succeeded at 306 (offset -13 lines).
patching file xmlsitemap.xmlsitemap.inc
james.williams’s picture

@velrat what version of the module are you trying to apply the patch to? It still applies cleanly to the latest code in the 7.x-2.x (dev) branch. It doesn't apply to a specific release, because patches are made to be applied on top of the latest work in development.

You could try using the latest dev branch with the patch, or manually create your own version of the patch that does apply, taking the parts in the generated '.rej' files that were left behind and carefully applying them yourself. Or find another way...

velrat’s picture

Thank you for your answer.
Indeed, after installing the latest version of the module, I was able to apply the patch without errors.

But, unfortunately, it does not work for me. No rel = alternate links added. Instead, sitemap formatting is completely gone. After applying the patch, I see the following

james.williams’s picture

@velrat Use your browser's option to view the page source to see the actual XML.

velrat’s picture

Please tell me which option. I use Google Chrome. Before applying the patch, the appearance is as follows:

james.williams’s picture

Right-click the page, click 'View Page Source'.

velrat’s picture

Well.
# 111 is installed but does not work correctly. Only rel = alternate hreflang = en is included in the map. Although English is generally excluded for the site and should not be included in the map. As for the two languages used on the site, they are not included in the map.

# 108 includes in the map two language versions of the site that are used. But it also adds English, which is turned off. Is it possible to somehow get rid of the disabled language?

x-default does not work in # 108 and # 111

james.williams’s picture

@velrat Please can you provide a screenshot of what settings you have for the various bits of multilingual configuration that this patch uses? (In the 'Advanced settings' section on /admin/config/search/xmlsitemap/settings ) Some of the things you've mentioned are quite possibly just because your settings are configured that way. (And the defaults don't necessarily match your expectation - which might be fair, but there's probably also a need to remain backwards-compatible.)

velrat’s picture

-