Hi,
My drupal setup is as below:
www.domain.com --> For English
www.domain.de --> For German
XMLSiteMap base URL is configured as http://www.domain.com
English and German languages are enabled in XMLSiteMap settings page.
I am using xmlsitemap menu module to generate sitemap.
The problems which I faced are:
1. http://www.domain.com/sitemap.xml contains all English and German menus in sitemap. (should only contain English menus)
2. http://www.domain.de/sitemap.xml page can not accessed by anonymous user and receive following error:
The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.
--------------------------------------------------------------------------------
Access is denied.
3. http://www.domain.de/sitemap.xml page can be accessed by root user but the file format is not correct.
4. http://www.domain.de/sitemap.xml page can be accessed by root user but again it contains all English and German menus. (should only contain German menus)
Comments
Comment #1
Dave ReidHmm...I don't think I have any kind of multilingual restriction on the menus yet. What module are you using to use certain menus for each language?
Comment #2
vendeka CreditAttribution: vendeka commentedI am using Drupal Core Menu module with Internationalization module.
By the way, I also tried with xmlsitemap_node module but it also has the same issues.
Comment #3
Dave ReidThat's odd. When I was trying out multilingual stuff, my Drupal install didn't hide any content that wasn't my 'current' language. Maybe it's when domain negotiation is enabled, which I couldn't test. Marking back as needs more info.
Comment #4
vendeka CreditAttribution: vendeka commentedHere is more detailed information about my setup:
* admin/settings/language/i18n
Content selection mode: Only current language.
* admin/settings/language/configure
Language negotiation: Domain name only
When creating a new node, I specify the language of content. (I do not set language setting of node as "Language Neutral")
Comment #5
Dave ReidAha...I didn't have the contrib internationalization module enabled. That would be the important point. :) I'll take a look into this again.
Comment #6
apadernoComment #7
hass CreditAttribution: hass commentedSubscribe. I also reported this for 6.x-1-x, but the cases have been closed without a fix by Kiam. There are also a few more i18n issues with path based detection if I remember correctly. The main source is
url()
that cannot be used and xmlsitemap need to implement it's own function that behave differently.Comment #8
Dave ReidPlease expand on why url() cannot be used.
Comment #9
hass CreditAttribution: hass commentedIt returns an URL for a language, but you do not have content in this language. This will happen if a node/7 (http://example.com/foo) has an English version, but is not yet translated to German and you request a URL on http://example.de/foo. Than url() returns the German path, but having English content, but we do not like to have English content on a German site in the SE... and if you build the xmlsitemap you will add the German URL to the sitemap, but there is only English content (wrong)... this is only one reason... there are 3-4 variants I cannot remember now... need to search the issue queue :-( and try out again.
Comment #10
hass CreditAttribution: hass commentedI may found what I've written... very long time ago...
http://drupal.org/node/157533#comment-797857
http://drupal.org/node/157533#comment-800788 ***
With path base detection the default sitemap is also added as /de/sitemap.xml (if default site language is DE) if I remember correctly. But it need to be /sitemap.xml... and contain all nodes in all languages of the site and not only German. I haven't tested this for a long time - so this may have been solved.
Comment #11
Dave ReidOk, so I had been planning on including the language of the node in {xmlsitemap}.language column. When a chunk of a sitemap is generated for a specific language (let's say german), it would include in the SQL:
WHERE language IN ('', 'de')
which selects only language-neutral or German nodes. Anything wrong with that approach?Comment #12
hass CreditAttribution: hass commentedCannot say for sure, but sounds good. There are so many variants and haven't taken a look to the 2.x code... plus it's toooo long ago that I've spend days on this functionality and how it should be... sorry. The only thing I can remember is that the url() or drupal_lookup_path() function often returns something that is ok for an user surfing a site, but incorrect for the sitemap.
I hope to find some time to do a detailed test again, but it would make more sense if you know that such issues cannot occur and I'm doing a review afterwards to figure out if the behaviour is correct with all language detection modes...
Comment #13
vendeka CreditAttribution: vendeka commentedIt seems the best approach to me but it shouldn't be limited with nodes only. Menu items should also have the same.
Comment #14
apadernoDrupal core code assigns a language to nodes, but not to menus; maybe there is a third-party module that assigns a language to the menu being shown, but Drupal core code doesn't do that.
Comment #15
Anonymous (not verified) CreditAttribution: Anonymous commentedMenu's use t() internally unless 'title callback' is given. See http://api.drupal.org/api/function/hook_menu/6. The description always uses t().
Comment #16
Anonymous (not verified) CreditAttribution: Anonymous commentedAre we using menu_link_load?
Comment #17
apadernoTo use
t()
for a string doesn't mean that a language is associated with menus. In the table used to save the menu data there isn't a language field, and when you create a menu, you are not asked for a language; if there would be such possibility, one could have a menu that appear only when the current language is a specific one.Comment #18
Dave ReidYes we are using menu_link_load in 6.x-2.x.
Comment #19
vendeka CreditAttribution: vendeka commentedI am not talking about menus but menu-items. menu-items has language option when creating. (I think this is also i18n feature but not sure)
menu_links.options field contains information about language for that menu-item:
'a:2:{s:10:"attributes";a:1:{s:5:"title";s:16:"Website Feedback";}s:8:"langcode";s:2:"en";}'
For instance, I have a menu with 6 menu-items. 3 of these items are specified as English and 3 of them are specified as German. They are only shown when the specific language called (in my case, it means specific domain)
Comment #20
Anonymous (not verified) CreditAttribution: Anonymous commentedYikes, the eliminates the need for t(). If the translation is done at the menu editing level then there is no need for using t() at all. I suspect that i18n module is altering the menu links with a hook. This would mean that we handle i18n enabled module differently.
Comment #21
apadernoThat is not a feature that is present in a plain Drupal core installation; the feature you are talking of must be implemented by i18n.
Still, in the Drupal code table, there isn't a field for the language associated with a menu item.
Comment #22
vendeka CreditAttribution: vendeka commentedI am aware of this feature is not a part of Drupal core.
You mean multilingual menu support for xmlsitemap should be implemented by i18n? I didn't get exactly what you mean.
I think earnie clarified that it is a hook used by i18n module.
Comment #23
Dave ReidTagging as alpha blocker.
Comment #24
eMPee584 CreditAttribution: eMPee584 commentedi just also hit this issue.. wanted to 'quickly' implement this but pondering about it but quickly found out that a) current implementation's (2.x) db schema and api has to be thoughtfully modified and b) i have different priorities than getting my sitemap localized...
i think the best way to handle this is to add a $language parameter to hook_xmlsitemap_links and a correspondent column to the xmlsitemap table... f.e. xmlsitemap_menu module would then query for items in the enabled menus, check the language of each item and in case it's not the wanted one, check for a translated version of the node. If that's not available but language is not set as well: keep it, else through it out.
btw there's what i believe is a bug: in xmlsitemap_menu_xmlsitemap_links(), the $menus arrays *values* are used to fetch the relevant entries from the menu_links table, but the *key names* actually are the machine readable names and values are the titles..
Comment #25
Dave ReidAdd the db schema and a couple of lines to each hook_xmlsitemap_links() is not a big issue. It just hasn't been as high of a priority as other things.
I'm not sure what you mean by the $menus values vs keys. xmlsitemap_menu_xmlsitemap_links() uses xmlsitemap_menu_get_menus() which runs
$menus = array_keys(menu_get_menus());
, so only the machine-readable menu names are used.Comment #26
eMPee584 CreditAttribution: eMPee584 commentedwell you're right dave, i'm a fool *g
line 103 contains
$menus = menu_get_menus();
and that confused me, sorry. (of course the module wouldn't even work the way it does without this being correct..)Comment #27
Dave ReidI just added *basic* support for multilingual node sitemap selection by adding a new hook_xmlsitemap_query_alter() and implementing i18n_xmlsitemap_query_alter() inside xmlsitemap.module on behalf of i18n.module.
Comment #28
Dave ReidI'll keep slowly working on multilingual menu items and taxonomy terms, but I think this has moved from a bug report to a feature request now that we've solved the node problem. Also moving back to beta blocker for this for the remaining items in this issue.
Comment #29
Dave ReidI finished the implementation of i8ln_xmlsitemap_query_alter() and also I'm pretty sure I got the support for multilingual menu items and taxonomy terms as well.
http://drupal.org/cvs?commit=253976
http://drupal.org/cvs?commit=253978
http://drupal.org/cvs?commit=253980
I'm going to consider this fixed for now. Just will need some testing from all you i18n users out there.
Comment #30
Dave ReidFYI I've decided to move the i18n.module integration into a separate sub-module xmlsitemap_i18n, so the base xmlsitemap module can stay trim as possible.
Comment #31
vendeka CreditAttribution: vendeka commentedI just do clean install with latest dev build with xmlsitemap and xmlsitemap_node module but unfortunately the generated sitemaps (http://www.domain.com/sitemap.xml and http://www.domain.de/sitemap.xml) contains all languages nodes in sitemap. I checked database for xmlsitemap language field for nodes and saw that node languages are ok but the generated sitemap.xml files contain all languages.
P.S. Haven't changed the status maybe latest dev build is not up2date.
Comment #32
Dave ReidYes, I just made the changes and the development build only regenerates every 12 hours automatically.
Comment #33
manfer CreditAttribution: manfer commentedI tested last in CVS after this in a test site with i18n enabled and some nodes translated in both languages, xml sitemap internationalization enabled:
In both cases I've been able to access any language sitemap as authenticated and as anonymous.
But I don't know exactly which is the objetive you want to reach and how this affect all cases.
I suppose for a multilingual site managed by domain name is great to have different sitemaps for each language which only its corresponding nodes, menus, taxonomies, ..., and then submit each sitemap to search engines as sitemaps for each different language domain.
But, is it the same situation for multilingual sites managed by prefix (http://www.example.com/, http://www.example.com/es, http://www.example.com/de)?. The domain is the same for all languages (http://www.example.com), or for a multilingual site using only user language preference. How that affect submission to search engines? I can't verify site http://www.example.com/es on search engines. The different sitemaps for every language will be submitted to search engines for the domain http://www.example.com ?
Comment #34
Dave ReidHeh, so there were some major bugs, mainly the new query alter hook never being called. I'm tagging an unstable3 that actually works with multilingual support. I even wrote tests to make sure and that's how I found this wasn't actually working. :)
Comment #35
manfer CreditAttribution: manfer commentedDepending on the option chosen in selection mode for multilingual system, every language sitemap shows only the nodes on that language, only the nodes on that language + neutral language nodes .... Now it works fine.
Tested with language negotiation by prefix and by domain.
How is managed the submission to search engines? This is something I can't test and would like to know how it would be done for a multilingual site with language negotiation by prefix.
Comment #36
Dave Reid@manfer Yay! Thanks very much for testing it!
To answer your question, the xmlsitemap_engines.module will ping the search engines with all the selected-language sitemaps on admin/settings/xmlsitemap. So if you have English (default lang) and French sitemaps enabled and you have just the Google engine selected, when the sitemap is updated, your site will ping Google with:
http://www.google.com/webmasters/tools/ping?sitemap=http://example.com/sitemap.xml
http://www.google.com/webmasters/tools/ping?sitemap=http://example.com/fr/sitemap.xml
Comment #37
hass CreditAttribution: hass commentedDave, this sounds wrong. The module should only notify google about http://www.google.com/webmasters/tools/ping?sitemap=http://example.com/s... and no other file. The file http://www.google.com/webmasters/tools/ping?sitemap=http://example.com/f... should never be accessible. The nodes with /fr/* should to be included in the main file http://example.com/sitemap.xml
Comment #38
Dave Reid@hass: If people are using the i18n.module with *content selection settings*, then nodes with /fr/* would and should not be included in the English sitemap. That matches what happens on the actual Drupal site. If visiting the site in /fr/* mode, you don't see English content.
There is also no harm in submitting all the language sitemaps. How would Google know about the French sitemap if we only submitted the English sitemap and we don't have robotstxt.module enabled?
Comment #39
manfer CreditAttribution: manfer commentedWith language negotiation by domain looks fine to have different sitemaps and submit the corresponding sitemap to the specific domain for each language.
But I have still many doubts on the correct way for a drupal multilingual site with language negotiation by prefix.
My knowledge is not enough to discuss this. It would be nice if people having the knowledge can clarify.
By now there are two totally opposite opinions. :(
One of my doubts is:
Would really google accept:
http://www.google.com/webmasters/tools/ping?sitemap=http://example.com/fr/sitemap.xml
and ones like that?
Comment #40
manfer CreditAttribution: manfer commentedAnother issue (in case submitting that kind of sitemaps with language prefix is accepted) would be with the option selection mode for multilingual site. If it is not set to Only current language you'll finish with a lot of duplicated nodes on submitted sitemaps. I'll explain better with examples:
Language neutral nodes will appear on sitemaps for every language and you finish submitting those nodes a lot of times to search engines.
Again you finish with submitting neutral language nodes a lot of times and nodes with no translations would be submitted a lot of times too.
I'm not sure but I think with language by prefix only a sitemap is needed with all nodes on it and just that sitemap submitted to search engines. Search engines would have no problem to identify language for each node.
English nodes:
Spanish nodes:
French nodes:
and so on.
Comment #41
Dave ReidWhat this integration is all about is mirroring exactly what content is on the Drupal site when you view it in each language, because each language is basically it's own individual Drupal site. If you view http://example.com/fr and it includes language-neutral content, it makes sense that a sitemap on http://example.com/fr/sitemap.xml includes the links to the same content as well. This is still at the basic, but complete step of integration and could probably be improved. But right now the sitemap content matches exactly what is controlled by i18n.module.
Comment #42
Dave ReidIf you want all the links in one sitemap, only enable the default language sitemap and don't enable the xmlsitemap_i18n.module. Easy as that.
What I could do is not allow people to select multiple language sitemaps unless the xmlsitemap_i18n.module is enabled. That seems to make the most sense.
Comment #43
Dave ReidOk I've moved all multilingual XML sitemap features to xmlsitemap_i18n.module. So by default (without this sub-module enabled) you can only have one sitemap and all your content is in that one sitemap.
EDIT:
If the user has the xmlsitemap_i18n.module enabled, they will see a message on the "Generate sitemaps for the following languages" option: "Each language's sitemap will respect the multilingual content selection mode."
I think this is a much better approach now. Thanks for your help in reaching this manfer. :)
FYI you could have a sitemap in any 'location' of your site as per the sitemaps.org protocol. It could be at:
http://example.com/sitemap.xml
http://example.com/funky_folder/sitemap.xml
http://example.com/funky_folder/strange_folder/what_is_going_on/sitemap.xml
The only restriction is that all the links inside a sitemap should/need to reside in the same location:
If people are using different domains for their Drupal language sites, they should be using i18n (and xmlsitemap_i18n) to help control what is in each site and avoid being dinged by Google for having duplicate content.
Comment #44
manfer CreditAttribution: manfer commentedYes I tested just now and as you say with xmlsitemap_i18n.module disable you get all nodes in the sitemap. And yes then has not many sense to select different languages on xmlsitemap settings configuration form - something you have just done while I wrote this :) -.
It is good to know if for some reason this is the case I need but still don't know.
It is not just I want all in one sitemap. If all is correct with more than one sitemap It is fine for me. But I have those doubts about the submission of sitemaps with the prefix and with the possible submission of same content more than once. If both things are fine for search engines probably I prefer the solution with more than one sitemap as that could be useful for other uses cases and not only search engines submissions.
Comment #45
vendeka CreditAttribution: vendeka commentedHi again,
First of all, you made a great progress Dave and I want to thank you for your efforts.
I decided to test initially xmlsitemap_menu and xmlsitemap_i18n module with my multilingual setup. Below, you can see the issues I faced:
Info: Someone may ask why I add same menu item to different menus. The answer is quite simple: the increase accessibility to "Contact Us" link :)
Info: I set homepage for each language from admin/settings/site-information by setting "Default front page" variable. However, this "Default front page" variable is not a multilingual variable as default so I added below setting into my settings.php file:
There was a patch for it but I think still not included in release but will in near future.
Comment #46
Dave Reid@vendeka:
- Issue for duplicate links: #454442: Disable duplicate links during regeneration. It's not a huge priority since the sitemap still works.
- The front page uses url() with the proper $language object parameter to generate the link of the frontpage. So I'm not sure what else we can do there.
Comment #47
vendeka CreditAttribution: vendeka commentedFor the frontpage issue, I checked with xmlsitemap-6.x-2.0-unstable2 and it works as expected.
Comment #49
vendeka CreditAttribution: vendeka commentedHi,
I checked the problem about front page url() and find out the exact problem. The front page url is gathered from the cron url and used for all language sitemaps. To illustrate:
If you run cron through www.domain.com (domain name used for English content only), all generated sitemaps include frontpage URL as www.domain.com (domain.de sitemap should contain front page URL as domain.de)
If you run cron through www.domain.de (domain name used for German content only), all generated sitemaps include frontpage URL as www.domain.de (domain.com sitemap should contain front page URL as domain.com)
To solve this issue, xmlsitemap i18n module should gather front page url from Language domain setting under admin/settings/language/edit/.
Comment #50
Dave Reid@vendeka: I can confirm the same problem. I hadn't tried out the domain negotiation before. I don't understand why it isn't working however.
Comment #51
Dave ReidOk I think the latest commits should be good:
http://drupal.org/cvs?commit=261478
http://drupal.org/cvs?commit=261508
Comment #52
vendeka CreditAttribution: vendeka commentedI just updated xmlsitemap and I confirm latest commits fix the issue.