Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
There appears to be a need for an index of all sitemap variants. This is different to a sitemap index that indexes all pages of a single variant (currently implemented). The sitemap protocol supports this idea.
For 3.x, please collaborate in simple_sitemap_index' queue on this functionality.
Let's focus on getting this feature cleanly into 4.x.
Comment | File | Size | Author |
---|
Issue fork simple_sitemap-3109090
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
michele.lucchina CreditAttribution: michele.lucchina as a volunteer commentedI created this patch to facilitate the creation of the submodule as proposed in this previous conversation
Comment #3
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedThis module already creates an index for each sitemap instance (variant) depandant on the module settings. Also we don't adhere to the sitemaps protocol; instead we adhere to google's hreflang sitemap standard.
I believe what you are trying to achieve is to create an index of all the sitemap variants. I took the liberty of updating the issue description and title. Will take a look at your patch soon.
Does this borrow code from https://gist.github.com/bdlangton/aea9673cc640e2dfc58466f985a3284c ?
Comment #4
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedLooks quite good already. I think we should include it as plugins in the main module (no submodule) and we should change the naming from sitemapindex to variant_index like so:
As soon as this is in, I will take a closer look at the code and documentation and add some final touches.
Thank you for looking into it!
Comment #5
michele.lucchina CreditAttribution: michele.lucchina as a volunteer commentedHere is the modification of the main module with your naming.
I have integrated a small modification to the forms to avoid inserting content in the variant index since this variant can only contain other sitemaps.
PS: yes I started my work from the bdlangton code
I hope my work is appreciated ;)
Comment #6
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commented@michele.lucchina Sorry for not making progress on this; I will make sure to test it thoroughly once I have more time on my hands.
Comment #7
cgmonroe CreditAttribution: cgmonroe as a volunteer commentedThe current patch will not apply to the latest dev / release. Here is a re-rolled version that will.
Comment #8
cgmonroe CreditAttribution: cgmonroe as a volunteer commentedThis is an update to the current patch to that includes a route for sitemap_index.xml.
The changes to the code are:
Adds a \sitemap_index.xml route.
Modifies the controller to include the config.factory service via injection and a getSitemapIndex(Request) function. The function looks for the variant_index config info. If found it calls the existing getSitemap(Request, variant) code with that key. If no config is found it returns a 404.
This allows the default sitemap setting to point to the main sitemap and the index to be retrieved with the suggested sitemap_index.xml filename.
Example, with the Variant Setup:
Available URLs:
/sitemap_index.xml (includes /sitemap.xml and /blog/sitemap.xml )
/sitemap.xml (default variant)
/blog/sitemap.xml (blog variant)
Comment #9
a.milkovsky#7 works for me. #9 did not generate the sitemap somehow.
Question: how to proceed with multilingual websites in this case?
What if I have example.de and example.ch websites. Should both domains be presented in the "index sitemap"?
Should we use hreflang?
Worth to mention:
Looks like Google does not require Sitemap Index file:
Comment #10
cgmonroe CreditAttribution: cgmonroe as a volunteer commentedBy #9 you mean the patch in #8? FWIW, I have noticed that both patches sometimes require the sitemap to be initially built twice for it to work. Mostly after adding variants or making changes. I think it has to do with the variant not being built before the sitemap index is built. Once the 'pump is primed', I have not see any problems.
Yes, the sitemap_index is 'optional'... but try explaining that to your SEO consultant / department... and then winning the fight against "but it's best for SEO...". :)
Anyway, the module supporting creating separate language sitemaps should be a different issue. And doing at the module level might be a bit tricky due to variants not being totally integrated.
That said, here's what I did to support this:
First, add some code similar to this in a custom module / theme file. Note the static variant to language lookup array in the filter code.
Then set up your language variants like:
[lang code] | default_hreflang | [lang code]
e.g. de | default_hreflang | de
Downside is that for each language you will have to process all your urls again. So build time is a bit longer.
You might do it a bit faster by building a custom variant for each language and then overriding the common SimpleSitemap (a bit complex but do-able). And then filtering by language when the entities are loaded to create the links list.
I choose just to have a slightly slower build time using the simple method.
Comment #11
a.milkovskyHi @cgmonroe,
yes, sorry, I ment the patch #8. Thank you for posting your solution.
I am also looking currently for a solution for the index sitemap for multiple languages.
AS I understand it is not necessary to split the sitemaps into separate languages. And this module tries to avoid this separation.
Have a look at #3033283: Generate per language sitemap:
From Google:
Comment #12
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedRegarding the last comment: I really hope it is clear by now that this module does everything a modern multilingual sitemap needs to be doing. This here issue is academic for people that need this to be working the old multilingual way 'because'. But seriously, if you are one of these people, you will be better off with xmlsitemap which does this instead of hreflang.
Edit: Sorry for the confusion, I am speaking of using variants/sitemap index to mimic the old multilingual way where you had one sitemap per language. This is academic and should be avoided. Having a sitemap index as proposed here is something different and probably a good feature to include.
Comment #13
a.milkovskyHey @gbyte, thank you for the answer and your contributions!
Regarding
This here issue is academic for people that need this to be working the old way 'because'.
I do not completely agree with this statement. In my use case I heve separate sitemaps. But they are not separated by language, but rather by content type. As result it may make sense to have an index sitemap, that collects all of the distributed sitemaps. It is not an outdated concept. From the Google docu:
In my current project I have 15 content types (this is a large media portal with a lot of content) and 2 languages.
This module allows to separate sitemaps by content types, that's why I have decided to use it instead of the Xmlsitemap module. In addition this module provides hreflang integration, which is also working perfect.
I have currrently generated 15 sitemaps, and I am looking for an option to generate an index sitemap (ideally with hreflangs).
I hope my usecase is clear. Looking forward to your feedback!
Comment #14
cgmonroe CreditAttribution: cgmonroe as a volunteer commentedHey @gbyte,
Totally agree, this module is SimpleSitemap for a reason. It does the core job easily with minimal setup.
It is also flexible enough to meet local needs with some fairly simple site specific coding. This is very important, as SEO people are paid to "improve" site SEO.. which means they will always find things to change. Have been thru SEO consultant changes where the new consultant suddenly say, why are we doing that.. because the old consultant said to... no no rip it out...
Love the new variant plugin setup btw. Using it to keep nodes marked with noindex tags from getting into the sitemap. Bottom line is that we have met every SEO change challenge over the last 4+ years with this module. Great track record.
That said, I hope this hasn't distracted from the benefits of adding sitemap_index.xml support to the module.
TIA.
Comment #15
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedSorry for the confusion, I am speaking of using variants/sitemap index to mimic the old multilingual way where you had one sitemap per language. This is academic and should be avoided. Having a sitemap index as proposed here is something different and probably a good feature to include.
Comment #16
s_leu CreditAttribution: s_leu commentedRe-rolled the patch against current 8.x-3.x
Comment #17
donaldinou CreditAttribution: donaldinou as a volunteer commentedHi,
Thanks everyone for the great job.
I really need this functionnality fast, so I've made a module:
https://www.drupal.org/project/simple_sitemap_index
Feel free to contribute or include it as a submodule to simple_sitemap.
Comment #18
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commented@donaldinou I installed and tested it, seems to be working fine. I'm sure people will appreciate it until this issue is fixed.
Regarding this issue, can we convert the newest patch to an issue fork so we can collaborate more effectively? This is one of the features I'd like to merge before starting the work on 4.x. Thanks to everyone who contributed!
Comment #21
daniel.bosen@gbyte I created a Fork and MR from the latest patch. This looks all good to me. What is left to be done, to get it in?
Comment #22
Oscaner CreditAttribution: Oscaner at CI&T commentedComment #23
fagoHow does that work in terms of URL handling?
We have created a custom module that added a sitemap index as variant as well - it should have been posted and communicated here earlier - :-/ https://github.com/drunomics/simple-sitemap-extensions
Anyway, it does not cope with the sitemap requirements of ensuring that sub-sitemaps live within the same "folder" of the sitemap index:
https://developers.google.com/search/docs/advanced/sitemaps/large-sitemaps
So that's something which should be considered when designing a proper solution I suppose.
Comment #24
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commented@fago If the sitemap variant index is set as default sitemap, its URL becomes /sitemap.xml which is in the site root, hence all sitemaps are located in subfolders fulfilling that requirement.
Things have changed since my last update:
@fago @donaldinou
Can you guys collaborate together on the drupal.org simple_sitemap_index module for 3.x?
Thank you all for your input.
Comment #25
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedComment #26
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedComment #27
fago>@fago If the sitemap variant index is set as default sitemap, its URL becomes /sitemap.xml which is in the site root, hence all sitemaps are
located in subfolders fulfilling that requirement.
Yes, that works if you add only one sitemap index, but not if you add multiples. We need a solution that is capable of that and solved it now by adding some more URL processing at https://github.com/drunomics/simple-sitemap-extensions
> Can you guys collaborate together on the drupal.org simple_sitemap_index module for 3.x?
It's too late, we already have two alternative solutions here and it does not make sense for us to re-build our working solution now.
I think we should aim at 4.x now and collaborate on a good solution to get into 4.x
Comment #28
donaldinou CreditAttribution: donaldinou as a volunteer commentedSuch a shame.
@gbyte
I am willing to help the community as much as I can and would be happy to collaborate with anyone want to improve this module.
Comment #29
Li Qing CreditAttribution: Li Qing commentedRemove "priority" tag as that is not valid XML.
See: https://www.sitemaps.org/protocol.html#index
Comment #30
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedComment #33
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedI have started implementing this in 4.x.
Questions to all you SEO gurus:
/sitemap.xml
?/index/sitemap.xml
? Obviously the user would be able to delete the index entity or set it as default to have it available under/sitemap.xml
.Please speak up now - that includes my Thunder Genossen who apparently are still on 3.x. :)
Comment #34
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedThat's your queue guys
Comment #35
marcoka CreditAttribution: marcoka commentedNot sure if i understand it correctly but i used it this way.
I have the domain https://www.kopfhoerer-berater.de/sitemap.xml
And on that page i have the seperate sitemaps for contenttypes.
On a small site that may not be necessary by default.
Comment #36
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedThese questions become moot once I implement #3269333: Add ability to disable sitemap variants.
Comment #37
chr.fritsch@gbyte Thx, for working on this. Awesome. Let me know if you need a review or any help.
Comment #38
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commented@chr.fritsch Thanks and yes, I wouldn't mind one of you guys adding XSL to the sitemap index generator.
Use
\Drupal\simple_sitemap\Plugin\simple_sitemap\SitemapGenerator\SitemapIndexGenerator::getXslContent
analogous to\Drupal\simple_sitemap\Plugin\simple_sitemap\SitemapGenerator\DefaultSitemapGenerator::getXslContent
.Alternatively remove the method
\Drupal\simple_sitemap\Plugin\simple_sitemap\SitemapGenerator\SitemapIndexGenerator::getXslContent
and alterxsl/simple_sitemap.xsl
to accomodate the sitemap index.Feel free to introduce any other changes concerning the index functionality - will happily review. Thanks in advance.
Comment #39
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedComment #40
chr.fritschI am not an expert on XML/XSL stuff. What should the XSL look like?
Can I do something similar to WordPress? https://developer.wordpress.org/reference/classes/wp_sitemaps_stylesheet...
Comment #41
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedI can't prioritize learning the structure ATM, hence I'm asking for support. Not sure about your question though, XSL is already integrated for the default sitemaps; right now it's only about adjusting it to fit the index of sitemaps. See my comment.
Anyone feel free to grab this.
Comment #42
chr.fritschI fixed the sitemap index XML. And now the XSL is applied correctly as far as I can see.
Comment #43
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedIs the index of unrelated sitemaps on the site (new functionality) the same as splitting up one sitemap into chunks in terms of sitemap structure (old functionality)? If so how did we miss this?
In this case,
SitemapIndexGenerator::getChunkContent
should useSitemapGeneratorBase::$indexAttributes
instead of callingDefaultSitemapGenerator::addSitemapAttributes()
I believe, similar to whatSitemapGeneratorBase::getIndexContent
does.Comment #44
marcoka CreditAttribution: marcoka commentedInstalled the latest dev version today for testing. Whet i would expect is that
http://dev9.test.de/sitemap.xml would list an index that lists all the subindexes i created. In my case
-One for Contenttype Article
-One for Contenttype Site
-One for Contenttype Product
What i get is that http://dev9.test.de/sitemap.xml lists the first entry in my case the content of "http://dev9.test.de/artikel/sitemap.xml"
Comment #45
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commented@marcoka
/sitemap.xml lists your default sitemap, as set in your settings - this is expected behavior.
If you need an index of all sitemaps, be patient or use the above branch/patch. It's more or less finished, just needs some love.
Comment #47
gbyte CreditAttribution: gbyte as a volunteer and at gbyte commentedThanks for your input; that's in dev now and will (hopefully) be released mid June alongside D10 support and a few other niceties.