Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
One of the things I ran into while trying to make a 'News sitemap' sub-module using the new context system was that there will be a conflict as several different modules try to provide different sitemap types that cannot be combined. We'll need an elegant way to handle a global 'type' context (for XML sitemap types: normal, news, video, mobile, etc.).
I'm also running into the problem that we need to allow the XML delivery callback to take an argument of contexts as well if we wanted to add a new menu item like 'sitemap-news.xml' that would automatically have the 'news' type context be TRUE.
Comments
Comment #1
Dave ReidThis is a blocker for #451234: Support for extended sitemap formats (e.g. video, news, image, mobile)
Comment #2
Dave ReidComment #3
Dave ReidThis context should probably be provided by the base module itself. Other modules would only then need to hook into the sitemap edit form alter to provide the additional type values.
Comment #4
Dave ReidAnother thing to consider is being able to provide a partial context array to the sitemap page delivery callback. So a xmlsitemap_news.module could add a new menu router item:
Comment #5
Dave ReidComment #6
willmoy CreditAttribution: willmoy commentedSub but will try to be helpful in Oct if still hanging around
Comment #7
willmoy CreditAttribution: willmoy commentedI hope this is helpful...
There are seven types of sitemap:
The point being that they have to integrate in very different ways.
Contexts, as implemented in i18n and domain_access, are simply extra constraints on the SELECT query on xmlsitemap. It is not obvious how that can be extended to handle these cases.
The blunt selection of entities in the module at the moment will do for a sitemap that we presume ought to have just about everything in it, but it isn't really flexible enough for news, images etc.
So I think the thing to do is:
Imagine a site with news content containing significant images. You want perhaps three sitemaps:
The general one happens more or less as now. When articles are saved, a rule checks if they contain relevant image/video content and if so adds the appropriate sitemap status and metadata. When the general sitemap is being generated, xmlsitemap_specialist notices (by testing $sitemap in query_alter) and JOINs and extends with hook_sitemap_element_alter where applicable.
The news one is its own sitemap, and its own file. Its cron kicks off on every run and it uses a fairly complex set of criteria to decide what goes in (blog posts with certain tags are news for this site), as well as news articles.
The mobile one is like the general one. It has blanket rules so it includes all content and a blanket setting that all mobile content is in, for argument's sake WAP. It does a straight query off xml_sitemap and spits it out, no JOIN.
As I say, massively overkill for news but will work for the others too.
I should also say there may be better ways but it looked like an effort was in order, so here's mine.
Comment #8
digi24 CreditAttribution: digi24 commentedWouldn't it be better to have a separate table for each sitemap type? Each table could store its relevant metadata, thus keeping up performance and joining on numerical ids is not any significant overhead for the db.
Comment #9
willmoy CreditAttribution: willmoy commented2 reasons I didn't think so:
(1) I think this would be easier to integrate with the existing codebase, because it would just need a small change to the function that does the query on the _links table
(2) Several sitemap formats don't need any per-link metadata (mobile and news come to mind)
Comment #10
digi24 CreditAttribution: digi24 commentedNews does need additional data: title, keywords, type access for example, so do images in sitemaps. If you do not store this data in a separate table you will have to retrieve it for each sitemap generation. But then you could just use views instead of the xmlsitemap module to generate them on the fly. Please correct me if I am wrong.
Comment #11
willmoy CreditAttribution: willmoy commentedSorry, haven't been able to look at this before now. I think I reckoned that for many use cases, the extra properties of news sitemaps would be the same for every article, so you could save the overhead by just specifying them once in the sitemap config—but that doesn't deal with the 'strongly recommended' title property.
http://www.google.com/support/news_pub/bin/answer.py?answer=116037
And you're definitely right about image.
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=178636
(Perhaps the smoothest way to handle image sitemaps would be to parse nodes for img tags, and pick up src, title and alt attributes for loc, title and caption respectively?)
The only one that would work is mobile, I think, and even then not for some use cases.
So yeah, separate tables joined on ID and type.
Comment #12
Rhino CreditAttribution: Rhino commentedSubscribing.
Comment #13
BenK CreditAttribution: BenK commentedSubscribing
Comment #14
70111m CreditAttribution: 70111m commentedSubscribing
Comment #15
bensnyder CreditAttribution: bensnyder commentedSub
Comment #16
anavarreSubscribe
Comment #17
lemming CreditAttribution: lemming commentedI think I have been posting:
http://drupal.org/node/451234#comment-6083094
to the wrong thread maybe? It might be more relevant here?
Comment #18
DamienMcKenna