Problem/Motivation

On large sites cron runs into memory and performance problems.

xmlsitemap_get_path_alias() is a bad function because it loads the whole URL alias table which will not scale on large sites.

This patch refactors xmlsitemap_generate_chunk, the place that call xmlsitemap_get_path_alias(). It overrides the query which collects aliases.

Comments

jbodony created an issue. See original summary.

jbodony’s picture

poker10’s picture

Status: Needs review » Needs work
Issue tags: -xmlsitemap

Looking at tests, this patch introduced a short array syntax, which will not work on older PHP still supported by this module. Classic array need to be used here.

@@ -177,6 +165,9 @@ function xmlsitemap_generate_chunk(stdClass $sitemap, XMLSitemapWriter $writer,
 
   $last_url = '';
   $link_count = 0;
+  $sources['language'] = [];
+  $sources['loc'] = [];
+  $linkcollection = [];

I have not reviewed the whole patch yet.