Problem/Motivation
If an access control mechanism or a hook implementation results in an empty page in the sitemap, the rest of the pages are not generated.
Steps to reproduce
Sitemaps are generated using batch jobs. Say, there is an access control mechanism that results in the following:
- page 1: Has 50 links so the page is generated.
- page 2: Has 2 links because the other 48 links were removed by some access control mechanism or hook implementation.
- page 3: Has 0 links for the same reasons as page 2.
- page 4: Has 50 links so the page is supposed to be generated, however, since page 3 was empty, the batch was marked as finished.
Proposed resolution
Consider using a 2 step process:
- Step 1: For each entry in the xmlsitemap table, attempt to prepare a sitemap link for each sitemap. Maybe, this intermediate data could be stored in the database.
- Step 2: Generate sitemap pages using the records created in Step 1.
This way, each XML file will have equal number of records and empty pages won't result in premature termination of the sitemap.
Comments