Please review
#1435264-23: Xml stops working Uncaught exception 'XMLSitemapGenerationException' error
#1435264-34: Xml stops working Uncaught exception 'XMLSitemapGenerationException' error

Original Issue
XMLSitemap was working fine with 6.x-2.0-rc1 generating two pages (links are more than 10.000)
Some day i noticed there was an error preventing cron to finish.

Unknown error occurred while writing to file sites/default/files/xmlsitemap/NXhscRe0440PFpI5dSznEVgmauL25KojD7u4e9aZwOM/1.xml. στο αρχείο /var/www/vhosts/moriasnow.gr/httpdocs/sites/all/modules/xmlsitemap/xmlsitemap.generate.inc στη γραμμή 158.

also i found this in apache's error log:

2012-02-10 09:42:17.497 [NOTICE] [194.219.230.130:57503-0#APVH_moriasnow.gr] [STDERR] PHP Fatal error: Uncaught exception 'XMLSitemapGenerationException' with message 'Unknown error occurred while writing to file sites/default/files/xmlsitemap/NXhscRe0440PFpI5dSznEVgmauL25KojD7u4e9aZwOM/1.xml.' in /var/www/vhosts/moriasnow.gr/httpdocs/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc:162
Stack trace:
#0 /var/www/vhosts/moriasnow.gr/httpdocs/sites/all/modules/xmlsitemap/xmlsitemap.generate.inc(155): XMLSitemapWriter->endDocument()
#1 /var/www/vhosts/moriasnow.gr/httpdocs/sites/all/modules/xmlsitemap/xmlsitemap.generate.inc(338): xmlsitemap_generate_page()
#2 /var/www/vhosts/moriasnow.gr/httpdocs/includes/batch.inc(189): xmlsitemap_regenerate_batch_generate(Object(stdClass), 1)
#3 /var/www/vhosts/moriasnow.gr/httpdocs/includes/form.inc(2550): _batch_process()
#4 /var/www/vhosts/moriasnow.gr/httpdocs/sites/all/modules/xmlsitemap/xmlsitemap.module(1436): batch_process()
#5 /var/www/vhosts/moriasnow.gr/httpdocs/sites/all/modules/xmlsitemap/xmlsitemap.module(222): xmlsitemap_run_unprogressive_batch('NXhscRe0440 in /var/www/vhosts/moriasnow.gr/httpdocs/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc on line 162 

I downloaded the 6.x-2.x.-dev version with the same errors.

Disabling XMLSitemap module (and uninstalling) seems to work fine again but when the xml file gets bigger with cron jobs, the problem comes again. (for rc1 and dev version also)

Trying to rebuild sitmapxml file, the batch proccess never ends successfully.

Any ideas ?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Anonymous’s picture

Priority: Major » Normal

I've no idea. The parent class XMLWriter set a status that caused xmlsitemap to throw a stackdump while executing the endDocument method. Well, that's not exactly true. The this::status is set during the flush operation. The documentation for XMLWriter::flush states the return value as

Return Values

If you opened the writer in memory, this function returns the generated XML buffer, Else, if using URI, this function will write the buffer and return the number of written bytes.

Dave's use of this value is wrong, IMO. It is possible for the number of bytes written with a flush to be zero and thus cause the status to be set to bool FALSE.

If you change the xmlsitemap.xmlsitemap.inc file line 161 from

if (!$this->getStatus()) {

to

if (!$return) {

do you get a valid sitemap.xml files?

I am going to add here:
FINALLY, someone with enough logging and know how to give us a proper report. I believe this issue is the crux of a bunch of other issues plaguing this module.

Alex Andrascu’s picture

I get this in 7.2.x-dev too while trying to regenerate the sitemap via drush.
Worth mentioning this happened after a huge taxonomy vocabulary inclusion (about 4 million entries) and that it was working before with 20K links split in 4 pages (sitemap.xml?page=1 / sitemap.xml?page=2 / sitemap.xml?page=3 ...)

xmlsitemap-regenerate -v
Initialized Drupal 7.12 root directory at /var/www/html/xxx/d7                                                                                                         [notice]
Initialized Drupal site default at sites/default                                                                                                                          [notice]
WD xmlsitemap: XMLSitemapGenerationException: Unknown error occurred while writing to file                                                                             [error]
public://sites/default/files/xmlsitemap/g3XYqcXbSKPVBDODwnT6pq7oqhCFkPryj4vVqrl_Kfc/24.xml. in XMLSitemapWriter->endDocument() (line 157 of
/var/www/html/xxx/d7/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc).
WD php: XMLSitemapGenerationException: Unknown error occurred while writing to file                                                                                    [error]
public://sites/default/files/xmlsitemap/g3XYqcXbSKPVBDODwnT6pq7oqhCFkPryj4vVqrl_Kfc/24.xml. in XMLSitemapWriter->endDocument() (line 157 of
/var/www/html/xxx/d7/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc).
XMLSitemapGenerationException: Unknown error occurred while writing to file public://sites/default/files/xmlsitemap/g3XYqcXbSKPVBDODwnT6pq7oqhCFkPryj4vVqrl_Kfc/24.xml. in XMLSitemapWriter->endDocument() (line 157 of /var/www/html/xxx/d7/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc).
Drush command terminated abnormally due to an unrecoverable error.

Number of links in each sitemap page 5000.
Maximum number of sitemap links to process at once: 50 (not sure if this affects drush )
I'd say earnie is pretty close to answer it.

Please let me know

Anonymous’s picture

Status: Active » Postponed (maintainer needs more info)

Re: #2

Did you try my suggested fix from #1?

Alex Andrascu’s picture

Yep...applied that...still running but got way far than previous runs. I'll let you know how it ends up :)

Alex Andrascu’s picture

Well it went past that error it seems but it eventually died throwing this:

Fatal error: Maximum execution time of 240 seconds exceeded in /var/www/html/blottr/d7/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc on line 128
Drush command terminated abnormally due to an unrecoverable error.                                                                                                     [error]
Error: Maximum execution time of 240 seconds exceeded in /var/www/html/blottr/d7/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc, line 128

wich i think it's unrelated.

I'm gonna keep trying.

EDIT:
changed drupal_set_time_limit(240); to drupal_set_time_limit(0);
fingers crossed

drush> xmlsitemap-regenerate -v
Anonymous’s picture

Look at your DB logs. Make sure there isn't a DB lock. If one batch set doesn't finish in 4 minutes then you've other issues to deal with. Setting the time limit to unlimited will only cause you to wait indefinitely for the process to end and if there is a DB lock then well...

Also, rebuilding the xmlsitemap should happen during site maintenance. Probably should be a feature to refuse to do the rebuild if the site is active.

FanisTsiros’s picture

Applying #1 i got a valid 2 pages xml file with 12.000 links!
Not sure what #1 does though. Is this safe to keep ?
(sorry i have no idea about xmlsitemap module...)

Now i'll wait for cron jobs to run to see if the problem is still there. I have just rebuilt links (not in maintenance mode, but with cron stopped as long as the batch was working).

earnie, thanks for your time dealing with this.

Alex Andrascu’s picture

Apparently it went fine generating 5004 xml chunks but then thrown this:

Initialized Drupal 7.12 root directory at /var/www/html/xxx/d7                                                                                                         [notice]
Initialized Drupal site default at sites/default                                                                                                                          [notice]
WD xmlsitemap: PDOException: SQLSTATE[HY000]: General error: 3 Error writing file '/dev/shm/MYn0q1Nn' (Errcode: 28): SELECT x.loc AS loc, x.lastmod AS lastmod,        [error]
x.changefreq AS changefreq, x.changecount AS changecount, x.priority AS priority, x.language AS language, x.access AS access, x.status AS status
FROM 
{xmlsitemap} x
WHERE  (x.access = :db_condition_placeholder_0) AND (x.status = :db_condition_placeholder_1) AND (language IN  (:db_condition_placeholder_2,
:db_condition_placeholder_3)) 
ORDER BY x.language DESC, x.loc ASC
LIMIT 5000 OFFSET 2520000; Array
(
    [:db_condition_placeholder_0] => 1
    [:db_condition_placeholder_1] => 1
    [:db_condition_placeholder_2] => en
    [:db_condition_placeholder_3] => und
)
 in xmlsitemap_generate_chunk() (line 164 of /var/www/html/xxx/d7/sites/all/modules/xmlsitemap/xmlsitemap.generate.inc).
WD php: PDOException: SQLSTATE[HY000]: General error: 3 Error writing file '/dev/shm/MYn0q1Nn' (Errcode: 28): SELECT x.loc AS loc, x.lastmod AS lastmod, x.changefreq  [error]
AS changefreq, x.changecount AS changecount, x.priority AS priority, x.language AS language, x.access AS access, x.status AS status
FROM 
{xmlsitemap} x
WHERE  (x.access = :db_condition_placeholder_0) AND (x.status = :db_condition_placeholder_1) AND (language IN  (:db_condition_placeholder_2,
:db_condition_placeholder_3)) 
ORDER BY x.language DESC, x.loc ASC
LIMIT 5000 OFFSET 2520000; Array
(
    [:db_condition_placeholder_0] => 1
    [:db_condition_placeholder_1] => 1
    [:db_condition_placeholder_2] => en
    [:db_condition_placeholder_3] => und
)
 in xmlsitemap_generate_chunk() (line 164 of /var/www/html/xxx/d7/sites/all/modules/xmlsitemap/xmlsitemap.generate.inc).
PDOException: SQLSTATE[HY000]: General error: 3 Error writing file '/dev/shm/MYn0q1Nn' (Errcode: 28): SELECT x.loc AS loc, x.lastmod AS lastmod, x.changefreq AS changefreq, x.changecount AS changecount, x.priority AS priority, x.language AS language, x.access AS access, x.status AS status
FROM 
{xmlsitemap} x
WHERE  (x.access = :db_condition_placeholder_0) AND (x.status = :db_condition_placeholder_1) AND (language IN  (:db_condition_placeholder_2, :db_condition_placeholder_3)) 
ORDER BY x.language DESC, x.loc ASC
LIMIT 5000 OFFSET 2520000; Array
(
    [:db_condition_placeholder_0] => 1
    [:db_condition_placeholder_1] => 1
    [:db_condition_placeholder_2] => en
    [:db_condition_placeholder_3] => und
)
 in xmlsitemap_generate_chunk() (line 164 of /var/www/html/xxx/d7/sites/all/modules/xmlsitemap/xmlsitemap.generate.inc).

Looks like it had a hard time writing to tmpfs wich looks like:

tmpfs                 15799720   2747728  13051992  18% /dev/shm

Question for earnie:
If i restart xmlsitemap-regenerate now what happens ? Is it smart enough to pick up where it left or it'll start all over from the beggining. I don't particulary want to read trough all the module's code but i'll gladly help sorting this module out as currently is kinda useless for big data / high volume websites.
Here's more info about my server just to clarify any memory hog questions:

top - 09:55:11 up 43 days, 13:16,  5 users,  load average: 3.06, 3.26, 2.65
Tasks:  79 total,  10 running,  67 sleeping,   2 stopped,   0 zombie
Cpu(s): 51.7%us,  5.5%sy,  1.5%ni, 37.7%id,  3.6%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  31549440k total, 26549560k used,  4999880k free,        0k buffers
Swap:        0k total,        0k used,        0k free, 16378360k cached

EDIT:
I got my answer...it seems like it's starting over. Maybe this can be a nice optimization ? Add an index of all sane xml chunks and only regenerate from there on. Or maybe add an option to partialy regenerate.

Anonymous’s picture

Priority: Normal » Major
Status: Postponed (maintainer needs more info) » Active

Re #8, You're MySQL server is running out of memory. You need to look into re-configuring the memory allocations for it.

Re #7, yes it is safe to keep. The original code is making an incorrect assumption about the return value of the flush. I'll try to find some time for a proper patch but Dave can feel free to go ahead with a patch.

Anonymous’s picture

Issue tags: +Release blocker
gordon’s picture

This fixed my problem that I was having. here is a patch.

gordon’s picture

Status: Active » Needs review
FileSize
911 bytes

Fixed up the paths so it will apply

Anonymous’s picture

Version: 6.x-2.x-dev » 7.x-2.x-dev
Status: Needs review » Needs work

While the patch is a work around we really need to modify or possibly remove the getStatus() function and we need to apply it to D7 first then role it back to D6.

troyer’s picture

Patch from #12 fixed my issue in 6.x-2.0-rc1. Sitemap generation is now way faster too.

rooby’s picture

Here is a version of the workaround patch in #12 for D7.

It fixed the issue for me.

Anonymous’s picture

See my comment in #13, it still applies. Rolling the same "needs work" patch doesn't resolve the issue that caused it to need work.

rooby’s picture

I never said it did, and you might notice I didn't set it back to needs review for that very reason.

It is just there as a drupal 7 version of the workaround that was previously provided for drupal 6 for those who either don't know how to fix it better or don't have time to dig into it.

mgriego’s picture

Status: Needs work » Needs review
FileSize
1.04 KB

Patch against 7.x that removes the getStatus() method and associated code.

mgriego’s picture

Same patch against 6.x.

Status: Needs review » Needs work

The last submitted patch, xmlsitemap-1435264-6.x-remove-getstatus.patch, failed testing.

mgriego’s picture

Resubmit corrected 7.x patch.

mgriego’s picture

Status: Needs work » Needs review
FileSize
1.04 KB

*sigh*

mgriego’s picture

It's apparently been a while since I did this... apologies.

mgriego’s picture

And, once again, here's the 6.x patch in the proper format.

Status: Needs review » Needs work

The last submitted patch, xmlsitemap-1435264-6.x-remove-getstatus.patch, failed testing.

mgriego’s picture

Status: Needs work » Needs review

Fixing the status back.

Anonymous’s picture

Status: Needs review » Needs work

The patch presented in #24 is opposite of the patch in #23. However, we don't want to get rid of the flush function.

mgriego’s picture

Sorry, was obviously in a hurry while I was doing this, so I got my file order backwards when rolling the 6.x patch. It's not otherwise identical to #23, though, as the offsets are a few lines off in the 6.x patch due to other differences in that file between 6.x and 7.x.

To your other point, though, I'm not sure why you want to keep the flush() method around... That method's *sole* purpose is to set the status property based on the return value from the actual XMLWriter::flush() method. Since the only place that the status property is used is in getStatus, and we've already determined that the return from XMLWriter::flush() isn't useful to what we're doing, I'm not sure why it needs to be kept. If the getStatus() method is removed, then flush() is setting a property that will never get read.

Anonymous’s picture

The flush() itself has a purpose. We base status on the return value which is either FALSE on failure or the number of bytes written. The issue is that 0 is a valid return value which equates to FALSE. So the change should be something like the following.

- $this->status &= (bool) $return
+ $this->status = $return !== FALSE;
mgriego’s picture

I'm definitely missing something, then. The PHP docs for XMLWriter::flush() don't state anywhere that the flush() method *might* return FALSE... It just says that it simply returns either the buffer or the number of written bytes, never FALSE...?

Anonymous’s picture

Well, sorry for the noise in #29. We still need the flush() it is used elsewhere but setting status with the return value is bogus. So the endDocument function should check the return value from the parent, the getStatus function should go away, the setting of the status variable should go away. I don't think anything else is extending XMLSitemapWriter so we should be good with that.

mgriego’s picture

Right, OK... Checking the $return from the parent::endDocument() is exactly what my patch does, so we're good there.

As to the flush() method, here is what it was before:

public function flush($empty = TRUE) {
  $return = parent::flush($empty);
  $this->status &= (bool) $return;
  return $return;
}

Notice that the *only* thing it does, other than calling parent::flush() is set the value of the status property, which we are no longer using. So, again, I see no reason to override it. Removing the method from the XMLSitemapWriter *sub*class won't remove the method from the class... it will just inherit the method from its parent, as all subclasses do. And, since it would no longer be doing anything that the inherited method wasn't, there's no reason to override it. Hence, I removed the method override in my patch... no longer necessary.

And, for the record, I checked the rest of the code, and the *only* place I see the flush() method being called directly is from the writeSitemapElement(), and it's calling it in the object context, so it will use the inherited method if there is no overriding method, just as it should.

So, based on all that, I still don't see what's wrong with the patch I supplied.

Anonymous’s picture

Your points are accepted as valid and I have no further objection. I'm really not far from a OO beginner in my logic thinking (I'm a strong procedural programmer) and didn't think about the fact that removing the overridden function doesn't remove the function from the object in total. But I agree with everything you've said. We just need to correct the D6 version of the patch. Add -D6 to the file name on the upload so that the testbot leaves it alone.

mgriego’s picture

Status: Needs work » Needs review
FileSize
1.04 KB

Sure thing, here ya go.

Anonymous’s picture

Should we add a parent::flush() after the end of the if {} in the endDocument() function since we've written data with the endDocument()?

mgriego’s picture

I wouldn't think so... I'm making an assumption here, but I would think that calling endDocument would actually force a flush of the buffer since you're closing out the XML document. Though, if you do decide to add a flush call, you don't need to call parent::flush(), rather $this->flush().

Anonymous’s picture

Status: Needs review » Needs work

It doesn't say so in the documentation so therefore let's not assume that it does and add it.

pianomansam’s picture

Just reporting in that patch from #34 seems to be working for me.

Taxoman’s picture

So, is it time to push this one to 7.x-dev?

Anonymous’s picture

So, is it time to push this one to 7.x-dev?

As soon as the patch is corrected for #35 - #37. I haven't accepted the fact that we do not need parent::flush().

Carsten Müller’s picture

Status: Needs work » Needs review
FileSize
675 bytes

Hi,

i had the same problem. It appears if the return value in the function flush is 0.
then (bool) $return will become to FALSE and will not change back again.

From the header in parent::flush

@return mixed If you opened the writer in memory, this function returns the generated XML buffer,
* Else, if using URI, this function will write the buffer and return the number of
* written bytes.

so the number of written bytes seems to be 0.

I just changed the function flush to the following lines and it works

/**
   * Override of XMLWriter::flush() to track file writing status.
   */
  public function flush($empty = TRUE) {
    $return = parent::flush($empty);
    if (isset($return)) {
      $this->status = TRUE;
    }
    else {
      $this->status = FALSE;
    }
    return $return;
  }

Status: Needs review » Needs work

The last submitted patch, xmlsitemap-1435264-7.x-change-flush-status.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review

Re: Carsten Müller

We already have a patches for this. Your patch provides a result that is no different than the already bogus result and will not be accepted.

See the following comments for the patches needing review:
#1435264-23: Xml stops working Uncaught exception 'XMLSitemapGenerationException' error
#1435264-34: Xml stops working Uncaught exception 'XMLSitemapGenerationException' error

asb’s picture

I'm experiencing the 'XMLSitemapGenerationException' error from tome to time with D6. The patch from #34 seems to fix the problem, including Cron runs through Drush. Thanks @mgriego!

Since there also are other confirmations that this patch is fine (#38), I'd like to set the status to RTBC, but I can't comment on #35-#37. I can just say that before applying the patch, cron dies and sitemap generation fails, and after applying the patch, cron finishes and sitemap generation succeeds (tested on a site with split sitemaps with 132,209 links).

FanisTsiros’s picture

#1435264-23: Xml stops working Uncaught exception 'XMLSitemapGenerationException' error Works for me ...finally.
I had the same issue for my new website built on drupal 7. This is a new News Aggregator site collecting news with feeds from more than a thousand sites. At the time, we have about 600.000 nodes. Xmlsitemap is 13 pages in a month.

Without the patch sitemap xml generation stopped sometime when we had about 10 pages. Rebulding links was not working throwing unknown error. With the patch #23, i rebuild links successfully and everything goes smoothly !

klausi’s picture

Status: Needs review » Reviewed & tested by the community

Patch from #23 seems to solve the problem.

klausi’s picture

Issue summary: View changes

Add the comments containing the patches for review.

lmakarov’s picture

Just another confirmation that #34 fixes this on 6.x-2.0-rc2. Thanks!

edrokov’s picture

#34 fixes this on 6.x-2.0-rc2. Thanks!

bessone’s picture

#23 fixed my problem with 7.x-2.0-rc2

Dave Reid’s picture

Status: Reviewed & tested by the community » Fixed
lostboys’s picture

xmlsitemap-7.x-2.0 and xmlsitemap-7.x-2.x-dev both give me:
XMLSitemapGenerationException: Generated public://xmlsitemap/NXhscRe0440PFpI5dSznEVgmauL25KojD7u4e9aZwOM/1.xml resulted in an empty file. in XMLSitemapWriter->endDocument() (line 156 of /var/www/www.******.com/public/sites/all/modules/xmlsitemap/xmlsitemap.xmlsitemap.inc).

Running the cron gives me a connection reset.

Any idea how to get it working?

thnx

markusa’s picture

response to comment #51, check out this issue, looks like a different problem (with solution) : https://drupal.org/node/2220707

bryanhidalgo’s picture

Solution on #52 fixed my problem

iantresman’s picture

#52 pointing to "Error on sitemap generation: "An AJAX HTTP request terminated abnormally." #6 resolved this for me, except that in XML Sitemap 6.x-2.0, the offending line is on line 161 in the file xmlsitemap.xmlsitemap.inc.

I was getting blank white pages on "Rebuild links", and running the cron. I am running Drupal 6.30.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.