As discussed on #361171: How to debug cron stoppage problems with bad PHP content, sometimes people include bad PHP in a node.

During search indexing, cron will abort, run into an infinite loop, etc.

This feature request is for the Search module to handle the situation better, so that cron will not have trouble just because someone put something bad into their node content. Some sort of exception handling, timeout, etc.

Comments

jhodgdon’s picture

Damien Tournoud’s picture

I already described the only possible way to handle that (and the related redirect issue):

http://drupal.org/node/286263#comment-979102

jhodgdon’s picture

I tried to get this issue and 286263 to be dups of each other, but was overruled over on 361171...

jruberto’s picture

there are a bunch of dupes of this issue. i'm of the opinion that it's a node.module issue and not a search.module issue -- could be handled with an error trap in node_update_index() in node.module, see 744276 for more in-depth description.

also of the opinion that the resultant "Cron run exceeded the time limit and was aborted" error message is a bug worthy of fixing before 8.x.

#361171: How to debug cron stoppage problems with bad PHP content
#744276: Crashy PHP node causes search index to fail, misreports condition as "Cron run exceeded the time limit and was aborted"
#286263: Make search indexing smart to handle nodes wth php redirects
#308808: Drupal_cron_cleanup paints the wrong picture
#752184: Keep running cron when bad PHP content is encountered
#643474: Invalid PHP code in node, causes whole cron task to crash

jhodgdon’s picture

Thanks for finding all of those...
361171 is now a support request for debugging support.
744276 I have just marked as a duplicate of this issue.
286263 is a related issue (with probably the same fix) but has been requested to be kept open.
752184 is this issue.
643474 is already marked as a duplicate of this issue.

That leaves 308808, which is definitely related, but it is asking for a better error message in the logs, rather than a fix for the underlying problem, so it's not really a duplicate IMO.

As If’s picture

The difference between 286263 and all these other ones (where the phrase "bad code" keeps getting used over and over) is that using drupal_goto() is NOT "bad code". So you are correct here when you say that there are two issues. 286263 is a BUG because part of Drupal doesn't understand how to work with the Drupal API. The others are support or feature requests because sometimes people write bad code.

jhodgdon’s picture

I somewhat disagree with that statement. drupal_goto() in node content is not standard, nor is it a good practice. If you are doing something that needs a drupal_goto(), it should probably be done in a module that uses hook_menu() to define a URL, and does whatever it wants to define the content that should be shown in that URL. Putting things like this into the body of a node is a hack.

As If’s picture

Interesting response. So at risk of going OT, do you consider using any and all Drupal API commands within PHP content to be a hack? And is this the general consensus, a documented best practice, or a personal opinion?

jhodgdon’s picture

Generally, I would say it's dangerous and hackish to do any programming in node bodies, and I would personally never ever enable the PHP input format module. I would put all programming either in a module or a theme.

But please solicit other opinions. I think most core Drupal developers would agree with me, but I haven't done a survey.

John Pitcairn’s picture

Interesting indeed (subscribe).

Damien Tournoud’s picture

Running any function with side-effect inside the node body is indeed bad code.

As If’s picture

Fair enough, I do see your point from a security perspective. But when I see the phrase "bad code" I think of things like unbalanced brackets or unclosed quotes, arrays confused with objects, syntax errors, one-off errors, eternal loops, etc.

In other words, if ALL PHP code inside a node is "bad code" then it's really a rather moot point to differentiate bad from good, no?

jhodgdon’s picture

Truly -- you have grasped it.

The reported issue here is to keep running cron when search indexing encounters a node that it cannot index at all, due to a PHP error or some other logic problem (like a drupal_goto() -- well, that's being discussed on another issue) in a node body. Our feeling, as core Search module maintainers, is that this is due to a programming error by a site builder, so we shouldn't work too hard to make cron keep running when a site builder has made an error. Just as your site will crash if you install a custom module with a PHP error in it, and the node page in question will probably have a white screen of death if you try to view it -- why should search_cron() do anything special to get around this type of problem?

That is why this issue has been pushed off to Drupal 8 as a feature request (meaning it may or may not ever have an accepted resolution) rather than being jumped on as a serious Drupal 7 bug (meaning it has higher priority for a fix).

Does that make sense?

John Pitcairn’s picture

Sounds fair enough to me too. My interest was that under certain dynamic conditions, I issue a drupal_goto() in the hook_nodeapi view operation. hook_menu won't work for this. The drupal_goto() is wrapped in a cron semaphore check so cron operations won't abort.

inforeto’s picture

How do you do a cron semaphore check to manually wrap code that calls drupal_goto in drupal 6?

John Pitcairn’s picture

Basically:

if (!variable_get('cron_semaphore', '')) {
  drupal_goto(...);
}

But note this also means that regular users will not get the drupal_goto() while cron is running.

filijonka’s picture

hi

I would actually like to add this issue #302739: Watchdog interferes with cron semaphore deletion in drupal_cron_cleanup() as a part of this and say that this might not only be considered as a problem with the search module. And also that this isn't a feature request but a bug.

Dave Cohen’s picture

I've seen the comments indicating this is not considered a problem with the search module. I'm not so sure about that, since PHP code is probably doing something dynamically, showing different content to different users. The search index is therefore likely to contain something that the current user will not see when they visit the node with PHP code.

More immediately, I am looking for some way to determine, in my PHP code, whether the search index is currently being built. That is, I'd like put something like this in a text field with PHP code filter...

if (search_index_is_currently_being_built()) {
  // Do nothing.
} else {
  // Do something.
}

My problem is I can't figure any way to test for that. The fact that index is being rebuild seems to be known as the view mode of the node. But the filter, or code being evaluated has no way to learn that, as far as I can tell. I'm trying to do this in D7, by the way.

Any help is appreciated. Thanks.

UPDATE... I'm using the following, for the time being. Using the fact that cron module sets a lock, and search index is rebuilt via cron (even when cron is running in page footer, which is how I first noticed my problem). But I'd like a cleaner test, so feel free to suggest improvements. Thanks.

  // Avoid any calls to drupal_goto when Drupal search index being built, which is done via cron.
  if (!lock_may_be_available('cron')) {
    return;
  }

  // The code I'd like to run when the node is viewed goes here.
ianthomas_uk’s picture

Version: 8.x-dev » 7.x-dev
Issue summary: View changes

The php filter module has been removed from Drupal 8 for exactly this sort of reason - it's too powerful and can't be locked down.

I'll set this back to D7 in case anyone wants to work on a 'fix', but my recommendation to anyone encountering this problem is to disable the php filter module and find alternative ways to do what you're trying to do (e.g. write a custom module).

jhodgdon’s picture

Status: Active » Closed (won't fix)

We don't do feature requests in D7. So this either needs to be deemed a bug or left in D8 or marked "won't fix".

Given the previous comments, I think the "won't fix" option is the most realistic. No one is working on the core Search module in D7, and as we do not have the PHP filter in D8, we probably won't fix it there.