I had a problem with the cron.php, which was not working properly, instead redirecting to a page on my system.

After some research I found out that in one of my nodes I used php code which stated that if the user wasn't supposed to be on that page he was redirected to the previous page. This caused the cron to fail, because it was redirected. (And without proper error messages.)

The problem seems to be related to the search module and the drupal core, so I decided to post it on the Drupal project instead of the search module, I hope i made the right decision.

The issue is seemingly deep inside the core, because the cron.php is using a function cron.php::module_invoke_all('cron') which in time is invoking search.module::search_cron() which eventually executes search.module::module_invoke('node', 'update_index'), which eventually leads to common.inc::drupal_eval($code)

And if $code has

  if (user_did_something_wrong()) {
    // return the user back to where he did the thing wrong
    header('location: node/144');
    exit;
  }

I know this php code is a bit unusual in drupal and shouldn't be used.. but i think its a bug if it's not working at all.

CommentFileSizeAuthor
#5 common.inc don't redirect on cron run.patch392 bytesBevan
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pwolanin’s picture

how about using drupal_goto() instead?

killes@www.drop.org’s picture

Status: Active » Closed (won't fix)

this is maybe a documentation issue, but it is quite clear that php code will get executed if you tell Drupal to.

pwolanin’s picture

@rbaarsma

from your note to me:

I've tried to use drupal_goto() but it breaks the cron just as much as header('location: ') does.
Do you possibly have an idea how to work around this?

Ok, well the usual alternative is to use an access control module. In this case, the user will get an "access denied" page, rather than being redirected. The search indexing automatically bypasses access control, so that everything is indexed even if it is not accessible to all users.

You might also try testing for the variable "cron_busy", but this might mean that sometimes users don't get redirected. Or, lastly, just run cron manually (visit in your browser example.com/cron.php) when logged in as a user who is not redirected. Once you've indexed that page once, you should be ok.

http://cvs.drupal.org/viewcvs/drupal/drupal/cron.php?rev=1.34&only_with_...

rbaarsma’s picture

Thanks for your response.

I think I will go try use access controll for this. It would've probably been better to have made a new module instead of php code in the first place. But I still think it shouldn't be this hard in drupal to redirect.

Anyways,
thanks a lot

Bevan’s picture

Title: cron.php seems to execute php in nodes?! » cron breaks on drupal_goto
Version: 4.7.4 » 6.x-dev
Status: Closed (won't fix) » Postponed (maintainer needs more info)
FileSize
392 bytes

Hmmm. We have a similar situation, but rather than being an access issue, the issue is that a certain node type has no meaning for the user as a full page view. It's a multisite config, and the node type is for a data structure which is referenced from this site. So I have used drupal_goto in the site's module's hook_nodeapi implementation. I've tried out the attached patch, and it works fine. I wonder if a better solution would be to allow search module to be configured to search only certain node types...

Bevan’s picture

There's a better solution to this general problem over here: http://drupal.org/node/111744

For now, I'm using this method (can't be bothered porting that patch to 5.x), but I've left it in the custom module, around the call to drupal_go_to():

          if ($_SERVER['SCRIPT_NAME'] != '/cron.php') {
              drupal_goto('node/'. $node->parent['nid']); // redirect the web browser to the other node
          }
Gábor Hojtsy’s picture

Status: Postponed (maintainer needs more info) » Closed (works as designed)

As I said over at http://drupal.org/node/224875 we cannot do a blacklist of functions to not run. You cannot intercept running exit; for example, but that would halt the cron process just as well. Users using the PHP input format should just assume some responsibility for what they do, and that using PHP requires some understanding of the language as far as I can see. Bevan' code in #6 should be in codes using drupal_goto() from PHP inputs.

cvining’s picture

Version: 6.x-dev » 6.9

Bevan,

Perfect. I ran into precisely this issue when I upgraded from v5.12 to v6.9. This snippet saved me!

-- Cronin

agerson’s picture

This didn't work for me. I believe it is because my Drupal is installed in a sub-directory. This was my work around:

<?php
  $script_name= str_replace(base_path(), '', $_SERVER['SCRIPT_NAME']);
  if ($script_name != 'cron.php') {
    drupal_goto('node/'. $node->parent['nid']); // redirect the web browser to the other node
  }
?>
shanesj’s picture

@agerson

Thank you very much for that snippet. It is exactly what we needed and it saved my sanity!

jaypabs’s picture

@rbaarsma

Thanks man. It was the search module that causing this problem. After disabling the search module the cron runs fine.

Thank you very much

rsvelko’s picture

thanks guys - great that I find this thread. In my case it was a rules.module Rule that spoiled my day. I had 3 options - patch drupal_goto, patch rules.module AND remember to wrap every drupal_goto we use in an if...

Personally I chose to patch core. Due to the way we update core (with a pre-diff and after-patch) this works for us.

mike503’s picture

I am looking into implementing something like this:

if ($node->build_mode == NODE_BUILD_SEARCH_INDEX) {
   $node->body = preg_replace('/drupal_goto/', '#drupal_goto', $node->body);
}

The other solution is just removing anything inside of PHP blocks but the PHP logic -might- be generating useful content.

However, the drupal_goto function is messing with things.

Either drupal_goto should be overloaded to return false or something during cron, or something like above. This is really a killer and I'm sad to see 4 years later it's still something that isn't addressed easily.

mike503’s picture

My workaround:

1) Install SuperCron module
2) Make a custom module (below)
3) Make sure this custom module runs BEFORE the search/apachesolr/etc. jobs, so it can define the hook_nodeapi() ahead of time

This seems to fix the issue by "commenting" out drupal_goto. I was thinking of using /*drupal_goto*/ but nested comments could be an issue. I also thought of preg_replace'ing the php code out completely, but some people may use it for generating content.

That's why this is a preg_replace() and not just an str_replace() - I started out with planning on doing something more fancy, and got lazy. Also it does not take into account only processing this if it's in a PHP block. So if the page -talks- about drupal_goto, it's going to have a hash in it in the search results. Oh well :)

I had to define a global of _cron_running because I could not detect that this was being triggered under cron, and I could not check cron_semaphore because that doesn't take into account if node op of 'view' is occuring from the frontend or not. I would have thought Drupal would have an internal state of "I'm running cron" defined somewhere but couldn't find one. Only a semaphore check.

I also would think after 4 years of this being open that something could be done to detect cron or being indexed ($node->build_mode = NODE_BUILD_SEARCH_INDEX?) and fix little things like this drupal_goto issue.

function searchfix_cron() {
        $GLOBALS['_running_cron'] = true;
}

# break drupal_goto inside of "PHP Filter" style content
# otherwise, search indexing breaks because of this.
# please, don't allow PHP Filter based content. seriously.

/**
 * Implementation of hook_nodeapi().
 */
function searchfix_nodeapi(&$node, $op, $a3 = NULL, $a4 = NULL) {
        if($op == 'view') {
                if(isset($GLOBALS['_running_cron']) && $GLOBALS['_running_cron'] == true) {
                        $node->body = preg_replace('/drupal_goto/', '#drupal_goto', $node->body);
                }
        }
}
puddyglum’s picture

Thanks @agerson, and in order to run cron manually, for instance, from Admin Menu, and to keep it from running when you edit it, I made some modifications:

<?php

//normally script name is index.php, but is cron.php if it's loaded via cron
//do not execute the drupal_goto if script name is cron.php,
// or if the node is loaded for anything other than being viewed

$script_name= str_replace(base_path(), '', $_SERVER['SCRIPT_NAME']);

$arg0 = arg(0);
$arg2 = arg(2);

if($script_name != 'cron.php'
   && $_SERVER['REQUEST_URI'] != '/admin/reports/status/run-cron'
   && !($arg0 == 'node' && !empty($arg2)))
{
  drupal_goto('your path here');
}

?>
adaddinsane’s picture

Solution #14 certainly the most elegant but commenting out drupal_goto is potentially dangerous as it might cause all the code to fail - yes, it ought to be on a line by itself but it may not be.

Instead replace it with searchfix_noop and define a function:

function searchfix_noop() {}

(In fact it might be even better to replace the entire call, plus parameters, with return; so that execution isn't compromised.)

In addition all the solutions here assume that only $node->body contains the PHP, there could be CCK fields as well (I had them).

My twopennyworth.

frdesign’s picture

#15 seems to be working for me, thank you!

tarzadon’s picture

Thanks. #15 worked. I'm not sure if I'm going to need $arg(0) and $arg(2) yet, so I didn't include that part.

HansKuiters’s picture

For people finding this page, my solution. The breaking cron problem in my case was caused by the CCK Redirection module. I solved it by changing this modules weight in the system table to a higher value than the Apache Solr module.

mtem27’s picture

#15 worked for me.

Thanks!

_vid’s picture

I realize this issue is closed but as it remains to be a problem for many I thought I'd contribute the best solution I've encountered thus far; a function that checks for 3 different flavors of cron: is_cron( ).

<?php
if(!function_exists('is_cron')){ //don't re-declare functions
  /**
   * function is_cron()
   * Tests for cron using best known practices
   */
  function is_cron() {
    if (php_sapi_name() == 'cli') {
      // This may mean cron via Drush
      return TRUE;
    }
    if(preg_match('/\/cron.php$/', $_SERVER['SCRIPT_NAME'])) {
      //check for cron.php in the url. *preg_match accommodates sites in sub-directories
      return TRUE;
    }
    if (arg(0) == 'admin' && arg(1) == 'reports' && arg(2) == 'status' && arg(3) == 'run-cron') {
      return TRUE;
    }
    return FALSE;
  }//end function is_cron()
}//end if(!function_exists('is_cron')

//Don't redirect if cron
if(!is_cron()){
  drupal_goto($preferredLink);
}//end if(!is_cron())

Note: I've wrapped it in a '!function_exists()' if statement to be sure the function is only declared once.

Once that function is in place you can test it by running cron every way you know how and look for errors in watchdog. I've tested this thoroughly for my applications; updating nodes in offending content types, rebuilding my search index and running cron from the reports menu, from the cron.php url and via drush cron. This function seems to solve the problem in both D6 and D7.

Previously I had used: if ($_SERVER['SCRIPT_NAME'] != '/cron.php') {} but that didn't detect drush cron. Thanks to Aron Novak for posting his function here: http://drupal.org/node/788574#comment-4182470

Alternatively, I thought this hook_node_view solution was an interesting option too: http://drupal.stackexchange.com/questions/4169/is-it-okay-to-use-drupal-...

adaddinsane’s picture

Nice one. If I could +1 it, I would. :-)

vb_swapnil’s picture

After disabling core search module my cron run seccessfully. But i need this module for search indexing . Is there any way to fix search module probs. When I enebled search module and run cron it will redirect to subscription form page.
Please suggest .

puddyglum’s picture

@vb_swapnil, I'm assuming you know that an implementation of drupal_goto() is likely causing this problem. The next step is to find which node or page is calling drupal_goto() when it gets loaded, and then use the suggested solutions above to prevent drupal_goto() from executing if cron is running.

PROMES’s picture

@ _vid #21: thanks for your solution. It solved a real headache.
If you are using Elysia-cron, you can add before "return FALSE":

  // start search with Elysia-cron
  if (module_exists('elysia_cron')) {
    if (arg(0) == 'admin' && arg(1) == 'config' && arg(2) == 'system' && arg(3) == 'cron' && arg(4) == 'execute' && arg(5) == 'search_cron') {
      return TRUE;
    }
  }
HansKuiters’s picture

@promes #25: Why not use $_GET['q'] in this case?

if (module_exists('elysia_cron')) {
  if ($_GET['q'] == "admin/config/system/cron/execute/search_cron") {
    return TRUE;
  }
}
PROMES’s picture

@capono I used the same syntax as in #21. But your lines are more simple.
I did test it and updated my code in favor of your idea.
Thanks

HansKuiters’s picture

There is nothing wrong with the syntax, but this is much easier to read. And the arg() function also uses $_GET['q'].

leisurman’s picture

Issue summary: View changes

How can I search every node for php content. I have multiple cck text fields.
I can use the query but its just for the revision body table.
SELECT * from node_revisions WHERE node_revisions.body LIKE '%drupal_goto(%';