Splitting off from #331611: Add a poormanscron-like feature to core

otherwise if (anon) users are viewing cached pages and no one is logged in then even if cron is overdue it will not be triggered until either someone logs in or an anon user hits an uncached page.

I don't see why this is such a problem. This is a failsafe, if you specific 3 hours, it'll run some time after 3 hours has elapsed. If that ends up being 5 hours, tough luck.

There's two reasons to have poormanscron in core. First to allow people to run sites without any external cron at all - so that search engine indexing and other things continue to work. If it doesn't run on the dot that's fine, it's a poor mans cron - for a proper cron job you can go set up an external cron job and have punctual cron runs anyway.

The other reason is to have a failsafe in case cron stops running for any reason - typo in crontab -e, move server and forget to set up a new job, things like that which can bring any remotely busy site to a halt as the watchdog table grows to hundreds of thousands of entries and cache tables fill up with stale data - in this case, it's also fine if it takes an extra few minutes (or even hours) to run - since that's better than not at all.

I don't think there's either situation where having an extra uncached bootstrap for every request is going to be acceptable though - it was put in an image to not affect visible performance for users - doubling the load on every site enabling this feature is going to do that anyway one way or another.

CommentFileSizeAuthor
#189 cron-breakage-566494-189.patch963 bytesDavid_Rothstein
#177 cron_breakage.patch956 bytescatch
#173 register-shutdown-cwd-706608.patch799 bytesDavid_Rothstein
#154 enough_of_this_really.patch9.25 KBchx
#148 566494-fix-page-cache-tests.patch989 bytesDamien Tournoud
#140 cron-patch-merge.patch10.59 KBDavid_Rothstein
#132 cron-patch-merge-v2.patch12.61 KBGábor Hojtsy
#131 cron-patch-merge.patch10.59 KBGábor Hojtsy
#122 rollback-cron-566494-122.patch13.59 KBDavid_Rothstein
#120 rollback_7.patch13.11 KBGábor Hojtsy
#118 rollback.patch9.05 KBcatch
#108 566494-cron-clear-page-cache-D7.patch754 bytesDave Reid
#104 566494-cron-clear-page-cache-D7.patch817 bytesDave Reid
#103 run-cron.patch517 bytesmoshe weitzman
#95 566494-95-cron-causes-two-bootstraps-per-page.patch14.15 KBAnonymous (not verified)
#93 566494-93-cron-causes-two-bootstraps-per-page.patch14.17 KBAnonymous (not verified)
#90 566494-90-cron-causes-two-bootstraps-per-page.patch13.45 KBAnonymous (not verified)
#87 566494-cron2.patch13.01 KBAnonymous (not verified)
#79 566494-cron-best-of-both-worlds-D7.patch14.26 KBDave Reid
#77 566494-cron-best-of-both-worlds-D7.patch14.23 KBDave Reid
#76 566494-autocron-D7.patch12.05 KBDave Reid
#73 screenshot_003.png106.73 KBcatch
#70 cache_expire.patch1.73 KBchx
#68 x_drupal_cron.patch3.52 KBchx
#67 x_drupal_cron.patch3.51 KBchx
#66 x_drupal_cron.patch2.12 KBchx
#60 cron.testing.patch3.43 KBDavid_Rothstein
#55 cron.diff1.4 KBmoshe weitzman
#45 566494-autocron-D7_2.patch10.92 KBTheRec
#40 566494-autocron-D7.patch12.05 KBDave Reid
#38 566494-autocron-D7.patch12.02 KBDave Reid
#37 589438-poormanscron-TNG-D6.patch12.83 KBDave Reid
#36 566494-autocron-D7.patch12.3 KBDave Reid
#31 566494.patch5.53 KBRobLoach
#24 566494-cron-single-request-demo.patch1.62 KBDavid_Rothstein
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexanderpas’s picture

what if we remove the image from the DOM (if JS is enabled) on $(document).ready() -- what would be the effect of that?

TheRec’s picture

Just to summarize, the image is requested with XMLHttpRequest when Javascript is available and through a simple HTTP request created with the <img> tag when Javascript is not available.With page caching it is not reliable to put those triggers for automatic cron in the page only when it is really needed (a.k.a when cron threshold is passed) because we could have most page cached without the triggers in them and thus mostly disable the feature. Right now the triggers are added on every request (only one of them is used, depending on the Javascript availability), as soon as the functionality is enabled.

alexanderpas> Removing it with Javascript doesn't make much sense, as it would be even easier not to put these triggers in the page in the first place, but then we end up having the page caching issue I've described above.

So in this issue we should investigate if there is a more efficient and at least as much reliable solution to include the triggers to run cron automatically in the page. And we must keep in mind that this feature is not planed to be used by heavy traffic sites, it is for unexperimented users who either don't care or don't know how to setup a scheduled task to run cron on a regular basis.

gpk suggested these solutions :

A couple of options come to mind to get round this:
1. make page caching incompatible with the built-in cron feature. Then Drupal can decide on each page request whether to send the image. Although this would halve the number of page requests made to Drupal by real users with browsers (as distinct from bots), you would lose the benefit of a potentially faster page load for anon users
2. make the decision on whether to output the image in the DRUPAL_BOOTSTRAP_PAGE_CACHE phase in http://api.drupal.org/api/function/_drupal_bootstrap/7 ... if cron is due to run then we would need to prevent a cached response by adding a condition to the check here:

      // If there is no session cookie and cache is enabled (or forced), try
      // to serve a cached page.
      if (!isset($_COOKIE[session_name()]) && $cache_mode == CACHE_NORMAL) {

Would also need to make sure that when the image is included in the page, drupal_page_is_cacheable(FALSE) is invoked so that pages which include a reference to the image never get stored in the page cache.

I agree with catch, disabling page cache completely when this functionality is enabled is not really an option because this would certainly slow the whole system more than the current implementation.

Using a conditionnal inclusion of the triggers during the drupal_bootstrap() could be a solution, but what do you suggest ? The bootstrap will ignore cache completely for the current request to display a page including the triggers (I wouldn't know how to achieve this without clearing cache and it would mean clearing cache every time the cron threshold was passed, there is surely other solutions but I don't know them) ? Is that what drupal_page_is_cacheable(FALSE) would do ? Reading at its documentation on a.d.o does not sound like it :S

alexanderpas’s picture

how about reducing the max cache lifetime to 3 hours when using built-in cron?

TheRec’s picture

The goal is not to force the use of minimum cache life time, which is disabled by default and which should be used on heavy traffic sites only I think... I would rather look in the direction of the second suggestion of gpk, i.e. during bootstrap, when the feature is enabled, check if the current request is supposed to trigger the cron, and if it does do not use cache and generate the page with the included triggers.

moshe weitzman’s picture

Priority: Normal » Critical

Can't ship with the double bootstrap problem. I think we can add a JS setting about the time that cron is next scheduled to run and then only add the image if this is past. Not sure if our run-cron callback clears page cache but it should.

Not too worried about the noscript case.

TheRec’s picture

Not so great idea... someone could mess with the time check (since it would be client-side) and run cron every second... that would surely improve the site load at least we'll have spared another bootsrap LOL. More seriously, if anyone is experimented with drupal_bootstrap() they should investigate in the second solution suggested by gpk (see the quote in #2), this will not be solved on client side.

I am not sure this is really critical... if you're not happy with the feature, just disable it (it's a variable...so you could even disable it in a profile AFAIK). It's intended for a specific class of users who do not want or cannot setup cron themselves.

moshe weitzman’s picture

TheRec - I don't understand your comment. A client can just script a call to run-cron callback every second no matter what we put in our page_bottom region and no matter how often it gets put in there. For that matter, a client can almost DOS your site by requesting complex taxonomy pages and so on.

Anyway, the run-cron callback needs to just refuse to run cron if it is not overdue.

We are simply not shipping drupal with a default configuration that does two bootstraps for every page view. No way, not over my dead body. Lots of people are going to use our out of the box config, not just a few newbies.

TheRec’s picture

moshe weitzman> You are right, my comment doesn't make sense. Checking the cron_threshold value in Javascript before making the XHR call could be a solution, or at least half a solution since, as you said, it would leave the <noscript> tigger and every user without Javascript would still cause 2 bootstraps on every request they do (that might not be a lot of people, but if it's not acceptable for users with Javascript, it isn't either for users without it).

Anyway, the run-cron callback needs to just refuse to run cron if it is not overdue.

The triggers do not execute cron, unless the cron threshold is passed, but we include them in every page to avoid problems with cache, as I already pointed to you #331611-61: Add a poormanscron-like feature to core, Damien Tournoud said :

[...] we want to always display the image (if the threshold > 0), but only call drupal_run_cron() if the cron needs to run (and it is not already running). This is because the image tag could be inserted in a cached page, so we always need to return at least a correct image. [...]

The issue is that the triggers are included in every page right now and this is the cause of and additional bootstrap for every request, we need to find a solution that will work with cached pages (as disabling the cache completely when the feature is enabled is not a better solution) and that does not to include the triggers when they are not required (i.e. when cron threshold is not passed). The solution would be to NOT to use the page in cache (get) for a request that happens after the cron threshold is passed (I suppose this would be at early stage, during bootstrap... as gpk suggested). The cron triggers would be included only at this moment, we would also need to ensure the page does not get in cache (set) when the triggers are included.
I'd love to post a patch for this, but I do not know enough about caching in Drupal to do it and I think that at that early stage of bootstrap, the system.module is not loaded yet and checking variables is also more difficult.

catch’s picture

Looks to me like in system_run_cron_image_access() needs to add an additional check for whether the threshold is greater than the last time cron was called.

Then, system_page_alter() would need to set $GLOBALS['cache'] = FALSE; to disable page caching for that request - it won't stop a page from being served from cache, but it'll stop the page with the image being cached. It should probably also check if there's an existing cron run in progress, so we don't send uncached pages for the full duration of a cron run, or that even belongs in _access().

Then the only side effect there, is that you have cached pages sitting around which won't trigger cron - however that's a side effect of minimum cache lifetime and caching in general, which isn't restricted to cron runs - so we simply say that if the threshold is 3 hours, cron won't run more often than 3 hours, but it might run a lot less often than that. That means you can use it as a stop gap / failsafe without the extra bootstrap, but it really lives up to it's name as poormanscron since there's no guarantee of regularity.

TheRec’s picture

The beginning of your solution (the modification of the access callback) is right, and to be honest it is how I implemented it at first and Damien Tournoud warned me about caching issues, which are true if we do not prevent caching of the pages which includes the triggers.

About preventing the page to get in the cache when it includes the trigger, shouldn't we be using drupal_page_is_cacheable(FALSE) indstead ? What would be the benfits of using $GLOBALS['cache'] ? It is already done with drupal_page_is_cacheable(FALSE) for the image itself :

function system_run_cron_image() {
  drupal_page_is_cacheable(FALSE);

So we'd use the same solution in system_page_build() (which alters the page to add the triggers), to prevent caching of the pages that will include the triggers when they are required. But we should really prevent pages to be fetched from cache in that case, I do not find reliable to "hope" that at some point an user reaches a page when cache expired... this could become problematic too. There should be a reasonable solution to prevent fetching the page in the cache when cron_threshold is passed, no ?

catch’s picture

drupal_page_is_cacheable(FALSE) we should definitely used, I'd completely forgotten that was in core now.

I really think it's OK to rely on either 1. non-cached pages 2. cached pages which have expired 3. any logged in traffic to trigger cron. The problem if we try to do things early in the bootstrap, is it means that logic has to be handled on every cached page request - and you can't do that if using boost or varnish or other page caching not in Drupal itself. Or at least, it's no an ideal situation but you can just set up a cron job (or a contrib module could do it in hook_boot()) and it's better than two bootstraps.

TheRec’s picture

Well hook_boot() sounds like a good idea, now should we really care about other caching methods... if you need to setup something like this to be able to run your site, you're not really supposed to rely on automatic cron feature :S

alexanderpas’s picture

I wonder if this is possible, and weither it would be enough:

function system_boot() {
  $cron_last = variable_get('cron_last');
  $semaphore = variable_get('cron_semaphore', FALSE);
  if (REQUEST_TIME - $cron_last > DRUPAL_CRON_DEFAULT_THRESHOLD) {
    // Prevent duplicate cron calls and allow stuck cron to reset.
    if (!$semaphore || REQUEST_TIME - $semaphore > 3600) {
      drupal_page_is_cacheable(FALSE);
    }
  }
}

(I am not on my usual location)

catch’s picture

Maybe, but I'd rather this stayed in poormanscron.module in D7 - otherwise every site, whether they use this feature or not, as to load the entirety of system.module for every cached page requests. Not as bad as a double bootstrap, but still a mess. Compared to some sites sometimes having a slightly delayed cron run.

alexanderpas’s picture

Would that be a reason to move that hook to a different file?
Someting like a module.bootstap.php file?

catch’s picture

You can't do module.boot.inc because we removed the registry, otherwise that'd get around it yes. However it might be possible again if #557542: Cache module_implements() gets in.

I'm just not sure why delayed cron runs, with what we already know is a highly degraded cron implementation, are such a problem that we need to impose a penalty on all the requests where cron doesn't need to be run.

Damien Tournoud’s picture

I suggest we simply add proper HTTP headers to the image response so that most clients will not perform a request. Adding following in system_run_cron_image() should do the trick:

  drupal_set_header('Cache-Control', 'public, max-age=' . variable_get('cron_safe_threshold', DRUPAL_CRON_DEFAULT_THRESHOLD));

I don't believe we should prevent caching of page containing the Only local images are allowed. and associated js.

moshe weitzman’s picture

Thats a great idea since it helps non js clients also.

I was thinking js code like

if (NOW > NEXT_CRON_RUN_TIME) {
// our current image request javascript goes here.
}

alexanderpas’s picture

@Damien Tournoud
With 1000 different (anon) users / 3 hours we still get 999 unneeded double requests, while not caching that page when determined cron needs to be started only get's us a couple (of seconds) of uncached (anon) requests (between timeout passed an semaphore set again)

David_Rothstein’s picture

I went and read the original issue, and it wasn't entirely clear to me why a double HTTP request was needed at all? There was some talk about problems with register_shutdown_function() and the possibility of the end user noticing a huge delay before the page loaded/etc (in the case where you tried to run cron in the same request), but I'm not sure that was explored very thoroughly.

For example, you can try a very simple test: Edit the bottom of drupal_page_footer() and put a single call to drupal_cron_run() in it; nothing fancy, just that.

The combination of the fact that the output buffer is flushed before this call happens, and the fact that within drupal_cron_run() there is already a call to ignore_user_abort(), means that the end user will see the fully loaded page before the cron run ever starts, and nothing they do in their browser can interrupt the cron run from finishing.

To confirm this for yourself, you can put code like the following in drupal_cron_run(), just to keep it going a while:

for ($i = 1; $i <= 20; $i++) {
  sleep(1);
  syslog(LOG_NOTICE, "Running cron for $i seconds.");
}

So my question is: Why isn't that solution good enough?

The only issue I found while trying this was that even though the fully loaded page appears in the user's browser window right away, they'll still get a "busy" icon (i.e., their browser looks like it's still trying to load something) for the duration of the cron run. However, given that this only happens for one page request every 3 hours or so (and most small sites who would be using this feature are not likely to have very expensive cron runs in the first place), is it really that big of a deal?

Also, reading up on http://www.php.net/manual/en/features.connection-handling.php and http://www.php.net/manual/en/function.flush.php it sounds like there are some possible solutions for that problem too, by explicitly closing the browser connection but still being able to do processing after that... In my quick trial, they didn't work, but perhaps it's possible to make them work.

TheRec’s picture

The only issue I found while trying this was that even though the fully loaded page appears in the user's browser window right away, they'll still get a "busy" icon (i.e., their browser looks like it's still trying to load something) for the duration of the cron run.

That is the reason why we do not use register_shutdown_function(). It is not considered acceptable to have this impact on random users, thus we create another request with XHR and, if Javascript is disabled, with an <img> (it is wrapped in <noscript>, this was discussed long enough in my opinon. Unless you bring a solution that is transparent for every visitor I doubt it will be accepted, because this was the main concern for a long time in #331611: Add a poormanscron-like feature to core.

To get back to the current issue, the two current propositions are :

  1. Find a solution for a for the second request to be executed only when threshold is passed, but it should work on cached pages (it means that if cron threshold is passed, the following requests will not use cache, until the cron was run successfully). This solution might seem
  2. Trust the user-agent and set a cache lifetime for the path to the image so it expires after the cron threshold is passed. My main concern was that Cache-Control max-age and Expires headers could not work when using XHR, but I could test it in various browsers, they are working in many user agents (FF, IE6, IE7, IE8) but also fails in popupar browsers like Opera and Safari... so we must decide if we want to rely on the client for this...it is not as safe as a server side control.

catch said in #1 :

The other reason is to have a failsafe in case cron stops running for any reason - typo in crontab -e, move server and forget to set up a new job, things like that which can bring any remotely busy site to a halt as the watchdog table grows to hundreds of thousands of entries and cache tables fill up with stale data - in this case, it's also fine if it takes an extra few minutes (or even hours) to run - since that's better than not at all.

This is not implemented yet... it will only run cron whenever the feature is enabled, not if cron did not run for a long time, as far as I know. But it could be implemented of course, but in another issue.

catch said in #14 :

Maybe, but I'd rather this stayed in poormanscron.module in D7 - otherwise every site, whether they use this feature or not, as to load the entirety of system.module for every cached page requests. Not as bad as a double bootstrap, but still a mess. Compared to some sites sometimes having a slightly delayed cron run.

This is about solution 1 and I do not know why we should load "system" module for every cached query according to you. Couldn't we load it only if the cron_threshold variable (using variable_get which is available at this time I think) ? In that case we load the system module, serve an uncached version of the page (skip cache, only for this request... or the following requests if this one could not run cron) which will contain the two triggers and try to run cron this way... I feel this is the safest way to handle it, but it reaches areas of Drupal I do not know enough to produce a decent patch.

alexanderpas’s picture

@TheRec

why we should load "system" module for every cached query

because system_boot() would reside in the system.module

TheRec’s picture

Right, I was thinking about doing this directly in drupal_bootstrap(), forgot we switched the talk to hook_boot() at that time... my bad.

Anyways, if we do that there we'd also need to load system.module, since we need the DRUPAL_CRON_DEFAULT_THRESHOLD constant to check the value of the theshold when it was not set manually, but we could rather move it so it is in the scope when we are at this early stage I guess...

David_Rothstein’s picture

@TheRec: As far as I can tell from reading the other issue, Damien's comment at http://drupal.org/node/331611#comment-1105066 was never discussed at all...

But in any case, I think we should explore these solutions because in theory they are much, much simpler - and if we need to solve the "busy icon" problem, let's at least try to solve it (as mentioned above, there are documented methods for doing that).

So, I did a bit more digging and came up with the attached patch. It is a very simple "demonstration-only" patch, but it works beautifully for me (no busy icon or any visible indication that cron is running). The basic technique is discussed in many places, but the original source for it actually seems to be here: http://us3.php.net/manual/en/function.register-shutdown-function.php#40815

I have only tested this on the combination of Firefox 3 and a vanilla installation of Apache - we'd probably need to think carefully about whether there any configurations where this could cause issues. Also, the patch currently only works with non-cached pages. In theory it's not hard to extend to cached pages also (but we'd need to pay special attention to the page compression, I think). Or, as per @catch's original point, we could just make it so that the automatic cron only runs on non-cached page views, and that would likely be fine too....

TheRec’s picture

You're right, it has not been addressed after... it has been addressed beforehand #331611-24: Add a poormanscron-like feature to core (and #25) ;) ... Damien Tournoud was just stating that for him it wasn't a big deal to have few requests with slowdowns, but there was enough concerns about slowdowns for Damien Tournoud to comeback with another solution which was in the end committed after few modifications.

Anyways, if the solution you're suggesting works, it would indeed be better, providing we find a way to solve the problem of cached pages (which is basically the same we are trying to solve here) and ensure that it works in every server environment supported. I've took a few minutes to check the php.net comments and I've read this one which ends with It does not work on any machine. (but without giving more details... and using another way to execute it), so we should really investigate/tests under which conditions this could work (count me out on this task, I don't have the time to play with every PHP5 version around).

Just for the sake of referencing it, the current solution is also inspired from a php.net comment, which is referring to the solution you're suggesting like this :

The workaround using output buffering and header('Connection: close'); seems to work, but the solution I prefer is to put the background processing in a separate script which gets included as a transparent gif.

I do not say the that new solution doesn't work, I just cannot find any thorough testing of it (various Apache and PHP versions) and the current solution only relies on the undeniable fact that a new HTTP request will not alter the original one.

RobLoach’s picture

Subscribing, this is pretty horrid.

chx’s picture

too much effort too little gain remove the image who has JS disabled anyways? A few , yeah, but less in less as we go ahead and we do not need to deal with singular users just a mass of users. Some of em will have JS.

Dave Reid’s picture

http://www.w3schools.com/browsers/browsers_stats.asp says 95% JS-enabled (in Jan 2008) but it's got to be closer to between 90/10 and 85/15, so that's a pretty good chance a JS-only poormanscron will hit.

EDIT: I am pro-JS-only poormanscron.

TheRec’s picture

It wasn't the main purpose of the issue, but yes the XHR call could be sufficient in itself... but seriously, what's how's an image in a <noscript> such a problem ? ... as Dave Reid just showed, only 5% will request it.

For me you can remove it, when I first added the XHR call I think it was discussed what was the use of the <img> now, I've answered this and I guess it wasn't such a big deal since it ended up being committed (no, that does not make it less horrid ;D).

TheRec’s picture

Dave Reid> I know I know you're "pro-JS-only poormanscron"... just makes me laugh when we use "global" usage statistics... this all depends of the audience of your website... offering this not-hurting alternative wasn't such a big deal to me.

Anyways, if it's such a problem the image can be themed through theme_system_run_cron_image()... just overide it and make it return nothing... the XHR call will still be added.

RobLoach’s picture

Status: Active » Needs review
FileSize
5.53 KB

This patch regulates the call to the cron check through JavaScript and returns a JSON object stating whether the cron was run successfully. It also removes the ugly noscript tag, and additional inline JavaScript that showed up on every page load. No more 1x1 pixel either ;-) .

Status: Needs review » Needs work

The last submitted patch failed testing.

TheRec’s picture

Why do you use REQUEST_TIME to compute cronCheck? Let's say I disable cache completely, cronCheck will get updated on every query (as REQUEST_TIME changes). You should use the cron_last variable as it is done on server-side before drupal_cron_run() is called. In fact you should also check wether cron has run once before (as cron_last can be NULL ... it's done on server-side, so why not do it on client-side now too ?).

Aside from that, the Javascript setting will also get cached. Let's take the example of a page getting in the cache just few seconds before the cron threshold will be passed, then cron runs as expected when the threshold is passed, but the old value of cronCheck is still in the cached page (cache is not cleared on every cron runs as far as I know)... and then you're back with two full bootstrap on every request of that page (now only if they have JS enabled...) until the cache is cleared or expires.

Also you ought to update tests... which wouldn't rely on getting the images of the page (which was a way to check if the trigger was present too), now that it would be only JS.

The purpose of the issue is to avoid two bootstrap on every page, removing a trigger because you don't like how it looks will not solve this problem because it isn't the source of the problem... in fact, the <noscript> will be used only when the XHR call could not be used, it's a graceful degradation of the functionality. If you want to remove this I don't see a problem, except that if the audience of a website is a majority of users without JS (for any reason : security, disabilites, etc.) this functionality will be of no use and you should warn the user that cron might not run in such environement.

The real source of the problem is that cached pages will prevent triggering the cron run in an efficient way... my suggestion still is that cached pages should not be returned when cron_threshold is passed. If this cannot be done because it goes too deep in Drupal's entrails, then we should investigate David_Rothstein solution wish could be achieved server-side only.

moshe weitzman’s picture

TheRec makes some good points.

The real source of the problem is that cached pages will prevent triggering the cron run in an efficient way... my suggestion still is that cached pages should not be returned when cron_threshold is passed. If this cannot be done because it goes too deep in Drupal's entrails, then we should investigate David_Rothstein solution wish could be achieved server-side only.

We could shut off delivery of cached pages after cron throeshhold, but we are not goign to do that. That would unload a huge load onto the server just so that one person can get cron run. The risks outweigh the benefits at this point. And please don't mention that folks can disable this. We are now discussing how default drupal works and I think thats vitally important. It is irrelevant that people can disable it. Many won't.

I actually think this poormanscron feature should be rolled back and re-committed if a performant solution is found. The current double bootstrap situation is awful, and has no easy solution AFAICT.

catch’s picture

I think we should just alow one person every three hours or whenever it is to have a longer request time, all the other solutions are worse than that original problem. As well as that, we should doubly make sure that the poormanscron only runs if cron hasn't run already by some other method before it's due. That way if you have hourly cron jobs on your server, and the threshold is set to 3 hours, you'll never actually have a poormanscron run unless something goes horribly wrong with your cron job.

Dave Reid’s picture

Assigned: Unassigned » Dave Reid
Status: Needs work » Needs review
FileSize
12.3 KB

Revised patch with tests.

1. Uses variable_get('cron_last', 0) + variable_get('cron_safe_threshold', DRUPAL_CRON_DEFAULT_THRESHOLD) to generate the cronCheck JS variable. That way each page gets the same timestamp of when the next autocron should run.
2. When the JS executes the cron AJAX request, system.module clears page caches in system_cron, so it should update with the new cron check variable.
3. Updated tests.

Dave Reid’s picture

Fixed unrelated changes in patch and documentation to match what was removed.

Dave Reid’s picture

FileSize
12.02 KB

Arg...hate when I'm working on too many patches.

TheRec’s picture

Status: Needs review » Needs work

Ok, for the moment I only have time for a code review... I'll try to test it tomorrow.

+++ modules/system/system.module	27 Sep 2009 21:58:36 -0000
@@ -2790,43 +2787,38 @@ function system_retrieve_file($url, $des
+ * (if the client's browser supports JavaScript)

Missing a dot at the end of the sentence.

+++ modules/system/system.module	27 Sep 2009 21:58:36 -0000
@@ -2790,43 +2787,38 @@ function system_retrieve_file($url, $des
  * @see system_page_alter()
@@ -2865,25 +2858,9 @@ function system_run_cron_image() {
  * @see system_page_alter()

Well, this could be updated too, we're using system_page_build() now.

Powered by dreditor

Also, I talked about it with Dave Reid on IRC, I suggest that, if we're about to rename the functions and the callback path, couldn't we rename them to something reflecting the fact that they are related to automatic cron run ? Something like "system/automatic-cron-run" for the path, and "system_automatic_cron_run" for the page callback and "system_automatic_cron_run_access" for the access callback. It's not that important, but it would improve readability of the code in my opinon... with the new names suggested by Dave Reid there is nothing that makes a relation with the "automatic cron" feature... before the "_image" part was somewhat giving an hint.

Finally, I feel that it would be a good thing to mention that functionality-wise we're regressing by not supporting users without Javascript and except the "look" of the code it will not improve anything to remove the <noscript>. If the majority agrees with this, no problem with me.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
12.05 KB

Fixed the two points in #39. Marking back as needs review to discuss the menu paths.

Dave Reid’s picture

Component: system.module » cron system

Moving to new cron system component.

gpk’s picture

@36, would having a cache lifetime > 0 interfere with your point 2. i.e.

2. When the JS executes the cron AJAX request, system.module clears page caches in system_cron, so it should update with the new cron check variable.

If this turns out to be a problem then it might still be possible to do something along the lines of #2, i.e. in the DRUPAL_BOOTSTRAP_PAGE_CACHE phase in http://api.drupal.org/api/function/_drupal_bootstrap/7. Moshe expresses the concern in #34 that

We could shut off delivery of cached pages after cron throeshhold, but we are not goign to do that. That would unload a huge load onto the server just so that one person can get cron run.

However we don't need to make it quite so drastic - after the threshold has passed, we could just turn the page cache off for as long as is needed for someone to trigger a cron run. For any site on which server load is a consideration this wouldn't be very long (a second or two at the most I'd have thought), and because it would be for a short period only I don't see that this would increase the server load anything like as much as a general wipe of the page cache, as performed by system_cron(). I imagine that on many sites only one page request would have the cache disabled in this way.

@39

it will not improve anything to remove the <noscript>

One possible tiny improvement: crawlers won't make requests for the cron img ;) .

TheRec’s picture

gpk> So whenever crawlers will consider Javascript you'll drop the XHR call too ? ;) I'm kidding... I just do not find it really a justification, we should have solved the issue upstream, so even if crawlers would visit the link a full bootsrap would only happen IF necessary (note, it would allow crawlers to trigger the cron... which could be helpful for sites with a small audience... ;D). But for the moment this is not the road we are taking... let's see where we end up with the JS only solution if the majority agrees.

Status: Needs review » Needs work

The last submitted patch failed testing.

TheRec’s picture

Status: Needs work » Needs review
FileSize
10.92 KB

Let's re-roll this one (while I was at it I saved the life of a poor kitty...) :)

I've taken the time to test it and all seems fine. With disabled cache for anonymous users, with Normal cache being enabled (for anonymous and authenticated users). The only issue I had was with Minimum Cache Lifetime, as the pages cached with this option do not get cleared when cron is run, but this was also the previous behavior and it has been accepted that usersed using Minimum Cache Lifetime are not supposed to use automatic cron (maybe it should be mentioned under the Minimum Cache Lifetime dropdown that it is not compatible with automatic cron).

Maybe we could test the cache issues with SimpleTest... When testing manually, I did my best to test them by tweaking the cron threshold to a lower value and see if Javascript was cached after cron run... it would be possible to request the page after cron has run from the XHR call, and see if the cronCheck value was updated... if it is not we'd have the unwanted behavior that will run cron on every request to this page until its cache entry is cleared and that would make the test fail, does that sound right to you ?

andypost’s picture

Glad to have commented line in default.settings.php

# $conf['cron_safe_threshold'] = 0; 

suppose it could be useful for advanced users to disable this feature!

+++ modules/system/system.module	30 Sep 2009 20:05:51 -0000
@@ -2789,44 +2786,39 @@
   // Cron threshold semaphore is used to avoid errors every time the image
   // callback is displayed when a previous cron is still running.

Comment still refer to image

Dave Reid’s picture

@andypost: It can be disabled easily through the 'Site information' settings page so we don't need to comment the variable in default.settings.php.

gpk’s picture

Presumably the method we are using here assumes that the browser's local time is reasonably in-sync with the server? Currently I am 10s adrift (ahead) of server time, so I'll be causing an additional full bootstrap on every page request for 10s before cron is due. On a busier site could this cause a crescendo of additional bootstraps (from approaching 50% of users) leading up to the time when cron will run?

Or maybe I've missed something!

alexanderpas’s picture

We're not solving the problem here, we're trying to find ways around it!

#557542: Cache module_implements() went in, and i'm starting to wonder if we can investigate the approach stated in #13-16 again!

(which would be the ultimate answer if it worked)

Dave Reid’s picture

@alexanderpas: Drupal 7 is already slower than Drupal 6 in many aspects. We can add caching patches until the cows come home (and it's good that we're getting them in) but it still doesn't change the fact that this triggers two server requests for every single request by the user.

catch’s picture

@alexandarpas - the other part of that issue, allowing for a separate $module.boot.inc file, hasn't gone in yet, and may well not for D7.

As before, I think we should just let one user every three hours have a longer request time, and accept that only auth and uncached requests will trigger a cron run - if that's not good enough for a site, they can set up a cron job on the server, or use one of the free cron running services.

alexanderpas’s picture

I agree with you catch, as all the other approaches seem to be evading the problem, this approach at least acknowledges the problem (and AFAIK any POST request is enough to let cron run.)

andypost’s picture

I think that POST request is only reason to trigger cron-run.
Except scheduled actions like mail-send or scheduled publishing, for this cases should be warning at the module settings which task for contrib.

David_Rothstein’s picture

Is there a particular reason we don't seem to be further pursuing the solution demoed in #24? It completely bypasses every single one of the problems brought up in the recent comments...

If we're going to reject it, let's do so for a specific, technical reason. So far, I'm not aware of any :)

moshe weitzman’s picture

FileSize
1.4 KB

@David - I'm so sorry that I ignored that demo in #24. I figured you were in for a world of hurt since I tried this technique without success not that long ago. But yours actually works! I confirmed Firefox3, Safari, Opera. I think we need to confirm IIS and IE and then we are good. This is really clean - I can imagine many sites simply using this technique and not bothering with crontab.

Attached is a reroll of David's patch. Please test with IE and IIS if possible. To test,

1. open your browser and your system log viewer.
2. Request any page from your site.
3. Confirm that the page finished loading quickly in the browser as usual.
4. Confirm that log messages are still being written. This proves that cron is running in the background.

moshe weitzman’s picture

I can confirm that David's technique works on IE7. Just need IIS now. Since we are explicitly sending the Close header, I do not expect any problem.

@David - what do you think is missing form this patch now?

Status: Needs review » Needs work

The last submitted patch failed testing.

David_Rothstein’s picture

Thanks for the testing, Moshe! If I recall correctly, the original comment on the page at php.net also reported having tested this technique on IE6 and that it worked there as well (I think IE6 was actually "new" when the comment was written). So it sounds like we're fine on the browser front.

The main thing that is still missing is integration with cached pages and page compression. We either need to make that work, or understand why/when it doesn't.... Drupal's code around cached pages seems pretty complex, so I don't immediately understand how to do it, but it seems like this might be relevant: http://us.php.net/manual/en/function.ob-get-length.php#59294

In terms of different web servers, I would agree that it seems pretty likely that this is OK. As long as we are sending the correct Content-Length header from PHP, one would assume that the server will handle it correctly. I wonder a bit about really cheap web hosts that might try to inject javascript into the page to display ads and such (e.g., I remember reading this comment from Gábor a while ago that suggests Drupal has had issues with that before: http://drupal.org/node/245990#comment-1200102). However, even then, I'd assume that if something gets injected onto the page, it's the webserver's job to make sure it adjusts the Content-Length accordingly?... but I really have no idea :)

The thing we definitely need to avoid (and which would be a showstopper for this approach if it ever were to occur) is a situation where the returned Content-Length was too small, because in that case, the browser would cut off displaying the bottom of the page.

gpk’s picture

>The main thing that is still missing is integration with cached pages and page compression. We either need to make that work, or understand why/when it doesn't.... Drupal's code around cached pages seems pretty complex

My understanding may be defective, but from what I can make out, http://api.drupal.org/api/function/drupal_serve_page_from_cache/7 (and also for that matter its predecessors, e.g. http://api.drupal.org/api/function/drupal_page_cache_header/6) does the compression negotiation itself and sets the Content-Encoding header where necessary. By the end of this function, $cache->data contains the content that will be returned to the browser, so strlen($cache->data) should be the correct Content-Length (in bytes, not characters) .. or is this too simplistic???

>In terms of different web servers, I would agree that it seems pretty likely that this is OK
Hmm, not sure I like the sound of this extract from http://uk.php.net/flush:

Several servers, especially on Win32, will still buffer the output from your script until it terminates before transmitting the results to the browser.

Hopefully I'm missing something here. Perhaps if you set the Content-Length then it gets round this.

David_Rothstein’s picture

FileSize
3.43 KB

Huh, that is a good point about $cache->data. The attached patch uses that, and makes it possible to try this out now in a few cases (both cached and uncached pages).

Hmm, not sure I like the sound of this extract from http://uk.php.net/flush...

Yeah, it's a bit difficult to tell what they mean by that. If it really waits until the end to send anything at all, then we could be in trouble.

I wonder if the reason that the technique uses both ob_end_flush() and flush() is to get around that? The idea is that if we can somehow flush the right amount of content to the browser, the browser will know to close the connection, regardless of whether the server thinks it should be continuing to buffer output or not. I guess the only way to know for sure would be to test this on Windows.

TheRec’s picture

I've tested this last patch with a server running Apache 2.2.8 on Windows, it seems to work with FF3.5, F3.0, IE6/IE7/IE8, Chrome 3, Safari 4 and Opera 10.

I'm fairly convinced that if we have to resort to stuff like this, it won't look "less hackish" ;) I doubt the users who posted these ways to "fix" flush() did it for the fun of it, this fuction must fail depending on the environment...
The flush() solution does not seem as "stable" as a XHR call as it is a simple HTTP request. So in my opinion we should just find a way to handle it for cached pages (which is the original purpose of the issue and what #45 tries to do).

@gpk in #48, "busier sites" should not use this feature so I don't see the problem with it.

andypost’s picture

As #56 we need test on IIS (windows)

Another issue with compressed pages in cache #43462: cache_set and cache_get base_url brokenosity

David_Rothstein’s picture

Hm, not so good: I just tried this patch with mod_deflate enabled, and it didn't work correctly. (It didn't fail spectacularly or anything, but it did put us right back where we were before: the page contents displayed but the browser still showed the "busy" icon until cron was finished running.)

That's a problem, because my understanding was that the Apache configuration shouldn't be able to break this... but it sounds like with mod_deflate enabled, it's either getting hung up in the flush() call or not setting the Content-Length properly. If we have to start coding for different server configurations, then this approach definitely becomes a lot less attractive :(

moshe weitzman’s picture

Well, I think we are then where catch said we should go - one request every x hours will be slow. I think it is OK, and we proceed with flush().

TheRec’s picture

Well what's the problem with the XHR call (#45)... I've answered gpk's concerns, this solution does not slow a particular request (contrary to what you think we must do now) and is not dependant on the environment (just to remind you that the first issue, to include poormanscron lasted about one year... to end up using what you refer to an "ugly" solution). If you're running a "busy sites", the XHR solution will only affect (with 2 bootstraps) at max 50% of the requests, since we all agreed this should not be used on "busy sites" I do not see the problem with it.

chx’s picture

Status: Needs work » Needs review
FileSize
2.12 KB
chx’s picture

FileSize
3.51 KB

Well, I needed a way to pass something to the cache and this is what I came up with. This allows you to skip the cache now and then. You know, this turned out to be a VERY nice little API trick for any advanced structure because you can peek at the cache expiry and begin reconstructing before your cache expires (and setting a lock so no other webserver does the same).

chx’s picture

FileSize
3.52 KB

Forgot to reroll sorry, the only change is that now I pass in $cache to the check callback -- that it can be used this way only dawned on me when writing the comment.

David Strauss’s picture

Status: Needs review » Needs work

I don't see how $cache->check could be ever defined when getting an item from cache_get(). It's pretty clear what member variables should be set on the object returned from cache_get(); "check" does not appear to be among them: http://api.drupal.org/api/drupal/includes--cache.inc/7/source

chx’s picture

Status: Needs work » Needs review
FileSize
1.73 KB

Simpler approach.

TheRec’s picture

But we fall in what's explained in #21 in the point 2. XHR does not always respect user-agent cache directives. So this would work with an image only trigger (as it doesn't use XHR and in that case cache directives are respected by most of the browsers) but not with a XHR only trigger (which is what was requested as for some it looks less "hacky"... it's not my opinion, I only report what's been said in here).

chx’s picture

Then we need to wait for someone to come who does not have this cached yet. The main point is in setting the cache expire time based on the Expires header.

catch’s picture

FileSize
106.73 KB

Devel is picking up the drupal_initialize_path() call from the cron image bootstrap when query logging, not sure how that happens yet, but here's proof:

pwolanin’s picture

Let's just remove this or go back to a very simple version as catch suggests in #51. This is a fallback, ideally it should never happen at all and it's totally unacceptable that it has a performance impact on a properly set-up site.

Dave Reid’s picture

I don't see why we can't just do a JS-only approach as well and get rid of the image. It's what we're doing in contrib Poormanscron 2.0 and it's been working great so far.

Dave Reid’s picture

FileSize
12.05 KB

Revamped patch from #45.

1. Adds a cronCheck JS variable to every page with the timestamp of when the next cron run should happen.
2. Uses a JavaScript only approach by putting its AJAX request in system.cron.js so we don't have to load system.js (which does not have any non-admin-related script) on every page.
3. The code in system.cron.js will check the current timestamp compared to the cronCheck setting and only if it's equal or past, it will make the cron AJAX request.
4. Removes the image/noscript running functionality as it will cause a bootstrap on every hit. And now that we have overlays, it performs another request on overlays as well, so that's two unnecessary bootstraps.
5. Takes chx's idea of adding a 'Expires' header to the AJAX request and serving the page from cache until the expires timestamp.
6. Revamps CronRunTestCase-> for more thorough tests.

Dave Reid’s picture

Ugh, wrong patch.

Status: Needs review » Needs work

The last submitted patch failed testing.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
14.26 KB

Revised patch that actually caches the system/run-cron-check page into the page cache.

moshe weitzman’s picture

Status: Needs review » Needs work

Looks like a solid solution to me. We need not care about a few clients who have badly screwed up clocks or have javascript disabled.

I assume our cron/queue system still protects us from a stampede of users trying to execute cron at same time.

chx’s picture

Status: Needs work » Needs review

Moshe surely crossposted.

Edit: Correct - I did.

Status: Needs review » Needs work

The last submitted patch failed testing.

RobLoach’s picture

For the jQuery.get, it might be a good idea to use:

jQuery.ajax({
  url: settings.basePath + 'system/run-cron-check',
  cache: false
});

jQuery.get is a wrapper around $.ajax, with a default cache parameter of true, if I recall correctly.

Another thing we might want to consider is using jQuery.queue so that the HTTP call is made after all other queued events are passed. I understand that the rest of the Drupal JavaScript system doesn't use jQuery.queue, but it might be worth considering. Minor nit-pickings though :-) .

... Something like? Do we really need behaviors for this? I don't see a reason why we'd want to run this multiple times on a page.

Drupal.behaviors.cronCheck = {
  attach: function(context, settings) {
    if (settings.cronCheck || false) {
      $('body').once('cron-check').queue('cron-check', function() {
        // Only execute the cron check if its the right time.
        if (Math.round(new Date().getTime() / 1000.0) > settings.cronCheck) {
          $.ajax({
            url: settings.basePath + 'system/run-cron-check',
            cache: false
          });
        }
      });
    }
  }
};
Dave Reid’s picture

Ok I could use some help debugging this. For some reason this has exposed exceptions in our content block fallback tests because the global $theme is not set and causes errors with system_region_list().

pwolanin’s picture

this looks a bit odd in the JS part of the patch:

if (settings.cronCheck || false) {
RobLoach’s picture

this looks a bit odd in the JS part of the patch:
if (settings.cronCheck || false) {

When writing the original JS patch, I was worried that the JavaScript would execute without settings.cronCheck being initialized. It's just my habit of writing strict code. We can remove that.

Anonymous’s picture

FileSize
13.01 KB

This is simply a reroll of #79. As such, I suspect it will die in the same was that #79 did. I'll try to see if I can't figure out how to fix it in the morning.

Anonymous’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
13.45 KB

This #79 and tries to go a little further with it. It won't get everything, but hopefully this fill fix a few problems.

Dave Reid’s picture

Yeah I'm not sure what's going on here. Somehow with this patch causes the global $theme to be unset and causes the exceptions in the SystemMainContentFallback test. I have no idea why this freaking patch causes it.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
14.17 KB

Well, I completely missed the fact that the above patch added a new file. Let's see if this fixes it.

Dave Reid’s picture

Woo-hoo!

+++ modules/system/system.admin.inc	(working copy)
@@ -1495,7 +1495,7 @@ function system_site_information_settings_submit($
-  if (($cron_threshold > 0 && $form_state['input']['cron_safe_threshold'] == 0) || ($cron_threshold == 0 && $form_state['input']['cron_safe_threshold'] > 0)) {
+  if ((bool) $cron_threshold != (bool) $form_state['input']['cron_safe_threshold']) {

Let's make one minor change here and just compare if the values are different (not cast to booleans), since that would cause a different cronCheck value on cached pages.

This review is powered by Dreditor.

Anonymous’s picture

Dave Reid’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +Performance

Looks good to me.

TheRec’s picture

Did the problem I re-raised in #71 about the "Expires" header being ignored by browsers get an answer ? Sorry if it did, please point it to me... otherwise he's a summary of the problem :

Most browsers ignore Cache-Control max-age and Expires headers on XHR calls (on purpose ... most likely to get fresh content as in the context of AJAX applications it makes some sense... unfortunately, in our case it doesn't, but we won't fix browsers in here so we will have to deal with it). So knowing this, the solution you suggest will work only when the browser doesn't ignore the header we set... which will not be many times. This is not and surely cannot be tested in by DrupalWebTestCase as this behavior depends on the browser implementation of XHR... you are testing that we send the right headers, but unfortunately most browser will ignore them and still generate the useless bootstrap on every request (as far as I can see).

Using "cache" parameter if we use "jQuery.ajax" (instead of the current $.get) wouldn't solve it either in my opinion as the problem does not lie in jQuery (unless it implements a local cache, which I'm not aware of) but in the XMLHttpRequest component implemented by the browsers.

I know I am not bringing any solution to the table, only questions... but I am not sure this is the right way to fix it (sure it is better than the current solution, but still not fixing the issue completely), so I feel the need to ask this question.

About removing the image callback, I do not have any problem if it's the consenus (as it seems), it was only a fallback and if it creates more problems than it solves it's better to remove it. Good move ;)

RobLoach’s picture

From TheRec at #97:
Using "cache" parameter if we use "jQuery.ajax" (instead of the current $.get) wouldn't solve it either in my opinion as the problem does not lie in jQuery (unless it implements a local cache, which I'm not aware of) but in the XMLHttpRequest component implemented by the browsers.

Ah, thanks for the explanation. The good thing about this patch is that it provides the best of both worlds. It has the expires headers, as well as the JavaScript fallback on the client-side. RTBC++.

TheRec’s picture

Ok, I see that. But when a page will get in Drupal cache, with a certain cronCheck value (as the JS settings variables are set inline in the page) and this next cron timestamp is passed (which means the cron should run on the next request), the conter-measure you implemented with Expires header (sent by the server) to avoid running the cron each time will not work as XMLHttpRequest will just ignore it and issue the request to the server anyways... until this page gets out of the cache (and thus cronCheck gets "updated") every request to this page will generate two bootstrap as far as I can understand.

moshe weitzman’s picture

Status: Reviewed & tested by the community » Active

Most recent patch was committed. I'm leaving this open until we resolve the question of the double bootstrap on cached pages.

Dries' commit message is worth sharing here.

- Patch #566494 by Dave Reid, chx, JoshuaRogers, David_Rothstein, Rob Loach, TheRec, moshe weitzman: cron image does a full bootstrap on every page request so changing to a Javascript-based solution instead. Important performance fix -- what were we smoking? ;-)

Dries’s picture

Status: Active » Reviewed & tested by the community

This is an important performance fix. Committed to CVS HEAD.

Dries’s picture

Status: Reviewed & tested by the community » Active
moshe weitzman’s picture

Status: Active » Needs review
FileSize
517 bytes

Attached patch flushes page cache after the run-cron-check request is complete. I think that sufficiently minimizes the 'double bootstrap on cached pages' problem.

Dave Reid’s picture

I think that will only run when people do the manual cron run on the reports page, not the AJAX callback. We should do something more like this in system_cron() itself so that it happens whenever drupal_cron_run() is called.

David_Rothstein’s picture

Doesn't system_cron() already clear the page cache, in code that runs right above this?

moshe weitzman’s picture

@David - that code clears *expired* entries from the page cache. It is just pruning of the table, not a full clear.

I'd prefer not to dirty system_cron() with poormanscron specific logic.

David_Rothstein’s picture

Right, but we don't need to clear expired entries if we're going to clear the whole thing immediately afterwards... a small point, I guess.

Dave Reid’s picture

Ok then we can put it into the actual AJAX callback that runs cron.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

That works for me. Hopefully someone will chime in if they see a problem with this approach.

catch’s picture

Status: Reviewed & tested by the community » Needs work

Both HEAD and the proposed patch are signlficantly worse than one, single, user triggering cron during the same request as viewing a page, which afaik is what poormanscron does in D6 and earlier.

We should just do that, or roll it back entirely, instead of degrading performance for everyone who gets the subsequent cache misses, and possibly overloading servers, just to avoid spinny wheels in firefox for one person.

Dave Reid’s picture

Yes, and poormanscron 6.x-1.0 had lots of bad issues related to performance as well. We're using a backport of this in 6.x-2.0 and have had no problems from any users so far, so I think it's a good thing.

Damien Tournoud’s picture

@catch: I remember my first patch for this with emotion (#331611-8: Add a poormanscron-like feature to core).

catch’s picture

@Dave Reid, there have been several fixes to cron recently, both the queue API and better distribution of max_execution_time, it'd help if you could explain what the performance issues are rather than just alluding to them, especially if there's anything Damien's original patch wouldn't cover. That always seemed fine to me and I never understood why it got taken in a far more complex direction.

On a site using varnish or boost, causing an extra bootstrap on pages which would otherwise not even be served from Apache, as I think the current patch code in HEAD would, is going to put a lot of additional load on the server (and while we can say that those sites shouldn't use this feature, it's the sort of setting which can be left on by accident until there's an emergency, and you can run boost on any old shared hosting).

For pretty much any site, unless they're very low traffic or have only a handful of pages rebuilding the page cache is a really, really expensive operation - I know of at least a couple of sites which nearly go down if there's a full page + block cache clear. There's really no way that giving one user a long request time to run cron is going to compete with either of those scenarios in terms of effect.

moshe weitzman’s picture

Yes, please share more, Dave.

I agree that this has become too complicated. I would prefer Damien's original patch as well.

Gábor Hojtsy’s picture

Status: Needs work » Reviewed & tested by the community

Looks like changing tracks now would be a bit late (although I totally agree that the current way of running poormanscron is way overcomplicated). So let's commit at least Dave's patch, which is in production on the 6.x-2.x branch in the contrib module and was agreed to by Moshe before.

We can debate on reworking it later then, but let's get this complete for the sake of Alpha 1, shall we?

Status: Reviewed & tested by the community » Needs review
Issue tags: -Performance

Re-test of 566494-cron-clear-page-cache-D7.patch from comment #104 was requested by aspilicious.

Issue tags: +Performance

Re-test of 566494-cron-clear-page-cache-D7.patch from comment #108 was requested by aspilicious.

catch’s picture

FileSize
9.05 KB

The contrib module is for people who absolutely can't set up cron themselves for some reason, the default profile is likely to be used by the vast majority of people using a Drupal site who'll have to go and switch this off to not get random cache clears all over the place.

I think we should just roll this back altogether, then put Damien's original back in later if that's agreed. Attached a patch for that.

Status: Needs review » Needs work

The last submitted patch, rollback.patch, failed testing.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
13.11 KB

@catch: you did not roll back related tests and related JS either. I've updated to include removal of the JS and the tests.

Status: Needs review » Needs work

The last submitted patch, rollback_7.patch, failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review
FileSize
13.59 KB

The menu item needs to be removed too - hopefully that's what was causing most (all?) of the failing tests...

I am in favor of going back to Damien's original approach also. Remember, the only problem it had was that an occasional visitor (i.e., once every few hours) would potentially experience a slowdown. However, the kinds of sites that would be expected to use this feature aren't likely to have very expensive cron runs anyway.

Plus, we started some work above (see comments/patches at #24, #55, #60) that would potentially even fix that problem, and which seemed to work fine for at least the most common server configurations (although not all)...

catch’s picture

Status: Needs review » Reviewed & tested by the community

All green, RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Needs work

Wait. What? No, we are not rolling this back. Let's go with Damien's approach.

Gábor Hojtsy’s picture

@webchick: Damien's approach is to roll this back *and* then add it in a different way. Do you prefer to have a bigger patch with both included?

catch’s picture

Yes, I wanted to do a rollback, then we could start again with a new patch to add it back in using Damien's original patch as a basis. I don't have time to re-roll Damien's patch and keep a big monster patch up to date, so if that's needed hopefully someone else will oblige.

webchick’s picture

I'm not keen to roll back this feature without an RTBC patch to replace it with. There's no telling when that would ever get fixed. If there is such a patch, though, just point me to it.

catch’s picture

No such patch, and it's likely to conflict with the rollback, so will have to be one big one.

Gábor Hojtsy’s picture

I'm on merging the patches.

Gábor Hojtsy’s picture

Ok, here is a patch merging the rollback and Damien's approach. I've abstracted the logic to decide when to run the automated cron to a system.module function. Since Damien's patch only runs on full bootstraps (not on cached pages), putting this in system.module was fine. We could just as well move it to common.inc or wherever else. I've "reworked" the tests (removed the not applicable items and left the basics in there), and cron tests seem to pass nicely for me.

I'm looking at another option, that is integrating David's patch with a rollback component included, so we can even have two options to talk about, and the effort put into these are not lost.

Gábor Hojtsy’s picture

Status: Needs work » Needs review
FileSize
10.59 KB

Let's not forget the patch :)

Gábor Hojtsy’s picture

FileSize
12.61 KB

Here is the version which merges the rollback with David's latest patch. Unfortunately I needed to uncomment one Content-length specification line, because I found that made the tests not pass (which is flattering).

So hopefully, we have two passing patches with slightly different options. Both roll back the current implementation and put in place a simpler solution which also avoids the performance problems we experience with the current code.

Let's figure out which way to go forward!

pwolanin’s picture

I like the simplicity of #131, but either of these look pretty good. I guess the advantage of #132 is that the cron will run even if only cached pages are served (from the Drupal cache). I can imagine some other ways to do that.

I'd support immediate commit for #131 since it would seem like the approach in #132 can be considered afterward.

David_Rothstein’s picture

I agree with focusing on #131 for an initial commit.

As per discussion above, the other advantage of the approach in #132 is that it avoids (at least for some standard webserver configurations) the situation where the user who happens to trigger the cron runs sees an abnormally long page load time - since it does the cron call only after the browser connection is closed. However, we haven't fully tested the effects of that, and given that this feature is intended to be primarily used by smaller sites (who are not likely to have extremely expensive cron runs to begin with), I think it's worth saving that for a potential followup rather than holding this issue up in order to perfect it.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Same here, #131 fixes the critical bug, #132 would be a great followup patch. Marking #131 RTBC then, I didn't run all tests locally though or anything.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Hm. Can someone re-check those test hunks? It seems like we're asserting things happened when the things that caused those things to happen were removed.

Gábor Hojtsy’s picture

@webchick: the cron tests run fine for me with both patches. (I did not run the whole test suite due to time constraints.) Since the new simplified implementation does not emit anything in the page output that can be tested for if the cron run (given it silently runs past the page generation), the parts where testing of these page elements were done are obviously removed. Also, cron does not require an additional HTTP request anymore, so the relevant requests were also removed from the tests. This simplifies the tests, but I don't think they test it anyway worse that the automated cron run operates. It just has less indicators we can look for.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Yeah it just looked like irrelevant stuff was stripped out to me, cron threshold etc. is the same either way. Back to RTBC.

Dave Reid’s picture

Assigned: Dave Reid » Unassigned
Status: Reviewed & tested by the community » Needs review

Has this been tested when cron runs on a cached page? Everything looks like it's going to be a problem since there is not a full bootstrap in that case. We at least need to change our documentation to warn anyone that implements hook_cron() with is a major DX :(.

David_Rothstein’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
10.59 KB

#131 only runs cron when there is a full bootstrap, so it's not a problem there - although that's definitely a very good point if we try to do #132 as a followup!

I'm reuploading #131 here so there is no confusion over which one is marked RTBC :)

Dave Reid’s picture

@David Thanks for the clarification.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Much better - nice and simple. Committed to CVS HEAD.

Gábor Hojtsy’s picture

Greto, thanks!

Berdir’s picture

Status: Fixed » Needs work

I don't know yet what exactly is the reason, but I was able to confirm by using git bisect that this patch broke the update.module tests and maybe page cache too. It is probably also the reason for #706608: system_update_files_database() called from update_get_projects() breaks system table on cache clear.

Working on it...

Berdir’s picture

Ok, still no solution but some more things I've found out..

- Both the module tests and the page cache tests work without this patch. But it's not the same bug...

- The Page cache tests *only* work when "register_shutdown_function('system_run_automated_cron');" is commented out. It is *not* enough to comment out drupal_cron_run() in system_run_automated_cron(). So It has something to do with the fact that there is a shutdown function.

- However, the update tests work when you add a "return;" at the first line of update_cron() so that the actual cron implementation doesn't run. I'm not yet sure what's exactly going on but it seems that the {system} data is borked if update_cron is accessed at that point.

Enough for now, it's 3:38am, I'm going to bed now :)

webchick’s picture

Title: Cron image does a full bootstrap on every page request (including cached ones) » [HEAD BROKEN] Cron image does a full bootstrap on every page request (including cached ones)

Making this a bit more "jumpy" in the issue queue to maybe get some more eyes on it...

Gábor Hojtsy’s picture

I'm also looking at this.

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
989 bytes

This fixes the page cache test.

Edit: and the Update tests (core + contrib) too.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Nice job! I have seen a cache_clear_all yesterday without a backtrace but did not realize it was from a shutdown function, although I should have.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD.

webchick’s picture

Title: [HEAD BROKEN] Cron image does a full bootstrap on every page request (including cached ones) » Cron image does a full bootstrap on every page request (including cached ones)
pwolanin’s picture

Status: Fixed » Active

This seems to cause problems sporadically now - exceptions may be thrown while running shutdown functions that cannot be caught or logged.

Why not put it in drupal_page_footer() instead? At least someplace exceptions can be caught? see also http://drupal.org/node/673050

chx’s picture

Title: Cron image does a full bootstrap on every page request (including cached ones) » cron in shutdown is undebuggable
Status: Active » Reviewed & tested by the community

Roll back the original patch. Now. If you have a DBTNG or a Field API or anything else of a problem then because cron now runs from shutdown you get a Exception thrown without a stack frame in Unknown on line 0 and good luck finding what goes on. If we can't fix this (and I doubt we can) then rip poormanscron. It's broken. We tried. We failed. Drupal 8 (or never). Much more limited patches like #504012: Index content when created can help. There must be limits how low we bend just to please the poor schmuck who can't pay for even remotely decent host.

chx’s picture

FileSize
9.25 KB

I warned -- I know, too late -- in #673050: Database bugs became fatals that throwing exceptions will fire back. I was right. Now pick what goes: poormanscron or exceptions. Both can't stay.

chx’s picture

Status: Reviewed & tested by the community » Fixed

Berdir saves the day by catching these damned exceptions. Stay tuned.

Berdir’s picture

The solution is pretty easy after all.

All we need to do is wrap the whole shutdown function with the following code.

  try {
    // function content goes here.
  }
  catch (Exception $e) {
      _drupal_exception_handler($e);
  }

The cleanest solution to do that is a drupal_register_shutdown_function(), which registers a single shutdown function and calls all other functions inside the above code. I'm going to write a patch tomorrow, I was just trying to this out. Don't want to stay up until 4am like yesterday :)

jbrown’s picture

Berdir: This does not fix the problem, it only hides it.

The fact that it is necessary to use a try{} shows that there are bugs elsewhere.

Rule 1: If a function is capable of throwing an exception and that is not documented, that is a bug in the function.

Rule 2: If function1 calls function2 which is documented as throwing exceptions, then function1 must either catch the documented exceptions or document that it may throw them. If it does not, then that is a bug.

It is correct that there is an exception handler in Drupal, but if it is ever called then that means there is a bug somewhere further up the call stack. If rules 1 and 2 are obeyed then the exception handler will never be thrown.

I'm not familiar enough with Drupal testing, but there should be a test that the exception handler is never called.

jbrown’s picture

jbrown’s picture

catch’s picture

So we're requiring all hook_cron() implementations to handle exceptions now? That's a massive API change and wtf which will throw a lot of people off.

If Berdir's fix is considered a hack, then we should revert poormanscron, or exceptions. Fully agreed with chx, can't have it both ways.

Also every shared hosting I've ever had (quite a lot over the years) has had a cron runner interface, this is for people who don't want to configure it. Not being able to run cron at all is mainly for people trying to do things on their local boxes behind a firewall.

jbrown’s picture

No - its not just hook_cron() implementations - its all of Drupal and contrib modules. If you call a function that can throw an exception, you must catch it or document it.

This isn't an API change. This is how to program with exceptions.

Berdirs fix isn't a hack - it is the correct solution. But the fact that it is necessary shows that there are bugs in Drupal. The exception handler is for detecting buggy code. If it is being triggered then there are bugs.

jbrown’s picture

The only time you need to worry about exceptions is if you call a function that is documented to throw them.

pwolanin’s picture

Status: Fixed » Active

again seems like moving this into the bottom of drupal_page_footer() instead of a shutdown function solves the problem?

I don't want exceptions just to be caught I want them to be logged too!

jdleonard’s picture

subscribing

Frank Ralf’s picture

#706608: system_update_files_database() called from update_get_projects() breaks system table on cache clear had been marked as duplicate of this thread but I think both issues aren't related. Can someone more knowledgeable than I please have a look?

tia
Frank

EDIT
The automatic cron run in connection with clearing the cache seems to set the whole status column in the system table to 0. Please see the above mentioned thread for details.

chx’s picture

We can't remove exceptions from core this late (I wish we could). Could we better document them? Maybe. Until someone writes that documentation patch, however, I can't care less. This issue is about what do we do when a bug hits and right now the 'run screaming' is the best we can do. The issue title says so. jbrown, when you commented the exceptions issue you clearly did not read the original post and this time you missed reading the title. Please be more careful with your posts. I unpublished a few really off topic comments from here, we have 166, let's not litter this poor issue with more.

Berdir’s picture

@jbrown
I absolutely agree with you. (no idea what happened with your comments though :)) Yes, in a perfect world, every function should catch the exceptions that could be thrown or declare that they can be thrown. Languages like Java even require that but PHP does not and there is no way to to the latter except in api docs (there is no throws statement).

Reality is that every db_* function and every field_* function can throw exceptions. So pretty much *every* function can throw exceptions but almost none handle them. That's the same for Drupal 6, they just don't throw exceptions there but PHP warnings and return FALSE. but that is in most cases not checked. This is currently pretty much by design (and while being discussed) and we have to live with that for now.

My proposed change does not fix anything at all. But it makes it *possible* to fix bugs. All it does is change

Fatal error: Exception thrown without a stack frame in Unknown on line 0

(which is hell to debug)
to

FieldException thrown within the exception handler. Message: blabla on line 1090 in file .../includes/common.inc

and the same for register_shutdown_function() functions.

And while cron is the biggest shutdown function that does call hooks and what-not, there are many other shutdown functions too and we've seen the above fatal error all over the place.

Berdir’s picture

Ok, I opened #710142: Handle exceptions in shutdown functions. Can we continue there?

I don't want exceptions just to be caught I want them to be logged too!

That's easy possible, we can log them when we catch them. The current patch doesn't that yet, but my example code above would. It just needs some more code and the patch in the linked issue is just a proof of concept for now.

David Strauss’s picture

I'm not sure if this has been suggested yet (I'm at least 50 comments behind here), but couldn't we implement our own pseudo-shutdown system that runs at the end of index.php?

pwolanin’s picture

@David Strauss - yes - this is essentially what I was suggesting by putting this call at the bottom of drupal_page_footer()

Can someone explain why this needs to be in a shutdown function at all?

Gábor Hojtsy’s picture

The plan was to let Drupal do all processing (and tell the client it did), so the browser can close the connection and PHP can run still. A shutdown function seemed most logical for that. Is it ensured that no other output is presented after drupal_page_footer()?

pwolanin’s picture

@Gabor - in that function we collect all the content that might go into the page cache and save it - otherwise we flush the output buffer. So I think it's pretty certain that we expect no other content to go to the browser.

David_Rothstein’s picture

Title: cron in shutdown is undebuggable » Running cron in a shutdown function breaks the site
FileSize
799 bytes

This issue appears to be the cause of #706608: system_update_files_database() called from update_get_projects() breaks system table on cache clear - i.e., the first time cron is run via this shutdown function method, it totally breaks your site.

As per my comment on that issue (http://drupal.org/node/706608#comment-2603874), I think this is due to http://php.net/manual/en/function.register-shutdown-function.php which explains the effect of shutdown functions on the working directory of a script, in particular:

Note: Working directory of the script can change inside the shutdown function under some web servers, e.g. Apache.

I'm finding that this is the case. Any time system_rebuild_module_data() gets called in the shutdown function - which happens as a result of update_cron(), for example - the filesystem scan doesn't find any modules because it is looking in the wrong directory. The attached patch is a workaround for the problem, but obviously not a real solution.

It seems to fix this, we need to either fix Drupal to always use DRUPAL_ROOT correctly, or move this out of the shutdown function (as is already being discussed above).

jurgenhaas’s picture

subscribing

catch’s picture

Status: Active » Needs work

Regardless of ugliness, there's a patch here.

Frank Ralf’s picture

Issue tags: -Performance +alpha blocker

Changing the tag to "alpha blocker" as this is by no means only a performance issue.

catch’s picture

Status: Needs work » Needs review
FileSize
956 bytes

Here's the drupal_page_footer() approach, untested.

Berdir’s picture

Just FYI, we can probably move cron to that function but we can't call _drupal_shutdown_function() from there (see other issue), I've already tried that. Doesn't work since we need shutdown functions also during install and probably other places as I was unable to install and it broke randomly on a installed site too with class not found fatal errors.

Frank Ralf’s picture

@Berdir
#708832: Fatal error: Class 'DrupalQueue' not found might be related and/or a duplicate of this issue.

Damien Tournoud’s picture

The correct fix is to use DRUPAL_ROOT consistently. The introduction of that variable was done precisely to workaround this issue.

rfay’s picture

Subscribing. I've had a pile of complete destructions on the D7 db in recent weeks. Never can find a smoking gun.

Frank Ralf’s picture

Getting worse:

The most current HEAD version not only sets most of the status column of the system table to 0 when running cron as before but actually deletes all those entries and leaves only one "profiles/standard/standard.profile". Clearing the cache kills the installation as before.

tstoeckler’s picture

I can confirm that currently HEAD is not usable when running cron. Which is pretty bad, because we have poormanscron enabled by default in core. And also because, well, not running cron is kind of bad.

Berdir’s picture

A pretty simple workaround is to disable update.module for now, you don't that that anyway at this time :)

Frank Ralf’s picture

Thanks for the hint. Perhaps then the update module shouldn't be activated by default in the standard installation profile for the time being ;-)

webchick’s picture

Damien, could you provide more details on your comments at #180?

And/or could someone who's having this issue test the patch at #177 instead of repeating how urgent it is? :P

catch’s picture

@webchick: see http://drupal.org/node/566494#comment-2603896

I'm not sure we can rely on every module which does anything in cache clear using DRUPAL_ROOT consistently just in case this happens. This took so many hours of debugging to track down, across 3-4 separate issues, so it's not really in the "don't babysit broken code" category, it's in the "obscure PHP bugs that bite us in the arse" category. Especially when the only reason we're doing this is f&^"%£$£ poormanscron

Or at least, I think we could go with the current patch, then look at putting the shutdown back. Or like I suggested before, rollback auto-cron completely until we can do it without breaking everything.

Berdir’s picture

Since the other patch was commited, could we now do something like

chdir(DRUPAL_ROOT);

In _drupal_shutdown_function() before calling the actual shutdown functions?

David_Rothstein’s picture

FileSize
963 bytes

The patch in #177 no longer applied - here is a reroll that does. I've tested that it fixes the bug.

While it's probably true that fixing the DRUPAL_ROOT issues is a good idea in its own right, that doesn't sound like a critical bug, whereas this one is. What is the rationale for leaving the cron run in a shutdown function at all? As I understand it, shutdown functions are useful when you want to guarantee that some particular code runs on a particular page request. In this case, though, we don't really care which page request cron runs on -- if someone happens to hit the stop button while their page is still loading, then that page request won't trigger cron, but the next page request will... so who cares?

webchick’s picture

Status: Needs review » Fixed

Ok, committed #189. Are we back to double-page loads on every page request again, now? Or can this silly issue be marked fixed once and for all? ;)

catch’s picture

No this should be harmless unless you're the unlucky user who gets to run cron every three hours.

I think we need a new issue to fix DRUPAL_ROOT/shutdown craziness though.

And possibly yet another new issue to put this back in a shutdown with that ob_flush() trick if someone really wants to take that on.

Status: Fixed » Closed (fixed)
Issue tags: -webchick's D7 alpha hit list

Automatically closed -- issue fixed for 2 weeks with no activity.