Title says it all. I'd like to see this module in D7 core.

CommentFileSizeAuthor
#127 drupal-cron.127.patch12.88 KBsun
#125 drupal-cron.125.patch10.49 KBsun
#122 drupal-cron.122.patch10.21 KBsun
#121 drupal-cron.119.patch10.26 KBsun
#119 331611-drupal-poorsman-cron_22.patch9.01 KBjoshmiller
#117 331611-drupal-poorsman-cron_21.patch8.82 KBTheRec
#116 331611-drupal-poorsman-cron_20.patch8.1 KBTheRec
#110 331611-drupal-poorsman-cron_19.patch7.75 KBTheRec
#108 331611-drupal-poorsman-cron_18.patch7.97 KBjoshmiller
#107 331611-drupal-poorsman-cron_17.patch7.71 KBTheRec
#103 331611-drupal-poorsman-cron_16.patch6.43 KBTheRec
#100 331611-drupal-poorsman-cron_15.patch6.39 KBTheRec
#98 331611-drupal-poorsman-cron_14.patch6.52 KBTheRec
#97 331611-drupal-poorsman-cron_13.patch5.02 KBTheRec
#88 331611-drupal-poorsman-cron_12.patch5.01 KBTheRec
#78 331611-drupal-poorsman-cron_11.patch4.92 KBTheRec
#76 331611-drupal-poorsman-cron_10.patch4.92 KBTheRec
#74 331611-drupal-poorsman-cron_9.patch4.92 KBTheRec
#72 331611-drupal-poorsman-cron_8.patch4.5 KBTheRec
#62 331611-drupal-poorsman-cron_7.patch4.04 KBTheRec
#59 331611-drupal-poorsman-cron_6.patch3.83 KBTheRec
#58 331611-drupal-poorsman-cron_5.patch3.82 KBTheRec
#55 331611-drupal-poorsman-cron_4.patch4.02 KBTheRec
#48 331611-drupal-poorsman-cron_3.patch3.23 KBTheRec
#46 331611-drupal-poorsman-cron_2.patch3.18 KBTheRec
#45 331611-drupal-poorsman-cron.patch3.16 KBDamien Tournoud
#23 drupal.cron_.patch5.05 KBsun
#23 cron.png13.74 KBsun
#23 install.png11.16 KBsun
#23 maintenance.png13.87 KBsun
#20 drupal.cron_.patch5.26 KBsun
#14 drupal.cron_.patch5.52 KBsun
#12 drupal.cron_.patch5.79 KBsun
#11 331611_yeah.patch2.94 KBRobLoach
#10 drupal.cron_.patch2.75 KBsun
#8 331611-poormanscron-into-core.patch2.02 KBDamien Tournoud
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sun’s picture

Title: Move into core as Cron module » Move Poormanscron into core as Cron module
Project: Poormanscron » Drupal core
Version: 7.x-2.x-dev » 7.x-dev
Component: Code » other
lilou’s picture

Category: task » feature
RobLoach’s picture

Seeing how Poormanscron is the 11th most popular module and it helps make maintaining Drupal on shared servers easier, it makes sense to bring it into core. Personally, I'd rather not have it as an entirely new module since it's so small. I'd rather have it as a part of system module as a small checkbox/combobox in the Performance settings.... But, that's just me and my experience with Ping module talking....

sun’s picture

Title: Move Poormanscron into core as Cron module » Move Poormanscron into core

After further discussion in IRC, it would make sense to move it into system.module, along with configuration settings on admin/settings/site-maintenance

nevets’s picture

Only if it's off by default and the "checkbox" makes it clear what selecting it means.

webchick’s picture

I don't personally see the harm in enabling it by default, and letting advanced users who know how to setup cron jobs disable it and say "don't monkey with mah cron."

Searching for "type:forum cron" comes up with 297 x 10 posts. I think it's safe to say that cron befuddles users, and having Drupal take care of this itself would be an important usability improvement, and reduce the support burden on our community for all the various platforms, hosting companies, etc.

That said, I haven't looked at the code of poormanscron to know if it's remotely core-worthy. :P

webchick’s picture

Further, I see this as analogous to core shipping with .htaccess files.

In a high-performance, production site you would want to turn off Apache's scanning for .htaccess files and move this functionality into httpd.conf for both performance and security. However, when you're on a dippy shared hosting account, you don't really care, and furthermore probably don't even have the ability if you wanted to.

This is a "nicety" of having core ship with the stuff needed to make it work out of the box, but still leaving in the ability for advanced users to perform tweaks for their setup.

Damien Tournoud’s picture

Title: Move Poormanscron into core » Add a poormanscron-like feature to core
Status: Active » Needs review
FileSize
2.02 KB

Here is a starter patch. It doesn't need to be any more complicated than that in my opinion:

- we have a simple checkbox in admin/settings/site-maintenance
- cron is registered as a register_shutdown_function, *after* session has been saved, so no need to do funky stuff to avoid error messages to be shown to random users
- a variable cron_safe_threshold (defaulting to 3 hours) is available for contrib to build onto (so poormanscron will be able to add configuration settings)

catch’s picture

For advanced users who don't want cron switched on when they install, we can add a variable set to expert.profile to uncheck the checkbox. I wasn't keen on the idea of this, but the way it's being done, and the fact it's 5 lines of code with a variable check makes me reverse that position.

sun’s picture

FileSize
2.75 KB

- Reworded variables and configuration setting.
- Added catch's suggestion to expert profile.

The only thing that bugs me with this patch is that cron is not invoked for anonymous users if caching is enabled (which is most probably a common case for a majority of sites).

RobLoach’s picture

FileSize
2.94 KB

Making it a select box doesn't cruft up the interface that much. Also allows us to get rid of the cron_safe_run variable in place of the actual threshold.

http://imagebin.org/30865

sun’s picture

FileSize
5.79 KB

- Added advanced configuration instructions.
- Added fieldsets on the maintenance page.

Does not yet include RobLoach's simplification in #11.

Damien Tournoud’s picture

Status: Needs review » Needs work

Oh my, no!!!!

What happened to the "one checkbox only" concept. I agree with Rob Loach that we can have a dropdown select. But the configuration instructions clearly don't have anything to do here.

sun’s picture

Status: Needs work » Needs review
FileSize
5.52 KB

Now simplified.

The remaining question is whether we want to support Windows in the advanced cron setup help text.
And of course, the anonymous users issue mentioned in #10.

catch’s picture

Status: Needs review » Needs work

The configuration instructions should be in hook_help, would just a link to http://drupal.org/cron do?

Dave Reid’s picture

Why not just an additional sentence to the #description of the option: "To run cron from outside the site, see the online handbook entry for configuring cron jobs."?

sun’s picture

well, the idea was to provide the cron job command that works for the current site - instead of

cd (your-document-root-here); ./sites/drupal.sh http://(your-conf-site-name-here)/cron.php?cron_key=(your cron key here)"

and making this also accessible to (advanced) site admins who disabled Help module. Just remembering this command is hard, inserting the proper values for the Drupal site in question makes it even harder.

Bojhan’s picture

Status: Needs review » Needs work

It sounds like a good idea, but technical implications?

Can you show me a screenshot of this on the admin page?

swentel’s picture

Mmm, I'm with nevets on this, please don't enable this by default. Why not make this also available on the install page ? I mean, when I develop, I always turn of the update module, I *really* don't need this when developing a site and I *really* don't want to remember each time on a fresh install to turn of this cron or use the expert profile, because I'm going to forget it a lot. My two cents.

sun’s picture

Status: Needs work » Needs review
FileSize
5.26 KB

Removed that advanced help text (separate issue, if someone takes it up; I'll certainly +1 it).

Now also runs for anonymous users. If I'm not mistaken, this would not be possible without our new shiny registry. :)

Damien Tournoud’s picture

Status: Needs review » Needs work

Cron implementations need a fully bootstrapped drupal. So registering the function in LATE_PAGE_CACHE stage will probably break those.

Damien Tournoud’s picture

Oh, and *big* -1 for the fieldset. That's just ugly.

sun’s picture

Status: Needs work » Needs review
FileSize
13.87 KB
11.16 KB
13.74 KB
5.05 KB

Now performing full bootstrap for anonymous users if we need to run cron.

I'm unsure whether we want to duplicate that code in bootstrap.inc and common.inc - or whether we rather want to do variable_set('cron_run_needed', TRUE) in bootstrap.inc along with a variable_get(...) + variable_del() in common.inc.

Also removed the fieldsets.

Screenshots of install.php and the settings page attached.

Note that this will be my last patch for this issue.

EDIT: cron.png shows how the advanced cron configuration hint looked before in #14. maintenance.png is how it looks with this patch. Sorry.

chx’s picture

No. Big, flaming no. cronjobs are slow and you will see "mysterious" big slowdowns in your site ... occassionally. Just because poormanscron is popular it does not mean it's a good idea. Are we about to add FCKeditor and CAPTCHA and then horrible flawed idea of Content Templates (Contemplate) just because they are popular? Do not be ridciulous.

catch’s picture

Status: Needs work » Needs review

chx: while I also dislike poormanscron (sorry Rob), if this is in a register shutdown function, then don't we escape the random slow requests issue?

When users don't set up cron themselves (and I'll admit to forgetting to do so for a couple of days when I've migrated my own sites between servers too, I'm sure I'm not the only one), they'll get much worse effects like search index not updating, watchdog and sessions tables never, ever truncated, update status never updated - this at least provides a failsafe for this garbage collection which can otherwise bring down sites and lead to insecurity if it's never done.

The difference with wysiwig, captcha and content templates is they all require a lot of code which needs to be maintained, whereas this is a 5k patch, half of which is forms and code comments. It also removes the only warning we present to users when they install, making a very basic first/test Drupal install more self-contained.

sun’s picture

Obviously, I agree with catch and webchick. However, please note that fresh installed Drupal (7?) sites won't see this cron warning anymore, since we added drupal_cron_run() to install.php to ensure that Update module fetches release information directly after installation. So the cron warning upon installation issue is not an argument for this patch anymore.

catch’s picture

@sun, that's true, was my patch too ;) - but they'll still get prompted after a couple of days. Now that's a good thing, but if we're going to have sites set up without even a mysql database then it reduces yet another basic requirement to get a working install running. btw, I agree with having it along with update status in the install profile options.

smk-ka’s picture

Status: Needs review » Needs work

@catch
No, the use of register_shutdown_function() doesn't close the connection, ie. the page will be "loading" until cron has finished its work. The PHP manual contains an alternative solution using a 1-pixel image to avoid this kind of issue, but the proposed solution won't work.

RobLoach’s picture

What if we had a singleton's deconstructor check if the cron should run?

Damien Tournoud’s picture

Come on guys, we are talking about *one request* every *three hours*. There is no point in building a complex solution for such a small impact. What we can do is flush() the buffer to the user, so that the page will get displayed in the browser as soon as possible, even if the request is not fully closed yet. It should remain unnoticed by most users.

@Robloach: that would not work: we need objects to be available inside cron runs (ie. the database layer, etc.).

kika’s picture

I'm a bit puzzled on cron configuration settings: If I do not know what cron is and/or I am not able to set up a "proper" cron, how am I able decide on what frequency cron should run? Would it make sense to have just a checkbox "[ ] run cron in every 3 hours" (if this happens to be the sweet spot default) and let the fine-tuning be done using proper setup?

catch’s picture

That was in the original patch, and it makes sense that people who can't configure cron probably don't know how often to run it either - so would be good to go back to the checkbox (but leave it variable).

gpk’s picture

Dave Reid’s picture

Issue tags: +Usability, +cron
karschsp’s picture

subscribe

tgeller’s picture

Add my voice +1 to having cron turned on by default. The arguments against it are for the benefit of experienced developers and admins, who (a) are a tiny minority of the Drupal-using population, and (b) are more likely than the average bear to be able to find the checkbox and turn it off.

For the vast majority of Drupal users -- and more importantly, potential Drupal users -- the lack of ready-and-running cron out of the box has enormous effects, as have been covered by other commenters. Turning it on by default will reduce one more barrier to Drupal's growth.

anotherdave’s picture

+1 for leaving Cron on by Default. Possibly having the checkbox as:

  • [x] Run Cron every 3 hours (Recommended)

I'd say it's extermely helpful to people starting out, especially when you're talking about making D7 more user-friendly. (Spoken as someone whose knowledge of Cron jobs started with finding out that they hadn't run on my new Drupal site & working backwards from there…) & you also have the shared-hosting packages that let you nowhere near configuring Cron on the server.

arhak’s picture

+1 for an out of the box cron (as default, which advanced user might configure)

nevertheless be aware, that it might lead to even more issues due to "blank pages" (AKA "the white screen of death") as soon as the user enable a module with a hell of cron hook (as it seems to be happening a lot with search_cron getting timeout)

I suggest to test this feature with a module having an infinite loop into it's cron hook (also keeping an eye on #154043: Cron should not remain monolithic (Implement a scheduling hook for cron))

EvanDonovan’s picture

Out of the box cron might be good if there were something like a "shared host" install profile. However, I've had enough problems with interminable crons to sympathize with chx's response in #24. If this really will be "invisible to most users", as Damien Tournoud says in #30, then it would be a good feature to have, but we won't really know whether it would be invisible until we see use cases with people who have dozens of modules enabled that implement hook_cron().

moshe weitzman’s picture

I agree with chx - this does not belong in core. It is a bad practice (random slowdowns) that is easily remedied. When we add this to core, we endorse this approach. Instead, lets add a link to http://www.drupalcron.org/ or to poormanscron right from our dics and UI ... FYI, I am the author of poormanscron and I detest it.

EvanDonovan’s picture

I never heard of drupalcron.org before - that sounds like a much better idea! Would that site work with the cron API key in 7.x?

Before that site gets mentioned in the docs though, its homepage should probably be proofread :)

UPDATE: I read the admintools project page (http://drupal.org/project/admintools). Looks like it does support the cron_key requirement.

Damien Tournoud’s picture

Issue tags: +d7uxsprint

Marking as one issue to tackle for the d7uxsprint.

webchick’s picture

Given that even the author of poormanscron thinks this is a horrible idea, and given that cron() is the place where we perform expensive operations for a reason, I'm inclined to won't fix this.

The *primary* justification for putting this in core is because search module appears obviously broken unless cron is running, but catch has an interesting patch at #504012: Index content when created which will help address that.

There's also the point of watchdog/sessions blowing up to astronomical proportions if cron isn't running, but the status log does warn you after a week or something if cron hasn't executed, which should help with that.

Also, -1 to recommending a resource in core that's not *.drupal.org.

sun’s picture

Status: Needs work » Closed (won't fix)

You forgot to change the status.

Damien Tournoud’s picture

Status: Closed (won't fix) » Needs review
FileSize
3.16 KB

Reopening, on a different technical ground: we now output a transparent 1x1 GIF image to the browser, inside the page closure.

TheRec’s picture

Updated Damien Tournoud's patch after reviewing it on IRC with him and sun.

  • Fixed a bug in IE6 with empty GIF when token was not valid or with anonymous users (see next point).
  • Skip the cron token check for anonymous users.

sun also brought to attention that doing this will reveal the cron token and thus void its use... might this be a necessary evil.

So those facts make me think this feature should most likely not be enabled by default and should be offered and explained in the "Status report", when cron never ran or did not run recently.

Damien Tournoud also mentioned that the cache (browser and/or Drupal?) should be "skipped". I did not experience any trouble related to cache when testing the patch.

Status: Needs review » Needs work

The last submitted patch failed testing.

TheRec’s picture

Status: Needs work » Needs review
FileSize
3.23 KB

Oops.. seems like my CVS client did not want to use the path... this one should be better, at least I can apply it locally.

chx’s picture

Status: Needs review » Needs work

Execute the cron if the token is valid or if user is anonymous

we just removed the ability for random users to run cron. dont add it back.

TheRec’s picture

OK, then if this cannot be considered as a necessary evil (as mentioned in #46), I do not see a solution to this, because running the cron only when logged users visit the site does not make more sense.
We are in a specific situation where compromises has to be made, because the cron cannot be run externally (by a crontab for example). When the new setting brought by this patch is set to "Never", the cron will work as expected, only run with a valid token... so yeah maybe this should be clear for the user that anonymous users will be able to run the cron if they use this feature... or maybe this feature is just incompatible with current philosophy about cron ;)

chx’s picture

you want an access callback which is not TRUE but checks for cron_safe_threshold being non zero. Currently regardless whether it's enabled or not, cron can be fired by anonymous.

gpk’s picture

function system_footer() {
  if ($cron_treshold = variable_get('cron_safe_threshold', 10800)) {
    $cron_last = variable_get('cron_last', NULL);
    if (!isset($cron_last) || (REQUEST_TIME - $cron_last > $cron_treshold)) {

2 occurrences of $cron_treshold should probably be $cron_threshold.

TheRec’s picture

chx: Oh you meant the access to the menu item, then allright ;)
gpk: Indeed, and since we're talking about names "system_execute_cron" is really close to system_run_cron. I think that something like "system_image_run_cron" would make more sense.
Also, if this feature has to be enabled by default, as the consensus seems to be, it should be set at install... this was the case in some previous patches, is it still wanted ?

Damien Tournoud’s picture

The access control check must be inside the callback itself, not in the menu definition: we want to output a GIF image no matter what.

TheRec’s picture

Well I do not really agree about outputting the GIF no matter what, we should output is when the feature is enabled only.. and we should prevent direct access to the image as chx mentioned.
Here's a patch doing this... I added also a constant "DRUPAL_DEFAULT_CRON_THRESHOLD", which will make it easier to change the value if needed. And I corrected the variable name as suggested by gpk.

I'd like to have your opinions about how this feature should be treated at install, right now it will be enabled by default and user will not be aware of it. As I said previously, in former patches, during the install, there was a checkbox under the checkbox to enable/disable Updates check... this is good but there were not that many explanation of what cron is used for (maybe we could add a link to documentation) and the options available for the user (run cron himself, change the interval, etc.). I do not think that enabling this feature without warning the user of its effect is not really a good idea... What do you think about this ?

TheRec’s picture

Status: Needs work » Needs review
Damien Tournoud’s picture

I believe we need to remove the token, and basically allow running the cron for anyone if the threshold is crossed.

TheRec’s picture

It is right... since we provide a way for anonymous users to run the cron it does not make sense to check the token for identified users. So we do no need to pass the token in the image URL and just return an image that will trigger drupal_cron_run when the threshold is crossed, but only when this feature is enabled. So I made a more generic function to check if the cron has to run automatically or not and I use it to know if we should display the image and also as menu item access callback.

TheRec’s picture

Wrapping the <img> in a <div> because it would be a direct children of <body> which is not allowed in most DOCTYPE's and thus would produce a XHTML syntax error.

cweagans’s picture

What if we made the cron key an -optional- security measure. This way, the poormanscron feature can be activated only if the cron key is disabled. It's a little outside the scope of this patch, I think, but it's a thought.

Damien Tournoud’s picture

Regarding #59: we want to always display the image (if the threshold > 0), but only call drupal_run_cron() if the cron needs to run (and it is not already running). This is because the image tag could be inserted in a cached page, so we always need to return at least a correct image.

Also, we need to call drupal_page_is_cacheable(TRUE) in system_run_cron_image() to prevent Drupal from caching this page (which would defeat its purpose).

TheRec’s picture

OK, after discussing about the cache issue with Damien Tournoud on IRC, I'm left with few questions. In my understanding, there are few issues that the solution Damien suggests will not resolve, pages which are cached without the image will not run the cron until they expire from the cache. Also, when a page will have cached the image tag (when the feature is enabled) and if the feature is then disabled, this page will produce an access denied error... there are at least two solutions I see to solve this, either strongly suggest to the users who enable/disable this feature to clear the cache after they changed the state or always output this image (even when the feature is disabled)...
I guess the latter will be rejected because the users who prefer to use cron.php do not want an image inserted in their layout by a feature they will have disabled... so we are left with the former (recommendation to the user to clear the cache). Here is an amended patch, I updated the description of the setting accordingly.
I also modified the drupal_set_header since the API changed between D6 and D7 ;)

if the cron needs to run (and it is not already running)

I did not implement what's suggested in the parenthesis, because drupal_cron_run does this check (semaphore) and if the cron is stuck for some reason I really think we should leave it report the error and not prevent the call to this function... as it would do with cron.php.

kika’s picture

Can somebody give a technical explanation why 1x1 GIF method is preferred over other possible implementations?

hass’s picture

Subscribe.

If their wouldn't be an 1x1 image the page load would become blocked for 120 seconds or more if you have job_queue and/or linkchecker and xmlsitemap and other modules are installed.

hass’s picture

UX wise - why are the caches not automatically cleared on form submit? The strong recommendation could be removed...

kika’s picture

So it is essentially a hack to create sort of a "background process" in the browser?

hass’s picture

Yes :-)

TheRec’s picture

kika: Our main concern is that the user who happens to be the one to run the cron should not have longer wait time to access the content and as stated before a shutdown function will not be fired after the content is flushed from the buffer, so to avoid this we put this image which will be a totally different HTTP request that will be issued by the client's browser when it will try to load the image, which happens AFTER the content of the original request was sent to it.

hass: Sure it is technically possible, but I'd hate to clear the cache on a heavy loaded site because an user missed the line that would warn the user that enabling/disabling this feature would clear the cache... Imagine you're running a site with thousands of pages, you're already using aggressive cache and a long expiration time because you have limited ressources, if Drupal resets your cache, you'd sure get a little bit mad... Anyways if there is a consensus that this should be done automatically then, yes it could be done.

hass’s picture

We always clear the caches if required for functionality in core... if you run update.php, theme page, modules page and others... I'm pretty sure that nobody who runs a heavy loaded and well administered site use poormans cron - *really* not :-)

sun’s picture

I am not sure whether a cron-executing-GIF image will prevent the DOM from getting into .ready() state and thereby prevent/defer JS behaviors from running. That, we should test. And that said, we might alternatively consider whether this could work as an asynchronous JavaScript request, completely in the background.

axyjo’s picture

TheRec’s picture

hass: All right, thanks for your input on caching, I am not really comfortable with Cache API (I'll have to read the documentation when I find time).

So we call clear_cache_all() upon enabling/disabling this feature. Cache is not cleared when the feature is already enabled and the user only increases or decreases (not disable) the threshold value. This leave one problem, when there is a minimum lifetime specified in the Performance settings, then clear_cache_all() will not make the cached pages expire (according to my test... as I said I do not master Cache API)... so if anyone has a solution to this, it is welcome ;)

sun: Well I tried to run some Javascript on DOM.ready(), in "scope" header and "footer" (type "inline"), the events were fired when normally (when expected) when the cron was running (I implemented a hook_cron with a sleep() call to make the process wait few seconds) and also when it was only an image (threshold not passed but the feature is enabled). This is because DOM.ready() (on modern browsers at least) is fired when the complete DOM has be downloaded and is ready to be manipulated, not when all the supplemental HTTP requests are done (in our case images).
But there is an underlying problem you will encounter, if a script uses "onLoad" event (on window or body), for example to manipulate the images on the page once they are ALL loaded (that's how onLoad works), then it will be fired only when the cron has finished to run because the GIF generated will be done downloading only when cron has finished running. I tested this too and it is a problem we won't be able to correct in my opinion.

arhak’s picture

what would be the problem for an asynchronous requests?

TheRec’s picture

arhak: We discussed it few days ago with chx on IRC and neither of us reported this here unfortunately. But it is a good idea, thank you.

After this discussion, we concluded that the solution would be to mix those two solutions (image callback and XHR). I did not want to implement only a XHR solution because Javascript will be disabled in many cases and it would be a pity to delay all the cron run because of this, especially if we have another solution (which also has its drawbacks...). So here's a simple implementation. The Only local images are allowed. solution also suffers a problem that was not mentioned before, it will not be used by users who browse the site with images disabled unfortunately, and hopefully this won't prevent us to integrate this feature.

hass’s picture

Have you tested that scope => header works?

I remember from D6 that I wondered why the hook_footer is no more able to add JS to the header and I was told it's to late in code processing... afterwards I changed Google Analytics to use hook_init for the scope header and hook_footer only for code that goes into footer.

+    drupal_add_js('(function($){ $.get("'.url('system/run-cron-image').'"); })(jQuery);', array('type' => 'inline', 'scope' => 'header'));

EDIT: I think I found it: #212560: drupal_add_js in hook_footer does not add JS files

TheRec’s picture

I tested it and it works as adverstised, the script is added in <head>. At first I wanted to put in in the "footer" scope, but since it's executed on DOM.ready() it sits better in header, and should not disrupt the loading of the page. I did not test with other DOM.ready() events, but that's why we're supposed to have more than one people testing this :P

Maybe the "bug" you mention was solved with the rewrite of drupal_add_js for D7 (and the new $options parameter) ?

I forgot to remove a value I put for the tests in the dropdown list... so I post a corrected version. Nothing else changed.

joshmiller’s picture

Status: Needs review » Needs work

This is a code syntax review for the patch at #76. Can I just say that the idea of running an asynchronis image cron by default is awesome and exactly that much cooler than the current default: nothing.

  1. +  // Clear cache when enabling or disabling the cron threshold
    

    Missing period at the end of the comment.

  2. +  if(($cron_threshold > 0 && $form_state['input']['cron_safe_threshold'] == 0) || ($cron_threshold == 0 && $form_state['input']['cron_safe_threshold'] > 0)) {
    

    Control statements should have one space between the control keyword and opening parenthesis, to distinguish them from function calls. http://drupal.org/coding-standards#controlstruct

  3.  /**
    + * Default interval between automatically ran cron task in seconds.
    + */
    +define('DRUPAL_DEFAULT_CRON_THRESHOLD', 10800);
    

    Comment is confusing. Perhaps, "Number of seconds between running automatic cron tasks."

  4. +/**
    + * Implement hook_footer().
    

    Supposed to be "Implements"

  5. +/**
    + * Menu page callback; executes the cron via an image callback.
    + */
    

    This sentence doesn't make sense. Splitting it up into two sentences helps: "Menu page callback." And on a new line, "Executes the cron via an image callback."

That's all I could find after reading through the entire patch and checking spelling and code syntax.

Josh

TheRec’s picture

Status: Needs work » Needs review
FileSize
4.92 KB

Thank you for the coding style review.

Supposed to be "Implements"

I've discussed with catch about it on IRC and he suggested to leave it as "Implement" and I agree since there is no occurrence of "Implements " anywhere in HEAD yet (when the patch that is supposed to change this will be committed, I will correct it of course... unless this one gets committed before... but seeing the number of testers of the functionality of this feature, we're not there yet ;)).

"Menu page callback;" becomes "Menu callback;" as it is done in most of the HEAD... I know they are not complete phrases but I'm trying to follow the current coding styles across core modules. If they get corrected it will be way easier to do it all at once than trying to find out what changed according to new (or WIP) "standards" and what did not.

"Default interval between automatically ran cron task in seconds." becomes "Default interval between automatic cron executions in seconds." (which includes the "default" notion that is missing in the suggested change.

I added few more dots at the end of inline comments that were missing.

arhak’s picture

@TheRec #74:
so, I browse the web with a console browser (say lynx or links)
would you expect Drupal to support my console browsers as well?

there have to be a point where you say "enough"

which kind of site would be that one just getting visited by browsers with disabled images and no javascript?
that would be a reasonable give up, I think

downgrade gracefully can't always preserve full functionality, it's also valid to silently lost features (in runtime, which might be documented in settings page descriptions, in help pages, in documentation, etc)

TheRec’s picture

arhak: So according to you, we should drop the image callback and only use XHR ? I don't see how this helps, when JS is available it will be used, and otherwise the image callback will be used if images are enabled... and yes, you are right, we can afford to wait until someone with either of those enabled visits... but what's the problem with having a fallback to Javascript ?

By the way, did anybody actually test the patch ? Seemed to me that many people were interested in this ;)
Testing should not be too difficult, just add a relatively short threshold time in the list (like "10" in this example) :
'#options' => array(0 => t('Never')) + drupal_map_assoc(array(10,3600, 10800, 21600, 43200, 86400, 604800), 'format_interval'),
Then set it in the administration (admin/settings/site-information).
And then refresh your page according to this threshold, you should notice in the reports (admin/reports/dblog) when cron has run (threshold passed) and when it has not (if you refresh before threshold was passed). You should also test if this works when Javascript OR Images are disabled (If both are disabled it will not work... well you can test that too if you want ;)).
Last test, would be playing with the cache options and see if the image/Javascript is added to cached pages after you enable this feature, and see if it is removed from cached pages when you disable it. If it does not remove it after you disabled the feature, you will get an "Access denied" error in the log... this happened to me only when I tried using "Minimum cache lifetime", because with this option, the way used to clear the cache when enabling/disabling cron threshold does clear those pages from the cache (that must be an intended behavior of Cache API). Anyways, I guess that people with the need of a "Minimum cache lifetime" should not be using this feature so we should not worry about this case in my opinon.

joshmiller’s picture

Status: Needs review » Needs work

ok... installed D7 HEAD, applied the patch, modified the system admin page so that I could set the refresh for every 10 seconds. Below are my results.

  • I like where the form is, very user-friendly
  • I went about trying to test with and without javascript and images.
    • Couldn't get but one logged cron refresh for either admin or anon
    • Since I couldn't get cron to automatically refresh, there was no way to test the patch functionality
    • This could be related to how the patch determines to load the image
  • the patch applied, but was offset on every hunk

Hope this feedback is helpful. Anyone else able to get this to work?

TheRec’s picture

It seems that you have to clear the cache through "admin/settings/performance" before it works now... before it was not behaving like this as far as I remember... any idea why ?

joshmiller’s picture

I can now confirm that clearing your cache makes the magical javascript / image appear in your footer. I have now tested with javascript disabled, and images disabled. The cache will run with either disabled, but not both. I believe this is by design.

After extensive refreshing, I couldn't get the patch to create one error... though running cron every 10 seconds is little much.

I can also confirm that cron doesn't get run except on the schedule you set. Thus, we avoid some of the problems of opening up a non-authenticated way of running cache. It will only poorman-run cache if you need it and the other cron would run EVERY time it was invoked.

I'm leaving it marked "needs work" -- but really, I have also reviewed the coding standards and all looks well. Just a few more to test it out and we could mark it RTBC.

Josh

TheRec’s picture

Status: Needs work » Needs review

With Images and Javascript disabled, it is indeed not going to work, by design.

I tested it again and if the patch is applied before you install Drupal, the image/javascript are added (the hook_footer implementations must be cached either during install or the first time frontpage is displayed). So it should not be possible to reproduce this bug when/if this patch would be in core.

Thank you joshmiller for your tests :)

CorniI’s picture

for faster loading of the image when the cron is run, you could move all the cron work in a register_shutdown_function() func and just output the gif image during the normal run.
So, instead of

+  if (!isset($cron_last) || (REQUEST_TIME - $cron_last > $cron_threshold)) {
+    drupal_cron_run();
+  }

you would write

if (!isset($cron_last) || (REQUEST_TIME - $cron_last > $cron_threshold)) {
register_shutdown_function('drupal_cron_run()');
}
This would imho remove the js, too, as it's not needed anymore.
tic2000’s picture

Status: Needs review » Needs work

After #519782: Change $closure to become a hidden region like page_top got in

+/**
+ * Implement hook_footer().
+ * 
+ * Adds triggers to run the cron via an image callback when the feature to run
+ * cron automatically is enabled. The image will always be loaded even when the
+ * threshold is not yet crossed to prevent troubles which could occur
+ * if the page is cached.
+ */
+function system_footer() {
+  if (_system_run_cron_image_access()) {
+    // GET the image callback with XMLHttpRequest, it will work only
+    // when Javascript is available.
+    drupal_add_js('(function($){ $.get("'.url('system/run-cron-image').'"); })(jQuery);', array('type' => 'inline', 'scope' => 'header'));
+    // Image callback as HTML code for a graceful degradation when Javascript
+    // is not available.
+    return '<noscript><div><img src="' . url('system/run-cron-image').'" alt="" /></div></noscript>'; 
+  }
+}

will no longer work.

Instead you should do

/**
 * Implement hook_page_alter().
 */
function system_page_alter(&$page) {
  if (_system_run_cron_image_access()) {
   $page['page_bottom']['cron'] = array(
      // GET the image callback with XMLHttpRequest, it will work only
      // when Javascript is available.
      '#attached_js' => array(
        'data' => '(function($){ $.get("'.url('system/run-cron-image').'"); })(jQuery);',
        'type' => 'inline',
        'scope' => 'header',
        'preprocess' => FALSE,
      ),
      // Image callback as HTML code for a graceful degradation when Javascript
      // is not available.
      '#markup' => '<noscript><div><img src="' . url('system/run-cron-image').'" alt="" /></div></noscript>',
    ); 
  }
}
CorniI’s picture

after my previous post, i actually had another idea:
why'd you not just call register_shutdown_function('drupal_run_cron()'); when the cron is overdue?
This doesn't expose any cron key or so to the enduser nor lets them run the cron whenever they want. The problem here are the cached sites which don't hit php at all, but without sending the nocache headers for the image, that solution has that flaw, as well, so has anyone an opinion on this?

TheRec’s picture

CorniI: I don't see how the shutdown function would help, we're in another HTTP request in both case (XHR and image). And as stated a "long" time ago, shutdown function won't output the buffer before they are done, so it won't make the image display faster... if it would then why would we be using an image callback or XHR to create another HTTP request ? It also means there is no reason to remove the Javascript of course.

tic2000: Thank you, here's an updated patch. I'd just like to mention that I had to tweak the code your wrote... it seem there is an error in the current documentation about #attached_js... which made me scratch my head few minutes ;)

TheRec’s picture

Status: Needs work » Needs review
tic2000’s picture

Yes you are right. I wrote the code without looking at the documentation, and as it shows, I'm not that familiar with it yet :)

TheRec’s picture

Well... since the documentation itself is not updated and the conversion guide contains the same error, I guess you have a good excuse ;) Thanks for the help with the rest of the code :)

tic2000’s picture

You're welcome. I only reviewed the code, did not test it so I can't RTBC.

moshe weitzman’s picture

Thats a clever idea but I think PHP does not always behave as you expect when you are in a shutdown function. Error handling is different, for one. See http://us3.php.net/register_shutdown_function. I'd be happy to be proven wrong though.

TheRec’s picture

Again, we are not using a shutdown function.

chx’s picture

folks, you should get on with times. register shutdown functions are part of the request since 4.1.0. Not to mention that this patch does not use them -- exactly because they can't be used for what you want...

catch’s picture

Status: Needs review » Needs work
+ * Implement hook_footer().

This needs updating to match the code.

TheRec’s picture

Status: Needs work » Needs review
FileSize
5.02 KB

OK, corrected.

TheRec’s picture

Added tests to check if triggers are added when the functionality is enabled and not added when it is disabled.

catch’s picture

Status: Needs review » Needs work
+/**
+ * Adds triggers to run the cron via an image callback.
+ * 
+ * The triggers are added only when the feature to run cron automatically
+ * is enabled. The image will always be loaded even when the threshold is
+ * not yet crossed to prevent troubles which could occur if the page
+ * is cached.
+ */
+function system_page_alter(&$page) {

This needs "Implement hook_page_alter().
Possibly some of the phpdoc could move to inline comments too - if system_page_alter() starts to do more stuff, then that documentation header is going to get very crowded.

TheRec’s picture

Status: Needs work » Needs review
FileSize
6.39 KB

Indeed. Rewrote that part, and since it's a hook implementation, let's leave all those comments inline :)

cweagans’s picture

+/**
+ * Menu callback; executes the cron via an image callback.
+ */

I think this should read 'executes cron via an image callback'

+  // Run the cron automatically if it has never ran or threshold was crossed.

How about 'Run cron automatically if it has never been run or the threshold was crossed.'

+   *  Ensure that triggers to run cron automatically are added only when this
+   *  functionnality is enabled.

'Ensure that the triggers to run cron automatically are added only when this functionality is enabled' (spelling in 'functionality', added a 'the' before triggers)

Patch applied with some offset. Dunno if it's absolutely needed, but reroll for cleanliness' sake?

Other than the semi-minor nitpicks above, I'd say RTBC.

Damien Tournoud’s picture

Status: Needs review » Needs work

We are getting there! A few places are still a bit off:

+  $items['system/run-cron-image'] = array(
+    'title' => 'Execute cron',
+    'page callback' => 'system_run_cron_image',
+    'access callback' => '_system_run_cron_image_access',
+    'type' => MENU_CALLBACK,
+  );

There is no reason to have a _ prefix here.

+      // Image callback as HTML code for a graceful degradation when Javascript
+      // is not available.
+      '#markup' => '<noscript><div><img src="' . url('system/run-cron-image').'" alt="" /></div></noscript>',

This badly needs to be a theme function (and at the very minimum, we need to add a class to this div.

+  $cron_last = variable_get('cron_last', NULL);
+  $cron_threshold = variable_get('cron_safe_threshold', DRUPAL_DEFAULT_CRON_THRESHOLD);
+  if (!isset($cron_last) || (REQUEST_TIME - $cron_last > $cron_threshold)) {
+    drupal_cron_run();
+  }

You also need to check if variable_get('cron_semaphore', FALSE) is not TRUE. If you don't, all the requests to the cron image will result in a Attempting to re-run cron while it is already running. message being output to the watchdog.

Nearly there, keep up the good work!

TheRec’s picture

Status: Needs work » Needs review
FileSize
6.43 KB

Corrected (also few other occurrences of "run the cron").

cweagans’s picture

Status: Needs review » Needs work

Nope, cross posted. You missed the stuff in #102.

TheRec’s picture

I agree with your first two remarks Damien and I have fixed them... but before I post another patch, maybe you could clear up the third one. Why should we check cron_semaphore ? It would void the use of this variable in drupal_cron_run(), it is used report when cron execution was stuck or if two cron calls were too close.
And since the call to this function happens only when cron_threshold was crossed (or if cron have never ran), it won't generate the error you're talking about every time we display the image, only every time drupal_cron_run() is called, as it would do with a normal cron execution (which is good, when this error occurs the administrator should see it in the log... even if he is using the automatic cron feature).

Damien Tournoud’s picture

@TheRec: as soon as the cron_threshold is crossed *and* the cron is running (and it will probably take a long time since it hasn't run for bit), each hit to the image url will result in Attempting to re-run cron while it is already running being printed to the watchdog.

TheRec’s picture

Status: Needs work » Needs review
FileSize
7.71 KB

Ok, I get what you mean. So as we discussed on IRC, I implemented a variable named "cron_threshold_semaphore", if it is set then we won't run cron for the next hour (3600 seconds), this is to wait for the cron to finish or if it is still "stuck" after that delay, add a warning in the log, and give it another chance (release the semaphore). This will avoid filling the log on every image request as Damien suggested and keeps a periodic error (every hour) so administrators are still able to spot that there was an error with cron.

joshmiller’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
7.97 KB

Applied with fuzz, so re-rolled and all is happy...

Same code as #107 -- which I have also reviewed once again and worked with in the Drupal interface and have found no problems.

I'm going to be bold and mark it RTBC. Let the flame wars begin ;-)

hass’s picture

I think there is one code style issue.

'(function($){ $.get("'.url('system/run-cron-image').'"); })(jQuery);' => array('type' => 'inline', 'scope' => 'header'),

I think this need at least to be

'(function($){ $.get("' . url('system/run-cron-image') . '"); })(jQuery);' => array('type' => 'inline', 'scope' => 'header'),

Or maybe more better

'(function($){ $.get(' . drupal_to_js(url('system/run-cron-image')) . '); })(jQuery);' => array('type' => 'inline', 'scope' => 'header'),
TheRec’s picture

Good find hass, thank you... I did not know about drupal_to_js :)

joshmiller’s picture

Status: Needs work » Reviewed & tested by the community
joshmiller’s picture

Status: Reviewed & tested by the community » Needs work

Can anyone confirm that all of #102 and #106 has been addressed? (no time on my end, just wanted to point it out...)

joshmiller’s picture

Just took 5 minutes and found that all 112 comments have been addressed in one form or another. In fact, it's interesting that webchick and catch both liked sun's implementation.

The current implementation builds on top of the approved direction by allowing anonymous users to trigger cron as a separate process from the one that is loading the webpage by javascripting an image. In the case that javascript is turned off, we have a <noscript> tag that will load the proper html.

Additionally, we have a cool $threshold_semaphore conditional that gives any current cron process an hour to finish before throwing a "cron is already running" error.

This is wonderfully simple and should stand the test of time. Back to RTBC.

Webchick?

Josh

Dries’s picture

- I think we should better document the 1x1 pixel solution in code comments.

- Instead of using variable_get('cron_threshold_semaphore'), shouldn't we be using our new locking framework?

Other than these two things, this looks RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Needs work

Marking needs work per Dries's #114.

TheRec’s picture

Status: Needs work » Needs review
FileSize
8.1 KB

@Dries: I am not sure about the locking system (maybe because I've never used it yet), looking at the followups it seems that there will already be a lock implemented for each cron implementation according to #553600: Apply locking framework to cron runs. Wouldn't it be a bit too much to have a "master" lock on those locks ?

Waiting to clear up this, I've updated the comments of system_run_cron_image() to describe the use of the image callback.

TheRec’s picture

Damien Tournoud reviewed the tests of this patch and we worked on them together to make them more useful... they now test the correct executions of cron using cron_last variable and the fact that when cron ran it gets updated with the request time. Checking wether the content includes the exact trigger string is not a reliable solution, the only thing it ensured was that the code did not change, so those tests are gone and now we test the functionality correctly.
To trigger the cron execution via the image callback, we get all the images of the page and follow their path with $this->drupalGet, it would be possible to query only the desired image with xpath, but Damien advised me to do it this way :)

The rest of the patch is unchanged... the function header of system_run_cron_image should still be reviewed as clarifiction was requested by Dries.

Damien Tournoud’s picture

This looks good to me. It makes more sense for the rafactoring to use the new locking framework to happen in the other issue.

joshmiller’s picture

I changed TheRec's original system_run_cron_image () comment from this:

+/**
+ * Menu callback; executes cron via an image callback.
+ *
+ * We render a transparent 1x1 image whose path will allow creating a valid
+ * HTTP request that will not prevent the original request to finish correctly,
+ * so we do not disturb the user who made the original request. During this new
+ * request, cron will run if it has not run for an amount of time defined in
+ * the administrative settings.
+ */

To this:

+/**
+ * Menu callback; executes cron via an image callback.
+ *
+ * A transparent 1x1 image is rendered to activate a special path that will
+ * run cron. An image is used to avoid disturbing the user's primary HTTP
+ * request. During this new request, cron will run if it has not run for an
+ * amount of time defined in the administrative settings.
+ */

I believe we have addressed Dries' comments (also see Damien's comment regarding locking). I'll wait for the bot to return and then mark it back RTBC.

Josh

TheRec’s picture

It's maybe a nitpick, but "activate a special path" does not sound pretty much the same as "will allow creating a valid HTTP request"... the function we are describing here is not "activating" anything on itself... it has to be activated from another place (i.e. system_page_alter()) all we do is provide a way to activate. I don't know what was wrong in that comment header I added in the patch in #116, it was made to address Dries request for more clartiy about the transparent image :)

sun’s picture

FileSize
10.26 KB

Re-rolled with

- Slightly renamed define DRUPAL_DEFAULT_CRON_THRESHOLD => DRUPAL_CRON_DEFAULT_THRESHOLD.

- Proper flushing of cache + additional clarification in a new system_site_information_settings_submit(); form validation was the wrong trigger.

- Added/clarified a few comments.

- Cleaned coding-style.

sun’s picture

FileSize
10.21 KB

oopsie, cross-post with joshmiller. Merged that new PHPDoc + improved it a bit.

Dries’s picture

Should we remove the semaphore stuff from this patch and go with a high-level lock at #553600: Apply locking framework to cron runs?

Dries’s picture

"The image is used to avoid disturbing the user's primary HTTP request." is not really explanatory. I think it needs some additional explanation to drive home the reason why we use a 1x1 image.

This looks very close to being perfect though.

sun’s picture

FileSize
10.49 KB

Yes, we should apply the new locking framework here, but that should be done in the scope of #553600: Apply locking framework to cron runs.

I've rewritten the PHPDoc once again and I think it's good to go now.

Dries’s picture

One more thing: a quick CHANGELOG.txt entry, bitte!

sun’s picture

FileSize
12.88 KB

Added changelog entry and information for INSTALL.txt.

Dries’s picture

Status: Needs review » Fixed

Looks great now! Committed to CVS HEAD. Thanks.

DocMartin’s picture

Just to add as user who's on shared hosting, using poormanscron after 2 or 3 variants of code to activate cron didn't work - great to see this being added to core!

moshe weitzman’s picture

Priority: Normal » Critical
Status: Fixed » Active

I can't imagine why, but this patch adds an image request for every single page. That means two bootstraps of drupal for a single page view. Can't we just add the image when we know that cron is overdue?

TheRec’s picture

Status: Active » Fixed

See #331611-61: Add a poormanscron-like feature to core.

Mabye you have a better idea for the implementation, then speak up ;), but before I really suggest you to read all the above comments (many solutions were evaluated before coming to this compromise. If you run a site which cannot support that additional bootstrap, it surely means you should be using an external cron trigger (crontab, scheduled task, etc.) and also that you're supposed to know what you are doing (so it won't be too much trouble to set it up).

gpk’s picture

@130, 131: And also, I think, we have a full bootstrap (for the image) even on a cached page view.

If I've followed the reasoning above correctly, this is to do with the fact that the image needs to be present in any cached page response, otherwise if (anon) users are viewing cached pages and no one is logged in then even if cron is overdue it will not be triggered until either someone logs in or an anon user hits an uncached page.

A couple of options come to mind to get round this:
1. make page caching incompatible with the built-in cron feature. Then Drupal can decide on each page request whether to send the image. Although this would halve the number of page requests made to Drupal by real users with browsers (as distinct from bots), you would lose the benefit of a potentially faster page load for anon users
2. make the decision on whether to output the image in the DRUPAL_BOOTSTRAP_PAGE_CACHE phase in http://api.drupal.org/api/function/_drupal_bootstrap/7 ... if cron is due to run then we would need to prevent a cached response by adding a condition to the check here:

      // If there is no session cookie and cache is enabled (or forced), try
      // to serve a cached page.
      if (!isset($_COOKIE[session_name()]) && $cache_mode == CACHE_NORMAL) {

Would also need to make sure that when the image is included in the page, drupal_page_is_cacheable(FALSE) is invoked so that pages which include a reference to the image never get stored in the page cache.

TheRec’s picture

This automatic cron feature is made to be used by inexperimented users (that is why it is enabled by default). If you have a clue about what cron does and how it can be executed more efficiently (crontab, etc.), it can be disabled.

Now if the feature is buggy in your opinion (or not performant enough... or maybe you want that to be explained at install), I think you should open another issue with detailed problems (and solutions... and even better, if you can/want, a patch). This implementation has been reviewed by few experienced Drupal coders and has been commited, it does not mean that it's bulletproof, but most of the issues that have been brought up today have been debated before. The additionnal bootstrap required by this feature which obviously was accepted as a necessary evil (for people who do not want care to learn how to run cron)... maybe it has been overlooked by all the people involved here, but I find that very unlikely. Anyways this would be better in a new issue, now that this is in core in my opinon.

catch’s picture

Status: Fixed » Active

otherwise if (anon) users are viewing cached pages and no one is logged in then even if cron is overdue it will not be triggered until either someone logs in or an anon user hits an uncached page.

I don't see why this is such a problem. This is a failsafe, if you specific 3 hours, it'll run some time after 3 hours has elapsed. If that ends up being 5 hours, tough luck.

There's two reasons to have poormanscron in core. First to allow people to run sites without any external cron at all - so that search engine indexing and other things continue to work. If it doesn't run on the dot that's fine, it's a poor mans cron - for a proper cron job you can go set up an external cron job and have punctual cron runs anyway.

The other reason is to have a failsafe in case cron stops running for any reason - typo in crontab -e, move server and forget to set up a new job, things like that which can bring any remotely busy site to a halt as the watchdog table grows to hundreds of thousands of entries and cache tables fill up with stale data - in this case, it's also fine if it takes an extra few minutes (or even hours) to run - since that's better than not at all.

I don't think there's either situation where having an extra uncached bootstrap for every request is going to be acceptable though - it was put in an image to not affect visible performance for users - doubling the load on every site enabling this feature is going to do that anyway one way or another.

catch’s picture

Status: Active » Fixed

Status: Fixed » Closed (fixed)
Issue tags: -Usability, -cron, -d7uxsprint

Automatically closed -- issue fixed for 2 weeks with no activity.

Status: Closed (fixed) » Needs review

lojose85 queued 127: drupal-cron.127.patch for re-testing.

The last submitted patch, 11: 331611_yeah.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 127: drupal-cron.127.patch, failed testing.

David_Rothstein’s picture

Issue summary: View changes
Status: Needs work » Closed (fixed)
Neirad82’s picture

Version: 7.x-dev » 8.5.5
Balu Ertl’s picture

As this is issue is being referred on the documentation page of Trusted Host settings feature, the date of this comment thread has an important meaning: it all happened in the D7 era.

Therefore I reverted Neirad82's uncommented version change to avoid further misunderstanding.