The Issue

The scheduler module allows users to remotely trigger the lightweight cron process for publishing/unpublishing nodes. Typically this is done via curl or wget like so:

curl http://yoursite.com/scheduler/cron

What's missing from this is authentication; anybody that knows you have scheduler enabled can arbitrarily hit this URL on your site to repeatedly publish/unpublish your content. For many people in a situation where there are content embargos (e.g. you're legally obligated to NOT publish content before a certain date/time) you can run into some serious problems.

The Solution

Add a authentication mechanism that is used by Drupal core's cron system. If you visit your status page, you are given the option to remotely call cron by passing an additional cron key.

With this patch backwards compatibility is preserved for those that wish to run an insecure lightweight cron. As soon as you configure a lightweight cron key, you must pass that key in the form of:

curl http://yoursite.com/scheduler/cron/<your long random password here>

For example:

curl http://yoursite.com/scheduler/cron/dRQDtLEayfXjpcfhrF46tJbm6MDCyx3RB6nTcgWUZFnMP5Ddyxe3RMZctW5uqqQe

The reason for not using the Drupal core cron key is because that changes on each site. If you manage many sites having a consistent key reduces maintenance and dev overhead for your calling script.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

blazindrop’s picture

jonathan1055’s picture

Status: Active » Needs review
Related issues: +#1463378: Lightweight cron functionality exposed to anon users?

Hi blazindrop

Thank you for taking the time to post this. It is an interesting issue. I think there has been some discussion on this before and the reason we have left the lightweight cron open is that we did not consider additional cron runs to be particularly dangerous. No content could get published or unpublished that was not already scheduled for such an operation. You say that content could be repeatedly published and unpublised but that would not happen because once the content is published it will not be scheduled again unless an authorised user edits the node. You also mention content embargos where the user is not allowed to publish content between certain times. If that is a requirement then I would not expect the site admins to rely on simply not having cron runs during that period, they are more likely to have a custom module which does additional validation on the scheduler date fields.

However, I am open to reconsidering the situation. I'd like to hear what the other maintainers think about it.
I have a couple of points regarding the actual patch, but they can wait until we decide if this is going to be progressed.

Jonathan

blazindrop’s picture

Hi jonathan1055, very good point and provided there are no bugs or security issues down the road in _scheduler_run_cron() then there won't be any issues. The perspective I'm taking is similar to buying homeowners insurance. You know there's a highly unlikely chance that you will get burglarized so you may opt to buy the lower coverage. Should you get burglarized then you will be regretting not opting for the higher coverage amount. The higher coverage amount in this case is securing _scheduler_run_cron().

What motivates this patch is the fact that _scheduler_run_cron() modifies content state on the server side and I feel that should be secured.

Thanks for the feedback.

jonathan1055’s picture

Status: Needs review » Needs work

I think you have convinced me that it is worth doing this. After posting the scenario above I thought of the case where even if the admin had put in place extra validation for restricting dates, if they have the 'allow scheduler dates in the past' option set to publish on next cron run, the user could set a past date then run the cron themselves and get the content published when it should not. We did not have the 'allow past dates' functionality when we discussed this issue before. I think that is a strong enough case to lock down the /scheduler/cron url. Plus as you say, there could be other as yet unknown problems which could be averted by this. It is a simple change so I am for it.

Regarding the patch, the name scheduler_access() looks like it is a hook function. I have just checked and hook_access() does not actually exist but this could still cause confusion. I think _scheduler_cron_access() is a better name, and adheres to our standard of starting internal functions with an underscore.

Secondly, would it be good to add a button in the form to generate a random string for the key? If not, then we need to explain in the help that a random string needs to be invented.

Jonathan

blazindrop’s picture

I think you have convinced me that it is worth doing this.

Awesome! :)

Regarding the patch, the name scheduler_access() looks like it is a hook function. I have just checked and hook_access() does not actually exist but this could still cause confusion. I think _scheduler_cron_access() is a better name, and adheres to our standard of starting internal functions with an underscore.

Secondly, would it be good to add a button in the form to generate a random string for the key? If not, then we need to explain in the help that a random string needs to be invented.

Very good ideas and I can make those changes. The easiest option for generating a password is placing some documentation on the settings with links to a few good sites for that. I'll re-roll the patch but it may take me a couple of days. Thanks for the positive feedback.

blazindrop’s picture

Assigned: Unassigned » blazindrop
jonathan1055’s picture

Status: Needs work » Needs review
FileSize
264.47 KB
4.86 KB

Here is an updated patch with the following changes

  1. Changed the name of the callback to _scheduler_cron_access()
  2. Moved this function to beneath the lightweight form
  3. Added the current key into the crontab examples at the top of the page, via scheduler_help()
  4. Added a 'generate new key' button and a corresponding submit handler _scheduler_generate_key()
  5. Fixed error in div placement on another field in the lightweight form, to save vertical space

Here is a screenshot

The layout could be improved if the 'generate new key' button could be floated to the right of the form adjacent to the text input and description. There must be some #attributes or class that can do that.

Jonathan

jonathan1055’s picture

Assigned: blazindrop » jonathan1055
FileSize
5.09 KB

Found a bug in my code. The newly generated key was still being used to set the variable even if after generating you edited and changed the value. The line

$form['scheduler_cron_settings']['scheduler_lightweight_access_key']['#value'] = $new_access_key;

needed to be

$form_state['input']['scheduler_lightweight_access_key'] = $new_access_key;

I also added a warning message saying that a new key has been generated but is not stored until you save the configuration. New patch attached, rolled against the latest code in git (which is not the dev release of 18th Nov on the project page).

Jonathan

pfrenssen’s picture

FileSize
5.13 KB
2.78 KB

This is a worthwhile addition. The actual danger of running unauthorized scheduling jobs is quite small, but it might be a DDOS attack vector on very large sites.

I provided an access argument for the menu router entry so we can get rid of the arg() call, and removed an unneeded switch to insert a slash in case there is a key.

I tested the functionality manually and it works well. I like how the "Generate new random key" functionality is implemented, this is very user friendly and provides perfect backwards compatibility.

jonathan1055’s picture

Thanks for the review, pleased you like the 'generate' part. It took a while to get right, as I first tried generating the key in the submit handler but that was not working. Then I realised I only needed to pass back the request to generate and actually do the processing in _scheduler_lightweight_cron().

I like your change to pass the second url argument into the access function.

I did my testing both via a crontab job and manually going to the /scheduler/cron url, with the following combinations:
1. correct key in url - runs ok
2. incorrect key in url - denied
3. missing key when one is required - denied
4. no key when one not is required - runs ok
5. giving a key when no key is required - runs, ie not denied.

Do you think for this 5th scenario the access should return false? It may give the appearance that a key is being used when none is, which give a false impression of security.

Also the reason I conditionally added the '/' to the front of $access_key in the help was so that it only showed a '/' when an access key was being used. Otherwise it shows /scheduler/cron/ with no key, and I'm not sure if this syntax is accepted as a valid path in all crontab implementations. Hence it might confuse the admin who is using our help as an example.

pfrenssen’s picture

5. Is due to the way that the menu module handles routes. I wouldn't worry too much about it. The URL with the trailing slash is actually a parameter to wget, and according to the URL spec the slashes are used as a delimiter between the different components that make up a path and are ignored. For example, this is also legal and works fine: https://drupal.org////////////////node////////2054777//////////////////////

Legal or not, I agree that it does look a bit worse than before though. I'll roll a quick update to remove the trailing slash.

pfrenssen’s picture

FileSize
5.15 KB
1.56 KB

Here is a new patch which reinstates the previous behaviour where there is no trailing slash if there is no cron key.

jonathan1055’s picture

FileSize
5.11 KB
676 bytes

Thanks for fixing the help example, that's a better way to do it, using $cron_url

Regarding the scenario 5, it's a separate issue. I was thinking we should be more strict in checking the access, if we are going to secure cron we might as well do it properly. On a large site there might be more than one admin, or a someone in charge of security (who looks at the crontab jobs and thinks it is secured) when the scheduler admin has deleted the cron key. The lightweight cron continues to run so the fault is never noticed. The check in _scheduler_cron_access() should be changed from

  return empty($valid_cron_key) || ($valid_cron_key == $cron_key);

to

  return ($valid_cron_key == $cron_key);

I've tested this, and it matches the behaviour in cases 1-4, but now denies access in case 5.

pfrenssen’s picture

Assigned: jonathan1055 » Unassigned
Status: Needs review » Fixed

Looking perfect now! Committed to 7.x-1.x: commit 7f2701e7. Thanks everybody!

jonathan1055’s picture

Excellent, thank you.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mjdriver’s picture

Hello i'm a student , and i'm using drupal 7

as I can see, i have a different schedule module.

I cant get on the page /scheduler/cron. I want to setup my cron every time i published something.
Plzz help me! i need this working.

I already spent houres searching for a solution but its all so complicated.

plzz help me !!