When a cron is run with e.g.

/usr/bin/php /var/www/aegir/drush/drush.php --php=/usr/bin/php --u='1' @youralias

and thats a remote site with a remote solr ( solr is localhost for the remote ), you get an error;

WD Apache Solr: "0" Status: Request failed: Connection refused in apachesolr_cron                                                                            [error]
WD Apache Solr: No Solr instance available during indexing.                                                                                                  [error]
WD Apache Solr: No Solr instance available during indexing.    

I guess as the process is basically exectuted on the aegir master, that php process trys to connect on the "localhost solorport" on the aegirmaster, instead on the remote. I rather expect this to be a generate drush remote issue but nevertheless, that makes cron runs for solr site using drush completly useless.

I would suggest making drush a dep on the remote, and using it the drush_exec way on the remote ( more or less "ssh sever drush @alias cron" ). Anything else will simply not work robust in any case, or am i missing something?

We could ofc make a drupal-bootstrap on our own...but why writing that code? Having drush on the server would also make it easiert to verify the site and run other tasks there. As we need at least the control of apache and the external mysql port + grants and all this, shared hosting remotes are completly our of scope. As php is part of the deps also, there should be no reason to not use drush on the remote ( beside drush on the remote is _handy_ and you will install it manually anyway?! )

Comments

anarcat’s picture

Title: cron not working properly » run cron on remote servers instead of locally
Project: Hosting » Provision
Version: 6.x-0.4-alpha3 » 6.x-1.0-rc2
Category: bug » feature
Issue tags: +aegir-2.0

Changing the issue title.

You are probably right here, and I agree the cron should be run local to the server.

As a workaround, you should be able to use the wget-style cron, but lookout for #1090678: wget cron method broken for d7 sites.

bwood’s picture

This is also needed to support modules like backup_migrate which which operate on the site's files/tmp directory. Consider the following code which is called via backup_migrate_cron():

backup_migrate/includes/files.inc:

  // Delete temp files abandoned for 6 or more hours.
  $dir = file_directory_temp();
  // Altered for testing!!!!
  //  $expire = time() - variable_get('backup_migrate_cleanup_time', 21600);
  $expire = time() - variable_get('backup_migrate_cleanup_time', 1);
  if (file_exists($dir) && is_dir($dir) && is_readable($dir) && $handle = opendir($dir)) {
    while (FALSE !== ($file = @readdir($handle))) {
      // Delete 'backup_migrate_' files in the temp directory that are older than the expire time.
      // We should only attempt to delete writable files to prevent errors in shared environments.
      // This could still cause issues in shared environments with poorly configured file permissions.
      if (strpos($file, 'backup_migrate_') === 0 && is_writable("$dir/$file") && @filectime("$dir/$file") < $expire) {
        unlink("$dir/$file");
      }
    }
    closedir($handle);
  }

If cron is called via drush, this code will attempt to clean up the *hostmaster* copy of the site instead of the live site on the remote webserver. Clients with a lot of data (consider the backup_migrate_files sister module) could fill up their file system quickly if this isn't running on the correct server.

Switching to wget as mentioned above remedies the problem.

joestewart’s picture

Is this a drush problem with remote aliases? Shouldn't it run on the remote if the alias says the server is remote?

I have a concern or two about running cron via wget.

1. Some hosting setups behind firewalls are NATed and do not resolve the domain to an internal IP to access the site.
2. Not specific here, but it is kinda nice to be able to segregate cron jobs to run on a separate machine from the web front ends.

mrfelton’s picture

I've run into a similar problem with a site that's using Feeds Imagegrabber to grab images out of a feed, store them locally and attach them to a filefield as part of a cron job. The problem is that when cron is run via Drush from the Aegir master, the images get downloaded to the Aegir master, and not to the remote server where thy should be.

As mentioned above, switching to the web based cron gets around the issue, but it's not my preferred method of running cron.

Really, Drush should be being run directly on the remote server, not on the Aegir master.

Steven Jones’s picture

Version: 6.x-1.0-rc2 » 7.x-2.x-dev
Priority: Normal » Critical

This isn't going to get fixed in the 1.x branch, the solution there is to use wget cron. We'll get this done in 2.x however, marking as critical for that branch, as the current way of doing it is really stupid (although much simpler!)

mrfelton’s picture

Also seems to effect clearing the boost cache - my assumption is that its trying to remove the files locally, which is why the cached files get left on the remote server.

EugenMayer’s picture

well as suggested, i solved this already doing cron-runs manually. I did not take the appraoch to trigger the cron-runs remotely by drush_exec, but rather using a general cron run on the host who basically does something like

"for all installed sites run cron using their site alias"

I can do this, as iam deploying site-aliases on the remote server also...which is also missing in 1.x ..that way cron runs run in their perfect enviroment.

Removing that "general" cron run by simply invoking drush_exec "drush cron @sitelaias" from aegir would be very simply eventhough. the key is the alias on the remote ( in a lot of cases in the current 1.x implementation ).

But why i even bother to explain .. :)

Steven Jones’s picture

Version: 7.x-2.x-dev » 6.x-2.x-dev
ergonlogic’s picture

Version: 6.x-2.x-dev » 7.x-2.x-dev
Priority: Critical » Major
Status: Active » Postponed

The plan for remote servers, as I understand it, is to make them 'smarter'. This would move us from our current hub/spoke model to a mesh model. That is, have them run Provision, manage their own queue, etc. I can't seem to find the issue where this was being discussed though... Anyway, I'm pretty sure that'll solve this issue, as the remotes will have aliases for the sites running on them.

This work will be undertaken in 7.x-3.x. Also, 'critical' and 'feature request' are usually mutually exclusive.

anarcat’s picture

Issue tags: -aegir-2.0

i wish we had that 3.x version already - can't we create a dev release or something? :)

EugenMayer’s picture

@anarcat having this achitecture in place will be a huge gain.

Since most of the agier-professionals not using aegir remotes though, i guess they never come across all the shortcomings with having no drush on the remote and well ( all the other issues you are aware of ).

When we implemented mesh for aegir, it solved a lot of issues right away ( or at least made the solution very easy ).

EugenMayer’s picture

double-post - whatever happend.

anarcat’s picture

Version: 7.x-2.x-dev » 7.x-3.x-dev
dmsmidt’s picture

Due to the error message in this thread I wasn't able to do a migrate task in Aegir.
Setting the option "On failure" at /admin/config/search/apachesolr/settings to "Show no results" at least helped me further.

Jon Pugh’s picture

Status: Postponed » Active

Question: Is the actual unix crontab utilized at all right now? Is the queue runner the thing that triggers cron on hosted sites?

Since unix crontab is a service and we can restart it, why not start using it, much like we do with apache?

On site verify, we could add a crontab entry for the site on whatever server it is on, and restart it.

If we do that, not only would we be able to setup cron on remote servers, we could have the cron timing configurable per site, which would be really useful.

I think adding more sudo restart privileges for different services is a good thing if it allows us to automate more things.