Support from Acquia helps fund testing for Drupal Acquia logo

Comments

killes@www.drop.org’s picture

FileSize
2.59 KB

Update: now implements hook_comment and hook_nodeapi to set the session variable. The maximal lag time is a variable without UI.

killes@www.drop.org’s picture

A similar patch runs since almost a year on drupal.org. The only functional difference is that we do only ignore the slave for the very next page request.

killes@www.drop.org’s picture

One could of course argue that this could also be in a contributed module. That would be ok with me.

Crell’s picture

Status: Needs review » Needs work

The changes to hook_nodeapi() broke this. I approve of the approach, however, and I'd rather see this in core.

Hm, I wonder if hook_comment() should be broken out as well... :-)

killes@www.drop.org’s picture

Status: Needs work » Needs review
FileSize
4.2 KB

As requested, I've updated the patch. This patch nicely exposes two problems with the change to the nodeapi hook:

1) Code duplication: the fact that insert and update do the same are not that uncommon. For longer parts of common code you'll need to introduce yet another function.

2) Other hooks (here: hook_comment) haven't been converted.

Dries’s picture

Status: Needs review » Needs work

This will need extensive comments before it can be considered.

Crell’s picture

Status: Needs work » Needs review
FileSize
3.19 KB

Reroll with more descriptive comments in system_init(). Dries, if you want something else or comments elsewhere instead, you'll need to say what and where. Between that block and the docblock for Database::ignoreTarget() I think we're probably OK.

David Strauss’s picture

Subscribing.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Looks simple and desireable to me. Comments are clear.

keith.smith’s picture

One very small note -- per the patches in #349504: Conform sentence spacing in code comments., code comments in core should only be separated by single spaces.

Dries’s picture

To me, it feels more logical, if, from an API point of view, we'd pass that information to the SQL query instead of implementing various hooks. That is, when doing an insert or updated in the comment table, why don't we pass a parameter to db_query()? I'd like to see us discuss some more.

David Strauss’s picture

Status: Reviewed & tested by the community » Needs review

The industry-standard method of ensuring replication is to post a magic value to the database (INSERT or UPDATE) after performing the changes and simultaneously send the magic value to the client using a cookie or session. On the next page, check for your magic value against your request's assigned slave server. If you find the magic value, you know the slave is at least as up to date as it needs to be. If you do not find the value, read from the master.

My preferred way to handle the issue is both more complicated and more scalable: try to store the data necessary to effect the user's change in the session or (ideally) cookie. Then, when they load the next page, use PHP or (ideally) Javascript to integrate the user's change if it isn't part of the data read from the slave server. This gives users quick, satisfying feedback on their activity by exploiting how the only that client needs to immediately see the results of its interaction. Other site users can wait for the replication delay.

Crell’s picture

I have to disagree with both of you.

Dries, what exactly do you mean? We're already passing information into db_query(), specifically, "try to use a slave server if available" (the target option key). What else would we pass in? "try to use a slave server if available and it's been less than X seconds since the last write by this process"? That puts a lot more work on the query author to figure out, and I don't even know what the implementation would look like internally.

David, your first suggestion involves an extra query for every slave query we run to see if it's safe to use, as well as an extra query for every write. That is a non-trivial number of new queries, as well as a non-trivial amount of work for module authors. I see lots of bugs cropping up from that if anyone tries to use a slave server.

As for saving the updated data to the session or a cookie only to bypass the database entirely, that's even worse from a module developer's point of view. You want to cache every node that gets saved in the database in order to check it on the next half-dozen page views? That still does nothing for Views or other aggregate queries, too. Those will need to pull from the master database either way, but only for that user.

And relying on Javascript for that logic is a non-starter, for obvious reasons.

This patch minimizes the work for module authors while providing a global kill switch (that we can handle mostly in core) to temporarily disable the slave for that user. There are, as a result, no extra queries run, and that user gets all up to date data for however long the site admin sets the timeout to (based, presumably, on empirical data of the replication delay on his server farm). I still haven't seen another implementation that provides this little hassle for module developers.

David Strauss’s picture

I said it was the industry standard. I didn't say it was right. ;-)

Crell’s picture

So we've discussed more, and found no better approach than what's listed here. Dries, unless you have a specific objection to this approach I'd ask that we move forward with it, as replication is really not useful without a disable mechanism (or acknowledgement that data could be very stale if you run replication).

moshe weitzman’s picture

Status: Needs review » Needs work

Needs to use drupal_set_session() now. Should be a straightforward search/replace for $_SESSION ... We should actively unset() this if possible. For unset, we can access $_SESSION directly.

wonder95’s picture

Status: Needs work » Needs review
FileSize
3.16 KB

Made the mods suggested by Moshe to use drupal_set_session().

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Thanks for the change. Back to RTBC. Probably should wait for testbot's final thumbs up before commit.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

wonder95’s picture

FileSize
3.65 KB

Per discussion with Crell and dmitrig01, modified call to hook_comment() into separate functions since $op no longer exists for hook_comment and created third function for setting session variable.

wonder95’s picture

Status: Needs work » Needs review

Changing status to code needs review for testing.

Status: Needs review » Needs work

The last submitted patch failed testing.

drewish’s picture

+/*
+ * Implementation of hook_comment_update().
+ */
+function comment_comment_update($comment) {
+  drupal_set_ignore_slave();
+}
+
+/*
+ * Helper function to get duration lag from variable
+ * if one isn't passed, and then set the session variable
+ * that contains the lag.
+ *
+ * @param $duration
+ *   The amount of time to ignore the slave to give it
+ *   time to refresh after an insert or update.
+ */
+function drupal_set_ignore_slave($duration = NULL) {

PHPDocs should start with /** rather than /*.

+function node_nodeapi_insert(&$node) {
+  drupal_set_session('ignore_slave_server',REQUEST_TIME + variable_get('maximal_replication_lag', 10));
+}
+
+/**
+ * Implementation of hook_nodeapi_update().
+ */
+function node_nodeapi_update(&$node) {
+  drupal_set_session('ignore_slave_server',REQUEST_TIME + variable_get('maximal_replication_lag', 10));

need to put a space after the commas before REQUEST_TIME.

Why does comment do drupal_set_ignore_slave(); and node do drupal_set_session('ignore_slave_server',REQUEST_TIME + variable_get('maximal_replication_lag', 10));
? it seems like they should both call the same function.

mikey_p’s picture

What about moving drupal_set_ignore_slave into database.inc and calling it db_ignore_target and adding a parameter for the target/key to be ignored (could default to 'default' and 'slave') ?

My only other thought was maybe session.inc but it seems better fit for database.inc.

wonder95’s picture

Status: Needs work » Needs review
FileSize
4.12 KB

My thoughts, too. Here is a re-rolled patch that adds db_set_ignore_slave() to database.inc and modifies the insert and update hooks in comment and mode modules to use it. I did it on Windows and ran it through dos2unix, so there shouldn't be any syntax errors this time (hopefully).

drewish’s picture

seems like it could use a test or two...

Crell’s picture

Status: Needs review » Needs work

A test would be good if possible, but I'm not entirely sure how we'd write it. Probably as a pair of JSON callbacks. Hm.

The comment block in system.module should be updated to mention the new utility function instead of talking about $_SESSION directly. Otherwise this looks OK. (I am tempted to push something into the Database class and make db_set_ignore_slave() a dumb wrapper, but I think that would put too much non-DB API code into the DB API proper. So let's leave it as is for now.)

wonder95’s picture

Status: Needs work » Needs review
FileSize
4.16 KB

Patch has been re-rolled to add reference to db_set_ignore_slave() in comments in system.module.

Dries’s picture

The last patch looks good.

Three comments:

1. I'd improve the documentation a bit. Most people are not familiar with replication lag. I'd explain it in 1-2 sentences -- really crisp but enough to give them a clue and to google for details.

2. The variable_get() makes me a bit nervous. If I do replication to another data center, I'll have a different lag than when I do replication to a local server. I might have different lag times for different slave servers?

3. Is there a function to temporary disable a slave that seems to be down? For example, I have 3 slaves and slave #2 is down. If Drupal discovers this (i.e. it gets a time-out), can it take slave #2 out of rotation for a while using this function? I understand that this is not part of this patch, but it might keeping this in mind when doing the API design.

Crell’s picture

If you're doing live replication to a local server and a server in another state, I don't understand why you'd be using both of them as hot slaves. I admit I've never setup a replicated environment, but I don't understand why you'd want to do that in the first place. :-) If you did, you'd just set the timeout to a higher value, enough to catch all of your active slave servers. (If you're just doing backups via replication, then you wouldn't tell Drupal about your remote slave server.) So I think we're covered here.

Right now there's no handling for temporary DB outage. killes had opened another issue about that at one point but not much has happened there. Definitely a separate issue from this one; not sure how or if that can be addressed. Probably something we'd want to bring in some "big iron" people to give suggestions on. Someone go find IBM. :-)

So I think this is ready as soon as the docs beef up a bit, then. If this doesn't get in by DCDC, I'm making it a priority for the code sprint.

David Strauss’s picture

I think slave failure is best handled using clustering systems like Heartbeat 2 that transparently move service IPs to other functional systems within the cluster.

Drupal might be able to handle failing slave servers, but where would we store the persistent data about slave state, in the database? That would be inefficient. If a memory-based cache is installed, that might do the trick.

wonder95’s picture

FileSize
4.43 KB

Updated patch with added comments in system.module to explain replication lag.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

Comment looks good to me.

mmcdougall’s picture

in system.module patch above , isn't the following if condition exactly backwards?
i.e. should change

    if ($_SESSION['ignore_slave_server'] < REQUEST_TIME) {
      Database::ignoreTarget('default', 'slave');
    }
    else {
      unset($_SESSION['ignore_slave_server']);
    }

to

    if ( REQUEST_TIME < $_SESSION['ignore_slave_server'] ) { ...

I've been testing this for consumersearch.com (d5!) and ran into this.

Crell’s picture

Assigned: killes@www.drop.org » Unassigned
Status: Reviewed & tested by the community » Needs work

Hm. It is possible that you are correct. The value in the session is the timestamp until which we want to skip the slave server. So we ignore the slave server if the request time is less than the value in the session. Yeah, I think you are right. :-)

I understand that this patch has been backported to run on Drupal.org. Can any of the d.o admins weigh in on how well it's working? I really do want to get this into HEAD soon so that we can figure out what queries to flag.

David Strauss’s picture

Here's the patch we're running on d.o:

http://vcs.fourkitchens.com/drupal/6-patched/revision/27

wonder95’s picture

Status: Needs work » Needs review
FileSize
4.43 KB

Made change above to match backported D5 patch:

if ($_SESSION['ignore_slave_server'] >= REQUEST_TIME) {...

Status: Needs review » Needs work

The last submitted patch failed testing.

wonder95’s picture

FileSize
4.37 KB

Re-rolled to account for change in core of hook names from hook_nodeapi_insert() and hook_nodeapi_update() to hook_node_insert() and hook_node_update().

wonder95’s picture

Status: Needs work » Needs review
Damien Tournoud’s picture

Status: Needs review » Needs work

The node module implementing hook_node_(update|insert)() and the comment module implementing hook_comment_(update|insert)()? Sorry, but that looks *very* ugly. There *has* to be a better way to do that.

David Strauss’s picture

Status: Needs work » Needs review

Reposting what I said in IRC:
"[An equivalent of this patch is] running on d.o, and it isn't causing any new problems. Unfortunately, the people who reported the problems we created [the patch] to solve haven't clearly indicated whether [the original problem is] solved. I believe the problem should be theoretically solved."

Damien Tournoud’s picture

Status: Needs review » Needs work

By definition, the node module control the hook_node_*() hooks, and the comment module control the hook_comment_*() hooks. Why not placing those calls just after the call to these hooks, instead of as a hook implementation itself?

Plus, a comment before the call would be great. It's not that clear what an isolated call to db_set_ignore_slave() does.

David Strauss’s picture

I personally prefer for modules to factor even their own functionality into their own hooks when possible.

moshe weitzman’s picture

It is somewhat common in Drupal for the a module to implement its own hooks. I don't find it inelegant at all. It keeps the original function smaller. See user_user(), user_user_operations(), ...

Damien Tournoud’s picture

Here is the bottom of node_save(). Why not putting the call to db_set_ignore_slave() there, where it will be both elegant and straightforward to understand?

  // Call the node specific callback (if any). This can be
  // node_invoke($node, 'insert') or
  // node_invoke($node, 'update').
  node_invoke($node, $op);

  // Save fields.
  $function = "field_attach_$op";
  $function('node', $node);

  node_invoke_node($node, $op);

  // Update the node access table for this node.
  node_access_acquire_grants($node);

  // Clear the page and block caches.
  cache_clear_all();
}
Crell’s picture

Damien, can you roll an updated patch then?

wonder95’s picture

I chatted with Damien in #drupal today, and he isn't planning on rolling a patch for this (he was just making a suggestion). Should we go ahead with this as is, or implement his suggestion and re-roll the patch?

drewish’s picture

i agree that generally modules should use their own hooks but in this case the function being called seems to me like it should be part of node/comment_save() rather than done via hook.

David Strauss’s picture

@drewish You may have a point. This is sort of an over-arching concern of node and comment saving rather than a hook-like concentration on the module's own data.

wonder95’s picture

Status: Needs work » Needs review
FileSize
4.24 KB

Here is a re-rolled patch that adds db_set_ignore_slave() to the end of node_save() and comment_save() and removes the insert and update hooks for nodes and comments.

Status: Needs review » Needs work

The last submitted patch failed testing.

wonder95’s picture

Status: Needs work » Needs review
FileSize
4.24 KB

Another re-roll to add periods to the ends of the comments for db_set_ignore_slave() in node_save() and comment_save() (per Crell's request).

Status: Needs review » Needs work

The last submitted patch failed testing.

andypost’s picture

Status: Needs work » Needs review

some bots were broken

Dries’s picture

This looks good. I'd rename db_set_ignore_slave to db_ignore_slave().

I don't see why/when we'd want to pass the duration using a parameter. Developers don't know the underlying system or architecture, so they can't decide to change the 10 to 12, IMO. That is always up to the people running the site because they are the only ones that really know. Because setting the $duration as a parameter overwrites the variable_get(), this is a dangerous parameter -- it takes away control from the system administrator.

Dries’s picture

Issue tags: +Favorite-of-Dries

Tagging and tracking.

David Strauss’s picture

Status: Needs review » Needs work

I agree with Dries. There's no way for a developer to usefully override the site-wide configuration.

Setting to "needs work" because the default should be more than 10 seconds. I'd set it to 300. A small/medium site doesn't need the slave so much that the extra 290-second delay has a real impact on performance. A large site will be either need the longer delay or will be explicitly tuning the setting. Five minutes is long enough to allow the slave to break and resume interrupted replication without causing problems on the Drupal site from the old data.

wonder95’s picture

Status: Needs work » Needs review
FileSize
4.22 KB

Updated patch with three changes:

  1. Changed function name from db_set_ignore_slave() to db_ignore_slave()
  2. Changed call to drupal_set_session() in db_ignore_slave() to set $_SESSION['ignore_slave_server'] directly since drupal_set_session() doesn't exist any more.
  3. Increased default delay to 300 seconds

In regards to the fact that there is no way to set the delay, what about adding a field to the admin UI somewhere to allow the admin to set the $duration value?

Status: Needs review » Needs work

The last submitted patch failed testing.

David Strauss’s picture

There's no reason to offer an admin interface for this option. Configuring slave servers already has to happen in settings.php.

David Strauss’s picture

The duration argument still needs to go for db_ignore_slave().

wonder95’s picture

Title: Add option to ignore slave server for next requests » Add option to ignore slave server for next requests
Status: Needs work » Needs review
FileSize
3.94 KB

Re-rolled with default value removed from db_ignore_slave().

I used TortoiseCVS to create, so here's hoping it works...

Status: Needs review » Needs work

The last submitted patch failed testing.

David Strauss’s picture

Looks more like a failure in the test bot.

killes@www.drop.org’s picture

Status: Needs work » Needs review

trying again

Status: Needs review » Needs work

The last submitted patch failed testing.

killes@www.drop.org’s picture

well, apparently, there is really a problem.

wonder95’s picture

Found the problem. In db_ignore_slave(), I had removed the $duration parameter, so it didn't like the fact that we were checking for something that wasn't there:

/**
 * Helper function to get duration lag from variable
 * if one isn't passed, and then set the session variable
 * that contains the lag.
 *
 * @param $duration
 *   The amount of time to ignore the slave to give it
 *   time to refresh after an insert or update.
 */
function db_ignore_slave() {
  if (!$duration) {
    $duration = variable_get('maximum_replication_lag', 300);
  }
  $_SESSION['ignore_slave_server'] = REQUEST_TIME + $duration;
}

Changed it in three places:

  1. Removed check for $duration.
  2. Added comment for line on setting session variable.
  3. Cleaned up commets for function to remove reference to parameter being passed.

to this:

/**
 * Helper function to get duration lag from variable
 * and set the session variable that contains the lag.
 */
function db_ignore_slave() {
  $duration = variable_get('maximum_replication_lag', 300);
  //set session variable with amount of time to delay before using slave 
  $_SESSION['ignore_slave_server'] = REQUEST_TIME + $duration;
}
wonder95’s picture

Status: Needs work » Needs review

Found the problem. In db_ignore_slave(), I had removed the $duration parameter, so it didn't like the fact that we were checking for something that wasn't there:

/**
 * Helper function to get duration lag from variable
 * if one isn't passed, and then set the session variable
 * that contains the lag.
 *
 * @param $duration
 *   The amount of time to ignore the slave to give it
 *   time to refresh after an insert or update.
 */
function db_ignore_slave() {
  if (!$duration) {
    $duration = variable_get('maximum_replication_lag', 300);
  }
  $_SESSION['ignore_slave_server'] = REQUEST_TIME + $duration;
}

Changed it in three places:

  1. Removed check for $duration.
  2. Added comment for line on setting session variable.
  3. Cleaned up commets for function to remove reference to parameter being passed.

to this:

/**
 * Helper function to get duration lag from variable
 * and set the session variable that contains the lag.
 */
function db_ignore_slave() {
  $duration = variable_get('maximum_replication_lag', 300);
  //set session variable with amount of time to delay before using slave 
  $_SESSION['ignore_slave_server'] = REQUEST_TIME + $duration;
}
wonder95’s picture

FileSize
3.83 KB

Hmmm, looks like something was wrong with the last attachment. Attaching again.

Dries’s picture

- + //set session variable with amount of time to delay before using slave has a code style issue.

- Tiny request. Let's add a small comment to document our choice of 300. I think we can borrow from David's comment in #58. Could be helpful to guilde operations people.

One last re-roll? :)

wonder95’s picture

FileSize
4 KB

Re-rolled with Dries' requested changes.

Dries’s picture

Committed to CVS HEAD. Yay.

Damien Tournoud’s picture

Status: Needs review » Active

A minor but sensible nitpick: there is no point in cluttering the session if there is no slave server to start with. Let's check for that in db_ignore_slave().

Dries’s picture

The cookie should only be set when a comment or node is updated/inserted -- it shouldn't have dramatic performance impact. But yes, let's get that fixed -- it could give a small performance win to clean that up.

wonder95’s picture

Status: Active » Needs review
FileSize
4.18 KB

Added step to db_ignore_slave() to check for more than one db being used before setting session variable.

Crell’s picture

Status: Needs review » Needs work

It looks like this patch doesn't account for the fact that the rest of the patch already went in. :-) We just need the tweak to db_ignore_slave() itself. Also, I think we can probably static-cache count($connection_info). Although you can add servers at runtime, that's a crazy edge case that has yet to be used outside of unit tests anyway. This is a micro-performance patch, so let's mind the micro-performance. :-)

mikey_p’s picture

Just a note, this is currently in the bottom of database.inc in the defgroup for legacy stuff. I'd recommend moving it to between _db_error_page and db_fetch_object.

wonder95’s picture

Here are a couple static cache options (thanks to mikey_p for the help):

function db_ignore_slave() {
  static $connection_info;
  if (!isset($connection_info)) {
    $connection_info = Database::getConnectionInfo();
  }
  // Only set ignore_slave_server if there are slave servers
  // being used.
  if (count($connection_info) > 1) {
    // Five minutes is long enough to allow the slave to break and resume
    // interrupted replication without causing problems on the Drupal site
    // from the old data.
    $duration = variable_get('maximum_replication_lag', 300);
    // Set session variable with amount of time to delay  before using slave.
    $_SESSION['ignore_slave_server'] = REQUEST_TIME + $duration;
  }
}

and

function db_ignore_slave() {
  static $ignore_slave;
  if (!isset($ignore_slave)) {
    $connection_info = Database::getConnectionInfo();
    // Only set ignore_slave_server if there are slave servers
    // being used.
    if (count($connection_info) > 1) {
      // Five minutes is long enough to allow the slave to break and resume
      // interrupted replication without causing problems on the Drupal site
      // from the old data.
      $duration = variable_get('maximum_replication_lag', 300);
      // Set session variable with amount of time to delay  before using slave.
      $_SESSION['ignore_slave_server'] = REQUEST_TIME + $duration;
      $ignore_slave = TRUE;
    }
  }
}

Which one is preferable?

Crell’s picture

Well the second one doesn't look like it would actually work. :-) You should also use the new drupal_static() function. That said, the fastest approach is probably to cache $use_slave and then just if ($use_slave) { ... }

wonder95’s picture

Status: Needs work » Needs review
FileSize
2.26 KB

Updated patch with the following changes:

  1. Moved db_ignore_slave() out of legacy section in database.inc:
  2. Added use of drupal_static()
  3. Added $use_slave variable to check for slave and set session variable one time
Damien Tournoud’s picture

Frankly, who cares about the micro-performance of db_ignore_slave()? The difference between using drupal_static() or just doing a count(Database::getConnectionInfo()) should not be measurable anyway. I vote for removing the static cache.

Status: Needs review » Needs work

The last submitted patch failed testing.

wonder95’s picture

Status: Needs work » Needs review

OK, but does it hurt to leave the static cache in? It would seem better to err on the side of speed, no matter how tiny it is. Of course, the trade-off is it also means a few more lines of code. I'll let people more knowledgeable than me decide, though.

Just my $.02.

wonder95’s picture

FileSize
2.24 KB

Re-submitting with correct path to database.inc in the patch (TortoiseCVS is so inconsistent with that).

Dries’s picture

I'm with DamZ, the static caching is unnecessary.

Dries’s picture

I'm with DamZ, the static caching is unnecessary.

mikey_p’s picture

FileSize
1.82 KB

Move out of @ingroup database-legacy.

I guess this is more of our future legacy?

Dries’s picture

We lost the slave checking in #89.

wonder95’s picture

FileSize
2.15 KB

Slave checking added back in, and function moved out of legacy group.

Status: Needs review » Needs work

The last submitted patch failed testing.

wonder95’s picture

Status: Needs work » Needs review
FileSize
2.15 KB

Ok, let's try that again, this time will all necessary closing braces.

Crell’s picture

Status: Needs review » Needs work

Everything except the $connection_info line is indented one level too deep. I don't know why. Otherwise it looks fine to me. Let's fix the spacing and RTBC.

wonder95’s picture

FileSize
2.06 KB

@Crell, I was testing you to make sure you were on your toes. :-)

Here is is with proper indentation.

wonder95’s picture

Status: Needs work » Needs review

Oops, forgot to set to "needs review".

Crell’s picture

Status: Needs review » Reviewed & tested by the community

Now we're in business.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks.

Status: Fixed » Closed (fixed)
Issue tags: -Favorite-of-Dries

Automatically closed -- issue fixed for 2 weeks with no activity.