Does the 'ignore_slave_server' work with this module? [#1915004]

Comment #1

gielfeldt commented 12 February 2013 at 11:50

the db_ignore_slave() does not work with AutoSlave, as AutoSlave takes care of this on a per-table basis.

To control the "assumed maximum replication lag", use the autoslave options in settings.php. It sounds like you may also want to enable the "global replication lag" option, which will force the use of the master db for concurrent users for the affected tables.

Example:

<?php
$databases['default']['default'] = array(
  'driver' => 'autoslave',
  'master' => 'mymasterdb',
  'slave' => 'myslavedb',
  'replication lag' => 30 // defaults to 2 seconds if not set. Standard Drupal's db_ignore_slave() defaults to 300 seconds (5 mins.)
  'global replication lag' => TRUE // defaults to false.
);
?>

Log in or register to post comments

Comment #2

gielfeldt commented 12 February 2013 at 11:51

Elaboration of "db_ignore_slave() does not work with AutoSlave": It has no effect :-)

Log in or register to post comments

Comment #3

gielfeldt commented 16 February 2013 at 10:48

Status:

Active

» Postponed (maintainer needs more info)

Hi

Did this answer your question?

Log in or register to post comments

Comment #4

szantog commented 16 February 2013 at 16:24

We are working on this, I keep you posted.

Log in or register to post comments

Comment #5

gielfeldt commented 24 February 2013 at 12:57

Ok. Let me know if you have other questions.

Log in or register to post comments

Comment #6

szantog commented 7 March 2013 at 19:03

Status:

Postponed (maintainer needs more info)

» Fixed

We added the registry, system, registry_file to the 'Always use "master" for tables' array, it seems, it can solve our issue.
Thanks for your help!

Log in or register to post comments

Comment #7

gielfeldt commented 7 March 2013 at 19:09

Hi

No problem. Do you have any insight to why these tables need to be "always master" on your setup? Then I will document it on the project page, perhaps even add them to the default settings.

Thanks

Log in or register to post comments

Comment #8

gielfeldt commented 8 March 2013 at 07:38

Status:

Fixed

» Postponed (maintainer needs more info)

Hi szantog

Not sure how notification works when status is fixed, so I'm changing the status just in case :-)

Log in or register to post comments

Comment #9

szantog commented 8 March 2013 at 17:18

Hmm.. It's hard a little bit.. Everything is theoretically, because we have limited debug options on the live environment.

The point should be the memcache vs database inconsistently. If the normal database cachebackend is used, eg. when enabling a module, the cache and registry tables are flushed and rebuild. But this is the database, if some lag exists, the wrong system and wrong cache tables are loaded. Wrong, but consistent. If we use memcache, due the lag wrong memcache data is built, and run against the correct database.

1. Turn on a module, master database are ok, memcache is empty.
2. The next page request load the data from slave, which is not in sync yet - and these not proper data go into memcache.

But there is a failure with this theory.
What about the other cache bins? Obviously wrong data in cache_bootstrap bin cause site crash.
----------------
I can't belive, during I wrote this, we get another site crash.
When updated a node the parent entity was loaded immediatly after saving, the old data get cached, this caused entitymailformedexceptions, and within some minutes the site was killed.
We need further investigation..

Log in or register to post comments

Comment #10

gielfeldt commented 8 March 2013 at 17:48

The next page request load the data from slave, which is not in sync yet - and these not proper data go into memcache.

The replication lag mitigation feature should prevent exactly this.

How does your $databases['default']['default'] look? Did you try enabling the 'global replication lag'?

I may have a vague suspicion towards an isolation level issue, assuming your using MySQL innoDB. I'll investigate this further.

Log in or register to post comments

Comment #11

gielfeldt commented 8 March 2013 at 19:21

Oh, btw, are you also using the lock.inc or memcache-lock.inc bundled with autoslave?

Log in or register to post comments

Comment #12

gielfeldt commented 11 March 2013 at 13:50

Hi szantog

I actually found something that could be the culprit also when using "global replication lag".

In the latest dev version, I've tried to address these issues, which resulted in these fixes:
* AutoSlave now using a non-transactional connection for "global replication lag" including possibility for a better isolation level.
* Enable "global replication lag" by default

Try installing the dev version and add the new "init_commands" to the autoslaver driver declaration:

<?php
$databases['default']['default'] = array(
  'driver' => 'autoslave',
  'master' => 'mymasterdb',
  'slave' => 'myslavedb',
  'replication lag' => 30, // defaults to 2 seconds if not set. Standard Drupal's db_ignore_slave() defaults to 300 seconds (5 mins.)
  'init_commands' => array('autoslave' => "SET SESSION tx_isolation='READ-COMMITTED'")
);
?>

Let me know how/if it works.

Log in or register to post comments

Comment #13

szantog commented 11 March 2013 at 18:43

Thanks for you hard work, at wednesday we will try those.

Log in or register to post comments

Comment #14

gielfeldt commented 11 March 2013 at 19:59

Hi szantog

I think I finally figured out what's wrong. And I can't believe I didn't realize it before. Every query runs through autoslave, making autoslave able to detect which tables are being queried... except when using join() on a db_select(). Unfortunately, these a VERY common :-)

I'm currently thinking about how to hook into this. My initial thoughts are overriding the SelectQuery all together, or perhaps adding a tag or an extender to the query, thereby being able to alter it.

The latter seems easier to implement, though I'm not sure I can easily change the connection on an already instantiated SelectQuery object.

This problem seems to be what is interfering with the EntityCache ... and god knows what else.

Log in or register to post comments

Comment #15

szantog commented 11 March 2013 at 20:42

EntityCache is irrelevant for now, my fail - entitycache is now turned off one of our sites, i missed it.

Log in or register to post comments

Comment #16

gielfeldt commented 12 March 2013 at 14:03

There's a new dev version ready, where I've tried to address the issue with dirty tables not being recognized properly. Could you try it out?

Note, you'll have to copy the autoslave folder to /includes/database/ again, as it contains a new file.

Regarding EntityCache, I've discovered that this will only work properly when using the database as a cache backend. The reason is that, e.g. node save clears the cache inside a transaction.

Log in or register to post comments

Comment #17

gielfeldt commented 17 March 2013 at 06:44

I've been looking some more into this. Regarding the error with the entity_extract_ids(), I think I've traced it to the inherint problem with transactions and non-db cache backends.

I've created a cache wrapper in autoslave, which should make cache queries transaction safe.

$conf['cache_backends'] = array(
  'sites/all/modules/memcache/memcache.inc',
  'sites/all/modules/autoslave/autoslave.cache.inc',
);

$conf['cache_default_class'] = 'AutoslaveCache';
$conf['autoslave_cache_default_class'] = 'MemCacheDrupal';

Another way to solve it, could be just to use the database as a cache backend.

This is partly theoretical, as I have been unable to reproduce your exact problem (not knowing precisely what modules you're using and how they are configured). However, it does suit the case, since modules like file_entity do perform cache operations during node_save(), which is wrapped in a transaction.

With the cache wrapper, the core-patch I mentioned earlier in our chat should be unnecessary.

Log in or register to post comments

Comment #18

gielfeldt commented 19 March 2013 at 08:38

I've generalized the consistent cache wrapper: http://drupal.org/sandbox/gielfeldt/1946668

Log in or register to post comments

Comment #19

szantog commented 22 March 2013 at 17:31

As we started to work together with all of our sites, can we close this issue, or just rename to 'varoius fixes and improvements based on high traffic live testing' :)

Log in or register to post comments

Comment #20

gielfeldt commented 22 March 2013 at 17:41

Status:

Postponed (maintainer needs more info)

» Fixed

Let's just close it. I've set it to fixed, since the original issue with the site actually crashing has been solved.

Log in or register to post comments

Comment #21

5 April 2013 at 17:50

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Log in or register to post comments

Does the 'ignore_slave_server' work with this module?

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

News items

Our community

Documentation

Drupal code base

Governance of community