Our site has about 250,000 rows in linkchecker_node, and we were having daily problems with failed queries because of locked tables. The function _linkchecker_cleanup_links() has a delete query that takes a very long time, and uses a subselect, which I believe keeps other tables locked as well. As I understand it, this query runs every day with cron. If a user tries to publish a node while this query is going on, the locked tables cause transactions to fail, which causes a whole lot of data issues.

I also found this info on drupal.org:

Note that on some databases, particularly MySQL, subselects in a conditional clause are not particularly fast. If possible, use joins, subselects in the FROM clause, or multiple flat conditional fragments instead of a subselect.

(from https://www.drupal.org/node/310086)

So I believe the fix is to not use a subselect. I'll attach a patch which seems to fix the problem locally. Since this is critical to our site, we'll also be testing this on production.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

brockfanning created an issue. See original summary.

brockfanning’s picture

Status: Active » Needs review
FileSize
766 bytes
brockfanning’s picture

FileSize
1.47 KB

That didn't work so well, the giant array caused memory issues. Here is another approach. This should perform much better, since it INcludes the unreferenced links (small number), rather than EXcluding the referenced links (large number).

hass’s picture

What MySQL version are you running? Do you have any analysis data that prove the locking and show where it locks? Do you know if this may has been improved in later MySQL versions?

I'd really like the subselect, but was not aware that it locks tables :-(

brockfanning’s picture

We're hosted on Acquia, and it looks like they are using "5.5.24-55-log". I'm not too database-savvy, so I'm not sure about future versions of MySQL, and as for analysis all I have is log records of queries failing while the table is locked. Here are some examples of log entries during one of these cron runs:

Oct 23 12:08:53 https://example.com|1445602133|node|https://example.com/civil/awards-date-10232015|https://example.com/civil/awards-date-10232015|2631||PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: INSERT INTO {linkchecker_node} (nid, lid) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array
Oct 23 12:10:25 https://example.com|1445602225|node|https://example.com/node/136856/edit?content_lock_token=snipped|https://example.com/node/136856/edit?content_lock_token=|PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_node}
Oct 23 12:17:07 https://example.com|1445602627|node|https://example.com/node/641141/edit?content_lock_token=snipped|https://example.com/node/641141/edit?content_lock_token=snipped|PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_node}
Oct 23 12:20:21 https://example.com|1445602821|node|https://example.com/node/641006/edit?content_lock_token=snipped|https://example.com/node/641006/edit?content_lock_token=snipped|PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: INSERT INTO {linkchecker_node} (nid, lid) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array
Oct 23 12:20:36 https://example.com|1445602836|node|https://example.com/ncfs/work-products-adopted-commission|https://example.com/ncfs/work-products-adopted-commission|1187||PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_node}
Oct 23 12:22:24 https://example.com|1445602944|node|https://example.com/node/641141/edit?content_lock_token=snipped|https://example.com/node/641141/edit?content_lock_token=snipped|PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_node}
Oct 23 12:27:56 https://example.com|1445603276|node|https://example.com/node/641006/edit?content_lock_token=snipped|https://example.com/node/641006/edit?content_lock_token=snipped|PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: INSERT INTO {linkchecker_node} (nid, lid) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array
Oct 23 12:29:43 http://example.com|1445603383|cron|127.0.0.1|http://example.com/index.php||0||PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_link}
Oct 23 12:29:43 http://example.com|1445603383|cron|127.0.0.1|http://example.com/index.php||0||PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_link}
Oct 23 12:29:44 http://example.com|1445603384|cron|127.0.0.1|http://example.com/index.php||0||PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_link}
Oct 23 12:36:34 https://example.com|1445603794|node|https://example.com/node/641141/edit?content_lock_token=snipped|https://example.com/node/641141/edit?content_lock_token=snipped|1187||PDOException: SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded; try restarting transaction: DELETE FROM {linkchecker_node}

I believe the INSERT queries are when users are trying to publish a node of a type that should be scanned by linkchecker. The effect, from the user's perspective, is that the browser spins for a long time, then finally fails, and not all of the stuff that should happen on node-publish has happened.

brockfanning’s picture

Just a heads up, this module has been added to Acquia's "do not use" list: https://docs.acquia.com/cloud/develop/drupal/module-incompatibilities

I don't know for sure that the reason is what I fixed in this patch, but from the description it sounds like the same thing. It may be worth a closer look at the patch and maybe contacting Acquia?

hass’s picture

I have a solution for Acquia customers. Rent a real server for 1/10 EUR per month and not a reselled overpriced Amazon AWS virtual machine without power. Just an useful idea...

No idea what this page should tell me about recaptcha and linkchecker. Useless issue comments. It is not pointing towards this issue here. Better they report how to solve the issue than only complaining and spread fear.

NancyDru’s picture

It is no longer on that list.