So, I have a multilingual site. The English version works just fine (ex. home page 229 queries in 28.03 ms). The same page in another language has 931 queries in 6225 ms.
The query log show that the following queries run for every translated word on the page (about 150 times).

40.521 i18nstrings_get_string SELECT s.*, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.textgroup = 'nodetype' AND s.location = 'type:page:name' AND t.language = 'el'

1.071 i18nstrings_add_string INSERT INTO locales_source (location, source, textgroup, version) VALUES ('type:page:name', 'Page', 'nodetype', '1')

0.08145 db_last_insert_id SELECT LAST_INSERT_ID()

0.2915 cache_clear_all DELETE FROM cache WHERE cid LIKE 'locale:nodetype:%'

1.161 locale SELECT s.lid, t.translation, s.version FROM locales_source s LEFT JOIN locales_target t ON s.lid = t.lid AND t.language = 'el' WHERE s.source = 'If you want to add a static page, like a contact page or an about page, use a page.' AND s.textgroup = 'default'

Even with boost module enabled, the site is extremely slow and causes load issues to the server.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jeka_fl’s picture

I have exact same problem a lot of i18n, inserts a row INSERT INTO locales_source (location, source, textgroup, version). base reaches 2GB, I remove the lines from bzy, the site is neither as no effect! Why inserts wonder had to clean the base permanently. if you know the solution .. tell

asb’s picture

Category: support » bug
Priority: Normal » Critical

Similar issues on several sites with i18n enabled. I consider this issue as "critical" as it not only kills the i18n-enabled site, but as well all other sites on the same server as well, including other services depending on MySQL. If you're on VPS or shared hosting, your db access might get blocked due to excessively high load, if you're running your own server... well, you might have to figure out how to access a irresponsive server.

The problem here is that we get massive slow queries minutes after starting the server up. If we let these queries run for a while, the server load continously rises; after 20-20 minutes we reach a load of 40 on a quad core server which means that the whole server is basically dead. After restaring mysqld, the drama repeats, until we take the i18n site(s) down, or completely uninstall i18n. According to MySQLs query log, most of the slow queries are somehow related to i18n.

However, what makes this issue a real kicker that we were not able to reproduce it clearly. Usually it suffices to just create a node on an i18n site; slow queries will start to spawn and take the server down. Sometimes this simply does not happen, even when creating a node of the exact same content type.

Example queries from slow query log:

# Time: 110716 18:40:54
# Query_time: 201  Lock_time: 39  Rows_sent: 1  Rows_examined: 0
SELECT COUNT(*) FROM (SELECT i.type, i.sid, 5 * (19.9103141434 * SUM(i.score * t.count)) + 5 * POW(2, (GREATEST(MAX(n.created), MAX(n.changed), COALESCE(MAX(c.last_comment_timestamp), 0)) - 1276439761) * 6.43e-8) + 5 * (2.0 - 2.0 / (1.0 + MAX(c.comment_count) * 0.25)) + 5 * (2.0 - 2.0 / (1.0 + MAX(nc.totalcount) * 0.00197628458498)) AS score FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER JOIN node n ON n.nid = i.sid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'de'  LEFT JOIN node_comment_statistics c ON c.nid = i.sid LEFT JOIN node_counter nc ON nc.nid = i.sid WHERE n.status = 1 AND (n.language ='de' OR n.language ='' OR n.language IS NULL OR n.language = 'en' AND i18n.nid IS NULL) AND (i.word = 'bild' OR i.word = 'life' OR i.word = 'photography' OR i.word = 'world' OR i.word = 'feather' OR i.word = 'sky' OR i.word = '3') AND i.type = 'node' GROUP BY i.type, i.sid HAVING COUNT(*) >= 7) n1;

# Time: 110718 11:26:19
# Query_time: 15  Lock_time: 0  Rows_sent: 1  Rows_examined: 7815652
SELECT COUNT(DISTINCT(n.nid)) FROM node n INNER JOIN term_node tn ON n.vid = tn.vid  LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'de' WHERE (n.language ='de' OR n.language ='' OR n.language IS NULL OR n.language = 'en' AND i18n.nid IS NULL) AND (  tn.tid IN (1148,1056,647,1102) AND n.status = 1);

# Time: 110719  1:24:15
# Query_time: 195  Lock_time: 0  Rows_sent: 1  Rows_examined: 79432376
SELECT i.type, i.sid, 5 * (13.6873279401 * SUM(i.score * t.count)) + 5 * POW(2, (GREATEST(MAX(n.created), MAX(n.changed), COALESCE(MAX(c.last_comment_timestamp), 0)) - 1311015615) * 6.43e-8) + 5 * (2.0 - 2.0 / (1.0 + MAX(c.comment_count) * 0.25)) + 5 * (2.0 - 2.0 / (1.0 + MAX(nc.totalcount) * 0.00194174757282)) AS score FROM search_index i INNER JOIN search_total t ON i.word = t.word INNER JOIN node n ON n.nid = i.sid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'de'  LEFT JOIN node_comment_statistics c ON c.nid = i.sid LEFT JOIN node_counter nc ON nc.nid = i.sid WHERE n.status = 1 AND (n.language ='de' OR n.language ='' OR n.language IS NULL OR n.language = 'en' AND i18n.nid IS NULL) AND (i.word = 'bild' OR i.word = 'beautiful' OR i.word = 'image' OR i.word = 'home' OR i.word = 'kitchen' OR i.word = '4') AND i.type = 'node' GROUP BY i.type, i.sid HAVING COUNT(*) >= 6 ORDER BY score DESC LIMIT 0, 10;

# Time: 110720 15:43:40
# Query_time: 189  Lock_time: 16  Rows_sent: 50  Rows_examined: 233711270
(SELECT n.nid, n.vid, n.vid AS cvid, n.uid, u.name, r.title, '' AS log, n.created, r.timestamp, n.type, 0 AS cid, '' AS subject, '' AS comment, r.uid AS auid, a.name AS aname FROM node n LEFT JOIN node_revisions r ON n.vid = r.vid LEFT JOIN users u ON n.uid = u.uid LEFT JOIN users a ON r.uid = a.uid  LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'de' WHERE (n.language ='de' OR n.language ='' OR n.language IS NULL OR n.language = 'en' AND i18n.nid IS NULL) AND (  FALSE AND n.status >= 1)) UNION ALL (SELECT r.nid, r.vid, n.vid AS cvid, n.uid, u.name, r.title, r.log, n.created, r.timestamp, n.type, 0 AS cid, '' AS subject, '' AS comment, r.uid AS auid, a.name AS aname FROM node_revisions r LEFT JOIN node n ON r.nid = n.nid LEFT JOIN users u ON n.uid = u.uid LEFT JOIN users a ON r.uid = a.uid  LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'de' WHERE (n.language ='de' OR n.language ='' OR n.language IS NULL OR n.language = 'en' AND i18n.nid IS NULL) AND (  n.type IN ('comment', 'anbieter', 'story', 'gallery_image', 'book', 'faq', 'film', 'gallery', 'glossary', 'illustration', 'produkt', 'publication', 'ressource', 'page', 'poll', 'website') AND n.status >= 1)) UNION ALL (SELECT c.nid, n.vid, n.vid AS cvid, n.uid, u.name, n.title, '' AS log, c.timestamp AS created, c.timestamp, n.type, c.cid, c.subject, c.comment, c.uid AS auid, c.name AS aname FROM comments c LEFT JOIN node n ON c.nid = n.nid LEFT JOIN users u ON n.uid = u.uid  LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'de' WHERE (n.language ='de' OR n.language ='' OR n.language IS NULL OR n.language = 'en' AND i18n.nid IS NULL) AND (  n.type IN ('comment', 'anbieter', 'story', 'gallery_image', 'book', 'faq', 'film', 'gallery', 'glossary', 'illustration', 'produkt', 'publication', 'ressource', 'page', 'poll', 'website') AND n.status >= 1)) ORDER BY timestamp DESC LIMIT 0, 50;

Sorry, I don't know enough about SQL to figure out why this results in "Query_time: 189" and "Lock_time: 16".

alanpeart’s picture

Component: Code » Blocks

In case you are still looking for the solution: http://drupal.org/node/586736

Adding an index to the "language" field of the node table brought MySQL execution time for switching between languages in "mixed mode" on my i18n site from 54s to .347s.

asb’s picture

j0rd’s picture

Version: 6.x-1.3 » 6.x-1.10
Component: Blocks » Menus

@#3 => alanpeart, you're link has nothing to do with the original problem.

While this problem happens in blocks, I think it's more noticable with large menus (think admin menu), so I've changed the "Component" for others to more easily debug.

On my site for each page, it probably gets called 500 times per page request. This needs to be properly cached and query once if needed. Currently as mentioned above I believe this query is getting called for any instance of t(), which is a bunch on any single page load.

Here's the query in question. Attached is a screenshot with a lot of them. Also additionally, you'll see the yellow dashes in my scrollbar in the screen shot, each of those is an instance of this query.

SELECT s.*, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE s.textgroup = 'nodetype' AND s.location = 'type:page:name' AND t.language = 'el'
j0rd’s picture

FileSize
196.19 KB

Forgot screenshot mentioned above.

j0rd’s picture

Status: Needs review » Active
FileSize
2.01 KB

Here's my patch for i18nstrings.module

Fix happens in i18nstrings_get_source();

Previously the code did this:
Check to see if the string is in the cache, if it is, use it, else query the database and store it in cache. You'll only get a cache hit, if multiple of the same word are translated on the same page.

What it does now:
Checks to see if the all the strings for the language are cached, if they're not cache them. Then see if the string is in the cache, and return what ever was in the cache. This function now works very similar to the locale() function in locale.module. You'll get a cache hit, if the next string you are translating is the same language as the first one you attempted to translate. Much better!

This simple fix reduced the amount of DB queries on my page from 800 to 300, reduction of 500 queries caused by i18nstrings_get_source().

Potential problems:
I need someone familiar with i18n to take a look. I'm excluding the 'default' textgroup from my cache, because I don't think I need it. Should strings also be needed from the 'default' textgroup in i18nstrings_get_string(), you should use the locale() function, since it already stores the 'default' textgroup information.

Another potential problem would be if you have tonnes of these strings. This could cause memory issues. For my case this is an acceptable. I have roughly 400 entries in my table (approximately the amount of queries which later don't need to get called) for each particular language and they're all quite small.

You can see roughly how much will be stored in memory by this patch by running this query:
SELECT s.lid, t.translation FROM locales_source s INNER JOIN locales_target t ON s.lid = t.lid WHERE t.language='fr' AND s.textgroup<>'default'

Where 'fr' is the langcode you want to test for.

j0rd’s picture

Status: Active » Needs review

Changing issue for automated testing.

Status: Active » Needs work

The last submitted patch, i18nstrings-dbtrash-877016-7.patch, failed testing.

j0rd’s picture

Status: Needs work » Needs review
FileSize
1.89 KB

One more time.

Status: Needs review » Needs work

The last submitted patch, i18nstrings-dbtrash-877016-10.patch, failed testing.

j0rd’s picture

Status: Needs work » Needs review
FileSize
528 bytes

Testing blank patch to see if testing works.

Status: Needs review » Needs work

The last submitted patch, i18nstrings-dbtrash-877016-12.patch, failed testing.

j0rd’s picture

Version: 6.x-1.10 » 6.x-1.x-dev
Status: Needs work » Needs review
FileSize
1.89 KB

Alright, so it appears patch #10, fails no worse than a blank patch #12.

Re-uploading patch #10 and testing it against -dev to see if that makes a difference.

Status: Needs review » Needs work

The last submitted patch, i18nstrings-dbtrash-877016-10.patch, failed testing.

Jose Reyero’s picture

Status: Needs work » Closed (won't fix)

About tests, I thought they were disabled for this branch (though I enabled them just for curiosity, now I cannot disable it), see from i18n README.txt (So do not worry about them anyway.)

SimpleTest:
-----------------
Tests for this module will run on SimpleTest 6.x-2.8 (old version).
About this see http://drupal.org/node/584596

About the patch, caching all strings for a language is not an option here as that just won't scale for too many strings and if you have now speed issues, the other way you may end up having memory issues.

More ellaborated caching schemes (like checking which strings appear on every page) should be an option, only we don't take any more features for this version. So better consider using other caches (blocks, pages, etc..)

So, about this patch this will be 'won't fix', though you can make it available from some link in the module handbook or if you want to have the feature in a module, you can create another module that just updates i18n_string_cache() so you don't need to patch this one.

j0rd’s picture

I would disagree. This caching is similar to how locale currently works (locale limits pulling in strings larger than 75 chars though). Adding 300-500 queries per page load is fairly unacceptable and destroyed load times of a site I'm working on due to DB being on another server and increasing latency. With this patch on my site it actually reduced memory, I assume because much less code gets hit. Also since most of string.translations are for items which show up on most pages anyways (menus, blocks...) thus 80% of the strings end up in cache (once query is done and string pulled.from db its cached) I don't think most users will notice.

With that said this could be easily an option in the admin. But I think some profiling on sites in existance should be done to see what the memory burden actually is. I think the site I'm working on is on the larger scale with the use of i18n (9 languages, ~30000 nodes, 40000 entries in translation tables (not text group default)) and if this patch makes that site use less memory and drops the amount of queries per page load by 30-50% I don't think this should be written off quite yet with out more feedback and profiling.

What doesn't scale is 1 query per string and then caching. How many times will that cache get hit as well? How often do you use same translated string on the same page? Maybe 2-3 times for maybe 2-3 strings. More common use case is that if you're pulling 1 translated string from DB for a language, you're going to need more for that language.

j0rd’s picture

For those who are interested in profiling patch #14 on an existing site I'd recommend these modules. Please be sure to profile your site before and after the patch and post results with number of queries, time spent doing queries, memory, and execution time. Also let us know if your DB is on the same machine as web server and what DB it is (MySQL/pgsql)

Devel
http://drupal.org/project/devel

Performance
http://drupal.org/project/performance

Then turn off page cache in sites configuration -> performance

Then in devel settings turn on query logging, display queries, display memory, display execution time.

In user permissions you should turn on access devel for anonymous users.

Then test your site on common and uncommon pages with and with out the patch then report back the results. The patch will provide better page loads when the DB is not hosted on the same server as web and also if MySQL query cache is turned off. Which for me is the difference between my production and development environment and probably why I didn't notice on my development box, but had big problems when moving to pre production environment.

Jose Reyero’s picture

Category: bug » feature
Priority: Critical » Major

First of all, recategorizing, as I don't think this is a critical bug.

And a reminder 6.x branch is closed for new features. Whatever you want to implement, 7.x is the place to do it. Again, this can be implemented as an add-on module if you need it for 6.x.

About these performance issues, please keep in mind they are too dependant on specific site builds. Of course if you want to display a few hundred taxonomy terms (or whatever) translated on a page you'll see a lot of queries, but you're also loading a lot of data there. And possibly you'd be better off caching the full page or block content.

So this may work for your site, sure, but caching everything just won't scale, think of a site with many thousands of terms to be translated. And about adding new options, no thanks, we have enough of them.

So please if you really want this to be available consider starting a new module.

asb’s picture

@Jose Reyero: Sorry, but I have to disagree. If a feature from Drupal core is able to kill a database server and even render a dedicated server unresponsive for 15+ minutes, this is a critical bug. It affects not only the Drupal site itself, but the underlaying software stack as well; by definition, a bug is critical if it "either render[s] a system unusable", e.g. "not being able to create content". And that is exactly what this bug (or whatever you want to call it), does: It makes the Drupal site inaccessible, kills the database server, and even affects the stability of a dedicated Linux server - not every server takes a load of 40 on a quad core server forever.

If the bug is "dependant on specific site builds", it'd be crucial to get those factors properly documented so we can try to avoid them. I have seen quite a number of multilingual sites, and none of them with a non-trivial size did not show a behaviour similar to that described in #2. Other posts in this and other issues confirm this. So either this core feature does not scale at all, or it has a severe bug that needs to be resolved. You just can't tell a client: "Yes, you could do a multilingual site with Drupal, but you must not write more than ten pages", can't you? (especially if you just had to explain that taxonomy doesn't scale, either).

I know that trying to fix bugs in Drupal core is quite the opposite of fun, so it's understandable to won't fix it, but please don't cover up a harmful bug as a "feature request"; many of us are using the issue queue's statuses and priorities as a criterion to figure out if a module is in a reasonable good condition. Thanks!

Jose Reyero’s picture

Category: feature » bug
Priority: Major » Critical

... but please don't cover up a harmful bug as a "feature request"...

What?

Ok, then this is a "critical bug report" if that helps "your stats", which I couldn't care less as long as it is closed and it doesn't pop up on my list of open issues again....

(But really I think you should look for help somewhere else about your deditacted server taking 15+ mins to render a page)

asb’s picture

@Jose: Thanks for resetting status and category.

If you read this issue, you might notice that 'Internationalization' causes the high server load, and the server load falls back to normal, as soon as you uninstall 'Internationalization'. Actually that's the whole point of this issue. You might also notice that I didn't write about a time to render a page, but a timeframe of 15 minutes and more where the whole server becomes unresponsive. With a load of 40, Drupal does not even start to render pages within a reasonable response time.

j0rd’s picture

@asb. Any change you'll be able to test my patch on the offending site? Or any other sites you having running i18n

asb’s picture

@j0rd: If it'd be one year ago (#2), I'd say "yes" and "hooray". After that, we dropped support for multilingual Dupal sites, and our servers like this decision. Sorry.

Status: Closed (won't fix) » Needs review
j0rd’s picture

Issue summary: View changes

@naim, no longer doing Drupal development, so if you want this patch to work, you'll need to fix it yourself. Code works though, probably just needs to get updated to latest version.

joseph.olstad’s picture

Version: 6.x-1.x-dev » 7.x-1.x-dev
Issue tags: +Performance

wondering if this is still relevant.

If so:

wondering if this can be refactored for 7.x i18n / 8.x core

candelas’s picture

I am interested too :)

joseph.olstad’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev
Status: Needs review » Closed (outdated)

while it is possible (highly likely) that this issue offers insight on performance opportunities for current versions of i18n I am closing this as outdated, feel free to open a new issue if you are working on performance opportunities / improvements for 7.x i18n or 8.x core

I do not doubt the observations above also apply to D7.x and possibly 8.x however let's let someone start a new issue for their new patches