The function i18n_string_object:load_translation is not using cache, and it can generate a huge amount of queries to load the translated strings. This small commits leverales Drupal's cache to reduce to load to the database.
Page loaded in the default language (no queries to fetch translated strings):
Executed 108 qTo the first ueries in 37.16 ms. Queries exceeding 5 ms are highlighted. Page execution time was 968.05 ms. XHProf output. Memory used at: devel_boot()=2.6 MB, devel_shutdown()=39.47 MB, PHP peak=43.75 MB.
Page loaded in a translated language (before patch):
Executed 452 queries in 72.65 ms. Queries exceeding 5 ms are highlighted. Page execution time was 1988.87 ms. XHProf output. Memory used at: devel_boot()=2.6 MTo the first B, devel_shutdown()=42.62 MB, PHP peak=47.5 MB.
Page loaded in a translated language (after patch):
Executed 189 queries in 47.04 ms. Queries exceeding 5 ms are highlighted. Page execution time was 1540.21 ms. XHProf output. Memory used at: devel_boot()=2.6 MB, devel_shutdown()=41.69 MB, PHP peak=46.75 MB.
This drupal installation in particular is a heavily-bloated installation, that's why the page execution times are high, but i think the example shows the significant gains. This scenario was tested for a non-logged user
We still have to handle cache expiration when the strings are updated, and probably add a new cache bin for this, but i first want to know if the patch is going on the right track, i'm not very familiarized with this module's code.
Comment | File | Size | Author |
---|---|---|---|
#85 | REVERT_i18n-add_cache_i18_string-2885006-85.patch | 4.46 KB | joseph.olstad |
#44 | i18n-add_cache_i18_string-2885006-43.patch | 4.47 KB | m.lebedev |
#42 | i18n-add_cache_i18_string-2885006-42.patch | 4.51 KB | m.lebedev |
#21 | add_cache-i18_string-v3-fixed.patch | 4.18 KB | agusluc88 |
#15 | add_cache-i18_string-v3.patch | 4.22 KB | agusluc88 |
Comments
Comment #2
agusluc88 CreditAttribution: agusluc88 commentedComment #3
agusluc88 CreditAttribution: agusluc88 commentedComment #4
agusluc88 CreditAttribution: agusluc88 commentedNew patch:
Digging a bit more in the module's code, i found out that there was already a function in place to get the CID for a given string. Using that function allow me to take advantage of the already present cache_clear_all invocations in the code. I only had to add the langcode to the CID because the query in the load_translation function is only retrieving the translation for one specific langcode and not for all.
Seems like the cache is now being expired properly when the translations are updated
Comment #5
agusluc88 CreditAttribution: agusluc88 commentedComment #6
agusluc88 CreditAttribution: agusluc88 commentedComment #7
joseph.olstadexcellent work ,
do you have any performance stats (before and after) from a performance profiling tool such as xhprof?sorry , yes I see that, thanks!just so that we can review the performance gains?Comment #8
joseph.olstadI've tested this performance patch and on my test environments I cannot find any problems/issues with it.
Great work!
Comment #9
agusluc88 CreditAttribution: agusluc88 commentedGreat!
There is also another possible optimization. Currently, load_translation is only fetching the translated string in the current $langcode, so with this change we will be generating N cache entries for each string, where N is the amount of languages in which that string is translated. We could modify the query to bring all the translations for the given string and put that into the cache, so we reduce the amount of new cache entries. The performance impact on the query should be minimal
Comment #10
joseph.olstadOk, sounds great.
Lets hope the simpletests will pass
Comment #11
joseph.olstadHi @agusluc88 , so are you going to do some further refinements to the patch? as discussed in #9?
great work so far. We'll see what the tests say so far.
Comment #12
joseph.olstad@agusluc88 , feel free to investigate this performance issue as well:
#2879230: Overhead with many vocabularies
Comment #15
agusluc88 CreditAttribution: agusluc88 commentedHere is the patch for the changes suggested in #9. Since load_translations wasn't going to need the $langcode param anymore, i removed it (updated all the invocations to that function) and changed the name to load_translations because now is loading all the translations for a given string. This patch should generate way less cache entries.
Comment #16
agusluc88 CreditAttribution: agusluc88 commentedAbout #2879230, it doesn't seem to be related to this change, but i will try to take a look at it.
Comment #17
joseph.olstadRun the testbot
Comment #18
joseph.olstadWake up bot
Comment #19
joseph.olstadWake bot
Comment #21
agusluc88 CreditAttribution: agusluc88 commentedMessed up the patch file, updated.
Comment #22
joseph.olstadtrigger testbot
Comment #24
agusluc88 CreditAttribution: agusluc88 commentedThe failing test is probably related to i18n_string_object:get_cid() function using the namespace "i18n:string:obj:" as base for all i18n_strings and the function i18nBlocksTestCase:testBlockTranslation using a different namespace "strings[blocks:block:" for the blocks. When the string of the block gets updated, most likely the i18n:string:obj:XXX entry created in the load_translations function is not being expired. Seems an inconsistent behaviour in the i18n_blocks submodule, don't know why blocks have a complete different namespace separated from all the other i18n_strings.
Is there any instructions available of how to run the test suite locally?
Comment #25
joseph.olstadIt's super easy to run tests locally.
Install drupal core, download and install i18n and it's dependancies.
Then enable the simpletest module.
Then admin development tests, the tests menu link will be somewhere in there. I ran it the other day locally but I forget exactly the location of the tests link or path, search around the admin configuration and sub menus of that and you will find it.
Make sure your server php memory limits are high enough, 512mb is plenty. Also the max allowed packet setting in MySql, if you have problems increase it to 64mb or so.
There was one or two php.ini memory limit settings I had to increase otherwise simpletests all crashed.
It's much better to run tests locally, and also faster, you can insert debug code and throw debug or dump to file as needed for debugging.
Thanks for your hard work
Comment #26
joseph.olstadHere's a simpletest tutorial, probably better explained than my explanation.
https://www.drupal.org/docs/7/testing/simpletest-testing-tutorial-drupal-7
Comment #27
agusluc88 CreditAttribution: agusluc88 commentedgreat, i will take a look as soon as i get some free time.
Comment #28
joseph.olstadI'd love to see some better performance from i18n, perhaps look at the Fabianx related issue
Fabianx created a related issue related to this.
Wondering if you could have a look, maybe it would make sense to work on the Fabianx issue and see if that is easier.
#2891270: i18n_string_textgroup_cached always caches strings at the end of the request
Comment #29
agusluc88 CreditAttribution: agusluc88 commentedI couldn't make the test suite work, even with a clean install of Drupal 7 core and the i18n module i'm getting tons of fails. Seems like every test that requires login is failing for some reason.
Comment #30
joseph.olstad**edit**
Comment #31
joseph.olstadagusluc88 , it would be really nice to get this in. I ran simpletests on this, it's the block translations that are failing as you mentioned.
would there be any way to just cache normally all the strings except block strings? and block strings get the pre-patch original treatment?
Comment #32
m.lebedev CreditAttribution: m.lebedev commentedThe last patch is good. Thanks!
Comment #33
joseph.olstad@agusluc88 , I get the same test results with i18nBlocksTestCase
not sure why but the block translations are not passing tests.
would be nice to figure out why.
Comment #34
joseph.olstadFYI: Stevel provided a fix to the test code, it might be related so I've re-triggered the tests for the above patch.
I'm curious to see what the test results will look like.
It would be great to have a nice performance boost like this.
Comment #35
gwynnebaer CreditAttribution: gwynnebaer commented@joseph.olstad, @agusluc88, I have had a suspicion about cache performance in i18n for some time, and this issue is very timely. I have both time and systems that I can use to help with this issue.
We have always had persistently poor caching performance on a very complex site, and had narrowed it down to i18n caching.
Let me know how the tests turn out and I can apply the patch and see if it improves our issue.
Comment #36
joseph.olstadThe test shows same results, strings for blocks not passing tests. If you can figure this out you will be a hero.
Comment #37
gwynnebaer CreditAttribution: gwynnebaer commentedI ran simpletest on the patch on my local D7.56 install, received two block related errors. I am looking at them now.
Comment #38
joseph.olstadgwynnbaer , any luck with the block strings?
Comment #39
joseph.olstadIs there an easy way we could cache only strings that are NOT block strings? this way we wouldn't get these errors in theory?
anyone want to make this happen?
Comment #40
joseph.olstadBounty for fixing this.
I will donate 100$ US funds to the charity of your choice to the first person that successfully fixes the block strings cache fail.
Comment #41
joseph.olstadas critical as performance issues can get.
Comment #42
m.lebedev CreditAttribution: m.lebedev commentedHello.
I performed the test blocks and found that caches not cleaning when strings update.
Comment #44
m.lebedev CreditAttribution: m.lebedev commentedFix paths in the patch
Comment #45
joseph.olstadWow great stuff @m.lebedev
I will need some time to review and make an interdiff. Run performance profile tests to confirm positive results .
Comment #46
m.lebedev CreditAttribution: m.lebedev commentedI checked a high load page:
Before:
Executed 426 queries in 258.28 ms. Page execution time was 575.99 ms.
After:
Executed 341 queries in 208.91 ms. Page execution time was 484.81 ms.
My site is bad as example. But I want to help with problem. Any help for a site performance is well.
Comment #47
joseph.olstadExcellent, yes that is a significant improvement. I will try to run my own tests. How did you get the performance results? xhprof? or your own code?
Comment #48
m.lebedev CreditAttribution: m.lebedev commentedI used the module Devel to get the performance results. It's a simple way.
Comment #49
joseph.olstadI confirm significant performance improvements.
On a slow virtual server with a slow conventional hard disk using php 5.5 and an older version of mariadb
before patch:
6700 milliseconds
after patch
5200 milliseconds
this is on a very heavy page logged in as administrator with devel module enabled and query stats enabled.
Need to review some more.
Initial preliminary review of patch looks great.
Comment #50
joseph.olstadThe latest patch still needs more reviewing, I did notice that the test failure code was removed, this might imply that the pass is a fake pass. Or else there was a problem with the test however we should really make sure we're not bringing in string translation regressions here.
It would be so nice to get this patch into a release asap, because it is a huge performance improvement. It might need some adjusting yet.
I am still going to honour the bounty on this, actually I will increase the bounty to 200$ U.S to the person that writes the patch that will be committed into the next release 7.x-1.23 (regressionless patch), this might be split up between two people. so , 200$ U.S bounty for this, mark my word, I will pay, but it has to be a patch that will give a significant performance improvement without a regression. I am still not 100% confident that the latest patch does not bring in a regression because the tests were modified and I have not yet confirmed if this is appropriate to do. I would guess that the original simpletest was there for a reason however I have not yet confirmed.
Comment #51
joseph.olstadBounty is increased to 200$ U.S
Comment #52
joseph.olstadComment #53
joseph.olstadComment #54
joseph.olstadComment #55
m.lebedev CreditAttribution: m.lebedev commentedThe test is successful because the translation cache is updated. What is the problem?
What needs to be improved?
Comment #56
joseph.olstadHi @m.lebedev , ok, I had another look at the patch and the change to the test code, it actually looks good. Makes sense , the new function load_translations was added and does not need a langcode.
Would you be ok to split the bounty with agusluc88 $ so 100$ U.S for you also 100$ U.S to @agusluc88 for his original work on this patch?
If you are ok with this, I will pay out this bounty 1 week after the release of i18n 7.x-1.23 , provided no one complains of regressions.
I will release 7.x-1.23 with this improvement in the next day or two. So you should get paid next week if all goes well with the release.
Thanks
Comment #57
joseph.olstadNice work, really hope this release rolls out smoothly. This is a very important improvement.
Comment #58
joseph.olstadHmm, maybe RTBC too quickly,
is there any other calls to load_translation that should be changed to load_translations?
wondering if somewhere in contrib this might be problematic, so in both i18n and in contrib, if we have a look at webform_localizations (the module code) wondering if it is using load_translation() and might need update to load_translations()
this is a pretty important change, maybe need a few more days of review.
Comment #59
m.lebedev CreditAttribution: m.lebedev commentedThanks. But I helped you for free.
I think that Drupal should be developed for free.
Comment #60
joseph.olstad@m.lebedev thanks very much for your generosity. I think we should commit your patch into the dev branch right away, it looks good, I tested it and did not notice any issues, however maybe we leave it in the dev branch to simmer a bit before releasing it.
Comment #61
joseph.olstadm.lebedev , have you been using this patch in a production environment? have you noticed any issues? anyone else using this in a production environment?
Comment #63
joseph.olstadfixed in dev branch 7.x-1.x
Thanks very much, this is a very important improvement.
Comment #64
m.lebedev CreditAttribution: m.lebedev commentedToday I applied a patch to one of the projects and found that names of fields did not translate in an add/edit form entity. I think it happened when I added new field to entity.
I will try found the cause of the non-working translate in the coming days.
Comment #65
joseph.olstadso you used patch 44 ?
prior to patch 44 this test succeeds?
so, try revert patch, cache clear all
repeat scenario
then try apply patch, cache clear all
repeat scenario.
different result?
Comment #66
m.lebedev CreditAttribution: m.lebedev commentedI understood why the error occurs. I will try to explain.
The I18n module includes several classes for working with translations.
For example:
- i18n_string_textgroup_cached
- i18n_string_textgroup_default
etc.
Fields use the i18n_string_textgroup_cached class for translation, because it is specified in the i18n_field module.
https://cgit.drupalcode.org/i18n/tree/i18n_field/i18n_field.i18n.inc#n69
The i18n_string_textgroup_cached class has a destructor in which data is cached.
https://cgit.drupalcode.org/i18n/tree/i18n_string/i18n_string.inc#n1406
If you apply the patch, the data is saved into the cache immediately after a each query and the second time after the class is destroyed (if the object is i18n_string_textgroup_cached class).
the cache of fields will be overwritten if a new field will be added to the entity. The cache not contains data for translations. Names of fields will not be translated.
And that is not all.
The i18n module translates other properties of the entity.
These properties use the i18n_string_textgroup_default class. The class does not have default caching.
The patch adds forcing data caching for these properties.
I use memcache. Cache set/get will be use memcache.
https://cgit.drupalcode.org/i18n/tree/i18n_string/i18n_string.inc#n669
This creates false performance.
i18n_string_textgroup_default class is default and should not cache data.
I think that the patch contains incorrect changes for the module.
In order to cache data that does not fall into the cache, you need to use the module's API, as it was done in the i18n_field module.
The load_translations function should only return results.
Caching must be performed in the i18n_string_textgroup_cached class destructor.
Comment #67
joseph.olstadOK I will revert this commit soon
Comment #68
joseph.olstadBut when cache clear is called the translation shows up?
Comment #69
m.lebedev CreditAttribution: m.lebedev commentedYes, it is. Translations will be updated, if the cache is empty.
Comment #70
joseph.olstadhmm, ok, so could be an annoyance, but not critical fail.
I wonder if this only affects those using memcache.
Comment #71
joseph.olstadm.lebedev, do you want me to revert your patch? Do you plan on implementing your suggestions in #66 in a new patch?
Comment #72
m.lebedev CreditAttribution: m.lebedev commentedAs I said earlier, I think the patch adds incorrect changes to the module.
Sites may have problems after applying this patch.
I'm not sure what my suggestions will be useful. And perhaps it will take a lot of time to implement.
The current task requires the collection of information. Maybe someone else can help with this task.
Comment #73
m.lebedev CreditAttribution: m.lebedev commentedQuick test:
This class caches translations.
The page has ~100 elements for translation. Cache warmed.
Before:
Executed 164 queries in 87.21 ms. Page execution time was 370.76 ms.
After:
Executed 54 queries in 33.39 ms. Page execution time was 294.99 ms.
I did not check for problems after applying the changes. I tried to change the translations of elements and they changed correctly.
Maybe it will useful... (if you use the memcache module)
Comment #74
joseph.olstadm.lebedev , when I tested the patch I did not add any other code and I measured a significant improvement.
Is this alter hook a new approach of some kind for additional improvement?
Comment #75
m.lebedev CreditAttribution: m.lebedev commentedMy approach is an alternative solution and it is "drupal way". The patch and this approach are equal each other.
I suggest adding additional settings for the i18n_string module (admin/config/regional/i18n/strings).
These settings will allow you to enable caching for a specific translation group (see example). Users will be able to select the desired settings.
Comment #76
joseph.olstadok, I like this suggestion.
However, the functionality as-is works quite well..... I think we should publish the release and create a new issue for your suggested cache settings per translation group.
Do you agree with releasing this? we can add your suggested cache settings per translation group post-release.
Release 1.23?
Comment #77
joseph.olstad@m.lebedev your patch looks rtbc to me, suggest we bump suggestion into new issue after a release.
Sound good to you?
it is normal that people would need to do a cache clear for newly updated strings. we need this caching.
Comment #78
joseph.olstadfor any related concerns, please followup in this issue
#2958586: enable caching for a specific translation group
or if appropriate, create another new issue and link it to this
Comment #79
m.lebedev CreditAttribution: m.lebedev commentedDo as you see fit.
I already told you about the problems after applying the patch. In my project, the patch causes problems and it can not be applied in production.
The decision is yours.
After the release, I will have to do a reverse patch for the project. Perhaps I could not explain to you why the patch is bad. Sorry...
Comment #80
joseph.olstadok, perhaps I did not fully understand the problem you raised. A clear cache is something that people would expect to do if the cache is not updated. I'll have to reread your comments.
So m.lebedev, should we hold off on:
#2958586: enable caching for a specific translation group
?
Comment #81
m.lebedev CreditAttribution: m.lebedev commentedI'm going to create a patch which add new settings to enable caching for a specific translation group.
I plan that this patch (from this issue) will be revert and new settings will be added to the module. Then the issues will be solved.
Thanks.
Comment #82
joseph.olstadThanks in advance, looking forward to this! From my tests, the performance improvement is noticeable without even profiling it. That means very good improvement!
Comment #83
joseph.olstadrelated issue is fixed. It is in dev, should there be new issues discovered with this caching technique we can optionally roll back (revert) or patch any regression fixes on top.
Let this sit in dev for a bit , maybe 2 or 3 weeks and then publish a release.
Special thanks to m.lebedev and agusluc88 as they both did a lot of heavy lifting here! great work
Comment #84
m.lebedev CreditAttribution: m.lebedev commentedThe new settings will not work until this commit will not revert.
I don't have time to prove my words. I have written above why you need to do it.
I repeat the last time that this commit creates problems in the work of sites. My job will no longer allow time for discussion of this issue.
Thank you for your time.
Comment #85
joseph.olstadSorry m.lebedev, I am confused. My offer still stands for a bounty if you want it.
the cache to load_translations is a significant improvement.
Here is the patch to revert the cache to load_translations.
Comment #86
joseph.olstadpatch 85 is patch 43 in reverse.
Comment #88
joseph.olstadreverted the commit
so for more performance enhancements, please open a new issue
If I am understanding things correctly the performance enhancements were done using a different approach in:
#2958586: enable caching for a specific translation group
I will try to run some performance benchmarks on 2958586 later next week to see if it improves performance. I have a dev system that uses opcache.
Comment #89
joseph.olstad#2958586: enable caching for a specific translation group