Problem/Motivation
Part of:
#1577902: [META] Remove all usages of drupal_static() & drupal_static_reset()
#3566536: [meta] eliminate core .module files
#3215707: [META] Modernize Locale module
Proposed resolution
Deprecate and move to CurrentImportStorage service
- locale_translation_get_file_history no direct replacement
- locale_translation_update_file_history
- locale_translation_file
New method: get that returns and caches the current import information,
New CurrentImport value object that stores import info. These functions, especially update, used to return and work with a mix of different untyped classes, now get and save only
This results in major performance improvements for sites with many languages. On a project with 4 languages, we identified around 500 database records, projects with 100 languages will have thousands of records, which were fully loaded every time this information was used and often written back, which invalidated all that again. Now we load records one-by-one and on updates, invalidate them one-by-one.
Move cron method to it's own class and move the db calls to CurrentImportStorage.
Remaining tasks
User interface changes
N/A
API changes
- New service
CurrentImportStorage. - Deprecated procedural functions:
locale_translation_get_file_history()locale_translation_update_file_history()locale_translation_file_history_delete()
- New value object CurrentImportState
- A number of minor changes around the behavior of these functions and how they are used. update() no longer returns a database insert/update constant, we no longer store this information as a fake-file on the source, we do fewer unnecessary merge queries, we removed the type from the return value as it was unused that was just derived information from the timestamp. More details on this in comments on the MR
Note: All these functions have zero usage in contrib. It is a mostly internal thing of the source/translation status API (which we will tackle next) which persists some of it's information while the source/status objects themself are only a keyvalue store that is at times completely cleared. We explicitly made this @internal now as we expect some minor changes in later issues, such as adding a type for the source object.
Data model changes
The new value object, but it mostly uses the same properties as the untyped object before·
Part of the confusion here was that the functions are called file history, the database table is called locale_file and has filename/uri columns. But these are/were never written. This isn't about files, it's about project/langcode combinations, the translation status/source as it's called. Now that we fully isolated access to the database table, we could rename and remove unused columns from it in a follow-up issue.
Release notes snippet
N/A
| Comment | File | Size | Author |
|---|---|---|---|
| #59 | paths_sourceBuild.png | 5.11 MB | nicxvan |
| #56 | Screenshot_20260516_104239.png | 111.62 KB | berdir |
| #56 | Screenshot_20260516_102835.png | 80.67 KB | berdir |
| #25 | Web capture_14-1-2022_135333_om-d8.ddev_.site_.jpeg | 490.62 KB | heddn |
Issue fork drupal-3037156
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
claudiu.cristeaComment #3
claudiu.cristeaComment #4
claudiu.cristeaPatch.
Comment #5
andypostmaybe use [] instead of null?
looks usage of constant makes this service non-unit testable, maybe just add this to query to return required value?
looks reset used only in tests, so does it need to be public api?
Comment #6
claudiu.cristea@andypost,
#5.1: I intentionally used NULL as "no cache". Because we want to cache also the situation when there's nothing in the table. So
[](cached but nothing in the table) is not the same asNULL(not cached).#5.2: I'm assuming that those constants will be deprecated too as all other global namespace constants. I would not fix that here. For sure there will be an issue for that.
#5.3" Yeah, I was thinking on that. Anyway, I guess there should be a public method to clear the cache even not used for now. Or you suggest to keep it public but not on interface?
Comment #8
andypostIt makes sense to keep scope creep
otoh While we factoring service makes sense to file follow-up and discus proper implementation
Over time this history needs clean-up for outdated translations and same time mostly all entities in core are revisionable
Also static usage in history service stuck on storage for any core entity
The
LOCALE_TRANSLATION_CURRENTsounds like duplicate ofFIELD_LOAD_CURRENTso Makes sense to move this constant to new history serviceComment #11
hardik_patel_12 commentedRe-rolling against 9.1.x-dev.
Comment #13
snehalgaikwad commentedComment #14
snehalgaikwad commentedComment #15
snehalgaikwad commentedFixed failed test case.
Comment #18
daffie commentedThis is now done with the use of the method $this->expectDeprecation().
All tests in this class are legacy. Therefor the tag @group legacy can be moved to the docblock of the class and the ones in the docblocks of the tests can be removed.
Every test needs an assertion. My recommendation would be to test the return value of each function call with an assertion.
Can we remove this method from the interface. See the comment #5.3 from @andypost.
Can we add to the docblock that this method is @internal.
I think it is ok to remove return value. The function is not used in contrib. See: http://grep.xnddx.ru/search?text=locale_translation_update_file_history%... Only it is a API change and it needs to be documented in the IS and the CR.
Can we change the text to: "The import history as an associative array that is keyed twice. First by the project ID (e.g. 'drupal', 'pathauto', etc). The associative array is keyed then by the language code. The value is the file as an \stdClass object with the following keys:"
Comment #19
andypostComment #20
dhirendra.mishra commentedRe-rolled the patch manually for 9.3.x as it was not applied automatically
Comment #21
gábor hojtsyThere were various code style fails and a spelling check fail (
LocaleImportHistory.php:30:12 - Unknown word (databse)) , see https://www.drupal.org/pift-ci-job/2123269Comment #22
yedhukrishnan.p@valuebound.com commentedFixed the fails. Please check.
Comment #23
daffie commentedThe points from comment #18 still needs to be addressed.
Comment #25
heddnThe last few patches have dropped the interface and implementation of LocaleImportHistory. NW to resolve this and the feedback in #18.
I'm hoping the conversion here will help with performance. We're currently using an install profile that adds multiple contrib modules and languages. We see calls to these functions around 96% of the execution time and taking almost 2 hours to run with 8 or 9 languages.
Comment #26
heddnAfter reviewing the code, my suggestion is to add an optional argument to get/update/delete to include the project name. It would significantly reduce the thrashing on install. If that is out of scope, maybe it should be part of scope. Since we are already changing the API a fair bit here. And deprecate calls to these services that don't provide them. If they do provide them, limit the query scope. If they don't, trigger a deprecation as of drupal 10 or 11 that they will be required.
Comment #31
claudiu.cristeaComment #32
nicxvan commentedComment #35
nicxvan commentedThank you for working on this!
I have a bunch of comments on the MR, I'm happy to answer any questions you may have.
Please don't get overwhelmed, there are just a bunch of conventions we have worked out in the .module initiative we need to apply here.
Normally I would provide MR suggestions you could apply, but I'm on my phone and some would be cross file and multi line.
Comment #36
quadrexdevThanks for providing some feedback. I'll check this out and work on it today
Comment #37
nicxvan commentedGreat! #2907780: Add a field purger service has examples for most of the comments I made.
Comment #38
quadrexdevAddressed the feedback. Please review.
I noticed that there's an error in PHPUnit Functional Javascript 2/3 job but it looks like it's unrelated to the current issue changes.
Would be happy to finalize this issue if something is needed.
Comment #39
nicxvan commentedThe Functional Javascript is a known random failure, you can rerun the job manually.
There are a couple more minor notes.
Unfortunately there is one larger issue that I think means we need to postpone this on #3577671: Modernize locale file handling
Edit: quick note that the referenced issue is RTBC and is pretty close, so this should not be postponed too long.
Comment #40
quadrexdevAddressed remaining minor notes.
Comment #41
andypostConcerns in #25 and #26 remains to verify as performance metrics of install is viable
Comment #42
nicxvan commentedI don't think that is in scope, and it won't really help since get gets the full history.
We could maybe create a new get for a single project that would help, but that would be a follow up for sure.
Update already only updates a single file so no change would be needed there.
Delete history let's you define projects already.
Comment #43
nicxvan commentedThe other issue is in, this can be updated to use LocaleFile.
Comment #44
nicxvan commentedUnfortunately I think we need to postpone this again, the compare issue is touching a bunch of things, I think the plan now is to do that issue, then the batch functions then either this or the constant ones.
I'm only going to postpone this on one for now since this may be better to address earlier once I start the batch one.
#3037031: Convert locale.compare.inc to a service
Once this is ready we need to use the LocaleFile and a memoryCache instead of a static cache.
Comment #45
nicxvan commentedComment #46
berdirBoth #3037031: Convert locale.compare.inc to a service and #3581303: Convert locale batch callbacks is in, I think this is the best next step.
Comment #47
nicxvan commentedComment #48
nicxvan commentedI'm working on this.
Comment #50
nicxvan commentedOk I recreated the MR since there were so many conflicts, I pulled the information from existing MR.
I'm going to work on the LocaleFile bit then look a the memory cache for individual calls.
Comment #51
nicxvan commentedOk I've converted the database call to a value object.
If we want to convert updateHistory to use a value object we would need to do #1842380: Convert $source object to a LocaleTranslationSource class here.
It would be nice to have one normally scoped issue though for locale, so I'm not sure we do this here.
I took a quick look at splitting up the memory cache, I think the only way to really do this is to add a single get call, if there is a miss we then load all to populate, and we have no way to get all projects. Does that seem right?
Comment #54
nicxvan commentedOk I added a v3 version that replaces the get all with an individual project and language, I'm curious if we should add a get all as well I think this will be way more performant and I'm not sure we need a get all at all.
There are no usages of
locale_translation_get_file_historyin contrib.Not waiting on tests at this hour so I'm not sure they are passing on v3.
v2 is failing too, I'll track that down later, but the approach would be good to verify.
Comment #55
nicxvan commentedSame test is failing in both so I just gotta fix it and we can decide on a direction.
Comment #56
berdirI've been trying to a bit of profiling with a bunch of languages on our distribution with a ton of projects. One thing I've confirmed is that if there's no file history info yet on 11.x, then we get exactly into the loop of load all -> update -> invalidate -> load all again. invalidating just the changed one should fix that. That only happened for a language where I aborted the initial batch operation that fetched the translations. not just the updates, but also the get history calls:
You can see that there are only 24 calls to buildSource(), each then updates that info.
A different scenario is when locale_translation_clear_status() happens, because that clears the locale.translation_status key value store. This doesn't happen often, but it can happen. it used to happen when saving the file settings form, it still happens when saving the locale settings form and changing the translation source. What I expected to happen then is a lot of calls to buildSource() and to get file history, but none/few to update. I did get 520 calls, but strangely, I also get 520 update calls. Something about that timestamp logic doesn't seem to add up, I'm not sure what needs to happen for that to be set to trigger that case.
=> There is a possible scenario where a getAll() in theory might be beneficial, but it's really not possible to know when upfront, so I don't think there is a place where we could say, I'm going to load all because I know I'll need them. So I think it's fine to only support single get and update, that will in most cases be a significant improvement. What I think *is* problematic is calls to locale_translation_clear_status(), because on a site with 100 languages and 50 projects, that will result in 5000 sourceBuild() calls after a cache clear. It would probably be better to trigger a rebuild as batch then afterwards and not do it on one request on the reports page. But the current refactoring will not make things worse, it will more likely make several things significantly better, such as not loading all translation status data + file history data on every request when it's not needed.
Comment #59
nicxvan commentedLocale is the gift that keeps on giving.
Had another long discussion with @berdir on slack about this and came to the conclusion yet again that this is not LocaleFile.
To that end we created another LocaleSourceInfo value object for tracking the info.
I created a map of all of the ways to get to sourceBuild since that is where things get muddy again. It doesn't necessarily show the logic flow entirely but it shows where you can start and end up in sourceBuild.
I created a v4 to show this approach, it is green again so I think it's ready for review.
I also changed update to just update the memory cache directly now that we have the info we need.
I'm going to hide v3 and update the issue summary and change record.
Comment #60
nicxvan commentedComment #62
berdirReviewed. I think v4 makes sense as a concept. translation status is still a mess, that we need yet another value object, Need to think about about the class and method names. And possibly the final question to me is if we want to move the direct query in the cron hook in this issue or a follow-up.
Comment #63
nicxvan commentedOk I renamed this, I also moved locale cron to it's own class so we could move the storage interactions to the storage class.
Comment #64
nicxvan commentedComment #65
berdirReviewed, I think we're getting close.
Comment #68
nicxvan commentedI think it's ready for review again!
Comment #69
berdirOk, I feel comfortable RTBC'ing this now. I also made a lot of updates to the issue summary now.
Note that we basically pair-programmed this. I didn't write any code (just made a few suggestions on comments) but we both spent hours mapping and understanding the functions calls and data flows around this and then more hours discussing naming, how to isolate and separate this data structure from the others and so on.
Considering the starting point, I'm quite happy with the result. We went from functions that, despite only having 1/3 calls, were so confusing that it took me an entire Saturday morning to have a basic understanding of what they actually do to a storage service that fully encapsulates this with a typed and documented value object, separating it from the various other data objects that are floating around. See the comments on the v3 merge request on my discovery journey: https://git.drupalcode.org/project/drupal/-/merge_requests/15774
We did our best to improve and clarify (previously barely existing) documentation. I'm sure it's not perfect, but I think it's a clear improvement. Keep in mind that absolutely nobody in contrib and presumably custom code directly uses any of this, it's entirely internal and now documented as such.
Comment #71
catchThis looks like a huge improvement, as mentioned there is more to do, but feels like we're getting to a much more organised point very quickly.
Committed/pushed to main - will need a backport MR for 11.x
Comment #74
nicxvan commentedThose should be ready.
The conflict was the use statements in locale.module.
I also fixed autowire.
I asked berdir to review if we would have issues with readonly properties and I think will be OK.
Comment #76
catchCommitted/pushed to 11.x, thanks!