Problem/Motivation
EDIT: Now that #3498154: Use LRU Cache for static entity cache is in, there is no longer a need to create an entity subscriber, only remove the memory management code from MigrateExecutable.
MigrateExecutable has a lot of code devote to managing and reclaiming memory, since memory usage is quite relevant to migration. There's nothing wrong with this code per se, but we'd like to remove MigrateExecutable entirely. To that end, let's move all the memory management code into a new internal class (Drupal\migrate\MemoryManager) and have MigrateExecutable's constructor instantiate it.
The new MemoryManager class will be explicitly internal because it is not an API. Eventually we may want to add an interface to it (and possibly plugin-ize it), at which point we can remove the internal designation, but that's all way out of scope for this issue. For now, we just want to remove as much code from MigrateExecutable as we can.
Cool enough I implemented it and replaced my wonky hook_row_alter :-D
I don't like the setMessage but our message "service" logic is just all over and I wasn't sure how to handle it.
Remaining tasks
Change record - Explain why the threshold is 0.9 and the memory limit is 0.85Inject the newEnvironmentMemoryclass from #3000050: Replace \Drupal\Component\Utility\Environment::checkMemoryLimit with a PHP memory checking helper class and service into the MemoryManager class and remove redundant methods likegetUsageInBytes().Instead of setting integer properties in the dispatched event, pass theEnvironmentMemoryobject. Update the event subscriber.Update the MemoryManager test to use a mocked version ofEnvironmentMemory.Implement the suggestion in #79.2. Or argue against it.Respond to the suggestion in #80.UsereturnCallback()as suggested in #82 or open a follow-up issue to do this. See also the last part of #83.Remove Memory Manager and event subscriber- Add tests
- Rewrite the change record. Or maybe delete it.
| Comment | File | Size | Author |
|---|
Issue fork drupal-3006750
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3006750-migrate-memory-management
changes, plain diff MR !5156
Comments
Comment #2
phenaproximaA first attempt, to see how many tests crap out.
Comment #3
heddnThis too. A constant on the memory manager.
Can we move this magic number into a constant?
Comment #4
phenaproximaDone.
Comment #7
heddnThis moves more things into the memory manager.
Comment #8
phenaproximaI do not think we should be moving the reporting stuff into MemoryManager. It is a big violation of the single-responsibility principle, and makes the class harder to maintain.
One possible compromise I'd be willing to make here is to create a service that extends MemoryManager. The service could bring in the logging and string translation services, and do the logging in its override of isLimitExceeded(). That service could be marked internal as well, and then MigrateExecutable could use it.
But ideally, we'd just keep the reporting code in MigrateExecutable for now.
Comment #9
phenaproximaSending back to "Needs work" until we hammer out a solution here.
Comment #10
quietone commentedphenaproxima asked me to look at fixing the failing test in #4 and I didn't get it done and heddn beat me to it anyway. Nice to see the tests fixed but I think that adding all the messaging clutters the MemoryManager and makes it harder to concentrate on what it is doing. I've always been a fan of doing one thing well and would prefer to see that done here.
How about a follow up to deal with the messaging?
Comment #11
neclimdulThere's nothing wrong with this code per se, but we'd like to remove MigrateExecutable entirely before Drupal 9.Really? How do we expect migrate to work without it? I can barely get some migrations to run _with_ it and generally have to write weird hacks into hooks to get drupal's long process memory usage under control. I don't think this is realistic.
I _do_ think moving it to a class has a lot of benefits though. There are actually problems as I hinted at with the code like there's no way to tune it, effectively use migrations it if memory_limit was set to '-1', or if you have other logic you need to implement.
Moving it to a class would let it be pushed on the container, inject tuneables like limits and thresholds, and be replaced.
Comment #12
phenaproximaI mean, obviously we would factor stuff out of it so that it would be a shell of a class. The point is that there is nothing in MigrateExecutable that needs to be in its own class. All of it can and should be merged into the main Migration class.
Comment #13
neclimdulReading your response and the additional clarity from slack I think I understand. Marking the memory management internal and discussions of killing the Executable I miss-interpreted that as expecting to kill the memory management in 9.
I think making it internal and not making an interface misses a big opportunity. For the past 3 years I've struggled with Drupal's caches and migrate's memory management. Its _almost_ bullet proof now but the arbitrary thresholds and assumptions around things like memory_limit=-1 can force some really weird site builder compromises... Like the hook_row_alter I had to to do futze with clearly caches when Migrate's didn't work for me.
Making this a service that can be replaced or tune would be a really cool developer win. So much so I implemented it and am applying it and using it instead of my hook_row_alter hackery! :-D
Comment #14
neclimdulOn the messaging discussion, that's actually the documented logic from checkStatus(). Since there are lots of messages on reclaim successes and they're all related to internal values of the class I don't really see a way to do it other than in the memory managment service.
Comment #15
heddnSomething I've seen done with several places in core and contrib is to fire an event when certain things happen. What if we fired an event when we decide we need to reclaim memory? Then subscribers can decide to actually reclaim memory, log errors, dial home and/or do some fancy thing to drop memory even /more/ than what comes with core?
Comment #16
phenaproximaI like that idea a lot. It lets us pull code out of MigrateExecutable, while still keeping the messaging in place and leveraging existing structures in core...all without having to introduce new APIs. And it makes things flexible for contrib and custom code. That's a lot better than my original idea. Let's do it!
Comment #17
phenaproximaRe-titling to reflect the new direction.
Comment #18
heddnI didn't update tests, so this will fail. But this at least gives us an idea of what moving things into an event subscriber could look like.
Comment #20
heddnTriaging issue queue.
Comment #23
quietone commentedReroll. Fixed a missing quote in a/core/modules/migrate/migrate.services.yml which will reduce the test failures.
Comment #25
quietone commentedCoding standard fixes.
Comment #27
quietone commentedChanged the unit test to use a dataprovider and allow for a reclaim of memory.
Comment #28
quietone commentedUploading the patch is always a good idea.
Comment #29
andypostQuick review
Would be great to have a comment why this settings default, probably in CR
why it can be int?
better to inject string_translation service
this strings could live in manager interface
It needs deprecation fallback in constructor, the new method used only once
Comment #30
quietone commented#1. Added an item in the IS so this doesn't get forgotten.
#2. I think this is copy/paste and since int doesn't make much sense for a ratio, all occurrences are now float.
#3. Done
#4. Added strings to the interface. Maybe better names are needed but nothing coming to mind today.
#5. Not sure what is expected here. What is to deprecated and is there a problem with a new method being used only once?
The patch includes code standard fixes as well.
Comment #31
mondrakeI wonder if anybody would be interested in getting #3000050: Replace \Drupal\Component\Utility\Environment::checkMemoryLimit with a PHP memory checking helper class and service in, in order to have a common memory checking component. I have been trying to improve memory managemnt in the context of GD image toolkit since #2583041: GD toolkit & operations should catch \Throwable to fail gracefully in case of errors, but without success so far.
Comment #32
andypostIt looks mostly ready! Thanks a lot moving forward 👍
Meantime
@TODO: explore resetting the container.needs follow-up link to passAlso "memory checker" should be memory manager as it more common, and could care about GC used right after the todo
Comment #33
heddnNW for #32.
Comment #34
quietone commented32. Made a new issue for the long standing @TODO, #3128793: Explore resetting the container when reclaiming memory
What is 'memory checker' referring to? A `grep -ri 'memory checker' core` returned nothing, so no change made to this patch.
Comment #36
mikelutz#25.5 is okay. Nothing is deprecated and the argument is optional, the class uses a getMemoryManager() method which will get the service if it isn't passed in. It's all a little wonky, but MigrateExecutable isn't a service, so we can't force the dependency injection easily. We could deprecate NOT passing the memory manager service to the executable, but that would require changing the order of constructor arguments, and if the goal is to deprecate the executable entirely in D9 (and I still think it should be) then there isn't really a point.
I think the patch is good. I'd RTBC it, but I'm setting to NW for the missing CR in the IS.
Comment #37
heddnCR added
Comment #38
quietone commentedThe CR look fine to me. There is just one thing, the IS has a task for the CR, "Explain why the threshold is 0.9 and the memory limit is 0.85". I would update the CR but I don't know why those values where chosen, only that they are the same as the existing values.
Comment #39
quietone commentedI was wondering if there was more information in the original commit of MigrateExecutable about the threshold values. It was commited in #2125717: Migrate in core: patch #1 and I didn't find any history of those values there.
And, the request for adding more information about the threshold is from #29.1
Comment #40
heddnCR updated and minor tweak to use constants rather then magic strings.
Comment #41
andypostmaybe better to use
self::CONSTANTas class implements the interface by itselfComment #42
heddnComment #43
quietone commentedReplaced string with the constant REDUCED_ENOUGH_TO_CONTINUE.
I think the CR needs another tweak "But if after reclamation at least 5% or (or less then 85%) is reached," is a bit confusing but I'm not sure about the 5%.
Comment #44
heddnCR updated.
Comment #45
catchComment #46
benjifisherI have started to look at this issue. I may have more requests as I continue.
If the parameter
$memory_limitis an integer, then we do not need to applyBytes::toInt()to it. If we want to allow strings, like128M, then change the@paramcomment.The remaining usage of
Bytes::toInt()should be cast toint, with a note to remove the cast once #3142934: Replace \Drupal\Component\Utility\Bytes::toInt() with \Drupal\Component\Utility\Bytes::toNumber() due to return type goes in. Alternatively, we could just make it a float (here and in the rest of the patch).What do we want to do if
$memory_limitis supplied and equal to-1? Depending on the answer to that, and to (1) above, this might be simpler:In three places, we create and dispatch an event, and the only difference is the "phase" constant. I think that is enough to justify adding a helper function.
If the function is rewritten to exit early, then it avoids indenting most of the function body at the expense of a second
return TRUEline.This function does more than just test. It also dispatches up to two events. Can we add a sentence or two to this doc block to say what it really does?
Make it
@param float.Do we need this wrapper function? Is the idea that child classes may override it?
Maybe this answers the previous question:
Is there any reason to make the test method
public?Add "in bytes" in a few places:
Can we say that the phase is one of the constants defined in MemoryManagerInterface? I think that should go here, but maybe on the getter function or both.
Make it "an action", not "and action".
DRY. Define the parameters once, outside the
switchstatement. Just use100insideround(): there is no need for the$ratio_multipliervariable.Do we need this wrapper function?
Comment #47
quietone commentedFixes for #46;
1. Update the @param
2,3. Seems reasonable to all ow -1 so used the simpler option in #3.
4. Helper added as well as return early.
5. Added some comments. Doesn't seem like enough, more like a start.
6. Fixed
9. 'in bytes' liberally applied.
11. Fixed
12. Fixed
13. Doesn't look like it is needed. Removed.
TODO
#46. 7, 8, 10
Comment #49
quietone commentedThe failure in #47 appears to be random, not related to this issue.
7,8. I don't know if it should be protected or public.
10. Not sure here either. Decided to ass a comment to the declaration of $phase, not the getter.
Comment #50
benjifisherI still have a few suggestions based on the patch from #43. I will look at the latest patch on my next review, and I still have to review the tests.
I already mentioned "and action", but I missed that "event lister" should be "event listener".
The getter returns whatever was passed to the constructor, so we should use the interface for the
@returncomment.I do not see any discussion of why we are replacing
memoryExceeded()(returning TRUE if we are out of memory) withensureMemory()(returning FALSE if we are out of memory). Maybe #46.5 is the reason for changing the name: the function does more than just test.If we are going to negate the meaning of the return value, then I would rather swap the two
returnstatements than add an explicit negation (!). Better yet:Comment #51
quietone commentedFixes for #51, 1 , 2 and 3.
It looks like ensureMemory() was added in #13.
Comment #52
benjifisherThe patch in #49 still says "event lister" (#50.1). Maybe you uploaded the wrong patch or had unsaved/uncommitted changes?
Comment #53
quietone commentedSorry about that. This fixes the spelling of 'listener'.
Comment #54
benjifisherThanks for the updates. I still owe you a review of the change to the tests, but here are some comments on the patch in #53:
This is a good start:
But I think we should rewrite the first sentence. In fact, it is worse than that: the first sentence is backwards! Compare #50.3. How about something like this before the
@returncommentIf memory usage is too high, then dispatch a PRE_RECLAIMED MigrateEvent so that event listeners can reclaim memory. Then dispatch a STILL_EXCEEDED MigrateEvent or a REDUCED_ENOUGH_TO_CONTINUE MigrateEvent, depending on whether enough memory was reclaimed.
and then the
@returncomment can be shorter, likeTRUE if memory usage is low enough, or if enough memory can be reclaimed. FALSE if memory usage is too high and not enough can be reclaimed.
I still think that this should be
protected(#46.8):I see the problem. I quoted the wrong lines of code in #46.7. There are two methods named
getUsageInBytes(): one on the event class, which is public, and one declared in MemoryManagerInterface, implemented in MemoryManager, and overridden in TestMemoryManager. The second one is declared and implemented asprotected, but overridden aspublic.In #46.12, I meant to suggest defining the parameters as a single array (before the
switchstatement), not three separate variables. For the record, in reviewing earlier comments I noticed that #3 explicitly requests replacing100with a constant. I disagree: this does not count as a "magic number" in the context of formatting a per cent. Maybe #3.1, like my #46.7, made the mistake of quoting the wrong lines. That is, maybe the intention was to point out the0.85.The
@returncomment forMigrateExecutable::getMemoryManager()still saysMemoryManagerinstead ofMemoryManagerInterface(#50.2).Comment #55
quietone commented1. Fixed
2. Not following this yet. getUsageInBytes() in not declared on MemoryManagerInterface it is only in MemoryManager. Similarly, IsLimitExceeded is protected in MemoryManager and public in TestMemoryManager. Whats the difference?
3. Yep, I planned to do that and completely forgot. Fixed now.
4. Argh! Fixed, again.
Comment #56
benjifisherI seem to be jinxed on
getUsageInBytes()! Of course you are right: if it were declared in the interface, then it would be public.The same question applies to both
getUsageInBytes()andisLimitExceeded(): if they are protected in MemoryManager, then why make them public in TestMemoryManager?I see that
isLimitExceeded()is called by the main test class, so it has to be public: that answers my question. I do not see an answer forgetUsageInBytes().I will not look at the new patch yet. Instead, I will finally review the changes to the tests,
Comment #57
benjifisherHere is a review of the changes to the tests.
Switching the order of the parameters
$memory_exceededand$memory_limitmakes the patch harder to review. Is there a good reason for this switch?Just a guess: earlier versions of the patch made
$memory_limitrequired but kept other options optional, maybe even made$memory_exceededoptional. That would be a good reason to rearrange the parameters. In the current version, all parameters are required, so this reason no longer applies.Maybe I am confused, but the MigrateExecutableMemoryExceededTest class seems to be all wrong.
(More context than the patch supplies.) See #50.3: this patch replaces
memoryExceeded()(on the executable) withensureMemory()(on the memory manager). Yet I still seememoryExceeded()here and in many other comments.Minor point: is there a good reason to move these lines from the
setUp()method to the test method? The move makes the patch a little harder to review.Are we actually testing the memory manager?
It looks as though we are actually testing the
reclaim()method, which is defined in the test memory manager, not the real one. Maybe we are also testingisLimitExceeded(), which is made public in the test class so that it can be accessed here.I think this test class needs to be completely rewritten. If it is actually a unit test for the memory manager, then it should test the memory manager directly instead of working through the executable. It should certainly test
ensureMemory(), the public method on the class. And it should probably be renamed.Fix the
@varcomment.This makes sense: the
import()method callscheckStatus(), which callsensureMemory(), and for this test we want that to return TRUE. I am not sure why the existing test does not have something similar.As long as we are looking at these lines, add a blank line before the assertion, for consistency with the other test cases.
The TestMigrateExecutable class is used twice. In MigrateExecutableTest, the constructor is passed the optional
$memory_managerargument, so we do not need to override the constructor. If we turn MigrateExecutableMemoryExceededTest into a unit test for the memory manager, as I suggested in (2), then we can get rid of this overridden constructor, along with severalusestatements.If we still need this, then I think it would be simpler to do something like this:
You might even combine that into one statement. Another option would be to override
getMemoryManager()instead of overriding the constructor.Comment #58
quietone commentedJust the simpler ones from #57. Fixes for #57.1, 3 and 4.
Comment #60
benjifisherAlso fix the
@paramcomments: I mentioned in #57.1 that none of them are optional (from the function signature) but some of the comments still say "(optional)".The testbot says there are some coding standards problems.
Comment #61
quietone commented@benjifisher, thanks for getting me to redo the tests.
This patch adds testing for the new MemoryManager in MemoryMangerTest and removes MigrateExecutableMemoryExceededTest, since that test is now in the new MemoryManagerTest. Be aware that the test cases in MemoryManagerTest are not necessarily thorough, they were enough to get the tests setup.
Comment #62
benjifisherThis has been hard, but I think we are getting to the cleanup stage.
The testbot reports two coding standards problems (really one problem triggering two notices).
The coding standards are not explicit on how to break long function calls, but the usual practice is to put the closing parentheses on a new line. Here, and in the other two cases, I prefer each argument to
t()(literal string and$results) on a separate line, then one more line starting with the closing parenthesis.The last part of #50.3 is still undone.
I am still curious about #57.4. Do you know why the previous version of the test worked without anything equivalent to
$this->memoryManager->method('ensureMemory')->willReturn(TRUE);?IIUC, we are moving away from MockObject and using Prophecy in new tests, so maybe it is not worth worrying about this. But isn’t the idea that when you create a mock object, it implements the given interface? If so, should we just use
MemoryManagerInterfacefor the@varcomment?Out of scope:
I am already tired, and I do not want to think about one more test if I do not have to! If this missing test coverage bothers you, then open a new issue for it.
Removing the
@throwcomment is also out of scope, but I will allow it if you are sure that it is correct. I guess you trust PHPStorm? The only line that looks at all suspect is the call togetIdMap().We got rid of this function in the MigrateExecutable class, so I think we can get rid of it here, too:
This is not used anywhere:
We do not need this unless I am wrong about (6) above:
I have a few points so make about this:
The function signature shows all parameters as required, but several of the
@paramcomments mention defaults. Several other doc blocks have the same problem.Instead of calling the
reclaim()method, which is only defined on TestMemoryManager, I would like to see this happen when theensureMemory()method dispatches thePRE_RECLAIMEDevent. The usual way to do this would be through the mocked event dispatcher, but it would be simpler to do it in the overriddendispatch()method.Can we add a test case where
ensureMemory()returnsFALSE?Several of these methods are just
publicwrappers for their parents. I guess the point is that we want to add tests for them, so they have to be accessible to the test class. ButdispatchEvent()is not tested, so we only need to override it if we want to use it when testingensureMemory(), as I suggested above. Even then, we do not need to make itpublic.Isn’t the usual practice to test only the
publicmethods of the class, and let them call theprotectedmethods?The
testGetUsageInBytes()test method tests the overriddengetUsageInBytes()in TestMemoryManager. I do not think think that is useful.This name is odd, but at least it is used consistently:
Comment #63
benjifisherMaybe I spoke too soon.
What is the right way to test the new MemoryManager class? It relies on the function
memory_usage(), not to mention the event dispatcher. Currently, we wrapmemory_usage()in a class method so that we can override it in TestMemoryManager.I think a better testing strategy would be to add a SystemMemory service class, and inject it into the MemoryManager class. This service class could contain the wrapper for
memory_usage(), so we can pass a mocked version of this new service class in the test. our goal should be to get rid of the TestMemoryManager class, injecting whatever mocked objects we need when testing MemoryManager.The service class might be part of the System module, or it might even be a Core class, since it is not really tied to migration. It might also contain a wrapper for
gc_collect_cycles(), so that we can use it in the MemoryLimitExceeded event subscriber as well as in the MemoryManager class. Or it could be passed to the event subscriber through the MigrateMemoryLimitEvent object.Comment #64
quietone commented#62
1 Yes, I have changed my workflow locally and some things are still wrong plus I hadn't had time do a self review.
2 No change. There is not standard, my search show many examples similar to this.
3 Oops, it was there and I must hvae reverted it. Fixed now.
4. I am not sure either but guess that is why there is a separate NigrateExecutableMemoryExceededTest.
5 I too prefer prophecy. But there is not yet a policy that I know of to use it. In this instance it would be introducing a second test framework just for one class. I think that is unnecessary and makes the code harder to read.
6 Ok, Make this a followup.
7 I trust PhpStorm less and less these days. It was not my intention to remove the @throws. It has been restored.
8, 9, 10. Fixed
11 - 16 No changes, mostly because of #63.
#63. No opinion right now if that is a good idea or not. However, I would like another opinion on the preferred method for writing the test.
Comment #66
phenaproximaI'm not sure I agree with this approach; although it seems like a good idea in general, I believe it will significantly increase the scope and effort of this issue.
For the sake of getting this refactoring done and keeping this issue's scope as small as possible, I think it's totally fine to have a TestMemoryManager class, with a
@todopointing to a follow-up issue where we can work on adding a memory usage service. I'm tagging this issue for that follow-up now, but feel free to untag it if you don't agree with what I propose.If we'd rather not add a whole new named class for this purpose, another option is to use an anonymous class that extends MemoryManager. This is surely allowed in tests, at the very least, and it still keeps this issue's scope tight while bypassing the need for mocking injected dependencies.
Comment #67
benjifisherI do not like this idea of improving the test coverage in a follow-up issue.
Maybe my (3) is bogus because we instantiate MemoryManager as a service. Maybe we will decide not to pass the SystemMemory object with the event, in which case (4) is moot. But let's make that decision now.
Comment #68
heddnWe have a solution for testing though, don't we? And constructors are not part of the BC contract. Could we keep this issue tightly scoped and make improvements iteratively?
Comment #69
benjifisherI guess I am outvoted. If we keep the current testing structure, then we should still address #62.11-16, so back to NW. See the last few lines of #64.
Comment #70
quietone commentedLet's get this moving again.
Some fixes for #62.
11. Fixed. Removed all occurrences of 'Default to NULL' in the doc blocs.
14. Fixed. Can't comment what is usual practice but I see wrappers around protected methods in other migrate tests, such as \Drupal\Tests\migrate\Unit\SqlBaseTest.
15. Removed unhelpful test.
16. Fixed
And removed methods not needed from \Drupal\Tests\migrate\Unit\TestMemoryManager.
That leaves #62, 12 and 13.
Comment #71
quietone commentedI think this is as far as I can go with this. I don't know how to implement changes for #62, 12 and 13
Comment #72
quietone commentedUpdate patch because #3156879: \Drupal\Component\Utility\Bytes::toInt() - ensure $size is a number type was committed.
Comment #73
quietone commentedForgot to remove the @todo, ignore previous patch.
Comment #74
benjifisherThe patch in #73 does not apply because #3156879: \Drupal\Component\Utility\Bytes::toInt() - ensure $size is a number type changed
Bytes::toInt()withBytes::toNumber(). I am adding the "Needs reroll" tag and setting the status to NW.Let me explain #62.13 first, since that is simpler: "Can we add a test case where
ensureMemory()returnsFALSE?"In MemoryManagerTest, the data provider returns three ... things. I call them test cases, but maybe that is not the right term. Each of the three has
'expected_ensure_memory' => TRUE,. Can we add one with'expected_ensure_memory' => FALSE,?What did I mean by #62.12? The test method in MemoryManagerTest has these lines:
The
ensureMemory()method starts with these lines:I think the test would be more like "real life" if, instead of calling
reclaim()directly from the test method, it calledreclaim()indirectly, either through the mocked event dispatcher or from a test module that listens for the event. (For a unit test, I think the second option is not possible.)Comment #75
quietone commentedReroll the patch and change references to deprecated Bytes:toInt to Bytes:toNumber.
I tried and was unable to do this, please show me how.
Comment #76
mikelutzI can't call reclaim from the mocked dispatcher easily, but I can call it from the test managers dispatch event. How's this?
Comment #78
mikelutzComment #79
benjifisherThanks for improving the test, and I apologize for letting this sit so long.
Calling
reclaim()this way is pretty close (good enough!) to what I asked for in #74:The
reclaim()method is no longer used outside TestMemoryManager, so it certainly does not need to bepublic. It is only used here, so let’s inline it and save one level of indirection.If you do not like the idea of inlining it, can we test
if($this->clearedMemoryUsage)insidereclaim()instead of indispatchEvent()?It took me a while to realize that this is adding an assertion (+1 for that!) and is independent of the previous point:
First of all, let’s restore the "blank" line (just
*) before the@dataProvidercomment.We do not need the new variable
$will_reclaim. It is just the return value ofisLimitExceeded(). I think we should rename the variable to match: either$limit_exceededor$is_limit_exceeded. Then let’s calculate it in the test function. That will make it easier to add new test cases. Finally, lets assert thatdispatch()is never called when this variable isFALSE. Something like this:Comment #80
benjifisherLooking at this again, I think it would be simpler to combine the second call to
dispatchEvent()and thereturnstatement:Comment #81
benjifisherAnother option is to rewrite
MigrateMemoryLimitEventso that it has aMemoryManagerproperty (and getter) instead of three numeric properties (and getters). Then event subscribers would have to replace$event->getUsageInBytes()with$event->getMemoryManager()->getUsageInBytes().Then, in the
TestMemoryManagerclass, keepreclaim()as a public method (ignoring my suggestion in #79.1). Do not overridedispatchEvent().Finally, in the mock event dispatcher, do not just assert how many times
dispatch()will be called. Have the call todispatch()check$event->getPhase()and conditionally call$event->getMemoryManager()->reclaim().That seems like a lot of work just to make the test more like "real life". I have not decided yet whether this is a case of making the "real" code more complex just so that it can be tested or whether this is a case of better code structure suggested by the testability mandate. What do you think?
Comment #82
benjifisherThat is part of my suggestion in #81. Usually, when we mock a method, all we do is specify
Is it possible to have the mock method do something? I am not sure, but I think it is possible using
returnCallback(). The callback receives the arguments(s) of the mock method, and the callback can be an anonymous function, so it can do ... anything you want.Comment #83
benjifisherIn #81, I wrote,
After thinking about it a bit, I like this idea. Let's pass the memory manager itself, not the various numbers, in the event object.
If I look too long at TestMemoryManager, I start to think that
getUsageInBytes()is just a static getter. In fact, it is a wrapper for the very non-staticmemory_get_usage(). Currently, we call it once, save the result in an event object, then dispatch that event. The one event listener in this patch does not even check the value! If there are any other event listeners, they will get the memory usage saved in the event when it was created.I think it would be better to pass the MemoryManager object in the event. Then each event listener can either ignore the memory usage or ask the memory manager for the current value.
If you make this change, then I think we can postpone the rest of my suggestion in #81 and #82 (the part dealing with using
returnCallback()) to a follow-up issue. For this issue, I would still like to see #79.2 and #80 implemented.Since we want the methods
getUsageInBytes(),getLimit(),getUsageRatio()to be available to the event listeners, make them public and add them to MemoryManagerInterface.I searched for
returnCallback()and found several examples in core tests. For example, MakeUniqueEntityFieldTest in the Migrate module (thanks to @chx in #2147815: Migrate: add an entity deduplication process plugin) and FormErrorHandlerTest in the Inline Form Errors module.Comment #84
benjifisherWe discussed this issue at #3175894: [meeting] Migrate Meeting 2020-10-15. Based on the point I made in #83, @heddn changed his mind, so I am no longer outvoted. @heddn also pointed out #3000050: Replace \Drupal\Component\Utility\Environment::checkMemoryLimit with a PHP memory checking helper class and service, which is already linked as a related issue. I am marking this issue as postponed on that one.
Follow-up to #83: the problem with passing the MemoryManager object in the dispatched event is that we do not want subscribers to have access to the
ensureMemory()method. So we really should have thegetUsageInBytes()and similar methods in a separate class. That is pretty much what I suggested in #63 and #67. It is also pretty much what @mondrake suggested in #31 a few months earlier.Comment #86
benjifisher#3000050: Replace \Drupal\Component\Utility\Environment::checkMemoryLimit with a PHP memory checking helper class and service was closed as "won't fix". @alexpott says that we should be testing low-memory scenarios with
ini_set('memory_limit', ...)and by actually consuming memory:and
Comment #87
quietone commentedThis is blocking, #2688297: File migration slows down and eats more and more memory, eventually stops
Comment #89
heddnI'm working on a very large migration of users on a site. The memory regularly goes up to 3 or 4GB before finally dying under its own weight. Instead of letting it grow to infinite size, I added an artificial limit of 1.25GB and let this cache clear logic execute. That has helped immensely.
We should let this memory limit be an optional injected parameter. I could see other's wanting the same control over things.
Comment #90
wim leersThat sounds like a good idea!
Comment #91
quietone commented#89 is a good idea. I think that should be done in a separate issue.
Just rerolling the patch.
It isn't clear to me what the next step is. #86 is too vague for me.
Comment #95
xurizaemonRe-roll for 10.1.x, as I'm interested in testing this out for a migration which is having memory issues.
Comment #96
xurizaemonRe-roll of 91 to account for CR-3159012: Symfony Event class deprecated, EventDispatcher::dispatch() argument order changed.
Comment #97
xurizaemonAnd one more for coding standards.
Comment #98
xurizaemon97 was an empty file, not a patch. Some days ... apologies all.
On next update I think we can also remove the "postponed" comment in ID.
Comment #99
andypostAll new properties need types and should use constructor promotion
needs to be typed and could be part of constructor
int|string|null
Comment #100
andypostAdded type-hints to fix #99
Comment #101
andypostIt's often less arguments passed, so no reason in deprecation and event dispatcher also could be created in constructor to minimize amount of useless extra function calls (will file follow-up)
Here's a fix for CS and fix test -
memory_usagealways int, never nullComment #102
andypostTo run tests the method is required, also fix one more place (test still fails)
Comment #103
andypostFix last broken test
Comment #104
andypostFinal polishing
Comment #107
andypostRe-rolled last patch as MR
Comment #108
smustgrave commentedThere are some bullets in the remaining tasks that can't tell if they still apply. Could they be marked out if no longer applicable?
Comment #109
andyposttests pass
Comment #110
heddnI've gone through and responded to most of the feedback mentioned in the IS. The requests in #82 & #83 are not super clear what is being asked given that the EnvironmentMemory object didn't materialize. Can we cross them off as well?
Comment #111
heddnComment #112
moshe weitzman commentedThe LRU cache finally landed in Drupal core (see related issues). It is likely that migrate can remove its memory handling as a result.
Comment #113
catchYes we might be able to delete this code altogether after #3498154: Use LRU Cache for static entity cache.
Comment #114
smustgrave commentedAppears to need a rebase.
Wonder if the follow up tag is still needed?
If you are another contributor eager to jump in, please allow the previous poster at least 48 hours to respond to feedback first, so they have the opportunity to finish what they started!
Comment #115
andypostAs the issue #3498154: Use LRU Cache for static entity cache is postponed it makes sense to file follow-up to clean-up and add TODO there, for #113
and needs rebase/reroll too
Comment #117
jofitzRe-rolled.
I'm unsure what more needs to be done based on #113 and #115
Comment #118
catchLeft a review on the MR, I don't think this is really going to do anything any more now that the LRU cache issue is in, and we should consider removing this code instead.
Comment #119
jofitzI'm gonna have a go at repurposing this issue; starting by editing the IS then making the changes
Comment #120
jofitzComment #121
jofitzRemoved Memory Manager and event subscriber
Comment #122
godotislateLooks like the phpstan baseline might need to be regenerated: https://git.drupalcode.org/issue/drupal-3006750/-/jobs/5474912
How to do this: https://www.drupal.org/node/3426891
Comment #123
godotislateUpdated the title to match the new direction of this issue to remove memory management code in migrate. Also removed the "Needs follow up" tag, since it referred to messaging in migrate that won't be relevant anymore.
Comment #124
godotislateRebased and updated phpstan baseline.
Comment #125
godotislateI was about to update the CR https://www.drupal.org/node/3139212 to reflect that memory management has been removed from MigrateExecutable, but I just read CR https://www.drupal.org/node/3512407 associated with #3498154: Use LRU Cache for static entity cache:
There's a chance then on hosting plans with low memory that migrations will OOM and fail now. I wonder if something should remain in MigrateExecutable to inform that memory is getting low and to reduce
entity.memory_cache.slots.Comment #126
godotislateOne idea to address #125 is introducing a memory limit/memory % threshold to the LRU cache instead.
Comment #127
benjifisherI reviewed the changes on the MR, and I looked at what was left after all the deletions. I made one little suggestion on the MR: back to NW for that. Other than that, I like what I see.
I thought about how to test this issue. I looked at the related issue #2688297: File migration slows down and eats more and more memory, eventually stops. Reading through all the comments, I see that several people reported this problem, but no one was able to provide steps to reproduce it. I decided to Postpone that issue (maintainer needs more information) with a note not to close the issue.
Then I remembered my own experience on my first D7-D8 migration. We had about 100K users on the D7 site, and the
d7_usermigration started out fast but then slowed to a crawl. (Eventually, I discovered thebatch_sizeoption for any source plugin extendingSqlBaseand that was enough for that project.)I created a new D7 site and used the
develmodule (drush generate-users) to create 100K users.Then I configured
memory_limitto96Mfor my Drupal 10/11 site. I confirmed withdrush eval "echo ini_get('memory_limit') . PHP_EOL".I decided to test on 10.6.x, which does not include the commit for #1199866: Add an in-memory LRU cache, on 11.x, and on the branch from MR 5156.
Testing steps
drushand PHP dependencies:composer require drush/drush.drush si.migrate_drupalmodule:drush en migrate_drupal.drush mim d7_user_role,d7_user.drush mr d7_user.drush mim d7_user.Something has improved since my first D7-D8 migration: on all three branches, the initial migration (Step 5) finished cleanly, without the slow-down that I remembered. That is why I added Steps 6 and 7.
Test results
Testing on 10.6.x
Conclusion: even with the memory management in
MigrateExecutable, this test ended with a PHP fatal error.Testing on 11.x
Conclusion: after #1199866: Add an in-memory LRU cache (and perhaps other changes) the memory management in
MigrateExecutableis never triggered.Testing on 3006750-migrate-memory-management (after merging with 11.x)
Conclusion: Removing the memory management in
MigrateExecutabledoes not make any difference. This is the expected result, given the results of testing on 11.x.Comment #128
benjifisherThe "blocker" issue tag was added in Comment #87, since this issue blocks #2688297: File migration slows down and eats more and more memory, eventually stops. Since I Postponed that issue (maintainer needs more information), I am removing that tag.
This issue already has the "Needs change record updates" issue tag, but I am also adding an item to the Remaining tasks in the issue summary. The change record needs a complete rewrite, not just an update. (Or maybe this issue no longer needs a change record.)
Comment #129
andypost@benjifisher thank you a lot for testing in #127
applied suggested wording and only CR and summary needs check
Comment #130
benjifisher@andypost:
Thanks for updating the MR.
As I said in #127, that update was the only change I wanted to see, so the issue is now RTBC.
Comment #131
godotislateSince @benjifisher successfully tested with memory @ 96M in #127, seems like OOM concerns in #125 could be an edge case, if anything. (Though technically recommended Drupal PHP memory requirement are as low as 64MB: https://www.drupal.org/docs/getting-started/system-requirements/php-requ...).
The CR https://www.drupal.org/node/3139212 does need an update, though, but otherwise RTBC +1.
Comment #132
catchUpdated the CR at https://www.drupal.org/node/3139212
Committed/pushed to 11.x and cherry-picked to 11.3.x, thanks!
Comment #137
catch