Problem/Motivation
Source::next() was copied from D7 and the logic is very convoluted and very hard understand. Probably as a consequence, it also has a logic mistake where a ! snuck in but since the whole class is untested this was not caught. Probably it was untested because it was so poorly understood by all. Other parts of Source also more look like contrib code than D8 core.
Proposed resolution
Simplify Source::next() (really big succeeded big time) and then move the simplified code into the SourcePluginBase class and remove the Source class. Also, add tests so that we do not break this again.
Remaining tasks
User interface changes
API changes
There are some but migrate is not frozen. Note that the main entry point, MigrateExecutable::import did not change. Using Source directly was never a good idea; it was poorly documented; poorly understood and I can't imagine anyone using it (especially since it was quite broken).
Comment | File | Size | Author |
---|---|---|---|
#68 | interdiff.txt | 4.26 KB | benjy |
#68 | 2427335-68.patch | 54.19 KB | benjy |
#66 | 2427335-66.patch | 52.79 KB | benjy |
#66 | interdiff.txt | 2.97 KB | benjy |
#61 | interdiff.txt | 2.54 KB | benjy |
Comments
Comment #1
chx CreditAttribution: chx commentedSmall tweak.
Comment #2
chx CreditAttribution: chx commentedSlightly reworded comments.
Comment #3
dawehnerNote: This is "just" work on the docs.
Comment #4
chx CreditAttribution: chx commentedSorry no interdiff there's very little that is unchanged.
Comment #5
chx CreditAttribution: chx commentedComment #6
chx CreditAttribution: chx commented$this->currentRow = NULL; is set already in the beginning.
Comment #7
benjy CreditAttribution: benjy commentedI actually got highwater to work without setting the field property and just bypassing this block entirely. The block that checks for the 'name' property on highwater seems to work fine.
+++ b/core/modules/migrate/src/Source.php
@@ -297,152 +311,110 @@ public function rewind() {
- if ($this->prepareRow($row) !== FALSE) {
- if ($row->getSourceProperty($this->highWaterProperty['name']) > $this->originalHighWater) {
This has removed a feature, eg. prepareRow now no longer runs before the access to the highwater property on row, this broke my tests because I was calculating the highwater in prepareRow.
I think from here we need to start fleshing out some unit tests because there seem to be a lot of outcomes in Source that currently exist without any coverage.
Comment #8
benjy CreditAttribution: benjy commentedThis interdiff fixed that issue for me. Basically lets prepareRow run earlier than anything else?
Comment #9
chx CreditAttribution: chx commentedComment #10
chx CreditAttribution: chx commentedScratch that. Well, what, even the id can be changed in prepare so I do not see a good way to sometimes skip it so don't skip it. This makes the double-call check on prepareRow unnecessary.
Comment #11
benjy CreditAttribution: benjy commentedSome basic unit tests. Also note the change to Source, I don't think there is any need to check originalHighwater at all to see if we're using highwater. Lets just check for the existence of the property.
Comment #12
benjy CreditAttribution: benjy commentedRemoved the useHighwater() function.
Note, we need to add highwaterProperty to the schema as well. Not sure if we add camel cased properties to schema?
Comment #14
benjy CreditAttribution: benjy commentedFix namespace.
Comment #16
benjy CreditAttribution: benjy commentedMore refinements.
Comment #17
benjy CreditAttribution: benjy commentedWe no longer allow trackChanges and highwater together. We also now initialise them in the constructor.
Comment #18
benjy CreditAttribution: benjy commentedCorrect interdiff for 17
Comment #19
benjy CreditAttribution: benjy commentedWe now no longer need to check highwater in rowChanged. Will write some more unit tests for the new exception.
Comment #22
benjy CreditAttribution: benjy commented1. Some clean-up in the constructor, also moved it to the top of the file.
2. Additional tests for the highwater/trackchanges exception.
3. Moved t() into a method on Source and then override that with a TestSource for the tests.
4. Moved prepareRow() after the idList check.
Comment #23
benjy CreditAttribution: benjy commentedThanks to @chx for pointing out no translation is required for exception messages.
Comment #24
benjy CreditAttribution: benjy commentedMore fixes to count() and more unit tests to improve the overall coverage of the class. Screenshot shows the coverage for next().
Comment #25
chx CreditAttribution: chx commentedComment #26
chx CreditAttribution: chx commentedComment #27
dawehnerGreat improvements!
Sadly we lost the explanation of what highwater actually is. The one patch from me had some documentation around that area.
do we really want to === FALSE or simply use if !? From a readability POV the later one seems much better.
It would be really nice to add a bit of "why" to the documentation pointing out "what".
really nice method name!
Can we make clear that it changed in the destination?
It would be cool if the Test would be named SourceTest, as this allows you to use https://www.jetbrains.com/phpstorm/help/navigating-between-test-and-test...
Comment #28
benjy CreditAttribution: benjy commentedI've taken a different direction. Drupal\migrate\Source seems to be left over from Drupal 7. It's a class that we can iterate. That made sense in Drupal 7 because the source data was gathered up in your migration class, this is no longer the case. So, attached patch merges Source into SourcePluginBase.
Source only brought us rewind(), next(), count() and prepareRow(). prepareRow() did very little and then called SourcePluginBase::prepareRow(), count() is coming to the SourcePluginBase anyway, #2429097: Optionally cache migration source counts and that just leaves next() and rewind(). Moving them across gets rid of Source.php altogether.
Plus, nearly everything in the Source constructor was configuration passed on from the SourcePlugin.
TODO, The source plugin currently needs the executable because of the way Source was saving some map messages. I've left a couple of TODO's to fix that but want to get a test run.
Comment #29
benjy CreditAttribution: benjy commentedComment #31
benjy CreditAttribution: benjy commentedJust fixed the unit tests for now.
Comment #32
benjy CreditAttribution: benjy commentedI added a @see, was holding off for the tests to finish on the last patch but all the bots seem to be down.
Comment #33
chx CreditAttribution: chx commentedComment #35
benjy CreditAttribution: benjy commentedFix for the last unit test.
Comment #40
benjy CreditAttribution: benjy commentedJust uploading the same patch again, not sure what the failures are meant to be.
Comment #42
benjy CreditAttribution: benjy commentedRegenerated the same patch after a re-roll, not sure what is going on with the bot just terminating.
Comment #50
benjy CreditAttribution: benjy commentedI tracked this down to a nasty little issue with the iterators. Now we've combined Source which was previously an iterator into SourcePluginBase the iterator wasn't been cached for EmptySource which in turn led to an infinite loop for migrate tests using the EmptySource. Ouch!
New patch changes EmptySource to implement runQuery instead which makes much more sense. I've also added runQuery to the interface to ensure that all sources implement this method. SqlBase provides a base implementation anyway.
SourcePluginBase currently provides a default implementation of getIterator() but i've opened [#] to remove this function altogether anyway, we should be able to cache these results in either runQuery or rewind().
Comment #52
benjy CreditAttribution: benjy commentedRemoved runQuery from the interface for now.
Comment #53
webchickFrom talking to benjy today, it sounds like this is a blocker for the finalization of the Migrate API, which is in turn a blocker to the D7 -> D8 migration path. Marking as "major."
Comment #54
webchickOh, and adding the blocker tag.
Comment #55
benjy CreditAttribution: benjy commentedThe solution for runQuery was to rename to initializeIterator(). It didn't make much sense outside of Sql sources anyway, and this is in SourcePluginBase.
We don't want it on the public interface but we do need to enforce that it exists so I added an abstract method to SourcePluginBase.
Comment #61
benjy CreditAttribution: benjy commentedSome testing changes.
Comment #62
chx CreditAttribution: chx commentedI like where this is now but I can't RTBC it because it contains quite a bit of my own code.
Comment #63
dawehnerI think this is a great improvement! Let's unblock stuff.
Comment #64
alexpottMight we not have some of the same issues solved in #2431379: LazyPluginCollection should not implement \Iterator - should we be using \IteratorAggregate instead of \Iterator
Comment #65
chx CreditAttribution: chx commentedI think the confusion arises from
getIterator
being public. No need for that. Perhaps we could rename it to something like getRawIterator or getUnfilteredIterator -- after all , the iterator returns all rows and the rest of the class -- next() mostly -- filters them.Comment #66
benjy CreditAttribution: benjy commented@alexpott, could question. I think we have a different use case but just confused things by having a public getIterator() method. We can't return a new iterator every time because that would send us into an infinite loop as shown in #50.
I discussed with @chx today and we decided to remove getIterator() from the public API and renamed to getRawIterator() to try make it clearer that we are not implementing a PHP API such as IteratorAggregate.
Comment #68
benjy CreditAttribution: benjy commentedThis is a WIP, just a suggestion. The problem we have is that the LoadEntity was using the raw iterator but it only needs the bundles. I've added a getBundles() method to SourcePluginBase but i'm thinking maybe a public method on either SourceEntityInterface or a new interface for sources that are "bundle migrations" for an entity type.
Comment #70
benjy CreditAttribution: benjy commentedLoadTermNode also uses the raw iterator directly, can't see an easy way around that other than making it public again?
Comment #71
chx CreditAttribution: chx commentedLet's rename and postpone the protected part to another issue.
Comment #72
benjy CreditAttribution: benjy commentedThat would make #61 RTBC again then. Created #2452217: Rename SourcePluginBase::getIterator() and try make protected as a follow-up.
Comment #73
chx CreditAttribution: chx commentedMaybe. Or we could do the renaming here but that's not a big deal , really.
Comment #74
alexpottMigrate is not frozen in beta. Committed 260f936 and pushed to 8.0.x. Thanks!