Closed (fixed)
Project:
Drupal core
Version:
8.1.x-dev
Component:
migration system
Priority:
Critical
Category:
Bug report
Assigned:
Unassigned
Issue tags:
Reporter:
Created:
13 May 2016 at 16:31 UTC
Updated:
15 Jun 2016 at 17:14 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
alexpottI've run this 20 times locally and no fails :(
Comment #3
mikeryanAre there any stats suggesting how often it fails (1% of the time, 10%, etc.)? Would suggest how many times one might need to run the tests and expect to see at least one failure...
The failing test is for the message displayed at the end of the process - locally, I've got the page content it's testing preserved in files like Drupal_migrate_drupal_ui_Tests_d6_MigrateUpgrade6Test-10-20.html, is this captured (or can we capture) that on the testbot and see what the actual page content is?
Here's a (very) recent failure: https://www.drupal.org/pift-ci-job/285501
Comment #4
penyaskitoHappened again at https://www.drupal.org/pift-ci-job/286121
Comment #5
penyaskitoAlso #2225271: Migrate content type language settings from Drupal 6 & 7: https://www.drupal.org/pift-ci-job/285489
Comment #6
lokapujyaHappens here consistently. #2719171 Never fails locally.
Comment #7
alexpottDiscussed this with @dawehner - he noted that one of the things that might be introduced randomness into the test is that the what runs each batch step is random depending on time and memory usage.
Comment #8
alexpottlet's see it we can get it always to fail in 20 tests...
Comment #9
alexpottComment #11
alexpottMaybe just 40 is enough and we don't need to do the other test.
Comment #12
heddn#2682585: Rename MigrationCreationTrait as it no longer creates migrations - it configures them seems to fail consistently.
Comment #13
lokapujyaDo we have a way to do what @mikeryan requested?
I think we basically need the debug output that we are able to get when running locally.
Oh, I forgot: The problem is we can't consistently reproduce it, NM.
Comment #15
lokapujyaSo, we CAN reproduce it on 8.1 easy. Now we need the debug so that we can see the HTML.
Comment #16
mpdonadioMoving this back to 8.1.x since that branch is showing random fails: https://www.drupal.org/pift-ci-job/289977
Defer to others whether that makes this a critical. It's easy now to ignore fails and work on other issues.
Comment #17
alexpottGiven the frequency and disruptive nature this is most certainly a critical.
Comment #18
claudiu.cristeaThis test failed twice with this error https://www.drupal.org/pift-ci-job/295763. It doesn't seem very random.
Comment #19
tetranz commentedIn case this helps, my test that @claudiu.cristea linked to above works on 8.2.x
I don't know if it's random but it worked on 8.1.x until I changed core/modules/field/tests/src/Kernel/Migrate/d7/MigrateFieldFormatterSettingsTest.php which added 'target_entity_type_id' => 'node' when creating comment types.
https://www.drupal.org/files/issues/interdiff-2717673-16-20_0.txt
Comment #20
lokapujyaTry to log the HTML.
Comment #21
lokapujyaTry another way.
Comment #22
alexpottSo on my test issue I've managed to view the html... #2729713: Investigation into random fails in \Drupal\migrate_drupal_ui\Tests\d7\MigrateUpgrade7Test and prove that the migration that is failing is the 'Comment field form display configuration' migration ie. d7_comment_entity_form_display. I'm guessing we have a missing dependency somewhere and the order when not working is different than when it is.
Comment #23
alexpottSo the error that is being thrown is "Migration d7_comment_entity_form_display is busy with another operation: Importing" which is logged on MigrateExecutable::import() Line 172 or so...
Comment #24
alexpottAccording to @mikeryan:
Comment #25
alexpottHere's the watchdog table - it looks like the migrate has indeed been fired twice?!?!? Entry 58 looks like a successful migration and 59,60,61 is the second attempt going wrong.
Comment #26
mikeryanSo, we're seeing d7_comment_entity_form_display_subject run twice (doing nothing the second time, because it's already done its work). Following that, d7_comment_entity_form_display fails because somehow its status is IMPORTING, implying it previously started and had some catastrophic failure (or, there's some bug in resetting the status). Since the reporting is only done upon return from $executable->import(), we seem to be missing a failed attempt to start that import - I've added logging before the import() call to #2729713: Investigation into random fails in \Drupal\migrate_drupal_ui\Tests\d7\MigrateUpgrade7Test, let's see what that tells us about when each migration is kicked off.
Comment #27
xjmComment #28
alexpottThis patch fixes the issue and brings MigrateUpgradeRunBatch into line with _batch_process() when dealing with progressive batches.
Comment #29
alexpottI don't know why #28 fixes the issue but I think the change is the right change to make anyway because:
_batch_process()behaves the way it does.Comment #31
alexpottThe one fail so far in #28 it unrelated... Views.Drupal\Tests\views\Kernel\Handler\FilterStringTest
Comment #32
mikeryanSome IRC chatter on the performance implications of reducing the batch length to 1 second:
We should probably do at least some informal performance testing to see how much slower it is in practice, and if stalling occurs with moderately-sized sites (perhaps we could at least go to 5 seconds?). A D8 copy (basically, omitting the last two sections) of the drush/UI performance page should be added to the core migrate docs. Linking it from the UI should probably wait until https://github.com/drush-ops/drush/issues/2140 is in, though.
Comment #33
alexpottSo it looks like this is caused by #2694391: Separate cache bin for migrations - we should revert it. The patch attached does that.
Comment #34
mikeryanFWIW, I did some manual performance testing with the 1-second batch patch using my D6 personal blog (304 users, 86 nodes, 51 files, 1109 aliases for some reason...). Three trials each, without the patch the average time was 3:39 (for comparision, with drush migrate-upgrade the average was 2:48). With the patch, the average of successful completions was 6:10, and changing the batch length to 5 seconds the average of successful completions was 4:19. I say "successful completions" because each of those cases failed once with a 500 error - going to the "error page" (home page) showed the normal success messages ("Congratulations..." etc.), and dblog had no errors, just the last migration completed (vocabularies in one case, vocabulary display configuration in the other).
Comment #35
heddnI can confirm that the cache backend does cause issues. See #2736789: Default of 'migration' database key is ONLY thing that works
Comment #36
alexpottReverted the issue that caused this #2694391: Separate cache bin for migrations