Problem/Motivation

Was getting errors from a translation migration (didn't keep the details) because the language was NULL. When I looked at the source database there were entries in the relevant i18n table but not a corresponding entry in the locales_target table, which is the one with the language. I recall seeing this before and at least one of the source plugins has a condition to only retrieve rows where language is not NULL. This condition should be applied to all the translation source plugins that involve the i18n tables.

The alternative is to modify all the migrations to skip_on_empty on the language field. It just adds extra processing.

The source plugins with an ->isNotNull('language') condition are d6/ProfileFieldTranslation.php and d6/FieldLabelDescriptionTranslation.php. There is actually a mix of conditions in the source plugin in an attempt to make sure there is data. Maybe they should all use the same 'safest' choice?

core/modules/field/src/Plugin/migrate/source/d6/FieldLabelDescriptionTranslation.php:      ->isNotNull('language')
core/modules/field/src/Plugin/migrate/source/d6/FieldLabelDescriptionTranslation.php:      ->isNotNull('translation');
core/modules/field/src/Plugin/migrate/source/d6/FieldOptionTranslation.php:      ->isNotNull('translation');
core/modules/user/src/Plugin/migrate/source/d6/ProfileFieldOptionTranslation.php:      ->isNotNull('translation');
core/modules/block_content/src/Plugin/migrate/source/d7/BlockCustomTranslation.php:      ->isNotNull('lt.lid');
core/modules/taxonomy/src/Plugin/migrate/source/d6/TermLocalizedTranslation.php:    $query->isNotNull('i18n.lid');
core/modules/menu_link_content/src/Plugin/migrate/source/d6/MenuLinkTranslation.php:    $query->isNotNull('i18n.lid');
core/modules/config_translation/src/Plugin/migrate/source/d6/ProfileFieldTranslation.php:      ->isNotNull('language');

Proposed resolution

Use an innerjoin instead of isNotNull to prevent rows where the language is Null which would have to be skipped in the process pipeline.

Modify the source plugin tests to have a row in an 'extra' row in the i18n_table that does NOT have a corresponding row in the locales_table.

There are the relevant source plugins.

$ grep -r locales_target core/modules/*/src/Plugin/migrate/source/ | awk '{print $1}' | sort -u
core/modules/block_content/src/Plugin/migrate/source/d7/BlockCustomTranslation.php:
core/modules/block/src/Plugin/migrate/source/d7/BlockTranslation.php:
core/modules/config_translation/src/Plugin/migrate/source/d6/ProfileFieldTranslation.php:
core/modules/content_translation/src/Plugin/migrate/source/I18nQueryTrait.php:
core/modules/field/src/Plugin/migrate/source/d6/FieldLabelDescriptionTranslation.php:
core/modules/field/src/Plugin/migrate/source/d6/FieldOptionTranslation.php:
core/modules/field/src/Plugin/migrate/source/d7/FieldLabelDescriptionTranslation.php:
core/modules/field/src/Plugin/migrate/source/d7/FieldOptionTranslation.php:
core/modules/menu_link_content/src/Plugin/migrate/source/d6/MenuLinkTranslation.php:
core/modules/taxonomy/src/Plugin/migrate/source/d6/TermLocalizedTranslation.php:
core/modules/taxonomy/src/Plugin/migrate/source/d6/VocabularyTranslation.php:
core/modules/taxonomy/src/Plugin/migrate/source/d7/TermLocalizedTranslation.php:
core/modules/user/src/Plugin/migrate/source/d6/ProfileFieldOptionTranslation.php:

Remaining tasks

Review
Commit

User interface changes

N/A

API changes

N/A

Data model changes

N/A

Release notes snippet

N/A

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

quietone created an issue. See original summary.

quietone’s picture

quietone’s picture

Status: Active » Needs review
Parent issue: » #2208401: [META] Remaining multilingual migration paths
FileSize
5.57 KB

Something like this. I'm heading out and haven't had time to run all the tests but wanted to get something posted here.

quietone’s picture

Issue summary: View changes
quietone’s picture

This adds an entry to the locales_target table in the source plugin tests where the language is NULL.

heddn’s picture

Status: Needs review » Needs work

from Gabor in slack:

in D6/7 if a language is null that did not mean it did not have language it meant the language should be understood as the site default language
D8 has explicit language on things so we don’t need to make those assumptions (and thus break everything if you decide to switch teh site default language let’s say)

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

quietone’s picture

Status: Needs work » Needs review

I think we are talking about two different things. I'm specifically addressing what is stored in the various i18n tables and locales_target and having a query constructed that returns NULL for the language. In the i18n tables in the fixture and locales_target language is a key and should not be NULL. The one exception is Drupal 6 i18n_block where the language is not a key but it is still set to be not NULL. That says to me that a language column will not be NULL in any of the tables and if a query returns NULL then it is corrupted data. If that happens the row has no language and can't save the translation (assuming one is found). If that is true, then that case can be avoided in the query.

Of course, this can be handled in the pipeline with a skip row, which has the advantage of recording the skip.

Now, having said that it seems that this isn't a problem in that no one is reporting problems.

Gábor Hojtsy’s picture

Hm, is a "no language specified" block/term, etc. in Drupal 6/7 storing null in the DB or are they storing empty string or something else? If this patch will result in data skipped for items that had no language specified (which is still valid data) then that is a problem. If null is not stored for items where no language was specified then this should be fine.

mikelutz’s picture

Status: Needs review » Needs work

I assumes it stored 'und', as we didn't separate out undefined and unspecified back then, but I'm not certain. I'm moving to NW until there is an answer.

quietone’s picture

Status: Needs work » Needs review

Trying to answer #9 and #10 here. I looked at the case of taxonomy terms in D6. When the language is not specified the language field is an empty string. That value is not changed in the source plugin or in d6_taxonomy_term_translation. But, in this particular case the migration does not use the i18n tables anyway.

Then took a look at the d7 custom block migration. That does use the i18n tables but the source query is already limited by isNotNull('locales_target.lid').

And just for myself I went back and ran this query on the d6 and d7 test fixtures.

select i18n.lid, lt.language from i18n_string i18n left join locales_target lt on lt.lid=i18n.lid where isNULL(lt.language)
For D6 zero rows are returned but for D7 there are 109. So, there are quite a few rows in the D7 i18n table that do not, in fact, have corresponding translation data. And that is what started this whole issue for me. The field/term/block or whatever that has an entry in the i18n string table has already been migrated by the non translation version of that migration. The entry in the 18n string tables seems to be a placeholder (though I haven't read the i18n source code).

quietone’s picture

While I think this is safe to do I don't want to spend more time on it. If you think this is too risky, set to Won't Fix.

Gábor Hojtsy’s picture

I don't know what could we do with data in the locales_target table that has null as langcode. I don't know how to interpret the language of that. It would be good to know if i18n module is using this for some kind of special data tracking in some case of term/block/etc translation. That the D7 fixture has strange data could also just mean that it is how it got built together from hundreds of commits throughout the years. That does not mean an actual Drupal 7 site would have those?

quietone’s picture

In the d7 fixture, the data is the locales_target always has a language. It just because the query uses a left join that 109 rows with no language are return. I guess that could just change to an inner join.

Gábor Hojtsy’s picture

+++ b/core/modules/block/src/Plugin/migrate/source/d7/BlockTranslation.php
@@ -50,6 +50,7 @@ public function query() {
     $query->leftjoin($this->blockTable, 'b', ('b.delta = i18n.objectid'));
     $query->leftjoin('locales_target', 'lt', 'lt.lid = i18n.lid');
+    $query->isNotNull('language');

Ok so we are talking about this producing null values from locales_target for missing translations in locales_target basically. If we only care about the translations in all of the "problematic" migrations, then doing an inner join should be fine as if there was no translation then no translations should be saved in Drupal 8 either.

If we do other things with the data, then an inner join would loose us some original items so then a left join is better. But if we only care for translations, then we can inner join them.

quietone’s picture

Yes, that is the question. 'do we only care about translations' for these i18n migrations. I think the answer is yes.

On principle, migration source plugins get as much data as possible and leave it up to the process pipeline to decide what to do thus allowing custom migrations maximum flexibility. In this case a custom migration would be able to identify every item that is translatable but does not have a translation. Using a migration to get that data seems odd to me. And as we see the D7 test fixture has 109 cases like that, a real site will have many more. How many is unknown but having many skipped rows may cause folks to spend time investigating the cause. Time they could spend elsewhere. Yes, documentation will help. And we have to remember that some of the i18n migration source plugins already do this and some don't. Whatever is decided we should make them consistent which means either change from left join to inner join or keep the left join and add a skip row if language is empty.

heddn’s picture

I prefer inner join. Let's get the data needed. skip rows are messy.

Gábor Hojtsy’s picture

Agree with inner join.

quietone’s picture

Status: Needs review » Needs work

It will be good to review all these queries.

However, I realized that some of these queries start with $query = parent::query() which means I don't know how to change it to use an inner join between the i18n_string and locales_target table. In those cases the notNull('language') is simpler.

quietone’s picture

Status: Needs work » Needs review
FileSize
13.13 KB

OK, made a few tweaks.

quietone’s picture

Working on this found that #2979964: Migrate translations for D7 i18n taxonomy 'localized' terms doesn't have a test of the source plugin. Made a follow up for that oversight, Follow up added #3073442: Add test of d7 term localized source plugin.

Now here is a patch for all the source plugins that use the locales_target table. It turns out that in locales_target that the language cannot be NULL so many of the source plugin tests are changed to remove rows where the language was set to NULL. Instead a row in i18n_string is added that does not have a corresponding row in locales_target. This was all a bit tedious. Let's see if the changes to the source plugins cause any failures.

Status: Needs review » Needs work

The last submitted patch, 21: 3030937-21.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review
FileSize
19.98 KB

Darn bad patch. Had a version of d7/TermLocalizedTranslationTest.php that is being fixed in #3073442: Add test of d7 term localized source plugin. So, just removing that error.

quietone’s picture

Issue summary: View changes
Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Changes look good. Thanks for adding the additional test coverage with no translations :)

Gábor Hojtsy’s picture

  • Gábor Hojtsy committed 1be0372 on 8.8.x
    Issue #3030937 by quietone, Gábor Hojtsy, heddn, mikelutz: Ensure...
Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Thanks all, committed this given no concerns raised for 2 weeks :)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.