REPRODUCTION STEPS:

1. Create and publish a content entity.
2. Add a translation for the node and publish it also
3. Using adminer or mysql CLI examing the node_field_data for the node.

select * from node_field_data where nid=17621

For example,

nid	vid	type	langcode	title	uid	status	created	changed	promote	sticky	revision_translation_affected	default_langcode	moderation_state	content_translation_source	content_translation_outdated	rh_action	rh_redirect	rh_redirect_response	published_at	publish_on	unpublish_on
17621	1177975	article	de	Cloud, AI and integration for today’s digital economy	1	1	1514324112	1514324319	0	0	1	0	published	en-us	0	bundle_default	NULL	301	1492179338	NULL	NULL
17621	1177975	article	en-us	Cloud, AI and integration for today’s digital economy	0	1	1492179338	1514324128	0	0	NULL	1	published	und	0	bundle_default	NULL	301	1492179338	NULL	NULL

This looks good and is what is expected: A published version of the source language and a published version of the translation.

4. Add a new optional text field to the content type and mark it as "users can translate this field"
5. Edit the published translation, fill in a value for the field and publish the translation

The expectation is that now there is a published source language document without a value for the new field and a published translated language document with a value for the new field. Examining the tables manually, we can confirm this is the case.

SELECT * FROM `node__field_article_new` LIMIT 50

 Modify	bundle	deleted	entity_id	revision_id	langcode	delta	field_article_new_value
 edit	article	0	17621	1177976	de	0	GERMAN NEW

So far so good.

Now, we want to write an entity query that will return to us all entities (including translations) that do *NOT* have a value for field_article_new. So we try something like this.

use Drupal\Core\Database\Database;

$query = \Drupal::entityQuery('node')
  ->condition('type', 'article')
  ->notExists('field_article_new')
  ->execute();

echo "REVID     ENTITYID\n";
foreach ($query as $revid => $nid) {
  echo "$revid      $nid\n";
}

echo count($query) . " results found\n";

EXPECTED BEHAVIOR:

We would expect that the published source language doc will match since it does not have a value yet for this new field and that the published translated language doc will not match since it now has a value for this field.

ACTUAL RESULT:

The published source language doc is not returned in the query results.

PRELIMINARY FINDINGS:

Examining the query that entityQuery produces behind the scenes for us shows the following for the above entityQuery:

SELECT base_table.vid AS vid, base_table.nid AS nid
FROM
{node} base_table
INNER JOIN {node_field_data} node_field_data ON node_field_data.nid = base_table.nid
LEFT JOIN {node__field_article_new} node__field_article_new ON node__field_article_new.entity_id = base_table.nid
WHERE (node_field_data.type = :db_condition_placeholder_0) AND (node__field_article_new.field_article_new_value IS NULL)

With respect to the entity under test, simplifying the query for informational/demonstration purposes (select * instead of just nid/vid and hardcode our article bundle), and pop it in adminer, we notice that our node 17621 is not in the results at all.

SELECT *
FROM
node base_table
INNER JOIN node_field_data node_field_data ON node_field_data.nid = base_table.nid
LEFT JOIN node__field_article_new node__field_article_new ON node__field_article_new.entity_id = base_table.nid
WHERE (node_field_data.type = 'article') AND (node__field_article_new.field_article_new_value IS NULL)

Now if we take off the IS NULL condition and instead add in node_field_data.nid = 17621 so that we are just looking at our node of interest, we see the following.

nid	vid	type	uuid	langcode	nid	vid	type	langcode	title	uid	status	created	changed	promote	sticky	revision_translation_affected	default_langcode	moderation_state	content_translation_source	content_translation_outdated	rh_action	rh_redirect	rh_redirect_response	published_at	publish_on	unpublish_on	bundle	deleted	entity_id	revision_id	langcode	delta	field_article_new_value
17621	1177976	article	5972bfd5-4210-4d5a-b527-13c78017fcea	en-us	17621	1177976	article	de	Cloud, AI and integration for today’s digital economy	1	1	1514324112	1514324319	0	0	1	0	published	en-us	0	bundle_default	NULL	301	1492179338	NULL	NULL	article	0	17621	1177976	de	0	GERMAN NEW
17621	1177976	article	5972bfd5-4210-4d5a-b527-13c78017fcea	en-us	17621	1177976	article	en-us	Cloud, AI and integration for today’s digital economy	0	1	1492179338	1514324128	0	0	NULL	1	published	und	0	bundle_default	NULL	301	1492179338	NULL	NULL	article	0	17621	1177976	de	0	GERMAN NEW

Now the problem is evident. Because of the LEFT JOIN condition in the query, it *looks* like the published EN article has a value for field_article_new_value and so it is not returned.

POSSIBLE RESOLUTION:

IMO, the problem here is the join condition does not also join on langcode. If we modify the entityQuery sql by adding the langcode in the LEFT JOIN, it fixes the problem:

SELECT *
FROM
node base_table
INNER JOIN node_field_data node_field_data ON node_field_data.nid = base_table.nid
LEFT JOIN node__field_article_new node__field_article_new ON node__field_article_new.entity_id = base_table.nid AND node__field_article_new.langcode = node_field_data.langcode
WHERE (node_field_data.type = 'article') AND (node__field_article_new.field_article_new_value IS NULL)

Now, the published EN row correctly shows NULL as its value for field_article_new_value. If we then go back to the original query entityQuery() uses and just add the join condition, our node 17621 shows up in the results.

Of course even if entityQuery did this and the entity/revision was returned, we still would not know which translation(s) for that revision matched and which did not since langcode is not reflected in the result has. Not sure how that would be indicated. Maybe by making the results keyed by vid and then langcode and then nid - although this would be a breaking change.

Perhaps entityQuery is not the right tool to use for this job? The goal being "Give me all published 'foo' entities (including translations) that do not yet have a value for the new field 'foo_new'. Is there a different way to go about this? We try not to write sql queries directly against the tables, favoring the API instead, but perhaps in this case it is our only route?

Comments

sdewitt created an issue. See original summary.

sdewitt’s picture

Issue summary: View changes
sdewitt’s picture

Issue summary: View changes
sdewitt’s picture

UPDATE:

I was able to workaround the join condition issue by using a hook_query_alter like the following.

function utilities_query_alter($query) {
  if ($query->hasTag('utilities.batch_field')) {
    $tables =& $query->getTables();
    foreach ($tables as $name => $table) {
      if (preg_match('/^node__field/', $name) &&
          !empty($table['join type'])) {
        $alias = $table['alias'];
        $tables[$name]['condition']
          .= ' AND node_field_data.langcode = ' . $alias . '.langcode';
      }
    }
  }
}

This did work. However, it is still not usable due to the following line in Drupal\Core\Entity\Query\Sql\Query#result():

    return $this->sqlQuery->execute()->fetchAllKeyed();

This causes only the last matching nid to be stored per vid since the vid values are not arrays.

sdewitt’s picture

UPDATE 2:

I further figured out a way to return entities and their translations. It is dirty and kludgy but it works. I added the following to the query alter:

    $fields =& $query->getFields();
    unset($fields['vid']);
    unset($fields['nid']);
    $query->addExpression('CONCAT(node_field_data.vid, \'.\', node_field_data.langcode)', 'vid_lang');
    $query->addExpression('node_field_data.nid', 'nid');

Now I end up with an array keyed on "vid_lang" so that each vid + langcode shows up uniquely in the array.... I did say it was ugly.

I understand that this does change the API contract on the return value of this method. However, unless there is another way to run these entity queries against their entities and all translations properly (that solves the problem introduced by the join without the langcode), I think this will have to do. Also, I only make these query modifications if I see the custom query tag so it would only affect queries I tag to do so.

sdewitt’s picture

Version: 8.5.x-dev » 8.4.x-dev

Changed version to 8.4.x-dev

Wim Leers’s picture

Priority: Normal » Major

This sounds like a major bug to me.

hchonov’s picture

Are you aware that \Drupal\Core\Config\Entity\Query\Condition::notExists() has a second parameter $langcode to limit the query to only specific entity translations?

eugis’s picture

Just faced the same issue but with the ->condition() method. Thanks to @hchonov - worked with langcode parameter as well!

plach’s picture

Version: 8.4.x-dev » 8.7.x-dev

The $langcode parameter does not work if you need a condition on the default entity translation, regardless of its language.

amateescu’s picture

A solution here could be to support a special ***ANY*** value for the $langcode parameter that would not restrict the join to a certain language but make it a join condition on the base table table instead, similar to what @sdewitt did in #4.

In addition to that, we would also need to ensure that the base table of the query is the data/revision data table instead of the base/revision one.

plach’s picture

Yep, this is the same conclusions I got to when I was debugging the entity query with @tedbow in the issue above.

What about adding a LanguageInterface::LANGCODE_ANY constant (value x-any)? We could also support the LanguageInterface::LANGCODE_DEFAULT language code as as shortcut for ANY + default_langcode = 1 condition.

hchonov’s picture

What about adding a LanguageInterface::LANGCODE_ANY constant (value x-any)?

+1.

It would be a serious BC break, but shouldn't this be the default? If this is not doable in 8.x then maybe in 9.x?

We could also support the LanguageInterface::LANGCODE_DEFAULT language code as as shortcut for ANY + default_langcode = 1 condition.

I don't understand this. Could you please elaborate on that?

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.9 was released on November 6 and is the final full bugfix release for the Drupal 8.7.x series. Drupal 8.7.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.8.0 on December 4, 2019. (Drupal 8.8.0-beta1 is available for testing.)

Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.2.x-dev

Drupal 8 is end-of-life as of November 17, 2021. There will not be further changes made to Drupal 8. Bugfixes are now made to the 9.3.x and higher branches only. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Eric_A’s picture

Version: 9.2.x-dev » 9.1.x-dev
Eric_A’s picture

Version: 9.1.x-dev » 9.2.x-dev

Version: 9.2.x-dev » 9.3.x-dev

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.15 was released on June 1st, 2022 and is the final full bugfix release for the Drupal 9.3.x series. Drupal 9.3.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.4.x-dev branch from now on, and new development or disruptive changes should be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.9 was released on December 7, 2022 and is the final full bugfix release for the Drupal 9.4.x series. Drupal 9.4.x will not receive any further development aside from security fixes. Drupal 9 bug reports should be targeted for the 9.5.x-dev branch from now on, and new development or disruptive changes should be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Eduardo Morales Alberti’s picture

Any news about it?

Version: 9.5.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.