When creating a view with a relationship to an entity reference field (to join the referred-to entity in the query), duplicates will be produced if the entity reference field is multi-valued (i.e. cardinality > 1). Users assume that checking the box at Advanced -> Query settings -> Distinct will solve this problem, but it doesn't.

This is because from a database perspective, the records actually aren't duplicates. Each different entity reference in the multi-valued field will produce a different row as the target IDs are different, even though this isn't necessarily reflected in the view itself if not all fields are included. Said another way, the view results may look the same on the front-end, even though they're not on the back-end.

It should be noted that this affects all entity types, whether it be taxonomy terms, nodes, paragraphs, custom entities, etc. There are several duplicates of this issue in various other queues that should be marked as such, with all efforts directed here.

Proposed solution

If the Distinct box is checked in the view configuration, after the query has executed, remove results whose entity IDs are already in the list.

Remaining tasks

  1. Produce a patch to solve the basic problem.
  2. Add support for pagers (as per #12).
  3. Add support for aggregation (as per the D7 version of Views Distinct).
  4. Add tests.

Original report

If you have a content type with a multivalue taxonomy term field and add a relationship based on that field in a view (in my case I want to use the term name as an argument) the resulting views result has duplicates if one or more of the nodes has multiple terms in that field.

The DISTINCT keyword does not help, since the tid is added as a field in the SELECT part of the query.

I have the following query:

SELECT DISTINCT node_field_data.sticky AS node_field_data_sticky, node_field_data.created AS node_field_data_created, node_field_data.nid AS nid, taxonomy_term_field_data_node_field_data.tid AS taxonomy_term_field_data_node_field_data_tid
FROM 
{node_field_data} node_field_data
LEFT JOIN {taxonomy_index} taxonomy_index ON node_field_data.nid = taxonomy_index.nid
LEFT JOIN {taxonomy_term_field_data} taxonomy_term_field_data_node_field_data ON taxonomy_index.tid = taxonomy_term_field_data_node_field_data.tid
WHERE (node_field_data.promote = '1') AND (node_field_data.status = '1')
ORDER BY node_field_data_sticky DESC, node_field_data_created DESC

.. which produces duplicates, even though the DISTINCT keyword is present, because the taxonomy_term_field_data_node_field_data.tid makes the duplicate nodes in the results 'distinct'.

Ideally the query should just exclude the field from the SELECT part like this:

SELECT DISTINCT node_field_data.sticky AS node_field_data_sticky, node_field_data.created AS node_field_data_created, node_field_data.nid AS nid
FROM 
{node_field_data} node_field_data
LEFT JOIN {taxonomy_index} taxonomy_index ON node_field_data.nid = taxonomy_index.nid
LEFT JOIN {taxonomy_term_field_data} taxonomy_term_field_data_node_field_data ON taxonomy_index.tid = taxonomy_term_field_data_node_field_data.tid
WHERE (node_field_data.promote = '1') AND (node_field_data.status = '1')
ORDER BY node_field_data_sticky DESC, node_field_data_created DESC

which works an nodes with multiple terms in the field are now only appearing once.

I have couldn't find a way to actually omit the field in the SELECT part of the query, and I'm also not sure, if it could create other issues if removed. But the tid in the select the DISTINCT keyword is useless..

Steps to reproduce.
1. add multivalue taxonomy term field to node type, and add a node with multiple terms selected
2. add a view which uses that field to add a relationship to the term (needed to use the term name as argument)
3. make the view results DISTINCT
4. optionally add the argument (contextual filter) that uses the relationship to filter the views result on the term name.
5. if no argument is given, the node with multiple terms in the field will appear multiple times

Issue fork drupal-2993688

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

prinds created an issue. See original summary.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Jorge Navarro’s picture

I was able to fix it writing a custom module:

use Drupal\views\ViewExecutable;
use Drupal\views\Plugin\views\query\QueryPluginBase;

/**
 * Implements hook_views_query_alter().
 */

function custommodule_views_query_alter(ViewExecutable $view, QueryPluginBase $query) {
 if ($view->id() == 'VIEW_ID' && $view->getDisplay()->display['id'] == 'DISPLAY_ID') {
    $query->addField('', 'nid', '', ['function' => 'groupby']);
    $query->addGroupBy('nid');
    }
}

I hope this helps.

rgpublic’s picture

#3 works for me. Thanks a lot. A lifesaver! Terrible that we're now on version 8.6 and basic stuff like that still doesn't work properly and reliably out-of-the-box. It's a node, a view and a taxonomy term. All of them are core modules. You use them in a very basic fashion. And it breaks. Stuff like that just shouldn't happen IMHO.

ñull’s picture

The work around is not suitable when sorting comes into play. Any suggestion?

david.qdoscc’s picture

Same issue as #5, however if using a Table format and set the sort functions on the table settings, this does not interfere with the groupby.

it-cru’s picture

Perhaps we should mark this issue with Drupal 8.8.x-dev and raise priority to major?

playful’s picture

I'm using Views 8.7.8 and having this problem. A patch would be very much appreciated!

mykola dolynskyi’s picture

8.7.10 - same problem as described in #0

colan’s picture

Title: Views relationship for taxonomy term field invalidates DISTINCT keyword » Views relationships with multi-valued entity reference fields invalidate Distinct query option
Version: 8.6.x-dev » 8.9.x-dev
Status: Active » Needs work
Related issues: +#2609938: Port Views Distinct module to Drupal 8
StatusFileSize
new1.44 KB

I made the title more general as this actually affects any entity references (with multiple values), not just Taxonomy.

Contrib has been handling this with the Views Distinct module, which (after the query is executed) removes duplicate results. This seems like the simplest solution, and Advanced -> Query settings -> Distinct should certainly be doing it. So I'm keeping it as a Bug, and not changing it to a Feature Request.

Here's a D8-core version of what somebody posted in #2609938-8: Port Views Distinct module to Drupal 8. It's still missing support for aggregation as per the D7 module, but seems to work well if not using that in your views.

colan’s picture

Issue summary: View changes
Issue tags: +Needs tests

Updated the IS.

sense-design’s picture

The patch from #10 breaks my pager but it filters all the duplicates 👍

colan’s picture

Issue summary: View changes

#12: You can't have everything. ;) Thanks for the review. I just added pager support to the IS.

Folks: Feel free to add other items there directly, and keep moving this forward. It now works well enough for my use case so I won't be doing any more work on it.

bkosborne’s picture

Why does this need to be solved by adding a group by? Seems to me the problem is that views is adding a field to the SELECT for the ID of the referenced entity for no apparent reason. That's what makes each result unique and prevents a normal DISTINCT from working.

bkosborne’s picture

The problem is this bit of code in Drupal\views\Plugin\views\query\Sql in "query" method:


    // Make sure each entity table has the base field added so that the
    // entities can be loaded.
    $entity_information = $this->getEntityTableInfo();
    if ($entity_information) {
      $params = [];
      if ($groupby) {
        // Handle grouping, by retrieving the minimum entity_id.
        $params = [
          'function' => 'min',
        ];
      }

      foreach ($entity_information as $entity_type_id => $info) {
        $entity_type = \Drupal::entityManager()->getDefinition($info['entity_type']);
        $base_field = !$info['revision'] ? $entity_type->getKey('id') : $entity_type->getKey('revision');
        $this->addField($info['alias'], $base_field, '', $params);
      }
    }
bkosborne’s picture

Here's my ugly solution that removes the ID of the entity reference field from the list of fields in the SELECT.

Note that this has to be done in views_post_build and not views_query_alter, because the ID field is added after views_query_alter is invoked :(.

/**
 * Implements hook_views_post_build().
 */
function HOOK_views_post_build(ViewExecutable $view) {
  if ($view->id() === '...' && $view->getDisplay()->display['id'] === '...') {
    // Populate with the name of the field alias that views used when adding the field.
    // I found this for my view with had a relationship to a paragraph using xdebug.
    $entityReferenceIdFieldAlias = '';

    /** @var \Drupal\Core\Database\Driver\mysql\Select $mainQuery */
    $mainQuery = $view->build_info['query'];
    $mainQueryFields = &$mainQuery->getFields();
    unset($mainQueryFields[$entityReferenceIdFieldAlias]);

    /** @var \Drupal\Core\Database\Driver\mysql\Select $countQuery */
    $countQuery = $view->build_info['count_query'];
    $countQueryFields = &$countQuery->getFields();
    unset($countQueryFields[$entityReferenceIdFieldAlias]);
  }
}

Nothing in the view seems to break from this, and it allows distinct to work correctly.

vidorado’s picture

My solution has been adding a groupby by nid. People reports that this doesn't work when sorting is in place. In my case was because there were problems with the ANSI SQL mode of GROUPBY, not allowing to select fields that weren't in the groupby.

For this kind of groupby can be done, we have to disable ANSI and ONLY_FULL_GROUP_BY modes in Connection class. Is risky, but it can be handy if you know what you are doing...

core/lib/Drupal/Core/Database/Driver/mysql/Connection.php:

/**
   * {@inheritdoc}
   */
  public static function open(array &$connection_options = []) {

   ...

  $sql_mode = 'STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO';
   ...  

  }

Custom module:

/**
 * Implements hook_views_pre_execute().
 */
function mymodule_views_pre_execute(ViewExecutable $view) {
  if (in_array($view->id(), ['...', '...'])) {
    $view->build_info['query']->groupBy('nid');
  }
}
yfma’s picture

In our case, we have translated taxonomy terms and desecration to be displayed in the view, even no multiple terms selected on the content, it was still duplicated in the view. The patch from #10 works well, filtered out the duplicate from either original and translated language.

liquidcms’s picture

Hey Colan, great to see you (and others) taking this on... another awesome feature lost in D8 :(

I tried patch in #10 on my modified version of the Commerce Orders view. The paging seems to be fine but 1 very odd issue i see is that the VBO checkbox on the last item of the first page is missing: https://www.screencast.com/t/4ddRPL0Z. I only have 2 pages at the moment; and 2nd page isn't full; so my guess is this will be missing on all full pages.

I may get a chance to look into this myself; but likely not until next week or so.

robertoperuzzo’s picture

I tested #2993688-10: Views relationships with multi-valued entity reference fields invalidate Distinct query option in my scenario and it works fine, pager too. My view has a relation with taxonomy multi-values field and I'm not using VBO.

tahiche’s picture

After trying out just about everything it seems like @bkosborne at #16 is actually working!
The patch at #10 works but is incompatible with a pager and results summary (since it filters after the actual execution).
I couldn´t get hook_views_query_alter to work. In my case it´s a Catgory (taxonomy) as a relationship.
Now, if i may ask... Why is this fix ugly?. Looks beatiful to me ;)

colan’s picture

playful’s picture

I tried #10 and it worked to remove the duplicates but didn't work well with the pager. Any chance to get #16 in a patch for testing?

amaisano’s picture

The "proposed" solution here is awful, if it's suggesting we post process the results of the view. This results in inaccurate paging, counts, etc. Using the GROUPBY adjustment is a better, more pure approach that doesn't require tweaking results, pages, etc.

colan’s picture

I just needed to get something working, but you're right. That's exactly why I set the patch to Needs Work. Better patches are obviously welcome.

bkosborne’s picture

GROUPBY or post alter - they are both bad solutions to the real problem, which is that field should not be added when it's not used.

vidorado’s picture

We can work in adding the fields to the groupby so we don't have to modify the SQL mode like I did in #17

I don't see other "clean" solution apart from a groupby @bkosborne, ¿do you have any other proposal?

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

michèle’s picture

I had the same problem. Despite distinct, nodes with multiple taxonomy terms were displayed multiple times. I tested patch #10 and thought at first that it would work in my case because the nodes with multiple terms were not displayed multiple times afterwards. But then I realized that the view now contains too few lines.
Example: I have 10 nodes that should be shown in the view. 3 nodes have 2 terms, 7 nodes have 1 term.
Now the view shows only 7 rows instead of 10 rows (3 nodes with 2 terms and 4 nodes with 1 term - so in the end I have 10 displayed terms but only 7 displayed nodes).

So I removed the patch and tried the solution from bkosborne in #16. In my case it works! Thank you @bkosborne!
A little tip: I found out the alias very easily with kpr ($ mainQueryFields) (requires Devel).

yalasta’s picture

#3 works for me. Thanks a lot @Jorge Navarro!

sense-design’s picture

#10 is not compatible to Drupal 8.9

kclarkson’s picture

Thank you so much for #10 patch. I am currently on Drupal 8.8.5.

I would love to see an updated 8.9 patch.

pameeela’s picture

Category: Bug report » Feature request

I am updating this to a feature request based on consultation with @Lendude, @jibran and @catch in Slack - comments below. There is probably not a single, catch-all solution for this but we are leaving it open to encourage discussion and patch proposals.

Lendude says:

Well there are a number of options on the issues that you could try that could work for specific views, but have to agree with @jibran that an actual solution that only shows the rows you want and is database agnostic is unlikely to come up

jibran says:

I don't think it is fixable.

There is a an inherit problem with relationships. For example if you say, I'm going to invite my father's parents to the party( which is an example of the single value ER relationship) then it make sense but if you say I'm going to invite my father's parents, my paternal uncle's parents and my paternal aunt's parent ( which is an example of the multi value ER relationship) then you are inviting three copies of same grandparents which doesn't make sense.

The only way to fix this is to fix it per row. As in, my father's parents, my mother parents and my cousin parents

tednacip’s picture

Hi, did you find a solution for D8.9 ?
Thank you !

adsyy’s picture

Hi, path #10 work for me on D8.9.1 but not totally because view->total_row is not updating correctly (obviously because it's post treatment).

edit : patch dont work on D8.9, duplicate node are deleted twice.

pameeela’s picture

Issue tags: +Bug Smash Initiative
coufu’s picture

#3 works for me (so far) thank you.

acbramley’s picture

This can actually surface without a relationship. I have a grouped filter on an entity reference field with a single option on "Not empty". This results in an INNER JOIN on the field's table and any entity that has multiple values in that field will have a duplicate row produced. Similarly to the issue with relationships, this can't be solved by using a Distinct.

dqd’s picture

This issue is not only affecting multi-valued reference fields but also multiple back references from other bundles referring to the listed items when the user tries to add them via relationship.

I rarely feel the need to reverse a status decision to prevent a back and forth edit war, but here I somewhat disagree with the decision turning this issue into a feature request. We have an underyling problem to solve in relation to an existing and promoted feature which is not working as expected. No matter if it is more or less complicated or more or less involving other changes or misunderstandings of users why something is not working easely as expected. Let's break it down to the UX, which is always our last instance of "if something works" or not: The user wants to list items and wants to add fields with references, or parts of other items which are referring to the prior items listed. The average user assumes that this is exactly what should easely be done with Views and the more advanced user expects that DISTINCT is exactly there for to prevent more than one row to be shown if there are multiple references or back references. And it is not working that way easely. If this is a "feature request" then we can not say on the product page that Views is able to list items and references from or to it side by side easely.

Look here and here how users expect that to work out. Otherwise why should we limit the entitiy reference module to only be able to add/refer in one direction only? Modules like CER try to overcome this limitations but run with the risk to break things. This is a "not to solve" issue from the users perspective from one end or the other at the moment. If this is a feature request then we have to state clearly what entity reference is not able to do in combination with Views at the moment.

Little side kick info: We already discussed this in times of Node reference vs Relations module priority in D7 cycles and the question which attempt should be preferred for core.

pameeela’s picture

@diqidoq feel free to switch it back to a bug report.

I changed it on the advice of Lendude and jibran but solely based on the lack of a viable solution that would work for all use cases. We aren't saying it's simply 'not a bug' or isn't something that should be supported.

If you think there is a good solution for it, patches welcome!

rgpublic’s picture

What I don't understand is the absolute premise about the solution needing to be database agnostic. Could someone elaborate? If I understand correctly we need to switch off ONLY_FULL_GROUP_BY. I might be wrong, but this seems to be available with PostgreSQL and MySQL. Anyone actually using other databases? And it's "risky" only if we remove this permanently on DB open. We could also switch this flag off just for the query if it's set and reset everything back to its original state after the query.

Currently, the corresponding Views option says somewhat cryptically sth. along the lines of "Try(!) to remove duplicate rows. This doesn't always work". If such a weak promise is made, all bets are off anyway. Why not just check the type of database and if it's supported and the option is set, really try everything possible to remove those rows? It would work for >99% of the user base and get rid of a (IMHO) quite terrible problem.

dqd’s picture

@pameeela Sure. I absolutely see your reasons and it wasn't any offense. And I didn't turned it back to a bug report yet because a) I agree with parts of what you and Lendude and jibran discussed but just put my 2 cents to it. And b) I would like to collect more thougths and voices here on it. Also on my concerns. Feel free to add something to my inserts. And of course are patches appreciated. We support a lot of projects on D.O. but I think there is still more to put on the table before starting with patches in different directions.

@ rgpublic: This! +1 Good point. Maybe ... If you are not mistaken and the simple solution would work, I would rather recommend to make this an option setting under query settings exactly where the user also can set DISTINCT. I would not let it go active automatically. Why: Because 1.) It is a more advanced setting the user should be aware of and should know what she/he is doing and 2.) This has a better performance because the user can be warned in the checkbox description to only activate it in case of running databse type A or B ... etc and we need to checks what the project is build on.

jasonluttrell’s picture

Thanks to all of those on here who are attempting to address this issue. This is definitely a bug (not a feature request) and it needs a resolution as it does not work as per user expectation. None of the proposed solutions thus far fully address the problem. I was able to implement a workaround by adding an additional custom text field at the bottom of my fields in my view and it then grouped the results. It took a lot of playing around to get it to work, but I managed. It should not be that difficult to make this work though. Hopefully, the group/distinct issue can be resolved soon. Thanks again.

Also regarding @rgpublic's point about the "solution needing to be database agnostic"--100% agreed. A solution that works 95% of the time is better than one that currently does not work.

baluertl’s picture

Patch #10 applied on 8.9.x branch for me does the job of filtering only distinct rows from views output.

ravi.shankar’s picture

StatusFileSize
new897 bytes

Added reroll of patch #44.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

brightbold’s picture

Per #19, shouldn't we add VBO support to the "Remaining tasks" list? I was revisiting this issue since there's been some movement on it, but remembered that the inability to use VBO on the final item in the list was the dealbreaker for me — my client needs this in a Commerce view where they use VBO to mark orders in bulk as fulfilled. Obviously it's not core's responsibility to ensure it's compatible with every contrib module, but the fact that there's no checkbox on the last item in the list presumably means there's something non-standard about the way the patch works which may have other ramifications.

I ultimately tried the solution in #16 and that seemed to do the trick! Thanks @bkosborne. If anyone else is implementing that, you can also get the entityreference field alias through kint($view) by drilling down to query > fields > [name of field where the relationship is causing the duplication] > [whatever's in the alias column for that field].

anruether’s picture

> Obviously it's not core's responsibility to ensure it's compatible with every contrib module, but the fact that there's no checkbox on the last item in the list presumably means there's something non-standard about the way the patch works which may have other ramifications.

I'm not sure if [#19] is using VBO module or cores bulk operations. I'm seeing that in a view with bulk operations for a row that used two rows without this patch is the last in the list, the vbo checkbox is displayed. But if an item is last which was already distinct without this patch, the checkbox is not displayed (see screenshot).

Also, it seems we can't display all results of the reference field when using this patch like the "Display all values in the same row" option for multi value fields?

blainelang’s picture

StatusFileSize
new718.9 KB

Using 9.0.7 and applied #45 via composer and found the test for the view query instance of was returning FALSE
even though, when I evaluate $view->getQuery the object is a Sql instance - see image below.
if (!$view->getQuery() instanceof Sql)

getQuery instance

v.hilkov’s picture

StatusFileSize
new1.15 KB

I see, that in #45 is missing use Sql dependency, that's why it wasn't working properly.

liquidcms’s picture

Just an update to my comment above suggesting this patch breaks vbo; that is not entirely correct. It seems that this patch with vbo and distinct set only break vbo when i add a field that uses an entity relationship.

liquidcms’s picture

A test case: Paragraphs on a Node; multivalue Term field on paragraph (currently using this patch)

- i have 2 Paragraphs on my node.

- i originally had a content (node) view which i then made a relationship to the paragraph to list paragraph fields. This caused duplicates (4 rows). When i select distinct, i now get 1 row (the correct answer is 2) because i suspect this patch groups on NID, so i only have 1 nid, but 2 paragraphs on it

- re-did the view to be a paragraphs view and now with distinct off i get 4 items; but with it on, i get 2 (correct as now i suspect distinct is grouping by paragraph id, not nid). So far, so good. :)

But.. on the paragraph i have a multivalued taxonomy term; this term has a field i want to display so i add a relationship to that term. With distinct set i only see one of my terms (so 4 results with only 1 term showing) with distinct off; i now get 12 results (1 paragraph has 1 term, the other has 2 terms - so 12 rows makes sense).

I haven't gone through this patch at all; but guessing the distinct/groupby has to be done on a relationship by relationship basis; not on just the base id of the row.

I am pretty sure all of this worked correctly in D7 (but i'll need to test).

liquidcms’s picture

Component: taxonomy.module » views.module
Category: Feature request » Bug report
Priority: Normal » Major

Not sure how this isn't at least Major.

thirstysix’s picture

D8.9.x - Patch #10 work well, but not in the pager. I tried with all patches, but still having the duplicates in pager.

camslice’s picture

Same here, still getting duplicates in the pager. Drupal 9.1.5. Thanks for everyone's hard work on this, I'm not at a level where I can contribute but I'm cheering from the sidelines :)

matthiaso’s picture

Had the same pager issue. Fix based on #50

matthiaso’s picture

For the pager issue, I added patch 56 based on #50.

mxwright’s picture

Status: Needs work » Needs review
mxwright’s picture

camslice’s picture

Unfortunately #56 doesn't work for me on Drupal 9.1.5. Interestingly the issue appears as soon as I add a relationship to the view. I've found that I don't need to use the relationship in a filter for the bug to appear. The mere existence of a relationship on the view causes duplicates to surface in the pager.

shriaas’s picture

This problem is not just with referenced field but any field which has its own table even fields like plain text.
One more thing to take care of while coming up with a solution is how does not contain/is not one of operator is handled.
For example a multi-valued text field of n1 contains foor, bar and baz and filter has value bar the containsoperator should return TRUE and the does not contains operator should return FALSE.
See this issue for details: https://www.drupal.org/project/drupal/issues/2980394.

colan’s picture

seanb’s picture

As mentioned multiple times, removing results in views_views_post_execute() will break pagers and can potentially show even empty pages.

After digging around a bit a traced the issue to Drupal\views\Plugin\views\query\Sql::query() and specifically:

    // Make sure each entity table has the base field added so that the
    // entities can be loaded.
    $entity_information = $this->getEntityTableInfo();
    if ($entity_information) {
      $params = [];
      if ($groupby) {
        // Handle grouping, by retrieving the minimum entity_id.
        $params = [
          'function' => 'min',
        ];
      }

      foreach ($entity_information as $entity_type_id => $info) {
        $entity_type = \Drupal::entityTypeManager()->getDefinition($info['entity_type']);
        $base_field = !$info['revision'] ? $entity_type->getKey('id') : $entity_type->getKey('revision');
        $this->addField($info['alias'], $base_field, '', $params);
      }
    }

I think we should not necessarily add base fields for all relations. Sometimes the relationship is only used for filtering, not to show fields. We could have a number of ways to fix this. Whatever we do, we should probably add an option to allow the current behavior to be changed. This will prevent existing views that rely on this to remain the same.

  • We add an option like 'Disable automatic base field for relationships.'
  • When this option is set, we:
    • A) Only add a base field for the base table of the view. This is the easiest option, but if a user configured fields using the relation this would break. So I guess this would be a little hacky.
    • B) Try to extract which relationship needs to have its base field added by checking the fields configured for the current display.
seanb’s picture

StatusFileSize
new2.29 KB

This patch implements the quick workaround for #63.A since that seems to fix it for me at the moment. Be careful using this patch. Hopefully it does help others for the time being and provide a possible direction for a proper fix.

colan’s picture

Status: Needs review » Needs work
+++ b/core/modules/views/src/Plugin/views/query/Sql.php
@@ -300,6 +303,12 @@ public function buildOptionsForm(&$form, FormStateInterface $form_state) {
+      '#description' => $this->t('By default views adds field to the query for each relationship in the view. This can potentiolly break the "distinct" option.'),

Do you mean "adds a field" or "adds fields"? The grammar is incorrect.

Also, "potentiolly" is spelled wrong.

+++ b/core/modules/views/src/Plugin/views/query/Sql.php
@@ -1351,10 +1360,17 @@ public function query($get_count = FALSE) {
+        // after that, so if we want to skip base fields for relation we can

"for relations" or "for a relation"?

manish-31’s picture

Assigned: Unassigned » manish-31
manish-31’s picture

Assigned: manish-31 » Unassigned
Status: Needs work » Needs review
StatusFileSize
new2.29 KB
new1.34 KB

Attaching a patch, I have fixed the typos and grammatical issues. Needs review.

The last submitted patch, 67: 2993688-66.patch, failed testing. View results

seanb’s picture

StatusFileSize
new581 bytes
new2.86 KB

My grammar is awful so thanks for the review! I forgot to add the schema for the new setting as well, so here is another version with the schema so the setting will be added to config on export.

thomas.dutch’s picture

We tested patch 2993688-69.patch hoping it might also prevent duplicate items when content contains multiple dates inside a paragraph in combination with a date filter. In this scenario it did prevent the duplicates but only when we remove the date sort (which, just like the date filter, contains a relation to the paragraph with the date field).

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

tim-diels’s picture

Good work everyone! I had this issue also and patch #69 works to solve the issue.

ginovski’s picture

This is very needed, +1

camslice’s picture

Patch #69 along with ticking the new checkbox "Disable automatic base field for relationships" in Query settings solves it for me. Nice work folks :)

glynster’s picture

Even with this patch applied we are not getting any success. Our setup is a view with a specific content type, entity reference for a taxonomy with multi value. Patching and then applying the checkbox gives us the same results. For Drupal 7 sites we used views_distinct module a lot to solve this issue. @ThirstySix created a working version of it for D8/9 https://www.drupal.org/files/issues/2021-02-23/views_distinct.zip and it solved the problem right away. I wanted to link this here to see what could be used from the module to truly resolve the core issue.

tobiberlin’s picture

I applied the patch from #69 with the expected result for most cases. But I have one case where the patch avoids any relationships to be added to the query. I have a view of entities with two displays: one page display and one data export display created with Views data Export module. This export is added to the page display so the user has a button add the end of the page where he can download the result as CSV.

When I export the data all columns which should show data/fields from entities which are included by relationships become empty whenever I check the "Disable automatic base field for relationships" checkbox in view configuration for this export view. When I uncheck this the data of the related entities come back but the old problem of duplicated results come back.

I created an issue for Views Data Export: https://www.drupal.org/project/views_data_export/issues/3226848

seanb’s picture

@tobiberlin Yeah the patch in its current form is not a real solution, just a workaround that works for some (mostly simple) views. Ideally we should figure out which automatic base fields are actually needed, and only discard fields we don't need. That is precisely why this bug exists though, it not that easy to figure out which base fields are needed in the select (specially since queries could be altered).

imclean’s picture

The patch in #69 works well when there are no fields from the referenced entity in the view. We're using it to filter by the bundle of the referenced entity.

AndyLicht’s picture

I´m working with Drupal 8.9.18

At the moment no solution is working for me. You can see my configuration at the attached image.
I´ve created a relationship between nodes and media object over a vocabulary.

The media object has the field field_tags and the node has the field field_mt_post_tags both are referenced to the same vocabulary.

Now i´ve got the problem of duplicates and that are all images shown if only one tag is used together, but i want to have every image once and i want that the media is referenced to all tags of the referenced tags of the node.

If i´m using the code of #3:

use Drupal\views\ViewExecutable;
 use Drupal\views\Plugin\views\query\QueryPluginBase;
  
 /*
 * Implements hook_views_query_alter().
 */
function custom_view_query_views_query_alter(ViewExecutable $view, QueryPluginBase $query) {
  $id_view = $view->storage->id();
  drupal_set_message($id_view);
  switch($view->storage->id()) {
    case 'duplicate_of_photo_gallery_test_';
	
	/*
    $query->addField('entity_type', 'id', '', ['function' => 'groupby']);
    $query->addGroupBy('entity_type.id');
	*/
    break;
  }
}

I got the following error, so i have added the "media: ID", than i get the error in the view-display (second image).
Has someone an idea to solve that problem?

eswiderski’s picture

#75 worked perfect for me. Thanks @glynster

weekbeforenext’s picture

Patch #69 worked, even in a case where the related term fields are being used in a Combined fields filter. Thanks!

brooke_heaton’s picture

Patch #69 is working well for me.

brooke_heaton’s picture

Status: Needs review » Reviewed & tested by the community
danflanagan8’s picture

Status: Reviewed & tested by the community » Needs work

Setting back to NW. This still needs tests.

And the comment in #77 from the patch's poster certainly gives me pause.

danflanagan8’s picture

Status: Needs work » Needs review
StatusFileSize
new9.9 KB

Here's a test-only patch that re-creates the scenario described in the original IS. It adds config for a test view. If you do a standard install locally, you should be able to import that view through the config ui (i.e. copy/paste into the import single item form). That might help manual testing.

I'll leave the Needs Tests tag on for now, but I'm hoping someone can review the test. I think this is what we need, but another set of eyes would be great. FWIW, the patch in #69 does not fix any of these failing cases.

Status: Needs review » Needs work

The last submitted patch, 85: 2993688-85-FAIL.patch, failed testing. View results

welly made their first commit to this issue’s fork.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

liquidcms’s picture

I tried the idea of adding a groupby in a query_alter; but this adds groupbys for all my fields (as enabling aggregation does). When i take the resulting views sql and run it directly in the db i see my duplicates. When i remove all the groupbys except the one i add with code, i get the correct result set.

before i tried to track down how to programmatically remove the unwanted groupbys, i tried the approach in #16 - and sure enough, it seems to work.

wasn't entirely sure which field to remove as i had the following:
0 = "paragraphs_item_field_data_langcode"
1 = "field_sessions_paragraphs_item_field_data_langcode"
2 = "paragraph__field_session_date_field_session_date_value"
3 = "id"
4 = "field_sessions_paragraphs_item_field_data_nid"

but removing the first 1 did the trick.

btw, for this fix: aggregation is off, distinct is enabled.

havent yet tried this with vbo; but pager works correctly.

liquidcms’s picture

Although the solution from #16 worked for one of my paragraph views it didn't work for another.

I was able to get the 2nd of my views to remove duplicates by using aggregation and some combination of groupbys added through the UI. But, with this i noticed, somehow this breaks my date range fields from showing their end dates.. wth??

Trying solution from #16, even though i have numerous fields in my view there are only 2 that show in that code and removing either of them only removes columns of required data from my view.

I also noticed that with this view; even though i have distinct set, it does not show in views sql. And if i take the sql that views has created and use it directly in db as an sql query; i show the duplicates - until i add the distinct and then this fixes it. More views bugs.. :(

edmund.dunn’s picture

#69 worked well for me.

isinadinos’s picture

#69 not working in drupal version 9.3.3

azinck’s picture

I'm just here to strongly endorse the solution outlined by SeanB in 63B. Everything else (post-processing, GROUP BY) is a hack. We've got to get the columns out of the select statement unless they're explicitly required.

anybody’s picture

Agree with #93. #63B seems to be a good idea, @seanB would you rate 63B as a real fix or workaround? With your experience, which way would you suggest? And should other core maintainers perhaps have a look to make a final plan?

seanb’s picture

As mentioned in #63, in theory the best solution is to only add base fields when they are actually required.

That being said, the views module has a lot of ways to alter a view, and the current behavior has been in the module forever. There are probably lots of sites out there relying on the current behavior. That makes it very difficult to determine which base fields are actually required.

Since it is very hard to determine if any existing code needs a base field to be in the query, I guess making this configurable is the only option. The next questions are:

  1. What should be the default for new sites? Do we keep the current behavior and add an option to turn this off, or do we turn this off by default and write an update hook so existing sites keep the current behavior?
  2. Which base fields do we include? I think by default we only need the base fields that are actually configured to be displayed, but I'm not 100% sure about this. This could also lead to problems in code that alters the view/query, but I guess anyone altering a view is also responsible for making sure the proper fields are added.

My vote would probably be to change the default to only include base fields that are actually displayed, and write an update hook to make sure existing sites keep the current behavior.

anybody’s picture

@seanB: 100% agree with all your points in #95 and +1 your vote. I'd see it exactly the same way. As this was a problem in the past and users new to Drupal should not have to learn why things are (wrong) like they are, it should have the improved functionality in the future by default.

So how to proceed here? I guess it would make sense if a core / views subsystem maintainer could have a look? First we should have the plan and green light, before the implementation is started? I'm not very experienced with large core issue workflows. But I think this is indeed major for (even a bit) more complex views.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

alfattal’s picture

Patch in #69 failed to apply on version 9.4.

ravi.shankar’s picture

StatusFileSize
new2.84 KB
new1.3 KB

Added reroll of patch #69 on Drupal 9.5.x.

rsnyd’s picture

#99 seems to work for me on 9.4.3

bbytyqi’s picture

#99 works on 9.4.5

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

aharown07’s picture

Status: Needs work » Reviewed & tested by the community

#99 works in my tests - in 9.4.8

aharown07’s picture

Status: Reviewed & tested by the community » Needs review

Column not found error after adding sort
Details:
* Node view
* Entity reference field to users
* Several nodes with multiple users in the field
* Distinct enabled, 'Disable automatic base table' enabled

This successfully eliminated duplicate rows.

Next...
* Add updated/last comment date to sort

SQLSTATE[42S22]: Column not found: 1054 Unknown column 'node.changed' in 'field list': SELECT DISTINCT "history"."timestamp" AS "history_timestamp", "node_field_data"."created" AS "node_field_data_created", "node_field_data"."changed" AS "node_field_data_changed", "comment_entity_statistics"."comment_count" AS "comment_entity_statistics_comment_count", "node_field_data"."nid" AS "nid", GREATEST(node_field_data.changed, comment_entity_statistics.last_comment_timestamp) AS "comment_entity_statistics_last_updated", GREATEST(node.changed, comment_entity_statistics.last_comment_timestamp) AS "comment_entity_statistics_last_updated_1" FROM "node_field_data" "node_field_data" LEFT JOIN "node__field_select_users" "node__field_select_users" ON node_field_data.nid = node__field_select_users.entity_id AND node__field_select_users.deleted = :views_join_condition_0 LEFT JOIN "users_field_data" "users_field_data_node__field_select_users" ON node__field_select_users.field_select_users_target_id = users_field_data_node__field_select_users.uid LEFT JOIN "comment_entity_statistics" "comment_entity_statistics" ON node_field_data.nid = comment_entity_statistics.entity_id AND comment_entity_statistics.entity_type = :views_join_condition_1 LEFT JOIN "history" "history" ON node_field_data.nid = history.nid AND history.uid = :views_join_condition_2 INNER JOIN "node" "node" ON node_field_data.nid = node.nid WHERE ("node_field_data"."status" = :db_condition_placeholder_3) AND ("node_field_data"."type" IN (:db_condition_placeholder_4)) ORDER BY "comment_entity_statistics_last_updated_1" DESC LIMIT 5 OFFSET 0; Array ( [:db_condition_placeholder_3] => 1 [:db_condition_placeholder_4] => pm [:views_join_condition_0] => 0 [:views_join_condition_1] => node [:views_join_condition_2] => 1 )

Edit: I'm also getting the error with Distinct and Disable base turned off, though. So it may not be related.

Ankit.Gupta’s picture

I have checked patch #99 in my local system it is applied successfully in drupal 10.1.x

smustgrave’s picture

Status: Needs review » Needs work

Error was reported in #104

Also there are a number of remaining tasks it appears.

anybody’s picture

@aharown07 but you don't get the same error without the whole patch? That's for sure?

aharown07’s picture

@anybody. Will test some more. Results shortly.

Test result: I am seeing the error without the patch also on another build without the patch.
So #104 is not due to the patch
Apologies for the hasty post on that.

acbramley’s picture

StatusFileSize
new13.17 KB

You can workaround this with some filter magic in views. Create a new filter OR group and add a filter on the delta = 0 along with another EMPTY filter (so rows without a field value aren't filtered out). E.g

Filter config

Where Personalisation criteria is my entity type and Suburbs is my field.

dpineda’s picture

Nice job! It works great

aharown07’s picture

@acbramley
Tested: fantastic! I don't understand why that works, but it does.
I was going a little crazy repatching core over and over.

acbramley’s picture

@aharown07 essentially you're reducing the duplicates by filtering on the field delta.

komlenic’s picture

+1 for #99. This resolved the issue in my use case on Drupal 9.5.2.

aharown07’s picture

Status: Needs work » Reviewed & tested by the community

I ended up going back to #99 because I had a couple of views where I couldn't use the target ID method in #109 (or maybe more accurately, had a couple of views where I couldn't figure out how to use it). In any case there are tradeoffs: with #109, you have to set this up for each view. With #99, you solve the problem with all of them.
It turned out I have a pretty large number of views with entity reference relationships causing duplicate rows.
I also no longer see any errors with that patch in place.

Seems like this qualifies for RTBC at this point?

danflanagan8’s picture

Status: Reviewed & tested by the community » Needs work

This is definitely not RTBC. It still has the "Needs Tests" tag. I added tests back in #85. I don't believe the tests were ever reviewed and they definitely are not included in the patch in #99.

I think it would be wise to see if the patch in #99 makes the fail test in #85 pass.

Back to NW.

cruze72’s picture

Has anyone got this to work with Drupal 10? I have duplicates when using relationships on a view for commerce orders:
Order Item
User
(User Profile)

Here is the sql query:

SELECT DISTINCT "commerce_order"."order_number" AS "commerce_order_order_number", "commerce_order_item_commerce_order__order_items"."title" AS "commerce_order_item_commerce_order__order_items_title", "profile_users_field_data__profile__address"."address_given_name" AS "profile_users_field_data__profile__address_address_given_nam", "commerce_order"."order_id" AS "order_id", MIN(commerce_order.order_id) AS "order_id_1", MIN(commerce_order_item_commerce_order__order_items.order_item_id) AS "commerce_order_item_commerce_order__order_items_order_item_i", MIN(users_field_data_commerce_order.uid) AS "users_field_data_commerce_order_uid", MIN(profile_users_field_data.profile_id) AS "profile_users_field_data_profile_id"
FROM
{commerce_order} "commerce_order"
LEFT JOIN {commerce_order__order_items} "commerce_order__order_items" ON commerce_order.order_id = commerce_order__order_items.entity_id AND commerce_order__order_items.deleted = '0'
LEFT JOIN {commerce_order_item} "commerce_order_item_commerce_order__order_items" ON commerce_order__order_items.order_items_target_id = commerce_order_item_commerce_order__order_items.order_item_id
LEFT JOIN {users_field_data} "users_field_data_commerce_order" ON commerce_order.uid = users_field_data_commerce_order.uid
LEFT JOIN {profile} "profile_users_field_data" ON users_field_data_commerce_order.uid = profile_users_field_data.uid
LEFT JOIN {profile__address} "profile_users_field_data__profile__address" ON profile_users_field_data.profile_id = profile_users_field_data__profile__address.entity_id AND profile_users_field_data__profile__address.deleted = '0'
GROUP BY "commerce_order"."order_id", "commerce_order_order_number", "commerce_order_item_commerce_order__order_items_title", "profile_users_field_data__profile__address_address_given_nam"
LIMIT 100 OFFSET 0

The output prints every created profile from all orders on the site against every single order. ie 20 orders, 20 profiles thus 400 entries.
The duplicates are removed as soon as the (User)Profile relationship is removed.

prem suthar’s picture

StatusFileSize
new2.83 KB

Re-roll the #99 Patch For 10.1

bburg’s picture

Using patch in #99 with the Distinct and new option enabled and I am still seeing duplicates. I have an "Event" node with a multi-value set of "Session" paragraphs, which have Smart date range fields. I only want to display a single version of a node with the next upcoming session. I was able to engineer a query alter that inserts a subquery to handle the filter. However, I still need the relationship to handle the sorting for me, which brought me to this issue.

kushagra.goyal’s picture

#3 working perfectly for me instead of applying patch, thanks...

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

j_s’s picture

Also just noting that using #3 worked well for my specific use case. But the patch making automatic base fields optional seems promising.

vasike’s picture

it seems there are some issues with DISTINCT ...

for example, it's not taken if there are other elements ... like sorting or some specific fields

the part !empty($this->fields) of this condition if (empty($this->noDistinct) && $this->distinct && !empty($this->fields)) { ...it "stops" the DISTINCT

So 2993688-85-FAIL.patch from #85 ... i don't think it will work, even we "fix the SQL".

About the latest patch

i would say it could help, but not sure it's the "universal panacea" for multiple value fields relationship.

However i would change some things like naming and "loop break"

for example:
Name: exclude_relationship_base_fields instead of disable_automatic_base_fields
and
Code:

      foreach ($entity_information as $info) {
        if (!$exclude_relationship_base_fields || $info['relationship_id'] === 'none') {
          $entity_type = \Drupal::entityTypeManager()->getDefinition($info['entity_type']);
          $base_field = !$info['revision'] ? $entity_type->getKey('id') : $entity_type->getKey('revision');
          $this->addField($info['alias'], $base_field, '', $params);
        }
      }

or something ... to be more clear what we're trying to do ...

But what if we want to add fields with the relationships ... how this could affect the results ...

And so on ...

flyke’s picture

I have a view of commerce_product entity types with a relation on commerce_product_variation.
The view shows rendered commerce_product entities as a field. The results are unique until I sort the view by a commerce product variation field. Then I get duplicate results.

When I add and apply the patch from #99 and check 'Distinct' and 'Disable automatic base field for relationships', the view no longer works and I get this error:

The website encountered an unexpected error. Please try again later.
TypeError: Drupal\views\Plugin\views\field\RenderedEntity::getEntityTranslation(): Argument #1 ($entity) must be of type Drupal\Core\Entity\EntityInterface, null given, called in /var/www/html/web/core/modules/views/src/Plugin/views/field/RenderedEntity.php on line 132 in Drupal\views\Plugin\views\field\RenderedEntity->getEntityTranslation() (line 69 of core/modules/views/src/Entity/Render/EntityTranslationRenderTrait.php).

That error is fixable by patch from https://www.drupal.org/project/drupal/issues/3007424
But now my view looks empty, the rendered entities dont render anymore. I still do have results and a pager, its just not rendering anything inside per result.
If I change my fields to just the commerce product ID instead of rendered entity, I see that I am still getting duplicate results. When I remove my commerce product attribute sorting field, then I dont have duplicate results. So in my specific case the problem is unfortunatly not resolved.

My views query that has no duplicates without the sorting added:

SELECT DISTINCT "commerce_product__field_price"."field_price_number" AS "commerce_product__field_price_field_price_number", "commerce_product_field_data"."product_id" AS "product_id"
FROM
{commerce_product_field_data} "commerce_product_field_data"
LEFT JOIN {commerce_product__variations} "commerce_product__variations" ON commerce_product_field_data.product_id = commerce_product__variations.entity_id AND commerce_product__variations.deleted = '0'
INNER JOIN {commerce_product_variation_field_data} "commerce_product_variation_field_data_commerce_product__variations" ON commerce_product__variations.variations_target_id = commerce_product_variation_field_data_commerce_product__variations.variation_id
LEFT JOIN {commerce_product__field_price} "commerce_product__field_price" ON commerce_product_field_data.product_id = commerce_product__field_price.entity_id AND commerce_product__field_price.deleted = '0'
WHERE "commerce_product_field_data"."status" = '1'
ORDER BY "commerce_product__field_price_field_price_number" ASC
LIMIT 9 OFFSET 0

My views query when I add the commerce product variation sorting:

SELECT DISTINCT "commerce_product__field_price"."field_price_number" AS "commerce_product__field_price_field_price_number", "commerce_product_variation_field_data_commerce_product__variations__commerce_product_variation__field_primary"."field_primary_value" AS "commerce_product_variation_field_data_commerce_product__vari", "commerce_product_field_data"."product_id" AS "product_id"
FROM
{commerce_product_field_data} "commerce_product_field_data"
LEFT JOIN {commerce_product__variations} "commerce_product__variations" ON commerce_product_field_data.product_id = commerce_product__variations.entity_id AND commerce_product__variations.deleted = '0'
INNER JOIN {commerce_product_variation_field_data} "commerce_product_variation_field_data_commerce_product__variations" ON commerce_product__variations.variations_target_id = commerce_product_variation_field_data_commerce_product__variations.variation_id
LEFT JOIN {commerce_product__field_price} "commerce_product__field_price" ON commerce_product_field_data.product_id = commerce_product__field_price.entity_id AND commerce_product__field_price.deleted = '0'
LEFT JOIN {commerce_product_variation__field_primary} "commerce_product_variation_field_data_commerce_product__variations__commerce_product_variation__field_primary" ON commerce_product_variation_field_data_commerce_product__variations.variation_id = commerce_product_variation_field_data_commerce_product__variations__commerce_product_variation__field_primary.entity_id AND commerce_product_variation_field_data_commerce_product__variations__commerce_product_variation__field_primary.deleted = '0'
WHERE "commerce_product_field_data"."status" = '1'
ORDER BY "commerce_product__field_price_field_price_number" ASC, "commerce_product_variation_field_data_commerce_product__vari" DESC
LIMIT 9 OFFSET 0

So I can see that despite having 'Disable automatic base field for relationships' checked, the field_primary from my commerce product variation is still added in the select cause.

coaston’s picture

Any progress?

pappis’s picture

Same problem here. Any proposed solution or patch that will remain when drupal gets updated?

hannakras’s picture

The patch (2993688-117) with "Distinct" and "Disable automatic base field for relationships" both checked works well for me to deduplicate items, but it breaks the view's autocomplete filters.

vasike’s picture

i'm still doubtful about this issue
i believe whenever relationships are involved ... "AGGREGATIONS" should be used - DISTINCT is not a way for Views generic SQL solution - imho.

related with my previous comment i noticed something that could improve things about DISTINCT .. in general ... so linked the issue, maybe someone is interested and could help there.

solideogloria’s picture

Couldn't there be a "client-side distinct" (performed in PHP) that only affects the results shown in the table? That would be different than DISTINCT in the SQL if the query itself contains extra fields behind the scenes.

azinck’s picture

@solideogloria Then you get into a situation where you have an inconsistent # of results per page (because some are culled by your PHP code) and other complexities that come out of that.

You also break pagination completely since you can't really assess the total possible # of results matched by your query (not just the current results, but the entire set matched by the COUNT query) without having PHP evaluate every single match which is both time-and-memory-prohibitive for queries returning large #'s of matches.

solideogloria’s picture

Then it could use a query like

SELECT DISTINCT col1, col2
FROM (
  SELECT col1, col2, col3, col4
  FROM table
  ...
)
c-logemann’s picture

I just successfully tested patch #117 with core 10.2 and it fixed a view with entity references when using a filter on multiple value field.

aharown07’s picture

#117 working for me on 10.3.2.

danny englander’s picture

I ran into the same issue with a reference of multiple taxonomy terms. #117 working for me with 10.2.5. Pager works fine as well as exposed filters. Thank you!

vincent_jo’s picture

Hi, it doesn‘t work for me on 10.1. (I can‘t update core because of an domain access issue).

My views-page has contextual filter: "Has tax term ID with depth" and "Has tax term ID with depth modifier".
Then I add a relationship "Taxonomy term on node". Just then I have the field "Tax-Image" available in the fields section.
Adding this getting duplicates!
Query-settings don‘t work. Actually hiding the image from display if any of the two options are activated.

I follow this thread since my first attempt with D8. Now I‘m still experiencing this issue on D10. It‘s crucial to get this fixed on my taxonomy driven job-board site migrated from D7.

regards

flyke’s picture

Typically when I try out a workaround for this issue, I create a custom 'Remove duplicate results' filter like this:

mymodule/mymodule.module:

function mymodule_views_data() {
  $data['views']['remove_duplicates'] = [
    'title' => t('Remove duplicate results'),
    'filter' => [
      'title' => t('Remove duplicate results'),
      'field' => 'id',
      'id' => 'remove_duplicates',
      'click sortable' => FALSE,
    ],
  ];

  return $data;
}

mymodule/src/Plugin/views/filter/RemoveDuplicatesFilter.php:

<?php

namespace Drupal\mymodule\Plugin\views\filter;

use Drupal\Core\Form\FormStateInterface;
use Drupal\views\Plugin\views\filter\StringFilter;

/**
 * Remove duplicates by adding id field and group by for that field.
 *
 * @ingroup views_filter_handlers
 *
 * @ViewsFilter("remove_duplicates")
 */
class RemoveDuplicatesFilter extends StringFilter {

  /**
   * {@inheritdoc}
   */
  protected function defineOptions() {
    $options = parent::defineOptions();

    return $options;
  }

  /**
   * {@inheritdoc}
   */
  public function buildOptionsForm(&$form, FormStateInterface $form_state) {
    parent::buildOptionsForm($form, $form_state);
  }

  /**
   * {@inheritdoc}
   */
  public function query() {

    // Set distinct to TRUE;
    $this->query->distinct = TRUE;
    $this->query->options['distinct'] = TRUE;

    /**
     * @var string contains the entity type the view uses.
     */
    $entity_type = $this->view->getBaseEntityType()->id();
    switch ($entity_type) {
      case 'node':
        $entity_id_label = 'nid';
        break;

      case 'media':
        $entity_id_label = 'id';
        break;

      case 'commerce_product':
        $entity_id_label = 'product_id';
        break;

      default:
        break;
    }

    // Add identifier field and group by it.
    if (!empty($entity_type) && !empty($entity_id_label)) {
      $this->query->addField($entity_type, $entity_id_label, '', ['function' => 'groupby']);
      $this->query->addGroupBy($entity_id_label);
    }
  }
}

You need to clear cache first after adding this code, and then add this filter to your view.
As you can see in the code, this only works for views for node, media and commerce_product entities, but it served its function on several projects here.

vincent_jo’s picture

Hi, thanks @flyke, I tried it out immediately. I‘m getting the filter in the filter criteria. It asks me to choose an operator.
Unfortunately none of my selections work to eliminate the duplicates.
Any further advice?

regards

... having high hope for this :)

flyke’s picture

StatusFileSize
new51.74 KB

Hi Vincent_Jo, just adding it should be enough, no need to set anything (leave default and just save). Maybe someone can improve my code to clear the settings for that filter. Maybe I should just return an empty array in defineOptions() and do absolutely nothing inside buildOptionsForm(), not sure, did not test that.

vincent_jo’s picture

Hi flyke, thanks for the response ... none of the operators worked. Maybe it‘s on the custom image field in this vocabulary? ... oh, just realized, when adding a custom description field it showed up in all four languages, so I get four duplicated items with each description field in another language. Even though I have a translation language filter (Content language selected for page) active. So this could be the issue here in my case.

well ....

flyke’s picture

Vincent_Jo: I listed the entity types that my workaround code works for, taxonomy terms / vocabularies is none of them. Working entity types for my demo code are 'node', 'media', 'commerce_product'. Of course you can add more types in the swich statement inside the query() function if you add the correct db field that contains the id. For everyone else: my apologies for polluting this issue queue. I do think that this custom filter can help others (only as a workaround) who come here looking for a solution until this issue gets resolved.

vincent_jo’s picture

@flyke, thank you very much anyway. Even it didn‘t help in my (strange) case, it surely will help others :)

best regards

to clarify: I have a list of nodes with related taxonomy terms (and need it also vice versa)

jsimonis’s picture

StatusFileSize
new97.48 KB

Any update on this? I've tried the patch. It said it patched correctly, but I am not seeing any change on my page. As you can see, I get one item for the image, another for the body.

mrupsidown’s picture

#117 applies for me on 10.2.2 but I see no change whatsoever to my view results. I have tried with "Distinct" and "Disable automatic base field for relationships" or any combination of the two. I still have the exact same duplicates showing.

My view is setup to show 18 items. I see one node 8 times, another node 3 times, 2 other nodes 2 times. In other words, only 3 nodes on my page are not duplicates! This is driving me nuts. The only thing I was able to achieve is by using hook_views_pre_render which allows me to remove duplicates but this will break the number of items displayed in the view. I see 7 items instead of 18, which corresponds to every non-duplicated item in my original view result.

My view displays nodes and has 2 required relationships with taxonomy terms referenced on the nodes (categories and brands). The categories can be multiple. The view also has a filter and a sort on these 2 entity references.

I have a second view for a different content type with the 2 same entity reference fields which works fine. The only difference is that the nodes shown in this view only have a single category and single brand.

mrupsidown’s picture

I finally got it working by altering the query this way :

/**
 * Implements hook_query_alter().
 */
function my_module_query_alter(Drupal\Core\Database\Query\AlterableInterface $query) {
  if ($query->hasTag('my_view_tag')) {
    // Add the nid field with a groupby function.
    $query->addField('node_field_data', 'nid', 'node_field_data_nid', ['function' => 'groupby']);
    // Group results by node ID.
    $query->groupBy('node_field_data.nid');
  }
}

I have added a specific tag in the View settings so I can identify the views I am after. Not sure whether this is the (temporary) way to go but it seems to work for my use case. I see no more duplicates and the number of results per page is correct.

bensti’s picture

@mrupsidown With this query_alter, i get a this error.

this is maybe related to https://drupal.stackexchange.com/questions/316086/how-do-i-disable-only-... ?

SQLSTATE[42000]: Syntax error or access violation: 1055 Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'databaseXXX.node_field_data.langcode' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by:

almador’s picture

For me, the combination of:
- custom module from #3 (thank you @Jorge Navarro);
- using the Format "Boostrap table" #6(thank you @david.qdoscc) with an added hidden field from (and sorting the table by this hidden field).
solved the problem with duplicates in View of content and paragraph as a relationship.

Also, I tried the patch from #117, both with Distinct on and off, but I can't find the difference.

I'm using Drupal 10.3.10, btw.

stolzenhain’s picture

The patch from #117 did not work for me either, as did the grouping tricks.

Since we needed COUNT sorting on aggregation I now resolve to using the Views Field View module and passing the output of multiple fields as an embedded field view, queried by node id and containing this field only. This works reasonably well but of course a solution to allow for the expected behaviour would be great.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.