This is an emanation from http://drupal.org/node/73081

In short, CCK fields with 'multiple' values currently result in duplicate nodes in a View (one per value of the field).

This patch corrects this behaviour : the (view) field for a multiple (cck) field now displays ONE list of ALL the values.

The patch is for Drupal 4.7, CCK 4.7, Views 4.7 - recent or latest versions are probably best.
You have to empty the cache after applying, in order to rebuild the Views tables data.

Note 1 :
This implementation is suboptimal. It costs one additional query for retrieved node - just like the "Taxonomy : All terms" field (meaning : it's probably not _that_ awful).
As Earl (merlinofchaos) stated in http://drupal.org/node/73081, the way CCK currently implements multiple fields does not play well with views.

In fact, I started by playing with an idea for this patch that could have integrated much better (no additional queries, everything bundled in the views-generated query), but it involved building a query that looks like
SELECT node.nid, GROUP_CONCAT(field_data.field.value ORDER BY field_data.delta) FROM node INNER JOIN field_data (...) GROUP BY node.nid
This worked quite well, but GROUP_CONCAT means MySQL 4.1+. No PostgreSQL, no MySQL 3 or 4.0.
This is a little heartbreaking, but for now, I dropped the idea. Any suggestion/workaround welcome :-)

Note 2 :
There are probably interesting (and complex ?) things to do with multiple cck fields and views arguments or filters. This is out of the scope of this patch, which only deals with views fields for now.

PS : Earl, if you read this, I'd love to have your opinion :-)

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

yched’s picture

In fact the missing GROUP_CONCAT aggregator could probably be added in postgreSQL (there are other places in core where "custom" postgreSQL functions are added)

Which leaves us with the problem of MySQL 3 and 4.0.

Anyone knows if there is a way to branch code depending on the version of mysql used ?

yched’s picture

Hell, in fact there is a way to branch code ... I found this in core (database.mysql.inc) :
if (version_compare(mysql_get_server_info(), '4.1.0', '>=')) {

Well, I guess I'll work on a new version of the patch, that uses the "optimal" method when possible, and the "suboptimal" for lower mysql versions...

What do you mean, talking to myself ?

yched’s picture

FileSize
3.66 KB

OK, I'm a jerk, my "group_concat" solution does not work after all (does not work when there are several multiple fields - I guess I was being naïve...)

So we are left with my original "1 additional query per returned node" solution, and for now I have nothing better to propose. I guess that's the trade-off for the flexibility that multiple fields allow, and anyway it's better than the current behaviour (return one instance of a node for each value of the multiple(s) field(s))

Here's a slightly refactored version of the patch - basically, it's the same as in my original post.

sym’s picture

+1

This patched applied fine and worked as expected on mysql 5.0.22/php 5.1.4. I've not tested on mysql 4.1 yet.

KarenS’s picture

@yched, can you re-roll this patch against latest version? I can't get it to apply so I can test it.

Thanks!

Karen

yched’s picture

Karen : Sorry, I can't do that, I'm currently away from my coding environment (vacation :-)
Won't be back before two weeks.

You shouldn't have too much trouble applying it manualy, though.

moshe weitzman’s picture

Status: Needs review » Needs work

subscribing ... i have needed this in the past, and need it in og. i will try to get to it.

sym’s picture

I've just got it to apply on the latest 4.7 CCK modules and it worked fine.

This is good to go IMHO

becw’s picture

this is great; all of the multiple values for a node show up with the node. However, the nodes themselves are still listed multiple times.
(mysql 4.1, php 4.4)

yched’s picture

@bec : strange - they shouldn't.
I'm currently away from my coding environment so I can't test further ATM, but remember for sure that on my test views, nodes didn't show up multiple times...

Can anyone confirm the behaviour described by bec ?

sym’s picture

No, with a list view this works fine and only copy of each node is returned. This is with the 2nd patch. The first patch resulted in x copies returned.

yched’s picture

bec, could you be more specific about the circumstances under which you see node duplication :
- decribe your content type
- "export" your view and paste it here

Plus please make sure that the node duplication is not triggered by something else in the view (taxonomy maybe ?...). Remove everything in your view that's not related to the multiple cck field, and see if that works...

treylane’s picture

The patch didn't work for me either. The multiple field is working better now, but the node is still listed multiple times. I'm using a custom template based on the code that the views theme wizard spits out. I added an ugly hack to template.php that filters out the duplicate nodes, but this will most likely mess up paging. There's no problem with taxonomy/etc, since the node doesn't list multiple times when there's only one value in the multiple field.

yched’s picture

FileSize
3.66 KB

Here's a re-rolled version against current 4.7

I can't reproduce the node duplication thing with my own test views / content types.
Bec or treylane, could you provide the additional info I mentioned earlier :
- describe your content type (a "content type" import / export feature would be useful here - well, later...)
- "export" your view and paste the result here

Please also make sure to empty your drupal cache and re-save your view.

It seems there's a number of people interested in this, so I'd really like to find the bug if there is one...

yched’s picture

Status: Needs work » Needs review

As a matter of fact, the more I think of it, the more I'm convinced the behaviour you describe is caused by outdated Views tables in cache.
As I said earlier, empty your cache (re-submit the "admin/modules" page, or use Devel module) and re-save your view - and please report here if the problem gets fixed or not :-)

treylane’s picture

You are correct! Problem went away, thanks! :)

yched’s picture

Status: Needs review » Reviewed & tested by the community

OK, then this might be RTBC...

sym’s picture

Testing again on 4.7.4, latest views and cck.

It works fine, but I need to add "Node: Distinct" as a filter or I get x nodes the same, depending on how many options I've selected. This isn't a problem for me, but I guess on a large site it could slow things down quite a bit?

otherwise +1

yched’s picture

Sorry to ask again, but :
Are you sure it's the multiple CCK field that triggers node duplication ? (I don't really understand your "depending on how many options I've selected" part).
If you remove this field in your view, do you still get node duplication ?
Last : Did you try emptying the drupal cache and re-saving your view ?

With this patch, a multiple cck field does not add anything to the view-generated query (the 'multiple' values for the field are retrieved separately, in a subsequent query in the field handler function).
So there is no reason why it would trigger node duplication.

sym’s picture

>Are you sure it's the multiple CCK field that triggers node duplication ? (I don't really understand your "depending on how many options I've selected" part).

Ok, so I have a content type with 1 'multiple field' (a field with multiple values allowed) and for all nodes with 3 values in that field I get 3 nodes in the view, all that have 2 I get 2 nodes in the view, etc. I don't know what is causing it, but I though it was worth reporting.

>If you remove this field in your view, do you still get node duplication ?

I've not tested this, because I can fix it by adding "Node: Distinct"

>Last : Did you try emptying the drupal cache and re-saving your view ?

Yes, before I did anything else (applied patch, saw problem, clear cache, problem still there.)

Having said that, I have not removed the "Node: Distinct" filter from the view, cleared the cache and I don't have the problem any more. Maybe it's nothing to do with your patch at all.

yched’s picture

Status: Reviewed & tested by the community » Needs work

Er, your answer is confusing me...

Let's sum up.
This patch should _not_ require the "node:distinct" filter to avoid node duplicates.
Previous reports about duplicates were in fact caused by an outdated cache.
Before starting a bug chase, I need to know this not also the case for you, and your previous post doesn't allow me to make that out for sure.

Could you please try the following steps, in that order :
- (apply the last patch - that you obviously already did)
- empty drupal cache
- re-save the view (remove the "node:distinct" filter if any, but re-save the view anyway)
- see if the nodes still get duplicated

If yes, then we do have a problem, and in order to debug this I need you to :
- "export" your view and paste the result in this thread
- provide a 'textual' description of your cck content type (sorry, cck has no 'export' feature at the moment...)
- please confirm : you're using mysql 5.0.22/php 5.1.4, right ?

Thanks for your cooperation :-)
I'd really like to fix any possible bug, so that we can commit this - it seems some people are interested in the feature.
I don't reproduce the behaviour you're describing, so I'm depending on your precise answers here.

yched’s picture

Status: Needs work » Needs review
FileSize
5.77 KB

OK, I reworked this patch slightly.
The previous version was not satisfying in that it required the views tables to be rebuilt each time a field got its 'multiple values' status modified. Current patch keeps views tables constant, by deferring the differentiation between multiple and 'regular' fields in a field_query_handler function.

BUT when you change the 'multiple' status of a field, you still need to recompute the stored queries for the views that use this (cck-)field as a (views-)field.
This patch provides a _content_views_rebuild_views function to do that, and calls it when a field settings change (I'm not fond of this views-related part getting in content_admin.inc, but for now I could not find a cleaner way - suggestions welcome)

This patch also drops the 'list item' (<li>) theming, the bullets seemed to sort of get in the way. It uses the same display as 'regular node view' (multiple values in div's) by default, but lets the user theme this easily.

Notes :
- empty your cache and re-save your views after applying - when committing this, I'll add an update function to take care of that automatically.
- after some research (and sym not answering...), I think the "nodes get listed several times" issue in #18/#20 is not related to this patch, but rather to multiple fields being used as filters (or possibly sorts) - I'm working on this separately.
With this patch, Views with no filters / no sorts should NOT require the "Node: Distinct" filter.

yched’s picture

Component: content.module » Views Integration
KarenS’s picture

Thanks yched for all this work. This looks like a rational approach and I'm going to try this out. My only other issue is that there might be times when someone wants the previous behavior -- a table line for each value. Is there some way we can make this optional?? We're already using the option field in the view, so that won't work. I'll see I think of anything else while I try it out...

KarenS’s picture

One thing that occurs to me is to rework this as a 'Field: Distinct' filter that has the option to make each field distinct or not. That is consistent with the way that Views handles the Node: Distinct filter, i.e. Views doesn't assume you always want the node to be distinct but lets you choose. I think that we could do essentially what is done in this patch with a Field: Distinct filter instead of a query handler without many changes to this code. Sorry yched for throwing a monkey wrench into this after all your work, but I am pretty sure it wouldn't take a lot of changes and it would add a lot of flexibility. I'll take a stab later at rewriting the patch to work that way if you don't get to it first. Or jump in and say so if you think that's a bad idea.

KarenS’s picture

Hmm... of course the downside of doing this as a Field: Distinct filter is that you can't differentiate between fields -- all with either be distinct or not. Still, one way or another I'd like to find a way to make this behavior optional...

KarenS’s picture

my typing is not so good, meant to say 'all fields will either be distinct or not'.

yched’s picture

Karen : both your last posts make valid points IMO
The "filter : all fields" is a good idea, I did not thought of that.
But yes, it has the drawback of being "one setting for all fields".
And, true, I can't think of a place where we could put a "group multiple values" option on the (view-)field.
(sort of related : see also my patch for a "is empty / is not empty" filter, and my proposal for "where to put it ?" : http://drupal.org/node/101050)

Maybe we could move this setting into a "Views settings" section on the (cck-)field settings page ? Not really ideal, but here at least we can stick all the options/params we like...

yched’s picture

Er, this might actually be much more simple :

Currently the 'handler' column in the views-field table is empty for cck fields, because we define a single handler.
We could simply define two handlers : 'Group multiple values / Do not group multiple values', and the user picks one for each of the cck fields in his views...
Only thing is this would be displayed even for non-multiple fields, because we do not want to have to rebuild the views table when a field 'multiple' status is changed.

KarenS’s picture

OK, I like the sound of that solution. Do you have time to work that into your patch?

yched’s picture

New patch, does as explained in #29.

The patch itself was not really hard to derive from my previous patch, but the feature adds "cryptics" to the code, and much complexity to testing it : what if the field is sortable ? what if the view has a sort / a filter on this field ? what if "Node : distinct" is used ?
Plus some peculiarities, making it hard for users to figure how their view will look, and which settings they should apply :
when "group" + make field sortable on table view, then "node: distinct" is required to avoid row duplication.
but if you choose "ungroup", "node: distinct" will filter out "redundant values" for the field
(e.g if a number field has the values 1 2 1, it will only show 1 2)

Well, this IS complex, and there's only so much we can do - multiple values are _not_ SQL friendly...
I'm quite sure we want the ability to group multiple values in a row.
Are we sure we want the ability to keep the current behavior ("ungroup") ?

yched’s picture

OK, I can probably answer my last question :
Keeping multiple values separate can be useful when you want to provide a list of all the values in a given field.
So I guess we want both options
So the patch in #31 should be more or less what we want...

KarenS’s picture

Testing this now. One quick question. When using this in 4.7 the function views_load_cache() does not exist. Do we need views_invalidate_cache() in 4.7 or nothing at all?

KarenS’s picture

Status: Needs review » Reviewed & tested by the community

I tested this in 4.7 and this seems to work fine. Thanks yched! This is very nice. It seems it is easy to convert to 4.7, just change one 'module_exists' to 'module_exist' and whatever needs to be done with the cache function. If I hear back from you I'll make that change and commit this, or you can go ahead and commit it yourself.

yched’s picture

Oops - I forgot about module_exists / module_exist, and was not aware that views_cache.inc was 5.0 only...
I developped and tested in 5.0, and copied the relevant parts in 4.7 code to generate the patch. Not very serious I guess :-(.

_views_load_cache() is just here to include views_cache.inc, that contains _views_get_tables in Views 5.0
For 4.7, _views_load_cache() can go away.

I'll commit this (carefully) when I have some more time - thanks for the review and ideas !

KarenS’s picture

Status: Reviewed & tested by the community » Fixed

This has been committed. Thanks yched, nice solution!

Anonymous’s picture

Status: Fixed » Closed (fixed)
doublejosh’s picture

Fixed with just view Theme wizard. Chances are you generated the view before you added on the CCK field with multiple input capabilities.

Just go to Admin > Views > Theme wizard and recreate the view entry WITH THE PROPER GROUPING CONFIG. You're new field that's causing the problems will be listed in the drop down so that you may group it in the query appropriately.

FYI: Drupal 5, PHP 4, mySQL 4