Problem/Motivation

Reported problem: fatal PDO error if you try to filter by field language in Node view

If you attempt to use a field language filter on a Node view, such as "Content Body:language" in Views, it doesn't work. You get a PDO error like this from the query:

SQLSTATE[42S22]: Column not found: 1054 Unknown column 'node__body.language' in 'where clause': SELECT COUNT(*) AS expression FROM (SELECT 1 AS expression FROM {node} node INNER JOIN {node_field_data} node_field_data ON node.nid = node_field_data.nid LEFT JOIN {node__body} node__body ON node.nid = node__body.entity_id AND node__body.deleted = :views_join_condition_0 WHERE (( (node_field_data.status = :db_condition_placeholder_0) AND (node_field_data.type IN (:db_condition_placeholder_1)) AND (node__body.language IN (:db_condition_placeholder_2)) ))) subquery; Array ( [:db_condition_placeholder_0] => 1 [:db_condition_placeholder_1] => article [:db_condition_placeholder_2] => fr [:views_join_condition_0] => 0 )

The problem is that the field data tables have a column called "langcode" not "language". This is set up in field_views_field_default_views_data().

This problem would also occur if you use the Field, Sort, or Argument on the field language of any field. It is not specific to the Body field.

Deeper problem: Field integration in Views in general

See meta-issue #2313159: [meta] Make multilingual views work for a summary of the deeper problem, and the plan of how to fix Views in general for multilingual, and further discussion.

Proposed resolution

On the meta-issue, we identified that the right solution to this problem is just to remove the broken sort/filter/argument handling for field-level languages. Rows in entity views should have a language associated with them, and we should filter/sort based on language at the entity/row level, not at the field level.

There are separate issues (see meta-issue) for fixing two Core entities that currently don't have a data table that would facilitate filtering by language.

Remaining tasks

Patch in #15 removes the language filter, sort, and argument from fields.

User interface changes

If we remove support for field language, these options will not be available when making Views. However, they weren't really available before because if you tried to use them, you got a PDO error.

There would also be different language options, like a setting on the display for "Display language" and the "Field display language" setting would be removed.

If we're not able to fix User and Custom Block entities, they would become untranslatable (at least for Views).

API changes

Entities would not be allowed to be marked as translatable without having a data table, at least for Views purposes.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jhodgdon’s picture

Issue summary: View changes
Status: Active » Needs review
FileSize
2.23 KB
42.93 KB

Here is a patch that fixes the language field, argument, filter, and sort for Field API field languages. At least, they all result in valid queries and work the way the field API seems to have intended that they work.

However... I think having a filter here is ... very weird in the first place, and pretty much never going to lead to the results anyone would want or expect.

Here's what happens.

I create a site with 1 article in English, translated into Spanish and French.

I create a View of articles. Initially, I get 3 rows of output for the 3 translations, since the only filters are article content type and published. So far, so ... at least what I now expect.

But if I add a filter on "Content Body:language" and check all 3 languages, I now get 9 rows of output. There is one row for each "body: language" for each translation. Here's a screen shot -- my view is showing Fields, with Title and Content Body: language being shown.
nine rows of views output, ugh

(Adding to that problem, if I actually display the body, it always comes out in English, even when the body: language field says it is in French or Spanish -- I didn't put that in my screen shot because it's presumably related to #2217569: Fields row plugin: Translation is non-uniform for base fields, Field UI fields, links; no way to choose "this row's language" and not this issue specifically. I'll add a comment there regarding filtering.)

Anyway, here's a patch that fixes the PDO problems with the field, filter, argument, and sort on field:language.

jhodgdon’s picture

Regarding the 9-row result... In reality, the field data tables are correlated with particular rows in node_field_data, right? So when we join the field data tables, we should be joining on something like node__body.langcode = node_field_data.langcode, shouldn't we? With the patch in #1, the query I'm getting is:

SELECT node_field_data.title AS node_field_data_title, node.nid AS nid, node__body.langcode AS node__body_langcode, node_field_data.created AS node_field_data_created
FROM 
{node} node
INNER JOIN {node_field_data} node_field_data ON node.nid = node_field_data.nid
LEFT JOIN {node__body} node__body ON node.nid = node__body.entity_id AND node__body.deleted = '0'
WHERE (( (node_field_data.status = '1') AND (node_field_data.type IN  ('article')) AND (node__body.langcode IN  ('fr', 'en', 'es')) ))
ORDER BY node_field_data_created DESC, node__body_langcode ASC
LIMIT 10 OFFSET 0

so it's just joining to the {node} table on the nid field, and ignoring language.

jhodgdon’s picture

And as another note, the "9 rows" problem also occurs if you filter on "Content: Body" using a "Contains" filter. All of my translations have the word "Paris" in the body, and if I filter on Body Contains "Paris", I get 9 rows of output. Title contains works fine, because this is directly in the node_field_data column.

So I'm fairly convinced that all field table stuff should join on node_field_data, with a match on the .langcode columns. Should this be a separate issue?

jhodgdon’s picture

rpsu’s picture

Status: Needs review » Reviewed & tested by the community

Tested and patch in #1 fixes the error in views. However #2218065: Need to join fields to the entity field data tables, not entity tables, or filtering increases number of results needs also attention before translated nodes can be used in views.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs tests

If we're getting a database error it looks like we have missing test coverage here

jhodgdon’s picture

Priority: Normal » Critical

This is also probably a critical issue, or at least major, if something in the UI gives you a db error?

Gábor Hojtsy’s picture

Issue tags: +sprint

Woah that indeed is bad, needs tests indeed :(

jhodgdon’s picture

I hadn't looked at this issue for a while...

So on #2217943: Views cannot be filtered by language of translation we added a (completely missing) filter on Node Translation Language, which lets you filter a list of Nodes to particular translation languages. If you're not using this filter, views of Nodes are really views of Node Translations, so if you have 3 nodes each translated into 2 languages, you will get 6 rows in your view.

Obviously, each individual field (like Body) value is associated with one translation (or in some cases, if the field doesn't get translated, one field value could be associated with several translations of the same node). So logically, you'd think that if you were looking at a view row in Spanish, it would show you the Spanish fields, because that is what you'd see if you were viewing/editing that particular translation. That doesn't currently happen, due to #2217569: Fields row plugin: Translation is non-uniform for base fields, Field UI fields, links; no way to choose "this row's language"... but if you show the Teaser or some other Content view mode, then you do get content in the right language.

So... What I don't understand is what it would mean, if you then added a filter to the view like Body Field: Language. What would you expect to see? Which rows would be shown, as compared to the 6 rows that you have originally, and why would you want to do that rather than filtering the view by the language of the translation as a whole?

I really think that field language filters do not make sense, and rather than trying to fix them (as this patch does, or at least did) and make a test that tests for some reasonable behavior of the filters (I cannot even define what that would be), what we should really do is remove these filters. They are at best confusing and kind of worthless, in my opinion.

Thoughts?

Gábor Hojtsy’s picture

Yeah I am not sure how field language filtering ended up in views. At least in Drupal 7 there was no entity level language negotiation and instead it happened on the field level. That may be the reason. In Drupal 8 the negotiation is on the entity level, which views have a setting now to specify, so you can say to negotiate it specific languages or use the page content language, etc.

So for fields, the only use case I can imagine is if you want to somehow filter for specific fields, eg. create a list of nodes that have a body field not translated to French (even if others may be). This is very limited a use case, pretty contrived. It may be useful if you add a field and then start to translate it in entities, and want to have a view of what is left... Something like that. Probably more confusing then it is worth?! I'll ping @plach to provide short feedback if he can, but it indeed looks like we want to remove this feature.

jhodgdon’s picture

Yeah. So, if the Body field is designated as translatable, then if you make a French translation of the node, the Body field value will be marked as French. If the Body field is not translatable, then the Body values are shared across translations, and they'll all have the source language of the node.

The person making the view for a site should know how the site's fields are configured, right?

In the first case, the "body is French" filter is the same as "translation is French", which (at least after #2217943: Views cannot be filtered by language of translation exists on the node, so it's useless to have it on every field as well.

In the second case, the "body is French" filter is the same as "original is French", and that also exists on the node, so again it's useless to have it on every field as well.

So, it's a useless duplicate in both cases, and confusing as well. I say we should get rid of it.

As far as why we have it, the Views integration of Field module seems to have taken the tactic of "Expose all pieces of information from each field". This is a great thing for things like Image fields, so that you can use all of the data associated with the image (Alt, Title, etc.) in Views -- this was spotty for fields in 7.x.

But it's not so great for this particular field data -- languages are special.

catch’s picture

Let's remove the feature. If there's a use case we can always put it back, working and tested, later on.

Gábor Hojtsy’s picture

Not sure I fully understand what plach means but:

[3:44pm] plach__: GaborHojtsy: in D7 field language is used mostly to dedupe results when filtering or sorting on translatable fields
[3:44pm] plach__: I tihnk that is still needed
[3:44pm] plach__: the field language setting in the advanced tab I never used
[3:44pm] plach__: not sure how it works
jhodgdon’s picture

D7 is quite different in structure of translations and fields from D8 -- for one thing, each translation is a separate node. And in a D7 database for a multi-lingual site that I manage, which has quite a few translated nodes in it, I could not find any field whose 'language' column in its data table had anything but 'und' in it. The language is apparently being stored on the node/revision, not on the fields in D7. Completely different structure to the database.

The only filter I found in Views that was related to language was "Content: language". Fields do not have language filters in 7, which is especially appropriate there since fields do not apparently have languages to themselves anyway.

Also, the Field Language setting in the "Advanced" section of 7 referenced by plach is about the display of fields in output, not about filtering the view. In 8, this also exists, and actually there is a separate issue about the display of fields in #2217569: Fields row plugin: Translation is non-uniform for base fields, Field UI fields, links; no way to choose "this row's language" which since I just noticed this setting and maybe that issue is wrong, I will need to test.

So.. yes, let's get rid of the field:language filters.

As a note... We also have field:language available in Fields, Sort, and Contextual Filters. Does it make sense there either? I don't think so... Maybe in Fields, but again (see comment #11), the field language is either the translation language or the original language, depending on site configuration, so I do not think it makes sense to have a separate Field available in Views for each field's language either.

jhodgdon’s picture

Status: Needs work » Needs review
FileSize
2.05 KB

Here's a patch that completely removes field language in all its variations. Let's see if it passes tests... I didn't check to see if it was being used in any tests etc. (I suspect not, since it was broken, see original issue report here.)

Gábor Hojtsy’s picture

@jhodgdon: plach was referring to entity_translation used with this views feature (not core translation), but I am not clear if this feature even exists in D7 and what it was used for :)

jhodgdon’s picture

Title: Field language filters for Views have wrong db column name » Field language filter/sort/etc. for Views do not work and are not needed
Issue summary: View changes

Guess I should change the title and update the summary.

plach’s picture

I think we might be removing a feature that is still needed. To tell if I am right or wrong we can do a simple experiment: let's setup a view and sort on a translatable configurable field, for instance a "weight" field that may vary by language. In D7 in such situation the join on the field table causes duplicates to appear in the result set. To get rid of the duplicates, in D7 we have to setup a (current) language filter for the field we are sorting on. This ensures (at most) one result for each node is displayed.

If we can't achieve the same in D8 without field language filters, we still need this stuff.

jhodgdon’s picture

Good idea.

But let's wait to do this experiment until fields and languages really actually work. There is a bug blocking this:
#2218065: Need to join fields to the entity field data tables, not entity tables, or filtering increases number of results
I do not know how to fix that issue, but it has a patch for a test that fails. So... At the moment, *all* field-related filtering is broken in the presence of translations, such as filtering on the main Body text value with a "contains" filter.

Berdir’s picture

I think that is exactly what @plach meant, this option exists exactly to prevent that problem, but it's possible that it's broken and not surprsingly for title, because this stuff has been written and very likely not touched since base fields can be translatable.

jhodgdon’s picture

You cannot fix the problem this way, because you cannot filter field language to be "the language this row is in" or "the appropriate field that goes with the translation that is selected in this row". So you would be looking at a French translation row, and you could filter based on the Spanish node body? ?!? That makes zero sense.

Gábor Hojtsy’s picture

@jhodgdon: So D7 language negotiation happened on the field level. Also in D7 fields may be shared between entities I believe. So you could theoretically make a view all bodies from all kinds of entities. In Drupal 8 fields are not shareable and negotiation happens on the entity level. Not all entities are translatable in themselves (eg. user and custom block will not be because they do not have sensible base fields to translate). However all of them may have translatable fields. So not all of them may have all the languages available yet on the entity DB level. Which is maybe the crux of this problem?

Gábor Hojtsy’s picture

In other words, some entities have base field translations and therefore data tables which may be filtered by language. Others don't have base fields translatable (files, custom blocks and users don't make sense to have them based on their data and taxonomy is in the works). Those cannot be filtered for translations on the entity level, only on the field level. So I guess a field level filter would be useful / needed for when an entity level filter is not possible. Ideally there could be a filter that may be on the entity level if possible or if not the field level. But I think that may not be possible with views. That means views builder users would need to understand the particularities of different entity data models to create the proper views? And also that both the entity level and field level translation language filters make sense but not at the same time?

plach’s picture

A possibility could be adding an ON clause to match the field language column with the entity data table language column when joining the field data table.

Gábor Hojtsy’s picture

But for entities which have no data table?

jhodgdon’s picture

What a mess!

So... Let me step back and see if I have this right... and I think we need a meta issue for this maybe, but anyway...

For Nodes (and other entities that work the same way?), a view of nodes joins automatically on the node_field_data table, making it become essentially a view whose rows are node translations [row == (nid/langcode, and pick out the latest revision)]. The way translations work, each translation is associated with a set of (mostly translated) field values, as well as the node_field_data stuff like title. So logically, when we use a field in a view (for display, filtering, sorting, etc.), we should join the field table to node_field_data on nid/langcode (but we aren't currently doing that, leading to #2218065: Need to join fields to the entity field data tables, not entity tables, or filtering increases number of results). And when we display the view, whether we're displaying stuff like title from node_field_data, or displaying a view mode such as a teaser, or displaying field values, we should be displaying in the row's language (which we're not currently doing for fields, which is #2217569: Fields row plugin: Translation is non-uniform for base fields, Field UI fields, links; no way to choose "this row's language"). And for nodes, we do not need or want the field-level language filters (which is this issue), or the global field display language setting.

Then for User accounts (and other entities that work the same way), translations are done at the field level and there is no equivalent of the node_field_data table, and a view of Users doesn't have any definite language associated with each row, and one row == one user. Then if you use a field in a view (for display, filtering, sorting, etc.), the only table we can join on is the base users table, so if the field has translations, each join is going to multiply the number of rows in the view. So, for users, we would need to filter by language for each translated field that is being used in the view in any way, to get back to the desired number of rows (unless we actually did want to show all the translations as separate rows the view for some reason). Then when we get to display... It seems like logically, if we are doing a field-level display, then each field in the query output for a given row does have a language on it (either because we filtered by language, or because we're displaying all languages as separate rows on purpose), so we should display that field in that row with that row's that-field language. We are definitely not doing that right now -- we are instead using that global field display language setting, which I don't think we want or need for this use case either. Then if we are using a view-mode-based display for the rows instead of field-based display, we would need to figure out how to choose the languge for each row... and I think in this case, it should probably be a setting on the display, shouldn't? So when you choose to display "Teasers" or whatever the view mode is called, you would select what language to display it in? (I don't think field-level language filters could apply to a display mode, because they are on each field, and there is no guarantee that you have exactly one in a view -- you could have zero, or multiple of these filters and they do not have to have the same values).

So... If I've gotten this right, it seems like what we would ideally want is this (on this or other issues):

a) For entities that behave like nodes, we need:
- Join field tables to the equivalent of node_field_data, not the base entity table, so you'd automatically be filtering to the row's language.
- Display fields, properties like title, and entity view modes based on the row's language.
- Do not offer field-level language filtering.

b) For entities that behave like users, we need:
- Join field tables to the base entity table like we are now doing. Make sure that the view has access to the language database table field in the field data table.
- Users will need to filter each field they are using (as Field, Filter, Sort, or Argument) by language to avoid multiplication of the number of rows in the output. So we need field-level language filters (patch #1 in this issue, plus tests).
- Display fields based on the language field from the field data table for that row of the view.
- Display entity view modes based on a language that the user configures in the display.

c) Are there any entities that don't fall into one of these two categories?

So... Does this make sense? Is it possible to accomplish -- can we maybe put something into the hook_views_data for the individual entities, or the entity annotation, that clues the Field module's Views integration in so it can divide entities up into these two types? Do we need a separate meta-issue for this stuff?

plach’s picture

I tried to re-organize my thoughts around this problem space, based on my work on the entity storage and my previous experience with D7.

The analysis Jennifer presents in #26 is correct and properly describes the current (poor) status. I would like to add a couple of notes just to clarify my following thoughts: views listing entities always load an entity object for each result row, no matter which row plugin has been configured (full entity + view mode vs single fields). When working with a field row plugin, the configured fields do not affect the columns in the query SELECT clause, only filter and sort handlers affect the query (i.e. imply JOINs). This means that only those are affected by the language conditions we apply to the query, in fact when rendering the entity row we have all translations available (due to the entity load) and we just need to choose which is the display language, based on some logic configured in the view. For instance we can select only nodes having a French translation but then displaying them in English (+ fallback), if for any reason we wish to (at least, this is possible at API level).

As Jennifer correctly points out, the notion of entity translation is represented at storage level by a row in the data table, thus filtering on the translation language actually means adding a language condition on the data table. In D7 the same role is played by the {entity_translation} table, which is provided by (surprise!) the Entity Translation module. In D8 the corresponding table is the {content_translation} table, which is going to be removed in #1916790: Convert translation metadata into regular entity fields, as its data will be stored in the entity data table, like any other regular base field (after #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions lands).

This actually means that we cannot (and don't want to) support two storage flavors for translatable entity types: a translatable entity type leveraging core entity storage needs a data table, no point in trying to get around that. The fact that we still have hybryds like the user entity is only caused by the fact that all the entity storage is still work in progress. What we should do for those is either converting them to multilingual (if we can achieve that before beta), or marking them as untranslatable by default and switching them to translatable through a contrib (or custom) module, by altering the entity type definition. This will trigger the creation of the data table. Personally I'd prefer the first option and so I posted an initial patch in #1498664: Refactor user entity properties to multilingual.

To sum up, I think we should support only option #26.A and possibly extend the entity renderers mechanism to the field row plugin, if this makes any sense :)

plach’s picture

Btw, only Custom block and User are marked as translatable but have no data table, so it should be feasible to have them converted before beta.

jhodgdon’s picture

Issue summary: View changes

#27 - Great, it sounds like we are thinking the same way. I think having entities that "work" like #26 (b) is crazy.

I've updated the issue summary. Take a look?

jhodgdon’s picture

Issue summary: View changes

Should have previewed. Cosmetic changes to issue summary... Oh, there's no preview, no wonder I didn't preview. :)

Gábor Hojtsy’s picture

It would really suck if you could not translate fields on entities like user and custom block just because their base schema was not as elaborate as the other entities in core. I do not subscribe to the idea that you would need to install a contrib module to do that :/ Much rather that we find a way in views core if need be. If the schemas are aligned to the same setup all the better though.

plach’s picture

I think we should really aim for aligning the table layouts of users and custom blocks, since they are the missing two. Those have rather straightforward table layouts, so with the entity schema generation already in place the two task should not be hard at all.

That said, I think we will definitely need a contrib Entity Storage module allowing to dynamically switch between table layouts when data is already existing, so this would be nothing new.

plach’s picture

Definitely needs work :)

jhodgdon’s picture

I've created a meta-issue that covers the over-arching concerns:
#2313159: [meta] Make multilingual views work

This issue can now go back to the simple "remove the non-working filters".

Status: Needs work » Needs review
Gábor Hojtsy’s picture

Needs issue summary update then. Sent the patch for a retest.

jhodgdon’s picture

Issue summary: View changes
Issue tags: -Needs tests, -Needs issue summary update

Yeah. Looks like the patch in #15 still applies and I think it's all we need for this issue. Updated summary.

plach’s picture

Yep, it's ok: I thought we would start implementing alternative approaches here.

jhodgdon’s picture

Let's keep this issue just about deleting those filters. We have other issues (see meta) for the other parts of the issue. If we keep them all small, we have I think a better chance of getting them done? :)

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Looks like we are in agreement this should go away.

catch’s picture

Status: Reviewed & tested by the community » Fixed

Yes agreed on removing the dead/broken code, we can add back working code as part of the meta, but no reason to keep fatal errors sitting around in the meantime.

Committed/pushed to 8.x, thanks!

  • catch committed 7420742 on 8.0.x
    Issue #2218025 by jhodgdon: Fixed Field language filter/sort/etc. for...
Gábor Hojtsy’s picture

Issue tags: -sprint

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.