Problem/Motivation
What are the steps required to reproduce the bug?
- Add a text (short) field to a node allowing multiple values
- Create a new node (of the type edited in the previous step)
- On the node edit form click on "Add another item" of the created field multiple times. Don't fill in any value!
- Save the node
- Edit the node again
What behavior were you expecting?
We would expect to only see one empty field item, ready to be filled.
What happened instead
As you can see the empty field items are still there.
Oh noo! How to get rid of these empty field items ever?
Proposed resolution
After saving a node / entity the empty field items should be removed.
This is the same behaviour as in D7.
The following quick fix shows how it should suppose to work in the above case.
Replace the line:
if (isset($value) && !isset($this->properties[$name])) {
with:
if (!empty($value) && !isset($this->properties[$name])) {
in the function isEmpty() at:
/core/lib/Drupal/Core/TypedData/Plugin/DataType/Map.php
Remaining tasks
- Confirm core bug
- Rethink the way we check if a field item is empty, because currently if we check if a field item is empty ($item->isEmpty()) we can't rely on the output. Are there more side effects elsewhere?
- Create patch
- Create tests
- Review patch
- Run tests
Comments
Comment #1
jmuzz commentedYes string fields should override isEmpty. Most of the other field types do. The isEmpty() from email fields should work here.
Comment #3
swentel commentedShould be fixed by #1988492: Avoid having empty field items by default
Comment #4
swentel commentedTalked to Amateescu, we should probably fix this in StringItemBase
Comment #5
swentel commentedWrong patch
Comment #8
swentel commentedWhut ?
Comment #11
swentel commentedWow, let's see what it does on StringItem
Comment #13
swentel commentedComment #15
swentel commentedComment #17
swentel commentedFago suggested filling in Anonymous at install ..
Comment #19
yched commentedI think SrtingItem::isEmpty() not considering an item with value '' (empty string) to be an empty item, is definitely a bug, and a major one IMO.
As the symptom in the initial bug report shows, this goes against what any widget for the "string" field type would do : a form element without user input, but not considered empty ?
Imagine the field is required :
- leaving the input empty would fail Form API #required validation
- but $entity->validate() would accept empty string as a value, because the the field has an item (with value empty string) ?
WTF :-)
Comment #20
yched commentedIf we fill in a value for anon user name, then we can make the user.name field actually required ?
Comment #21
fagoYes, that would make a ton of sense - wouldn't it?
Comment #22
swentel commentedSo, this means we can remove the hack in UserStorageSchema right ?
Comment #23
yched commentedWell, even if the field is required, SqlContentEntityStorageSchema::getSharedTableFieldSchema() would still not mark the column as "not null = TRUE", because of that line :
:-)
As we said yesterday, now that User 'name' field is actually required, we can try a quick check see if we still have cases of entity types with a non-required label, but until then the 'name' column needs that hack in UserStorageSchema to keep its "not null = TRUE" :-/
Comment #25
swentel commentedHmm yeah, interesting :/
A lot of failures are now due to the title of a node being NULL ...
IIRC, this is now because filterEmptyItems() now also gets into the StringItem and calls isEmpty() right ?
Comment #26
swentel commentedFirst batch of changes
Comment #28
swentel commentedLet's see if the bringing back the defaultValue on Node triggers less fails
Comment #29
swentel commentedComment #31
yched commentedMarked #2478077: Markup (and label) of empty fields get rendered as a duplicate.
Also, better title.
Comment #32
plachBetter better title :)
Comment #33
jhedstromNot only are empty fields still there after edit, but each revision adds a new empty field item to the storage table.
Comment #34
leksat commentedHere is a "workaround" patch. Just to let existing D8 websites work normally.
Some notes:
- The StringItemBase::isEmpty() only proceeds on the user-added fields having "field_" prefix.
- I was not able to use $this->getFieldDefinition()->getName() because in some cases, the "field_name" key was missing in the definition and the BaseFieldDefinition::getName() produced a notice. (Maybe that's a sign of another bug?)
- Please note that I used $value['value'] instead of $this->value (as it was done in previous patches). Probably this was changed recently?
Let's see if it fails some tests...
Comment #36
leksat commentedComment #37
leksat commentedIt seems that previous patch was green because it did not work at all :D
Comment #39
leksat commentedComment #40
yched commentedWe shouldn't have a Core class depend on a class provided by field.module.
More generally, it would IMO be really confusing to have a field type behave differently between base and configurable fields.
I'm afraid we need to fix the fails from the previous approach :-\
Comment #41
leksat commented@yched, as I said in #34, it's just a quick workaround to make life of editors easier right now. Of course it needs to be fixed in some general way.
Comment #42
dawehnerI agree that such a workaround would be awful.
Regarding #26, I was wondering whether in
\Drupal\Core\Entity\Sql\SqlContentEntityStorage::mapToStorageRecordwe could store, forshared tables, the default value of the items? They should still be considered as empty, in case the default value is empty, but then workaround all the various test failures.
I think fixing all those test failures by adding the values in the tests might not be the base idea, given that it could break other existing usecases out there.
Yes, node.title is marked as required but this requiredness is sadly just enforced on the validate method.
Comment #43
dawehnerReuploading the latest non workaround patch from swentel #26 so see the test failures.
Comment #45
larowlanDuplicate of #2444585: NotNull validation constraint does not fail on empty strings?
Comment #46
larowlanOver in #2444585: NotNull validation constraint does not fail on empty strings there is extra code to ensure Anonymous is reserved username - do we need that here - or is unique username enough - guess it depends on case sensitivity
Comment #47
larowlanCould we sub-class StringItem as UserNameItem and have a different isEmpty() definition to bypass the crazy in user_install()?
Comment #48
dawehnerInteresting thought, working on it.
Comment #49
dawehnerLet's try that out.
Comment #50
amateescu commentedI was also working on this and a new user_name field type seems like the easiest way out. The interdiff is to #43 and some more tests are also fixed.
Comment #52
amateescu commentedPromoting to critical per #2444585-10: NotNull validation constraint does not fail on empty strings.
Comment #53
dawehnerIs it really needed to have an empty string here?
Urgs, this is so horrible that we require this
Comment #55
amateescu commentedFixed a few more tests and included the test from #2444585: NotNull validation constraint does not fail on empty strings.
Yes, because we still have 'not null' => TRUE for the database field.
Uhm.. why? :)
Comment #56
yched commentedAgreed, this is one of the reasons why I find "one-off business specific field types" suck (can't look at code atm, so no real opinion on the specific one introduced here, just speaking generally)
A field type whose name is so specific that it looks like the name of a specific field instance is kind of a sign that the specifics would ideally live on that field's definition rather than on a "general field type" used for one field only :-) Dunno if that's possible here though.
But I think a given fIeld definition can specify a different Item class, while using a generic field type, can't it ?
Comment #58
amateescu commentedYep, that's also an option and it works :)
Comment #60
larowlanthis would match a string id ('foo' == 0) is TRUE, can we use ===
Comment #61
larowlanOnce @marthinal has commented can a maintainer add him to the commit credits as bits of our patches from the duplicate #2444585: NotNull validation constraint does not fail on empty strings have been brought over here?
Comment #62
marthinal commented@larowlan many thanks for considering my contribution :)
Comment #63
dawehnerwow, I think this is quite a clusterfuck don't you think so? Making it possible to remove the item class like that won't make debuggability easier. It is a string item, but in fact its not really one. I honestly prefer the more explicit variant we had before
Comment #64
amateescu commentedFixes #60 and a lot more tests.
Re #63: No opinion on that, I'm fine with either way :)
Comment #66
alx_benjamin commentedComment #67
dawehnerWell, I'm not into entity API but IMHO we should avoid changing the used classes for a specific plugin and instead change the plugin ID itself.
Comment #68
amateescu commentedI'm done for today...
Comment #70
webchickAdding credit to marthinal, per request from larowlan.
Comment #71
larowlanprodding
Comment #72
yched commented"Defining a new field type for a one-off use only by a specific field" vs "the definition of that field specifying its different behavior" :
Yeah, I can't say I'm thrilled by either options, ideally user_name string fields wouldn't need that weirdness.
If they really do, then IMO encouraging a pattern of "if one specific field needs some very specific behavior then pollute the list of field types with one-off field types" is a bad pattern to encourage. Field types are supposed to be about reusable, semantic-agnostic data types in a data modelling toolbox.
If some behavior / semantic is specific to a given field, it belongs to that field's definition.
If the behavior can be formulated in a way that makes sense generically for "more that one" field to reuse, then a field type makes sense.
Here, the behavior is "for entity 0, I accept an empty string as a non-empty value", that's ... pretty specific :-)
I *think* @fago added the setClass() methods precisely for this kind of cases, but we could ask for his opinion.
Other (crazy) thoughts :
- This is about user 0 passing validation without having a name, right ? Can't we just say that user 0 name is 'anonymous' internally ? That's what we display anyway, so allowing a non-anon user to chose 'anonymous' as a name sounds like an open door for social trickery anyway ?
- rather than switching a different item class, maybe we could add a $field_definition->setIsEmptyCallback($callable), that would be called instead of the field type's isEmpty() ? Could even happen in a followup ?
Comment #73
yched commentedMore down-to-earth remarks :
If we stick to this approach, the comment should be adjusted, this does not define a "field type".
Minor, but given that getItemDefinition() returns an object which is by ref, we don't really need to re-set it back with setItemDefinition() after modifying it ?
Could be just
$fields['name']->getItemDefinition()->setClass('\Drupal\user\UserNameItem')?The comment a bit brief, so the reason why user_name needs a different isEmpty() logic than the parent remains a bit cryptic. IMO this could use a more detailed explanation. Having not read the recent issue comments fully, I don't really understand it :-)
Also : thanks a TON for picking that one up, folks :-)
Comment #74
larowlanSome more fixes.
Need a migrate expert for the migrate fail, and a views expert for the views one - over my head.
Comment #76
benjy commentedThe migrate fails are a bug in the setup, specifically for MigrateCckFieldValuesTest so I added the node creations there rather than MigrateNodeTestBase as not to impact the node tests that use the same base class.
I think we need a follow-up though to more gracefully handle nodes with an empty title, which is likely to be common from non-drupal sources.
Arh, just noticed an extra space i removed in MigrateNodeTestBase, can remove that in a future patch.
Comment #77
mpdonadioI think I can fix the Views fails tonight.
Comment #79
mpdonadioThree hours to add two lines to the test...
The Views fail is because EntityViewDisplay::buildMultiple() has a call to FieldItemList::filterEmptyItems(). The user name for uid==0 in the test wasn't being initialized properly, so the new isEmpty() was filtering out this item, so nothing was being rendered. HandlerFieldUserNameTest comes up green for me locally now.
I already have the patch rolled and tested, but it may be better to fix this in UserTestBase::setUp(). Too tired to test try this, but also worried about ripple effects.
Comment #80
mpdonadioOh, if you are wondering how that gets traced back, start at \Drupal\views\Plugin\views\field::getItems(), go into \Drupal\views\Entity\Render\EntityFieldRenderer::render(), then you end up in ::buildFields, and then you are in the Field API at EntityViewDisplay::buildMultiple(). Whhhhhheeeeeeeeeeee!!!!!!
Comment #81
larowlan#72 and #73 still need to be addressed
Comment #82
amateescu commented#72.crazy thought 1: This is my only knowledge reference for why we can't store the anonymous user name in the storage: #2259323-12: Fix the sub-query part in UserSelection::entityQueryAlter.
#72.crazy thought 2: That would make it even harder for debugging, imo.
#73.1, 2 and 3: Fixed.
Comment #83
benjy commentedCreated #2550033: Handle empty fields gracefully as the follow-up i mentioned in #76
Comment #84
dawehnerIMHO some quick comment here would be great. What makes passwords special
Not a blocker:
/me still shakes his HEAD, well, this is why Drupal is so complex. Small microopts all over the place
Comment #85
jibranDo we need an update hook for this?
Comment #86
larowlanFixes #84 and #85
Comment #87
jibranLet us fix it.
Comment #89
larowlanSo we did need the 'name' => ''
Do we need an update path for it - no - that's what my current HEAD install gets for that column,
Comment #90
jibranRTBC+1
Just for the record currently in HEAD the name is empty so no need for upgrade path. Thanks @dawehner & @larowlan for clearing that up.
Comment #91
leksat commented/sorry, misclicked/
Comment #92
yched commentedAgreed with @dawehner that the "$fields['name']->getItemDefinition()->setClass('\Drupal\user\UserNameItem');" part could use a comment :-)
Other than that, the weirdness of user.name, and the fact that we need to override StringItem::isEmpty(), still bugs me.
- It's currently a non-required field, which implies there was a will to allow it to be empty (as in "no value", not as in "the value is an empty string"), and that default validation will not reject it if empty.
- The fact that we accept it as empty only for user 0 could be enforced by a custom constraint ? (the field already uses some custom business logic constraints)
- user 0 having no name should mean we store NULL, NULL typically means no value, right ? Yet we explicitly enforce "NOT NULL = TRUE" in UserStorageSchema::getSharedTableFieldSchema() (without that special code it would be NOT NULL = FALSE like any field in a shared table)
So it seems we intended to allow NULL user name for user 0, yet we go out of our way to store that as empty string rather than NULL, and then need to add ugly hacks to override the field type's isEmpty() so that in some cases this empty string is not filtered out as an empty value and can reach the storage.
That sounds... backwards :-) Why not just store NULL as you normally would for an empty field, and avoid that growing mess ?
Comment #93
amateescu commentedThe answer to that is just above the place we declare the 'name' column as not null in UserStorageSchema::getSharedTableFieldSchema():
Comment #94
yched commentedSure, saw that, I'm just saying that it's artificial for a field that is semantically nullable, and forces us to use weird tricks to make it work.
non-null indexes might be more performant, not sure it warrants running through hoops to forcibly make a nullable value not null :-)
[edit : also, the field being non-required and the hoops we use to make it NOT NULL seem to contradict each other]
Comment #95
amateescu commentedI don't think we ever had that intention, tbh :) As far as I see it, this is a field type where the difference between NULL and an empty string matters (i.e. for db performance reasons).
IMO, the ugly hacks are the other way around: this field should be required, but, if we mark it as such, the NotNull constraint would be added automatically, which by default checks the isEmpty() typed data method, which brings us circling back to the need for a custom field item implementation...
Comment #96
yched commentedYes, and this patch adjusts isEmpty() to be FALSE if the field contains an empty string, so it means if we go that way, the field always has a value and there's no reason to keep it non-required ? That's why I'm saying we're not being consistent by having the field both non-required and able to have an empty string as an actual value.
I disagree, that's not a distinction we need here. We're not trying to be able to semantically differentiate between "field has no value" and "field has a value that happens to be an empty string", because we don't have both cases. We have a field that either has a value or doesn't, and in that latter case we'd just prefer to store an empty string instead of NULL. That is just for SQL-specific reasons (NULL and index performances), which should IMO not leak into the data model and field types by introducing IMO potentially highly confusing subtleties (empty string but non-empty Item) if we don't actually have an actual need for that distinction on the data level.
(also, widgets can't really make the difference between "I want no value and I want a value of empty string")
IMO keeping performant indexes could be an optimisation handled at the SQL storage level : for text / varchar properties in fields in shared tables, keep the column as NOT NULL, store empty string instead of NULL, and translate back and forth on load / save. Those considerations shouldn't reach the field type logic.
Can totally be a followup though, so that we can finally get rid of the more general StringItem brokenness, which is a major yay :-)
Comment #97
amateescu commentedOk, let's try and see if anything breaks.
As for the "distinction between NULL and an empty string" part, that also makes sense, thanks for the thorough explanation :)
Comment #98
larowlanThe other approach could be what we do in #2404331: Can't serialise an entity with an entity reference to an unsaved entity (remove EntityReferenceItem::$NEW_MARKER in favour of a validation constraint) where we use a different data definition, instead of a different data-type. In the alternate data definition we override getConstraints and substitute the NotNull for a different implementation. We could do that here too for name.
Comment #99
jibranDifferent DataType makes a lot of sense to me. It will also help us remove
$fields['name']->getItemDefinition()->setClass('\Drupal\user\UserNameItem');which is really bugging @dawehner.Comment #100
yched commentedWell, the whole point of the discussion since #92 is about finding ways to avoid having to special case a new field type (either an actual field type or a one-off override just for $fields['name']->getItemDefinition()->setClass()).
I'm still kind of convinced about what I wrote in #96 - more briefly :
- We don't need a field type that can be a 3-state ("has a value" / "has an empty value" / "has no value"). That adds unneeded confusion and is not actually what is needed here.
- We just want SQL storage to store "has no value" as empty string rather than NULL for text columns, for SQL-index-specific reasons. That's an optimization that could be done in SQLStorage, probably in a separate issue because we really need to fix the StringItem WTF, which is rightly critical.
So the way I see it it's more a question of "what is less of a WTF til we get there (if we do) ?"
- I'd say it's not creating a new field type, because we couldn't remove it later and would be stuck with it
- The one-off override (as in the current patch) leaks less and would be easier to remove without breaking BC too much
Can't really tell for the approach @larowlan proposes in #98. Where would that "custom definition class" be ? I'm not sure I'm a fan of an approach that would require something lilke :
This bugs me less in #2404331: Can't serialise an entity with an entity reference to an unsaved entity (remove EntityReferenceItem::$NEW_MARKER in favour of a validation constraint) where the custom definition class is for a TD property (more "hidden"), but scattering several BFD classes scares me more as a practice to initiate (because it's a more "public" API, or maybe just because it's more "my" API than TD, not sure :-p)
[edit: actually, I know why - there are already several DataDefinition classes, while "there is only one BaseFieldDefinition class" is an epxlicit DX decision that was made a couple years ago now :-)]
Comment #101
gábor hojtsyI don't think our code style suggest a comma at end of inline arrays.
Would this run into #2542748: Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state given that it will add NOT NULL to the schema? I guess either we'll need to add the update path here or there depending on which one lands first.
Comment #102
gábor hojtsyFirst a reroll because UserUnitTestBase was renamed to UserKernelTestBase.
Comment #103
gábor hojtsyFixing #101/1.
Comment #104
amateescu commentedI was cross-posting the same patch :)
#101.2: Nope, because that column is already not null; see
\Drupal\user\UserStorageSchema::getSharedTableFieldSchema(and the entire discussion above :P)Comment #105
gábor hojtsyIn that case, looks like we can go back to RTBC. The other concerns seem to me not affecting what is critical about this patch.
Comment #106
effulgentsia commentedAdding credit for yched and jibran for their several helpful comments. And for dmsmidt for reporting.
Comment #107
effulgentsia commentedComment #108
effulgentsia commentedComment #109
effulgentsia commentedOverall, this patch looks great.
Is this meant to be part of this patch? If so, why?
Node::baseFieldDefinitions() still has a setDefaultValue('') on a required field (title). Some other classes also have it on non-required string and email fields. Do some/all of those need this removed as well? Do we need a CR informing contrib to remove it too? What happens if you don't remove it?
Comment #110
amateescu commentedSee #76 for an anwser :)
I think @swentel removed it for Shortcut in the initial patches since it simply doesn't make sense to set the default value as an empty string if the field is required. Before this patch, this was useful for nodes because it allowed tests to not specify a value for 'title', but since we have to do that now, we should also remove it from Node.
It would be a very small one that says to remove it for required string fields. So not sure if it's worth it or not, I'll let others decide :)
Nothing, it just doesn't give you the same benefit as without this patch, i.e. you would still have to explicitly set a value for a required string field.
Comment #111
gábor hojtsyLooks fine with me again :) I did not track #76 either, makes sense.
Comment #112
amateescu commentedThis is just a cosmetic change to keep things consistent, so leaving at RTBC.
Comment #113
jibranIn
UserNameItemwe are using magic getter and inStringItemBaseandPasswordItemwe are explicitly getting values. Can we be consistent here?Comment #114
amateescu commentedGood spot :)
Comment #115
jibranRTBC +1
Comment #116
effulgentsia commentedJust posting some notes for follow-ups if anyone's inspired. Does not need to block the commit of a critical.
I don't think we need this doc. It's pretty clear that the PasswordItem implementation is unique.
This doesn't seem ideal. I'm not sure if this is a weakness in entity_load_multiple_by_properties() or something else, but it seems like we should be able to loadByProperties from an existing value of the same property without needing to deal with converting one kind of empty to another. Unless there's a quick and obvious answer on why this is good the way it is, if someone can post a followup about that, I'd appreciate it.
A lot of test changes in this patch like this one. Perhaps we need a followup to centralize this somewhere?
Thanks for the link to #76, but this is not at all easy to understand when just reading the code. Why these specific nids, etc.? Some docs with @see links to where these nids come from and maybe a @todo link to #2550033: Handle empty fields gracefully would be helpful.
setClass() isn't a method on the @return type of getItemDefinition(), so a
/** @varline would help, plus a brief comment explaining why we're doing this would help. Maybe also a follow-up to discuss whether there's a way to not need this?A comment explaining why this isn't in the
Pluginnamespace would be useful: i.e., that it's not a discovered field type, but an override of the 'string' implementation without changing its type.Comment #118
effulgentsia commentedPushed #114 to 8.0.x. Thanks! One or more follow-ups for #116 would be nice, but not critical.
I do think a CR would be good though. Primarily for contrib modules implementing field types that extend StringBase. And secondarily a mention of what a module that does a
setDefaultValue('')needs to now know.Comment #119
jibranI disagree with #116.1. Thanks for the commit @effulgentsia.
Comment #120
webchickCo-wrote this with @gaborhojtsy https://www.drupal.org/node/2554995
Comment #122
neclimdulI don't know that anyone will actually see this since its been autoclosed but it seems this might be the cause of a pretty serious problem. #2567899: Empty node titles cause failures when migrating because empty strings are treated as NULL I was tempted to mark it as critical. Thanks @dawehner for pointing me to this.