Updated: Comment #0
Problem/Motivation
Follow-up from #2136197-75: Move field/instance/widget/formatter settings out of annotation / plugin definition.
Currently this code assumes that field settings always have values as plain structure
+++ b/core/lib/Drupal/Core/TypedData/TypedDataManager.php
@@ -259,6 +259,13 @@ public function getPropertyInstance(TypedDataInterface $object, $property_name,
}
$key = $definition->getDataType();
if ($settings = $definition->getSettings()) {
+ // Do not throw notices if settings values is array.
+ foreach ($settings as $k => $v) {
+ if (is_array($v)) {
+ debug($v, "key: $k");
+ $settings[$k] = ''; // stub for array settings or maybe serialize?
+ }
+ }
$key .= ':' . implode(',', $settings);
}
$key .= ':' . $object->getPropertyPath() . '.';
This implode()
throws notice:
Array to string conversionimplode(',', Array) Drupal\Core\TypedData\TypedDataManager->getPropertyInstance(Object, 0, NULL) Drupal\Core\TypedData\Plugin\DataType\ItemList->createItem(0) Drupal\Core\Field\FieldItemList->__construct(Object, NULL, NULL) Drupal\Core\TypedData\TypedDataManager->createInstance('list', Array) Drupal\Core\TypedData\TypedDataManager->create(Object) Drupal\system\Tests\Entity\EntityFieldTest->testEntityConstraintValidation() Drupal\simpletest\TestBase->run() simpletest_script_run_one_test('227', 'Drupal\system\Tests\Entity\EntityFieldTest')
Proposed resolution
Allow field settings with nested arrays,
Remaining tasks
Find a way to replace implode()
or refactor this code
User interface changes
no
API changes
no
Comment | File | Size | Author |
---|---|---|---|
#38 | 33-38-interdiff.txt | 986 bytes | alexpott |
#38 | 2221577.38.patch | 1.57 KB | alexpott |
#33 | 2221577.test-only.patch | 618 bytes | alexpott |
#33 | 2221577.33.patch | 1.28 KB | alexpott |
Comments
Comment #1
andypostComment #2
andypostLet's just calculate a hash, not sure how this affects performance
Comment #3
BerdirDo we even need the hash? As long as you don't have more than 2-3 settings, just the json string is shorter than the hash and likely also easier to debug.
Also needs a comment, something like "Add a string representation of the settings to the key to ensure that different objects are created for data types with different settings."?
Comment #4
andypostAdded a comment, also explained the reason for hash
Comment #5
fagoI see. Given we've objects as data definitions now, I wonder whether we couldn't just the spl object id for identifying definitions and start with that as the key? That means, that as long as we start off with shared definition objects (what we really should make sure it's the case anyway), we'll have the prototyping.
Given that, I think we could even vastly simplify the logic and just prototype based on a passed in $definition object of the data type to be created.
Comment #6
andypostNice idea, let's see
Comment #8
andypostSeems list part is needed
Comment #10
andypostComment #13
andypostDiscussed this with @fago, we still need the property path here
Comment #15
andypost@fago there's still a lot of exceptions somehow
Comment #16
yched CreditAttribution: yched commentedWeird - I ran a couple of those tests locally, they passed.
Comment #18
andypostSuppose one-line should be fine
Comment #19
andypostGreen! That's was bot flux...
Comment #20
alexpottI'm hitting this issue in #2271419: Allow field types, widgets, formatters to specify config dependencies - like the solution.
Comment #21
alexpottHmmm... actually i think the solution is not quite right.
In order for this static cache to work then we have to be using exactly the definition object. That was not the case with the old code.
What is wrong with
$key .= ':' . serialize($settings);
?Comment #22
andypostThe reason to not use serialization is performance that's why I initially used a hash()
So to prevent duplication of definition objects the spl_object_hash() looks better approach so no static cache needed
Comment #23
alexpott@andypost but my point is that once we use spl_object_hash() the static cache that is being set up here is less likely to be used because using that means that exactly the same field definition object needs to be pass in - not just the same values - the same instantiated object.
Comment #24
BerdirWhich should be the case if you create multiple nodes of the same bundle for example. Also used to be the case to be the same for all bundles for base fields, not sure if that changed. With overrides, certainly.
Comment #25
yched CreditAttribution: yched commentedThis is blocking #2271419: Allow field types, widgets, formatters to specify config dependencies which is critical, so this is critical ?
I tend to agree with @andypost and @Berdir here. The perf optim here are mostly about "this is the *same* definition object". I'd think this is good enough, and more robust, than building a compound key with "all the things that matter, including serializing stuff" ?
The existence of overrides means definitions *objects* differ for base fields across bundles, sure. I'd say that's fair game ?
Since this is also the approach suggested by @fago, temptatively kicking back in @alexpott's yard ?
Comment #26
yched CreditAttribution: yched commentedThat could really use a comment, though.
Comment #27
yched CreditAttribution: yched commentedAlso, no need to use both : and . as separators between the spl_hash() and the propertyPath().
Comment #28
swentel CreditAttribution: swentel commentedIt's hierarchy no ?
Comment #29
alexpottWasn't the . separator used because getPropertyPath() returns stuff with dots in - see
Comment #30
yched CreditAttribution: yched commentedHem, so much for harmless post-RTBC changes...
But actually, I looked into the surrounding code a bit more, and now I'm just confused :-/
I overlooked that that line at the top of TDM::getPropertyInstance() so far:
$definition = $object->getRoot()->getDataDefinition();
The $definition that is being looked at to answer "do I have a prototype I can reuse for the new Data object I'm asked to create" is the definition of the root object.
(The method then adds to the confusion by later using the same $definition var name for "the DataDefinition of the actual object I need to create" - i.e the FieldItemList, FieldItem, etc...)
So the logic currently is : re-use the prototype if we have already created objects at the same property path (e.g body.0) on a "similar" root object - for some definition of "similar".
In an Entity/FieldItemList/FieldItem context, that root object is always the entity, and $definition is the EntityDataDefinition for the entity.
Then :
- I have to admit I don't understand what this specific issue here is about :-)
$entity_data_definition->getSettings() always returns an empty array anyway.
- As a side note, not sure what the "if ($definition instanceof ListDataDefinition) {" block is used for, it looks like it's never executed - at least not when manipulating entities/fields. Was added in #2112239: Convert base field and property definitions, (discussed in #11 / #13 / #23), so was definitely in a "field" context. For the record, it's the same issue that added the "settings" lookup we're discussing here.
- The $key used to determine if a prototype can be re-used is, in current HEAD, 'entity:[entity_type]:[bundle]:.[property_path]".
We never look at the field definition. "Same entity type + bundle + field name ? Reuse the prototype". I don't know, that sounds... overly accepting ?
I'm a confused atm about what exact logic we'd want here, but yeah, we cannot go with "similar = same runtime EntityDataDefinition object according to spl_object_hash()", because after #2002138: Use an adapter for supporting typed data on ContentEntities, EntityDataDefinition objects are created at runtime for each entity (Entity::getTypedData() / EntityTypeData::createFromEntity())
As a proof - if I add the following debug code in TDM::getPropertyInstance() :
Then running the following session in "drush php" (node 1 and 2 are both article nodes):
HEAD:
PATCH (spl_object_hash):
In short :
- the code and logic in TDM::getPropertyInstance() could really use some cleanup / clarification / possibly rethinking ?
- in the current logic, spl_object_hash() is not an option...
- but it's not clear to me what's the actual issue around field settings we're trying to address here to begin with :-)
Comment #31
alexpott@yched this is the problem we are trying to fix since this can fail if $settings is a multi-dimensional array - this
Is not always true.
Comment #32
yched CreditAttribution: yched commented@alexpott: ok, then what I don't get is when is this not an empty array ? It was in the couple cases I stepped through.
+ I might be overlooking something, but I don't see what would be the settings of an EntityDataDefinition ?
Maybe it would help to have an example dump of what is found in getSettings() if not empty ?
Or more simply put : how do we reproduce the bug discussed here ? :-)
Comment #33
alexpottExplanation in #2271419-49: Allow field types, widgets, formatters to specify config dependencies.
Test only patch attached.
Patch attached should also address the cache misses reported in #30
Comment #35
yched CreditAttribution: yched commentedOK, so the fails are caused by EntityFieldTest::testEntityConstraintValidation() creating a FieldItemList directly without a parent entity.
Then, when an Item is created, getPropertyInstance() is called with $object = the FieldItemList
- $object->getRoot() = $object, since it has no known parent
- $definition is the ListDataDefinition instead of the EntityDataDefinition
- special code for "if ($definition instanceof ListDataDefinition) { use the ItemDefinition instead }"
- that ItemDefinition is the FieldItemDataDefinition, and its ->settings() are the field settings...
This is not supposed to happen in the regular flow (we don't support FieldItemLists without a parent entity), but we do have a test that does it. This screws the "prototype cache" logic, but no biggie in a test.
Also, I guess outside of the "Entity/FieldItemList" case, we do want to support standalone ItemLists without a parent, and with ItemDataDeffinitions that might have non-scalar settings.
So, yeah, serializing looks like the only valid option in that case.
I do intend to try to cleanup / clarify the content of getPropertyInstance(), but for that issue here, #33 should be good ?
Comment #36
andypostAs this one blocks other issues the fix is fine.
OTOH memory consumption should grow with a straight serialized data so needs profiling.
I still think that we need to hash serialized serialized data to make data structure smaller.
Also having ":" does not make any sense because serialized data could have a lot of it
Comment #37
yched CreditAttribution: yched commentedI'm not really worried about memory consumption. As explained above, $definition->getSettings() is an empty array in of the 99% actual use cases - at least in the current logic for prototype reuse that is based on comparing definitions of the root object.
Also, the presence of ':' in the serialized data is not really problematic, since we're only concatenating cache key parts, and never try epxlode them back apart.
But well, since the case for serialize() is pretty rare anyway, we could also chose to serialize() + hash. No strong opinion, TBH.
Comment #38
alexpottWell better safe that sorry - let's hash the output of serialise so the size of the output is predictable.
Comment #39
yched CreditAttribution: yched commentedAlright. Confirming RTBC.
Comment #40
Dries CreditAttribution: Dries commentedCommitted to 8.x. Thank you!
Comment #42
yched CreditAttribution: yched commentedPosted a cleanup patch : #2364267: Clarify the logic in TypedDataManager::getPropertyInstance()
Comment #43
yched CreditAttribution: yched commentedAbout the use of Crypt::hashBase64() :
#2346763: Improve views field rendering performance by caching entity display objects has a similar case of serializing an array of settings for a static cache key.
Over there, @msonnabaum used crc32(serialize($array)).
Quoting him from #2346763-10: Improve views field rendering performance by caching entity display objects :
So, should we change to crc32() here as well for consistency ?
Comment #44
andypostconsistency++ maybe do that in #2364267: Clarify the logic in TypedDataManager::getPropertyInstance()