Problem/Motivation

Currently, onDependencyRemoval() of Search API Index entity doesn't handle the case when dependent fields are removed. It results in removing whole index config entity.
In addition, we set a dependency on field definition instead of field storage definition.

Proposed resolution

Remaining tasks

Extend onDependencyRemoval() to consider removing field storage dependencies.

User interface changes

API changes

Data model changes

Estimated Value and Story Points

This issue was identified as a Beta Blocker for Drupal 8. We sat down and figured out the value proposition and amount of work (story points) for this issue.

Value and Story points are in the scale of fibonacci. Our minimum is 1, our maximum is 21. The higher, the more value or work a certain issue has.

Value : 3
Story Points: 5

CommentFileSizeAuthor
#51 2541206-51--field_dependency_removal.patch23.21 KBdrunken monkey
#51 2541206-51--field_dependency_removal--interdiff.txt2.31 KBdrunken monkey
#48 2541206-48--field_dependency_removal.patch21.75 KBdrunken monkey
#45 2541206-45--field_dependency_removal.patch24.04 KBdrunken monkey
#45 2541206-45--field_dependency_removal--interdiff.txt11.7 KBdrunken monkey
#39 2541206-39--field_dependency_removal.patch20.49 KBdrunken monkey
#39 2541206-39--field_dependency_removal--interdiff.txt14.32 KBdrunken monkey
#34 consider_field_storage-2541206-34-interdiff.txt7.28 KBmbovan
#34 consider_field_storage-2541206-34.patch16.03 KBmbovan
#33 2541206-33--field_dependency_removal.patch13.26 KBdrunken monkey
#33 2541206-33--field_dependency_removal--interdiff.txt1.81 KBdrunken monkey
#31 consider_field_storage-2541206-31-interdiff.txt638 bytesmbovan
#31 consider_field_storage-2541206-31.patch13.44 KBmbovan
#28 consider_field_storage-2541206-28.patch12.82 KBmbovan
#25 2541206-25--field_dependency_removal--interdiff.txt15.97 KBdrunken monkey
#25 2541206-25--field_dependency_removal.patch17.74 KBdrunken monkey
#22 consider_field_storage-2541206-22.patch14.55 KBmbovan
#22 consider_field_storage-2541206-22-interdiff.txt3.82 KBmbovan
#18 consider_field_storage-2541206-18-interdiff.txt4.42 KBBerdir
#18 consider_field_storage-2541206-18.patch12.64 KBBerdir
#14 consider_field_storage-2541206-14-interdiff.txt7.82 KBmbovan
#14 consider_field_storage-2541206-14.patch11.69 KBmbovan
#11 consider_field_storage-2541206-11-interdiff.txt9.17 KBmbovan
#11 consider_field_storage-2541206-11.patch7.57 KBmbovan
#5 consider_field_storage-2541206-5-interdiff.txt997 bytesmbovan
#5 consider_field_storage-2541206-5.patch3.17 KBmbovan
#4 consider_field_storage-2541206-4-interdiff.txt3.11 KBmbovan
#4 consider_field_storage-2541206-4.patch3.43 KBmbovan
#1 consider_field_storage-2541206-1.patch2.94 KBmbovan
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mbovan’s picture

Status: Active » Needs review
FileSize
2.94 KB

Changed dependency name for fields to field storage definition config dependency name.

Extend onDependencyRemoval() on Search API Index to support removing field storage dependencies.
There are 2 @todos:
- Making the field name generic through datasource
- Removing additional fields

Berdir’s picture

+++ b/src/Entity/Index.php
@@ -1403,6 +1417,12 @@ class Index extends ConfigEntityBase implements IndexInterface {
+      if ($entity instanceof FieldStorageConfig) {
+        // @todo Make this generic through the datasource.
+        // Remove the deleted field.
+        $field_id = 'entity:' . $entity->getTargetEntityTypeId() . '/' . $entity->getName();
+        $changed = $this->removeField($field_id);
+      }

So the hard part is being able to route this through the data source so that it can give us the field_id to remove.

Maybe something like $datasource->getFieldIdsForDependencies($dependencies) or something like that?

drunken monkey’s picture

Status: Needs review » Needs work

Thanks a lot for reporting this problem and providing a patch! It already looks quite good. A few notes, though:

  1. +++ b/src/Entity/Index.php
    @@ -1403,6 +1417,12 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +        $changed = $this->removeField($field_id);
    

    That needs to be a |= instead of an =, unless I'm mistaken. Otherwise, other changes might be ignored.

  2. +++ b/src/IndexInterface.php
    @@ -558,4 +558,15 @@ interface IndexInterface extends ConfigEntityInterface {
    +  public function removeField($field_id);
    

    That's a bit weird to have if there's no addField() method.
    There is already the method of $index->getFields()[$field_id]->setIndexed(FALSE, TRUE) which is a bit more verbose, but should work fine.

- Making the field name generic through datasource

How about just making getFieldDependencies() public and letting it return a mapping between Search API fields and dependencies (no matter which way – we just have to keep in mind that that's an N:M relationship). It would then be called from the index automatically, of course, the datasource would just return (in the case of the ContentEntity datasource) the provider of the entity type it's configured for as the sole dependency. And the onDependencyRemoval() code would have it easy to determine which fields need to be removed. (So it seems for both (existing) use cases the format [$dependency => $field_ids] would be handier.)

- Removing additional fields

I think that would better be handled along with the previous item – when getting the dependencies from the datasource, it would just list all Search API fields that depend on each of the Field API fields – i.e., we would know that both entity:node/field_editor and entity:node/field_editor:entity:name depend on the field_editor field. (However, we of course have to keep in mind when defining the return value of getFieldDependencies() (and implementing it) that, e.g., entity:node/field_editor:entity:field_project would depend on two fields – field_editor and field_project.

The point that we should depend on field storages, not field instances, was already discussed in #2472917: Make index configuration depend on field storage not instances. Since there's no patch for that issue, yet, it's OK to fix it here, of course. I've just closed that issue as a duplicate now.

Anyways, thanks again for the great work!

mbovan’s picture

Uploading a new patch with some points applied from #3. Additional fields are still not considered.

There are quite some caching issues (?), not directly related to this one:

1. Create a new field (Drupal field), it won't appear on the Search API Index page until you clear the cache.
2. Create an index, enable some fields, field dependencies are not calculated each time you save changes? You have to clear the cache and save it again to see configuration updated.
3. getFieldsByDatasource() sometimes returns empty array even if there are fields present in the index configuration. You have to clear the cache and re-save the index configuration...

mbovan’s picture

Just removing unused removeField() method.

drunken monkey’s picture

Status: Needs review » Needs work

Thanks for the reworked patch! Looks already better, but I don't think you've correctly understood my proposal?
It was about re-using the already existing (protected) getFieldDependencies() by making it public and letting it return a mapping of all dependencies mapped to the fields that depend on them.
If done properly, this would already take care of additional fields, too.

+++ b/src/Entity/Index.php
@@ -1425,6 +1425,14 @@ class Index extends ConfigEntityBase implements IndexInterface {
+        $this->getFields()[$field_id]->setIndexed(FALSE, TRUE);

If, for some reason, the field isn't present in the index, this will lead to a fatal error. Please check for the existence of the field before calling the method (and only set $changed if it is there.

If the caching issues are not related, please create a new issue for them. It seems we just shouldn't cache the unindexed fields?

Berdir’s picture

We discussed that and couldn't really make sense of what you were suggesting since the function seems to be opposite of what we need (needs fields, returns dependencies).

But I think I understand what you are saying now. That it should just return the mapping for all fields and their dependencies, not just those that are passed in?

We can try that tomorrow.

drunken monkey’s picture

But I think I understand what you are saying now. That it should just return the mapping for all fields and their dependencies, not just those that are passed in?

Exactly, yes. Instead of just returning an array with the dependency keys as values, put the dependency keys into the returned array's keys and store the Search API field(s) each dependency comes from in the values. That way, you can easily use the same method for both determining the dependencies (array_keys()) and determining the fields which should be removed when a certain dependency is removed (foreach ($dependency_fields[$dependency] …). Less code, less potential for bugs. And the issue with additional fields should also already be solved (though it seems we don't handle non-standard entity references, yet (i.e., non-Field API properties that reference entities) – but that's a different issue).

The last submitted patch, 4: consider_field_storage-2541206-4.patch, failed testing.

The last submitted patch, 5: consider_field_storage-2541206-5.patch, failed testing.

mbovan’s picture

Providing a mapping of dependencies and fields approach patch.

We tried to solve caching issues here by introducing $useComputedFieldsCache flag on Index because we are blocked with it and cannot properly test things. #4 1 and #4 2 should be solved with it.

Currently, I have a problem on onDependencyRemoval() in Index when calling $this->getFieldsByDatasource($datasource_id). It gives me list of fields on the given datasource but without (Search API) fields that have dependency on the Drupal field that is going to be removed. I guess that's happening because of computeFields() and the $useComputedFieldsCache flag we added, but cannot confirm.

I would appreciate some feedback here.

drunken monkey’s picture

Status: Needs review » Needs work

Thanks a lot for your work on this!

First off, some remarks on the code (interdiff) I had at a glance:

  1. +++ b/src/Entity/Index.php
    @@ -636,7 +648,7 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +      \Drupal::cache()->set($cid, $this->fields, Cache::PERMANENT, Cache::mergeTags($this->getCacheTags(), ['entity_field_info']));
    

    We currently don't use the short array syntax in this module, I think mixing this is more confusing than useful.

  2. +++ b/src/Entity/Index.php
    @@ -1425,6 +1427,38 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +          $fields = array();
    +          // @todo: There is no field that is going to be removed.
    +          foreach ($this->getFieldsByDatasource($datasource_id) as $field) {
    

    What is that @todo supposed to mean?

  3. +++ b/src/Entity/Index.php
    @@ -1425,6 +1427,38 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +                    $this->getFields()[$field_name]->setIndexed(FALSE, TRUE);
    

    I think you can just save the fields in a variable before and then call setIndexed() and unset() when removing fields.

  4. +++ b/src/Plugin/search_api/datasource/ContentEntity.php
    @@ -580,13 +580,14 @@ class ContentEntity extends DatasourcePluginBase {
    +    // @todo: Consider additional fields when calculating dependencies?
    

    What do you mean with that? Don't you do that already?

  5. +++ b/src/Plugin/search_api/datasource/ContentEntity.php
    @@ -580,13 +580,14 @@ class ContentEntity extends DatasourcePluginBase {
    -      $this->addDependencies(array('config' => $field_dependencies));
    +      $this->addDependencies(array('config' => array_keys($field_dependencies)));
    

    As said, now that that method is public I would move this code to the Index class (since it will be the same for all datasources).

  6. +++ b/src/Plugin/search_api/datasource/ContentEntity.php
    @@ -625,7 +626,26 @@ class ContentEntity extends DatasourcePluginBase {
    -            $field_dependencies[$field_definition->getFieldStorageDefinition()->getConfigDependencyName()] = TRUE;
    +
    +            // Make a mapping of dependencies and fields that depend on them.
    

    Please remove the empty line.

Then, I'm not sure I understand the purpose of the $useComputedFieldsCache property correctly.
At the very least, if it is FALSE, I don't think you should set the persistent cache to the newly calculated value?

Also, for #4.1, isn't that what the new entity_field_info cache tag solves? Otherwise, why was that added?

And #4.2 should be solved by just clearing our cache when the index is saved. Which, I think, our cache tags should already take care of? Otherwise, we should debug it, but if it hasn't got anything to do with this issue, I wouldn't want to introduce a haphazard fix here, which might (as you say yourself) interfere with our solution for this issue.
(If it makes testing harder/impossible, we'll just have to solve that issue first, before working further on this one.)

Speaking of tests: If possible, it would be great to get some automated test coverage for this issue. It's turning out really complex, and once it's fixed we should ensure that it stays fixed.

Currently, I have a problem on onDependencyRemoval() in Index when calling $this->getFieldsByDatasource($datasource_id). It gives me list of fields on the given datasource but without (Search API) fields that have dependency on the Drupal field that is going to be removed. I guess that's happening because of computeFields() and the $useComputedFieldsCache flag we added, but cannot confirm.

Hm, maybe for this code, with the sensitive time it's executed (in between various config changes) it might just be easier/better to only use low-level operations instead of our helper methods. I.e., loop through $this->options['fields'] directly, and unset the removed fields directly there. Maybe that will help? And it might be possible to merge some of the loops that way, because currently your looping (explicitly and implicitly) a lot through the datasources and fields. Maybe just try to build a more suited data structure for the method and then go with that.

If I can help you with something else, feel free to ping me again via mail or IRC.

Berdir’s picture

On the caching changes, we worked on that together, it is a bit complicated. Field definitions are pretty wrong in HEAD because of that, they're at least one save delayed. Which isn't *directly* related but it makes it very hard to debug/work on this issue.

First, those caches need to be invalidated if new fields are added, which is why we add the entity_field_data cache tag. However, that's only the start.

When you are saving the index and change the fields, we need to request them again during presave. in HEAD, that doesn't work at all because it just keeps using the computed field arrays that haven't been reset. So when the field configuration changes, then we now unset the computed field arrays. But even that is not enough, because it falls back to the cache and loads that again (which has not yet been invalidated as the view has not yet been saved).

That's why we added that flag to prevent it from using the cache at all when it has been changed in the current request. We could also delete the cache, but that method is called for every field so we would repeatedly have to delete that cache.. and there might be parallel requests to the site that are building it again, who knows. But yes, we also shouldn't write the cache if that flag is set.

mbovan’s picture

Re #12:

1. Changed to array()
2. That was the line where was the problem - not loading the right fields.
3. Changed.
4. This is the line where we add dependencies. I was wondering if we need to add dependencies if there are only additional fields selected, but probably not.
5. Moved to Index class now.
6. Removed.

Then, I'm not sure I understand the purpose of the $useComputedFieldsCache property correctly.
At the very least, if it is FALSE, I don't think you should set the persistent cache to the newly calculated value?

It should be set only if it is TRUE?

Speaking of tests: If possible, it would be great to get some automated test coverage for this issue. It's turning out really complex, and once it's fixed we should ensure that it stays fixed

Added new test assertions in IntergrationTest which should fail now (with this patch and without).

Hm, maybe for this code, with the sensitive time it's executed (in between various config changes) it might just be easier/better to only use low-level operations instead of our helper methods. I.e., loop through $this->options['fields'] directly, and unset the removed fields directly there. Maybe that will help? And it might be possible to merge some of the loops that way, because currently your looping (explicitly and implicitly) a lot through the datasources and fields. Maybe just try to build a more suited data structure for the method and then go with that.

Cannot use $this->options['fields'] as it only contains type and not property path what we currently use. Should we modify options, or getFieldDependencies()?

Status: Needs review » Needs work

The last submitted patch, 14: consider_field_storage-2541206-14.patch, failed testing.

drunken monkey’s picture

@ Sascha: Thanks for the explanation! Makes sense, of course.
Still, I a) think the solution looks a bit hacky and b) worry that with just the check in setOption() this will not cover all cases. I.e., there's also setOptions() and set() which might change the fields, and maybe we'll add other code later.

Safer, I think, would be to maybe store alongside the cached data the fields options for which it is valid – i.e., make the cache self-validating. Maybe with just a hash of $this->options['fields'] (and additional fields) attached to the data? That way, you could right away detect when the information that was used to build the cache isn't the one for which you want the fields data now – no matter what events led to this situation. (On the other hand, I don't think this is a pattern employed elsewhere (e.g., in Core), so maybe there's a good reason not to do that. I'ts just an idea.)

Also, if we end up agreeing on using the flag instead, the property should be added to the __sleep() method (I'd say).

It should be set only if it is TRUE?

Yes, I'd certainly say so. Otherwise you'll cache the fields information for a version of the entity which isn't yet saved in the database – if something changes the fields along the way, you'll end up with a bad cache.

Cannot use $this->options['fields'] as it only contains type and not property path what we currently use. Should we modify options, or getFieldDependencies()?

If that's the only issue, the property path can be trivially extracted from the field's machine name, with list(, $property_path) = Utility::splitCombinedId($field_id).

Nick_vh’s picture

Issue summary: View changes
Issue tags: +beta blocker
Berdir’s picture

Ok. This works. Removed the flag and instead doing it with that hash.

Also had the problem that it still removed the index despite saying it would not. Which was somehow related to static caches that had already removed the field and didn't know about it anymore. So if I find an unexpected field now, I'm just flagging the entity as changed so that core recalculates the dependencies and then we're fine.

Nick_vh’s picture

This is great. I discussed this with Berdir in Drupalcamp Vienna and we need one more test that tests the same use case but with the additional fields. Eg, when you add a the title of an image to the search index as well and we remove that field, we need to make sure that also updates the index so that it no longer uses that additional field.

Nick_vh’s picture

Status: Needs review » Needs work
borisson_’s picture

Issue tags: +Needs tests

Added tests tag.

mbovan’s picture

I made a small modification, added support for nested Search API fields and provided tests for additional fields.

We still don't have support for deep nested fields, but as this is a rare case probably we can do it in a followup issue.

Berdir’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs tests

Yeah, the dependencies code certainly isn't perfect yet, but this seems like a good start and this improves the situation a lot.

borisson_’s picture

Looks great, rtbc++.

drunken monkey’s picture

Thanks for implementing it like this, this does look a lot a lot better now.

Still, there were several things suboptimal about this, see the attached patch.

Also a few remarks:

  1. +++ b/src/Entity/Index.php
    @@ -323,6 +324,10 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    if ($name == 'fields') {
    +      // Make sure we recalculate the computed fields.
    +      $this->fields = NULL;
    +    }
    
    +++ b/src/Entity/Index.php
    @@ -331,6 +336,8 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    // Make sure we recalculate the computed fields.
    +    $this->fields = NULL;
    

    These could also check whether the fields actually changed. But probably unnecessary for now, maybe when we move it to its own property (cf. #2317771: Move processor and/or field settings to top-level properties on the index?).

  2. +++ b/src/Entity/Index.php
    @@ -1169,12 +1181,12 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    $this->applyLockedConfiguration();
    +
         // Stop enabling of indexes when the server is disabled.
         if ($this->status() && !$this->isServerEnabled()) {
           $this->disable();
         }
    -
    -    $this->applyLockedConfiguration();
    

    Why this change, shouldn't that be completely the same? Neither does the locked configuration depend on the status, nor is it possible to lock the index's server.

  3. +++ b/src/Entity/Index.php
    @@ -1405,8 +1425,49 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +          else {
    +            // If the field dependency does not exist, it is possible that we
    +            // already updated the configuration to remove that dependency.
    +            // Ensure that we return TRUE so that the new dependencies are
    +            // checked.
    +            $changed = TRUE;
    +          }
    

    Doesn't mean that we'll always return TRUE when a field gets deleted, no matter whether we index it, or whether it's even on the same entity type?
    That sounds pretty inefficient, would be great to find out how we can more reliably detect that case, and/or what causes it.
    Or does returning TRUE not effect a significant performance penalty?

Also, what exactly doesn't work about ("deeply") nested entities? To me, it looks like it would just recurse again. (But the code is pretty complicated, of course, so I can't be sure.

It would also be great not to hard-code :entity: as the separator for an entity reference – even though that's in Core like this, it's still possible to have a reference to an entity in some other way.

But sure, after these questions are cleared up we can at least commit this as a temporary partial fix and try to improve on it subsequently. Would be great if one of you could then create the follow-up issue, to make sure a) we don't forget it and b) it explains correctly what is still missing.

Berdir’s picture

3. No. We're only called if we actually depend on the thing being deleted. There is no other way as far as I see.

drunken monkey’s picture

3. No. We're only called if we actually depend on the thing being deleted.

Ah, of course, my mistake.

mbovan’s picture

After #2638116: Clean up caching of Index class method results (especially fields) the situation about Items/search api fields is much more cleaner.

I adapted patch #25 to the latest Search API changes, removed some unnecessary workarounds, refactored etc... Beside automatic tests, tested manually as well, the patch worked fine.

Btw, no interdiff since last patch is ~2 months old.

Status: Needs review » Needs work

The last submitted patch, 28: consider_field_storage-2541206-28.patch, failed testing.

The last submitted patch, 28: consider_field_storage-2541206-28.patch, failed testing.

mbovan’s picture

Ah, forgot to implement TestDatasource::getFieldDependencies().

Edit: Created a followup issue #2656916: Overhaul or remove ContentEntity::getFieldDependenciesForEntityType() based on #25.3

borisson_’s picture

Status: Needs review » Needs work
+++ b/src/Plugin/search_api/datasource/ContentEntity.php
@@ -899,11 +897,14 @@ class ContentEntity extends DatasourcePluginBase {
+        // @todo Support deep nesting.

Is there an issue we can link to?

Overall I think the code here looks great, is it possible to add a testcase for this in the dependencyRemovalTest as well? That's a kernel test so probably a different test to the one added in the integration test. But that makes it easier to test the code in isolation.

drunken monkey’s picture

  1. +++ b/src/Datasource/DatasourceInterface.php
    @@ -201,4 +201,17 @@ interface DatasourceInterface extends IndexPluginInterface {
    +   * @return string[][]
    +   *   An associative array with the keys being dependency names of config
    +   *   entities on which the given fields depend. The values are containing the
    +   *   IDs of the fields which depend on the given config entity.
    +   */
    +  public function getFieldDependencies(array $fields);
    

    I wonder if that isn't needlessly restrictive. Why not also allow non-config dependencies there, by using an additional level of nesting?

  2. +++ b/src/Entity/Index.php
    @@ -1394,6 +1395,32 @@ class Index extends ConfigEntityBase implements IndexInterface {
       /**
    +   * Retrieves information about the dependencies of the indexed fields.
    +   *
    +   * @return string[][][]
    +   *   An associative array, keyed by datasource IDs of this index and
    +   *   containing the respective field dependency information of that
    +   *   datasource, as described for the return value of
    +   *   \Drupal\search_api\Datasource\DatasourceInterface::getFieldDependencies().
    +   */
    +  public function getFieldDependencies() {
    

    Several things here:
    Why is this public? And, if there is a good reason, the method should definitely also be on the interface. No public methods without interface, as a rule.
    Why does it return the dependencies nested by datasource? Does that really provide any benefit? If there is really a need for that, we could also fulfill it by just adding an $datasource_id = NULL parameter.

  3. +++ b/src/Entity/Index.php
    @@ -1450,6 +1477,11 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    // Calculate field dependencies.
    +    if ($field_dependencies = $this->getFieldDependencies()) {
    +      $dependency_data['config'] = call_user_func_array('array_merge', $field_dependencies);
    +    }
    

    This will have the complete wrong format for $dependency_data. Please add a loop that correctly nests it within 'always' and 'field'.

  4. +++ b/src/Entity/Index.php
    @@ -1548,6 +1580,23 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +        // Remove a field from the index that is being removed from the system.
    +        if (isset($dependency_objects[$name]) && $dependency_objects[$name] instanceof FieldStorageConfigInterface) {
    +          if (!empty($dependency_sources)) {
    

    This is a completely different style than the rest of the method. Instead, add your code within the current framework, checking for a $plugin_type of 'fields' and then removing the fields when they come up there.

  5. +++ b/src/Entity/Index.php
    @@ -1548,6 +1580,23 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +            $changed = TRUE;
    +            $this->save();
    

    Unless I am very mistaken, the entity should never be saved inside an onDependencyRemoval() method.

See the attached patch for some further corrections.

Furthermore, I think the ContentEntity::getFieldDependenciesForEntityType() method takes a needlessly difficult approach. Instead, it should just take the property paths, descend into them (see \Drupal\search_api\Utility::retrieveNestedProperty() for example code in that direction) and, at each level, check whether the property is an instance of FieldDefinitionInterface – if so, add the necessary dependency. (That way, #2656916: Overhaul or remove ContentEntity::getFieldDependenciesForEntityType() would also be fixed – and adding code for other kinds of dependencies would be much easier.)

+++ b/src/Plugin/search_api/datasource/ContentEntity.php
@@ -899,11 +897,14 @@ class ContentEntity extends DatasourcePluginBase {
+        // @todo Support deep nesting.

Is there an issue we can link to?

More importantly: is this actually correct? As said before, to me it looks like this would recurse correctly also for "deeply" nested fields.
We should definitely add a test for this, either to verify it fails and needs to be fixed or to make sure it works and stays working.
As you say, adding this to the existing DependencyRemovalTest would also make a lot of sense, yes. Although, of course, having both is even better, so we can naturally also keep the addition to the integration test.

Also, the age of the last patch is not an issue, an interdiff could still make sense. But if you mostly just re-rolled, you're right, it would indeed be pointless.

mbovan’s picture

Tried to address some points #33 (1-5), but it seems that main purpose of the patch is not fulfilled anymore. :D Anyway, it might help someone to continue work.

Re #33:
- (2) Changed to private. Field dependencies are returned by datasource and that was the reason to return index fields dependencies keyed by datasource id.
- (3) Added 'always', 'fields', not sure if it is right tho...
- (4) Changed, I guess this is the part where new patch breaks.
- (5) Removed.

// @todo Support deep nesting.

Removed @todo, as this seems to be working, indeed.

As you say, adding this to the existing DependencyRemovalTest would also make a lot of sense, yes. Although, of course, having both is even better, so we can naturally also keep the addition to the integration test.

Added a test in DependencyRemovalTest class as well...

Edit:

+++ b/src/Entity/Index.php
@@ -1111,7 +1111,7 @@ class Index extends ConfigEntityBase implements IndexInterface {
+      ),

Not sure why PhpStorm changed this.

Status: Needs review » Needs work

The last submitted patch, 34: consider_field_storage-2541206-34.patch, failed testing.

The last submitted patch, 34: consider_field_storage-2541206-34.patch, failed testing.

mikemiles86’s picture

mikemiles86’s picture

drunken monkey’s picture

Managed to find the bug, re-wrote the field dependency detection code, but it's still there. The problem is (both with the previous and my current code) that, if we depend on the field storage config (instead of the field config itself), that at the point where onDependencyRemoval() is called the actual field configs have already been deleted – and with them our internal record of having them on the datasource. Thus, at the point where the field is actually deleted (not at the time where the resulting config chances are displayed – that correctly lists the index as "to be updated"), we can't actually find any dependency to the field, thus don't remove any dependency, $changed remains FALSE and the index is deleted (even though it wouldn't even have any (reported) dependency on the deleted entity anymore (the field would remain on the index, though, which is of course not OK).
The underlying problem is that the deletion of the storage upon deletion of the field config is not actually triggered by a dependency, but by a postDelete() operation. (Come to think of it, it's a bit of a mystery why the index is even listed on the field deletion page. Probably special code to include both dependents of the field and the storage (when both would be deleted).)

In any case, I'm attaching my current state of work. Any ideas or input would be much appreciated – not sure how we can solve this properly.

Status: Needs review » Needs work

The last submitted patch, 39: 2541206-39--field_dependency_removal.patch, failed testing.

The last submitted patch, 39: 2541206-39--field_dependency_removal.patch, failed testing.

Berdir’s picture

(Come to think of it, it's a bit of a mystery why the
index is even listed on the field deletion page. Probably special code to
include both dependents of the field and the storage (when both would be
deleted).)

Yes. That was a critical that I worked on, in Barcelona IIRC.

drunken monkey’s picture

Ah, that at least explains that. Thanks!

Any ideas for solving our problem here?

What I thought about that might help is adding caching for our dependency data. That way, the dependency data built for the "Predicted config changes" list would also be used for the actual dependency removal process. The caching would have to be persistent, though, as a cloned version of the index will be used for the "prediction" operation.

The problem here, however: I don't think it's impossible, internally, to remove a field config (or config entity in general) without first doing a "dry run" dependency check. And in that case, this would still fail. So, probably not practical either.

drunken monkey’s picture

Next idea: Just storing a field's dependencies along with its other information in the index. A bit brute force, maybe, but should generally be fine. Would at least be a solution if we can't think of any cleaner one.

drunken monkey’s picture

This implements the "saving dependencies with the fields" method.
Seems to work well enough – I guess we should just commit that, we can always come back and improve on it later, if we have an idea.

Anyways, reviews welcome!

Status: Needs review » Needs work

The last submitted patch, 45: 2541206-45--field_dependency_removal.patch, failed testing.

The last submitted patch, 45: 2541206-45--field_dependency_removal.patch, failed testing.

drunken monkey’s picture

Status: Needs review » Needs work

The last submitted patch, 48: 2541206-48--field_dependency_removal.patch, failed testing.

The last submitted patch, 48: 2541206-48--field_dependency_removal.patch, failed testing.

drunken monkey’s picture

drunken monkey’s picture

Still applies – no volunteers for reviewing?
Then unless anyone objects, I'll commit this early next week. High time this finally got fixed.

Berdir’s picture

This is not an easy patch to review ;)

Can't spot anything obviously wrong, seems to work, so +1 from me.

  • drunken monkey committed cd3c79f on 8.x-1.x authored by mbovan
    Issue #2541206 by mbovan, drunken monkey, Berdir: Fixed reaction to...
drunken monkey’s picture

Status: Needs review » Fixed

OK, then let's go the alternative route: commit first and then wait for people to complain.
Anyways, thanks for trying it out and looking it over!

Committed.

borisson_’s picture

I looked at this while on the plane earlier today, these were my remarks:

Overall, I think this looks great, I did spot a couple of small suggestions.

  1. +++ b/src/Datasource/DatasourceInterface.php
    @@ -195,4 +195,17 @@ public function getEntityTypeId();
    +   *   An array of property paths on this datasource, keyed by field IDs.
    

    /s/IDs/ID/?

  2. +++ b/src/Entity/Index.php
    @@ -1121,6 +1123,11 @@ public function preSave(EntityStorageInterface $storage) {
    +    // Since we change dependency-relevant data in this method, we can only call
    +    // the parent method at the end (or we'd need to re-calculate the
    +    // dependencies).
    

    This is a great comment!

  3. +++ b/src/Entity/Index.php
    @@ -1503,8 +1536,25 @@ public function onDependencyRemoval(array $dependencies) {
    +                // In case the field is locked, enable it before removing.
    

    This is a great comment, it perfectly describes what's going on. I'd only change "enable" for "unlock".

drunken monkey’s picture

I think "property paths keyed by field IDs" is fine like it is, but you're right about 3. – "enable" doesn't really make sense there. Committed the follow-up, thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.