OK, everyone, please help me think. I tried to explain it to my duck, but it wasn't in the mood.

In the Index class, we have a method resetCaches() to reset all the caches pertaining to the index (in particular cached fields lists, properties, plugins, etc.). It has a parameter $include_stored which, when set to FALSE, will skip resetting the permanent, stored cache and just reset the cache properties on the index object itself.

However, here comes my question: if you do that, doesn't that just mean it that next time getFields() is called (which uses both a static and stored cache) the stale data from the stored cache will be loaded into the static property and returned? Seems to me, this doesn't make any sense.

Please tell me if I'm missing something there.
Otherwise, to save performance, we should either use something like cache tags, to avoid clearing all caches when just one part has changed, and maybe also have some property during index save that disables the stored cache for a while (to avoid dozens of cache sets and deletes with no gain whatsoever). Or, in general, maybe try to be a lot more thorough in resetting caches exactly when necessary, neither more often nor rarer – just like Core did, which brought great benefits, too.
In any case, we shouldn't allow clearing one cache, but leaving the other with the old, stale data.

Estimated Value and Story Points

Value and Story points are in the scale of fibonacci. Our minimum is 1, our maximum is 21. The higher, the more value or work a certain issue has.

Value: 3
Story Points: 5

Files: 

Comments

drunken monkey created an issue. See original summary.

borisson_’s picture

We probably shouldn't need this method and having correct cache tags, keys and contexts should be sufficient. So in theory, we should be able to remove this method when we fix all other cacheability problems.

drunken monkey’s picture

Title: Does Index::resetCaches(FALSE) make any sense? » Clean up caching of Index class method results (especially fields)
Priority: Normal » Major
Issue tags: +Release blocker

We probably shouldn't need this method and having correct cache tags, keys and contexts should be sufficient. So in theory, we should be able to remove this method when we fix all other cacheability problems.

Not really, no, since I'm pretty sure we'll still want static caching for most methods. Loading the fields anew from stored cache for each getFields() call would be insane. Same goes for all the plugins, which currently aren't even stored in permanent cache (which we probably don't want to change).
Also, the normal cache system, with its cache tags etc., can only work on index save. The problem we're having here is that, while the index is being edited, the getFields() method should always retrieve the fields based on the current settings on the index object, but the stored cache should not be rebuilt (since the changes might not be saved). So while we definitely want to use the stored cache for the first getFields() call of a page request, it mustn't be used once the index has been modified, until it has been saved and the cache thus been cleared.

We probably need a broader concept for the whole thing, but it can wait until Beta (when all of the basics are in place so that we really know the scenarios under which this would have to work).

borisson_’s picture

Status: Active » Postponed

That makes sense.

drunken monkey’s picture

Actually, I just ran into major trouble due to this over in #2640982: Fix "unsaved changes" code in the new Fields UI, so we might want to tackle this earlier.
On the other hand, that other issue is also only a UI change, so not really a beta blocker.

drunken monkey’s picture

Issue summary: View changes
Status: Postponed » Active
Issue tags: -Release blocker +beta blocker

Now that #2317771: Move processor and/or field settings to top-level properties on the index? is in, I think we can un-postpone this, and the way I have it currently in mind I think we need to make this a beta blocker after all, since it will change the internal API somewhat.

My current plan for this:

  1. Remove Index::[gs]etFieldSettings(), and also don't change $this->fields directly anywhere inside the index class.
  2. Instead of a mere "cache", the loaded array of field objects becomes the canonical representation of the index's fields. It's loaded once, from the index's stored field settings, and can then be manipulated to change any field settings. Then, only in preSave(), we write the array of fields objects back to the fields property.
  3. Consequently, we should revert the change from #2574969: Add a Views-like UI for adding fields and not have the array of field objects in the cache property after all – that would just send the wrong message. Instead, maybe fieldObjects? The name is a bit tricky, I admit.
  4. (Possible solution, but seems rather hack-ish and potentially dangerous: actually override $this->fields on the first call of getFields() with the field objects and only make the translation back in toArray(). But probably really not worth it just to avoid the naming problem.)
  5. addField() will therefore only be used for adding a new field, not for re-saving the settings of an existing one. For changing an existing fields, just use the field's methods for changing its properties and you're done. (Except for changing the field ID or removing the field.)
  6. Since we now store the field's label, I don't think the computation in getFields() will be that tricky anymore, so even having a stored cache might not really make sense anymore. It does add some additional complexity. However, in this new system, we'd never have to reset the array of field objects (since it's not really a "cache" anymore), so the stored cache would only be used when first loading the fields from the stored settings anyways.
  7. getFieldsByDatasource() and getFulltextFields() can still use normal static caching, I think. Resetting their caches will only be necessary in the index's own (add|rename|remove)Field() methods, I think. In theory, changing a field's type could invalidate the static cache of getFulltextFields(), but I'd say that's too much of an edge case to take it into account. (I don't think that method is ever used when changing the index's fields.) resetCaches() should be dropped in favor of something more fine-grained, at least for this use case. (Or for all, if we do the following, too.)
  8. For all the different kinds of plugins, we might want to go excatly the same route – handle all of them as objects in general, and only write back their configuration and other settings when the index is saved. Might have some advantages, and would certainly improve consistency in our API.

Also, Index::$properties isn't even used anymore.

borisson_’s picture

I have opinions:

  1. I can agree with not changing $this->fields anywhere in the code.
    We probably have to keep the code to add / remove a field to the fields array in AddField/removeField though.
  2. -
  3. -
  4. I strongly disagree with storing 2 completely different object in 1 property.
  5. I agree with this, let's also throw a new exception from the addField method when trying to save a field that already exists?
  6. -
  7. I agree on keeping the static caching for getFieldsByDatasource() and getFulltextFields().
    We should probably keep resetCaches but add resetFieldCaches, resetProcessorCaches, ...
    I think we should reset the cache of getFulltextFields on changing a field type as well, it's probably not complex to include that check (if $old_field->type != $field->type) { $this->cache[getFulltextFields] = NULL; }) and on the edge-case that this changes, it will save a ton of WTF's
  8. I agree we should make them consistent.

I agree, let's make things more consistent; that's a good goal.

We should be making changes for:
- fields
- tracker
- datasource
- processors

I think I'd prefer to have 2 properties for all of those: things + thingInstances

- fields + fieldInstances
- tracker + trackerInstance
- server + serverInstance
- datasources + datasourceInstances
- processors + processorInstances

I also created #2642284: Remove the index's "properties" property for the other remark you made, that's a very novice issue.

borisson_’s picture

As an addition to my previous comment:

The following is an example of how I think we might be able to change things, this is close to how fields work now and I think we should be doing this for all of the mentioned plugins.

class Index {
  /** Only touched in __construct + preSave */
  protected $fields;

  /** Never directly called, except in add*, remove* and get* */
  protected $fieldInstances;

  /** Only touched in __construct + preSave */
  protected $processors;

  /** Never directly called, except in add*, remove* and get* */
  protected $processorInstances;

  public function __construct() {
    $processor_plugin_manager = \Drupal::service('plugin.manager.search_api.processor');

    foreach ($this->fields as $k => $saved_information) {
      $this->fieldInstances[$k] = Field::fromSettings($saved_information);
    }
    foreach ($this->processors as $k => $saved_information) {
      $this->processorInstances[$k] = $processor_plugin_manager->createInstance($k, $saved_information);
    }
  }

  public function getFields() {
    return $this->fieldInstances; 
  }
  
  public function addField(FieldInterface $field) { 
    $this->fieldInstances[$field->getFieldIdentifier()] = $field; 
  }

  public function removeField(FieldInterface $field) {
    unset($this->fieldInstances[$field->getFieldIdentifier()]);
  }

  public function preSave() {
    $field_settings = array();
    foreach($this->fieldInstances as $k => $field) {
      $field_settings[$k] = array('settings' => $field->getSettings();
    }
    $this->fields = $field_settings;

    // do something similar for processors, ...
  }
  
}

Did I understand it correctly?

borisson_’s picture

So, after the mess we just made in #2642534: processors shouldn't be unset in Index::__sleep, processorPlugins should be, swentel noted that having things + thingInstances is not super clear.
I think we could probably do $thingSettings + $thingInstances as variables and getThings + getThingInstances as methods?

drunken monkey’s picture

We probably have to keep the code to add / remove a field to the fields array in AddField/removeField though.

Yeah, sure, but that also shouldn't touch the $this->fields array, just $this->fieldInstances.

I agree with this, let's also throw a new exception from the addField method when trying to save a field that already exists?

Makes sense, yes. That will also get rid of that insanely complex and still error-prone $would_overwrite check.

We should probably keep resetCaches but add resetFieldCaches, resetProcessorCaches, ...

Depends on what that would do. It shouldn't touch any of the $this->*Instances properties, since those aren't caches anymore.
Then the question is, when should the caches even be reset? And the answer is, I think, that only specific caches should ever be reset – the fields caches in addField() and removeField(), for example, or just the getFulltextFields() cache when a field's type changes to/from fulltext (see below).

I think we should reset the cache of getFulltextFields on changing a field type as well, it's probably not complex to include that check (if $old_field->type != $field->type) { $this->cache[getFulltextFields] = NULL; }) and on the edge-case that this changes, it will save a ton of WTF's

The complexity here is that that check would need to be in Field::setType() and then invalidate a cache on the index object, and a) Field does currently not contain any such complex logic, it's more a data wrapper, and b) all the other index method cache logic is, unless I'm mistaken, currently encapsulated in the index, which makes a lot of sense.
I feel that adding such a check and cache invalidation logic would cause unnecessary complexity and possible confusion, while I'd perceive the scenario in which it would be needed as extremely unlikely.
However, I might be mistaken, and of course you're right in saying that if it does occur it might cause some pretty large WTFs. I just can't really picture a scenario where it would.

Did I understand it correctly?

Yes, more or less.
In my opinion, though, we shouldn't load all the instances in the constructor, that can in all cases remain in the associated getter method, I'd say. We just need to make sure to load that array properly in the add*/remove* methods (and rename* for fields). But otherwise we might waste considerable performance on some page requests.

So, after the mess we just made in #2642534: processors shouldn't be unset in Index::__sleep, processorPlugins should be, swentel noted that having things + thingInstances is not super clear.
I think we could probably do $thingSettings + $thingInstances as variables and getThings + getThingInstances as methods?

The reason we currently have it like this is that the settings properties of the index directly correspond to the fields of the exported index config, and having field_settings, processor_settings, etc., there would be rather ugly in my opinion. (As would then having $this->field_settings and $this->fieldInstances.)
I guess we could override toArray() or get() to remove this direct mapping, but that is bound to cause more confusion than it solves, I'd say.

But yes, I do admit that it's currently confusing. That's also why I suggested overwriting $this->fields with the objects internally (but pretty surely another case of creating more confusion than resolving). But maybe if all the four or five sets of properties work the same it will become clearer and simpler over time.

borisson_’s picture

In my opinion, though, we shouldn't load all the instances in the constructor, that can in all cases remain in the associated getter method, I'd say. We just need to make sure to load that array properly in the add*/remove* methods (and rename* for fields). But otherwise we might waste considerable performance on some page requests.

Sure, that makes sense.

The reason we currently have it like this is that the settings properties of the index directly correspond to the fields of the exported index config, and having field_settings, processor_settings, etc., there would be rather ugly in my opinion. (As would then having $this->field_settings and $this->fieldInstances.)

I don't think it'd be ugly to have them as thing_settings, it is more verbose but I like variables to be as explicit and verbose as needed. I agree that thing_settings + thingInstances is weird.

I guess we could override toArray() or get() to remove this direct mapping, but that is bound to cause more confusion than it solves, I'd say.

This would a very search_api specific solution for a common problem, and I don't think we should do that.

But yes, I do admit that it's currently confusing. That's also why I suggested overwriting $this->fields with the objects internally (but pretty surely another case of creating more confusion than resolving).

This would a very search_api specific solution for a common problem, and I don't think we should do that.

But maybe if all the four or five sets of properties work the same it will become clearer and simpler over time.

++

drunken monkey’s picture

I don't think it'd be ugly to have them as thing_settings, it is more verbose but I like variables to be as explicit and verbose as needed. I agree that thing_settings + thingInstances is weird.

OK, yes, the first part is disputable, and probably not too bad.
Having thing_settings and thingInstances is definitely aweful, but apparently can't be helped. So just go for it, if you want! The most important thing here is probably just consistency.

As said, maybe even make the tracker settings consistent? We could even have $index->tracker_settings[$tracker_id] = $tracker_config, even though there will always just be one member in that array – would simplify the code and also make it more consistent. But would probably need good comments to not make it too confusing.

borisson_’s picture

This still breaks in a couple of wonderful ways and I can't really figure out why yet. This patch is just intermediate work and probably shouldn't be reviewed yet. I'll post an update later.
There are also some things that aren't really related in this patch. I'll figure out if we should split part of that off into another issue after tests here are green.

Status: Needs review » Needs work

The last submitted patch, 13: clean_up_caching_of-2638116-13.patch, failed testing.

The last submitted patch, 13: clean_up_caching_of-2638116-13.patch, failed testing.

borisson_’s picture

I had a little scare earlier that an index with an integer as id was broken. I added a test to verify this. This also fixes at least one test (RenderedItem). Let's see how the bot feels.

Status: Needs review » Needs work

The last submitted patch, 16: clean_up_caching_of-2638116-16.patch, failed testing.

The last submitted patch, 16: clean_up_caching_of-2638116-16.patch, failed testing.

borisson_’s picture

Status: Needs review » Needs work

The last submitted patch, 19: clean_up_caching_of-2638116-19.patch, failed testing.

The last submitted patch, 19: clean_up_caching_of-2638116-19.patch, failed testing.

drunken monkey’s picture

Wow, great work so far, thanks!
Quite a huge patch already …

A quick optical review (just Dreditor, not even in the IDE yet):

  1. +++ b/config/schema/search_api.index.schema.yml
    @@ -76,23 +76,30 @@ search_api.index.*:
    +          plugin_id:
    +            type: string
    +            label: 'The plugin ID of the datasource'
    +          settings:
    +            type: plugin.plugin_configuration.search_api_datasource.[%parent.plugin_id]
    

    This additional nesting is pretty pointless for datasources and the tracker, I'd say. The plugin_id proprety is really just needed for config schema reasons for processors, since they have both their settings and their weights nested. The other plugins can just have the ID directly mapped to the settings.

    Or did you choose this on purpose, to be even more consistent?

  2. +++ b/src/Entity/Index.php
    @@ -147,30 +139,23 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -  protected $trackerPlugin;
    +  protected $trackerInstance = FALSE;
    

    Why FALSE and not NULL?

  3. +++ b/src/Entity/Index.php
    @@ -352,53 +327,93 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -    return (bool) \Drupal::service('plugin.manager.search_api.tracker')->getDefinition($this->getTrackerId(), FALSE);
    +    $tracker = $this->getTrackerInstance();
    +    return ($tracker instanceof TrackerInterface);
    

    This will throw an exception if an illegal/unknown tracker plugin is set, which would make the method pointless.

  4. +++ b/src/Entity/Index.php
    @@ -352,53 +327,93 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    if ($settings === array()) {
    +      return 'default';
    +    }
    

    Maybe use the default_tracker setting here? (And just !$settings.)

  5. +++ b/src/Entity/Index.php
    @@ -352,53 +327,93 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    $tracker_settings = array_values($settings)[0];
    

    You can just use reset() for that. Also, use key() afterwards, then you don't need the plugin_id property.

  6. +++ b/src/Entity/Index.php
    @@ -352,53 +327,93 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +  public function getTrackerInstance() {
    +    if ($this->getTrackerSettings() === NULL) {
    +      return NULL;
    +    }
    

    Can the tracker settings ever be NULL? If so, this needs to be taken into account in getTrackerId(), too (or, better, changed). Otherwise, this should go.

  7. +++ b/src/Entity/Index.php
    @@ -352,53 +327,93 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +  public function getTrackerSettings() {
    +    return $this->tracker_settings;
       }
    

    As mentioned, I'd say we shouldn't even provide access to those raw settings – just to the tracker plugin itself, with the settings in its configuration.
    Also, especially in the case of the tracker, I wouldn't return the whole array, even if the method stays, but just the settings for the tracker that is actually set.

  8. +++ b/src/Entity/Index.php
    @@ -486,13 +501,13 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +        if (class_exists($processor_definition['class']) && empty($this->processorInstances[$name])) {
    

    While we're at it: I'm pretty sure the class_exists() isn't needed here? I think createInstance() would take care of that. We should just catch exceptions thrown from it (and then log, like currently in else here).

  9. +++ b/src/Entity/Index.php
    @@ -787,13 +802,16 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -    if (!$search_objects || $this->read_only) {
    +    if (!$search_objects) {
    +      return array();
    +    }
    +    if ($this->isReadOnly()) {
           return array();
         }
    

    Why?

  10. +++ b/src/Entity/Index.php
    @@ -787,13 +802,16 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -    if (empty($this->fields)) {
    +    if (empty($this->getFields())) {
    

    When it's a method call, I'd just use ! instead of empty().

  11. +++ b/src/Entity/Index.php
    @@ -1167,19 +1191,23 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    if (!$original->status() || !$this->status()) {
    +      throw new SearchApiException("Invalid operation, this should only be called when the index was enabled before the change and remained so.");
    +    }
    

    Why add this, I don't think that's new? I think since it's just our own, internal code, if you really want to document this an assert() would make more sense.
    (Same for the others below.)

  12. +++ b/src/Form/IndexForm.php
    @@ -262,7 +262,7 @@ class IndexForm extends EntityForm {
    -      '#default_value' => $index->hasValidTracker() ? $index->getTracker()->getPluginId() : key($tracker_options),
    +      '#default_value' => !$index->isNew() && $index->hasValidTracker() ? $index->getTrackerInstance()->getPluginId() : key($tracker_options),
    

    Why that addition? And why not just use $index->getTrackerId()?

  13. +++ b/tests/search_api_test_backend/src/Plugin/search_api/backend/TestBackend.php
    @@ -128,6 +128,18 @@ class TestBackend extends BackendPluginBase {
    +
    +    // Searching on an index without datasources is useless, return an empty
    +    // result set and log this so the admin knows what's going on.
    +    if ($datasources === array()) {
    +      $logger = \Drupal::getContainer()
    +        ->get('logger.factory')
    +        ->get('search_api');
    +      $logger->notice($this->t('Tried searching on an index with no datasources.'));
    +
    +      $results->setResultCount(0);
    +      return $results;
    +    }
    

    Does an index without datasources ever exist? It really shouldn't. And why add this to the test backend, of all places?

Also, it seems you're still writing to the field, datasource, etc. settings properties directly, not only when saving the index. Is that part still coming, or did you decide against it, or did I just not explain it very well?

Or should I just not have reviewed this at all yet? I wasn't sure whether that only applied to the patch in #13 or was still valid for #16.
Anyways, I hope some of my comments still help.

borisson_’s picture

  1. I did this to be more consistent, yes
  2. Fixed.
  3. Reverted.
  4. Fixed.
  5. Uses reset() now, using plugin for consistency
  6. Fixed
  7. Will change this later
  8. Fixed
  9. Reverted.
  10. Fixed
  11. This was in the documentation of the method and was never reflected in the actual code, I can revert this if you want.
  12. This addition was because of some failures I had; possibly only during testing though.
    switched to getTrackerId, good suggestion.
  13. I encountered this while testing, will remove when tests are green again

I will do more work about writing into the fields later on; I'd like to get the tests green before trying that.

Status: Needs review » Needs work

The last submitted patch, 23: clean_up_caching_of-2638116-23.patch, failed testing.

The last submitted patch, 23: clean_up_caching_of-2638116-23.patch, failed testing.

borisson_’s picture

Status: Needs review » Needs work

The last submitted patch, 26: clean_up_caching_of-2638116-26.patch, failed testing.

The last submitted patch, 26: clean_up_caching_of-2638116-26.patch, failed testing.

drunken monkey’s picture

OK, then let's go with consistency for the plugins and keep plugin_id.

This was in the documentation of the method and was never reflected in the actual code, I can revert this if you want.

Then better just change the documentation, yes, but maybe really do add an assert().

I will do more work about writing into the fields later on; I'd like to get the tests green before trying that.

Since there are otherwise some weird problems with caching, implementing this might actually help with the tests.
On the other hand, since it seems you didn't yet remove and resetCaches() calls, that shouldn't be much of a problem. So, yes, proceed as you like.

borisson_’s picture

Changed those calls to asserts, also removed [g|s]et[Processor|Field|Tracker]Settings. Did some more work on DependencyRemovalTest, but that still fails. Loads of schema errors but I'd like to see what else fails.

Status: Needs review » Needs work

The last submitted patch, 30: clean_up_caching_of-2638116-30.patch, failed testing.

The last submitted patch, 30: clean_up_caching_of-2638116-30.patch, failed testing.

borisson_’s picture

Status: Needs work » Needs review
FileSize
6.09 KB
82.36 KB

Status: Needs review » Needs work

The last submitted patch, 33: clean_up_caching_of-2638116-33.patch, failed testing.

The last submitted patch, 33: clean_up_caching_of-2638116-33.patch, failed testing.

borisson_’s picture

Fixes at least he RendereditemTest, should also fix the LanguageIntegrationTest.

This patch + the one in #33 also add a new test in the rendereditemTest that was really useful while getting the other bug fixes out. Not sure if we should keep this in.

Status: Needs review » Needs work

The last submitted patch, 36: clean_up_caching_of-2638116-36.patch, failed testing.

The last submitted patch, 36: clean_up_caching_of-2638116-36.patch, failed testing.

drunken monkey’s picture

This patch + the one in #33 also add a new test in the rendereditemTest that was really useful while getting the other bug fixes out. Not sure if we should keep this in.

Sure, why not? It's a Kernel test anyways, so cheap enough, I'd say. Can't do any harm, in any case.

I guess I'll just wait until the tests pass before reviewing again. Or would you like me to take a look right away? Just say so/ping me.

Nick_vh’s picture

FileSize
9.02 KB
94.65 KB

Gave it a try to reduce some of the errors in the dependency test. Most of them were in the test themselves. What I ended up with is the following:

Starting test 'Drupal\Tests\search_api\Kernel\DependencyRemovalTest::testTrackerDependency with data set "Remove dependency" (true)'.

$this->index->get('tracker_settings'); equals to
Array
(
)

Starting test 'Drupal\Tests\search_api\Kernel\DependencyRemovalTest::testTrackerDependency with data set "Keep dependency" (false)'.
F

$this->index->get('tracker_settings'); equals to
Array
(
    [default] => Array
        (
            [plugin_id] => default
            [settings] => Array
                (
                )

        )

)

This means that when the dependency is set for removal it deletes the whole tracker settings array and it does not revert to the default tracker. When the dependency is not set for removal it somehow ends up with the default tracker. This leads me to believe that the new tracker we actually want to install to test our dependencies on is not really installed or configured correctly.

Nick_vh’s picture

Status: Needs work » Needs review
Nick_vh’s picture

FileSize
94.65 KB

Wrong extension...

Status: Needs review » Needs work

The last submitted patch, 42: 2638116-40.patch, failed testing.

The last submitted patch, 42: 2638116-40.patch, failed testing.

borisson_’s picture

@Nick_vh and I discussed this, we figure it's a smarter idea to try starting over. attached are 2 patches, 1 with the new tests added over the previous patches + 1 with changes to the field-storage. Testsuite is green.

Status: Needs review » Needs work

The last submitted patch, 45: 2645098-2--clean_up_field_class.patch, failed testing.

The last submitted patch, 45: 2645098-2--clean_up_field_class.patch, failed testing.

borisson_’s picture

borisson_’s picture

Status: Needs work » Needs review

Go testbot, go.

borisson_’s picture

borisson_’s picture

DependencyRemovalTest::testDatasourceDependency, both @Nick_vh and I don't really understand these lines:

    // Depending on whether the plugin should have removed the dependency or
    // not, make sure the right action was taken.
    $datasources = $this->index->get('datasource_settings');
    if ($remove_dependency) {
      $this->assertContains('search_api_test_dependencies', array_keys($datasources), 'Datasource not removed');
      $this->assertEmpty($datasources['search_api_test_dependencies']['settings'], 'Datasource settings adapted');
    }
    else {
      $this->assertNotContains('search_api_test_dependencies', array_keys($datasources), 'Datasource removed');
      $this->assertArrayNotHasKey('search_api_test_dependencies', $datasources, 'Datasource config removed');
    }

That test currently passes but I don't really understand it. We should probably explain more in code. There are several other things that fail regarding item counts. I have no idea why those fail either, just uploading this interdiff for now and diving back in.

Status: Needs review » Needs work

The last submitted patch, 51: clean_up_caching_of-2638116-51.patch, failed testing.

The last submitted patch, 51: clean_up_caching_of-2638116-51.patch, failed testing.

drunken monkey’s picture

$remove_dependency determines whether the test plugin in question (in your case, the datasource) should remove the dependency from its configuration.
If $remove_dependency == TRUE, in Plugin::onDependencyRemoval() it clears its configuration (and thus its dependency, in those test plugins) and returns TRUE, which the index will take as "all OK, dependency removed" and leave the plugin where it is, only with updated configuration.
If $remove_dependency == FALSE, Plugin::onDependencyRemoval() will do nothing and just return FALSE, the index says "oh, that plugin still has that removed depenency, so I should better remove the plugin" (yes, indexes can talk) and the plugin gets removed.

Therefore, if ($remove_depenendcy) we check that the datasource plugin is still there (was not removed from the index in Index::onDependencyRemoval() but that it's configuration was correctly updated (to an empty array, in the case of our test plugins).
else we check that the datasource really was removed, i.e., doesn't appear in $index->datasources anymore. (Though I see now that there is a small typo/mistake/copy-paste error in there, because the two lines in else of course test the exact same thing in different ways.)

Does that answer your question? Otherwise, what exactly is unclear?

Anyways, you are of course welcome to update the comment to something that makes sense to you. I'm usually not very good at seeing what is obvious and what needs explanation, and how to best explain (or name) things.

borisson_’s picture

FileSize
8.33 KB

I added that as a comment in the code. This makes more sense but I can do this is another issue or revert the change if you want.

borisson_’s picture

borisson_’s picture

Status: Needs work » Needs review

Run all the tests, testbot.

Status: Needs review » Needs work

The last submitted patch, 56: clean_up_caching_of-2638116-56.patch, failed testing.

The last submitted patch, 56: clean_up_caching_of-2638116-56.patch, failed testing.

borisson_’s picture

Status: Needs review » Needs work

The last submitted patch, 60: clean_up_caching_of-2638116-60.patch, failed testing.

The last submitted patch, 60: clean_up_caching_of-2638116-60.patch, failed testing.

borisson_’s picture

Improved the description in the schema.yml file. Still fails

drunken monkey’s picture

That was really hidden in a mean way.

Status: Needs review » Needs work

The last submitted patch, 64: 2638116-64--clean_up_index_caching.patch, failed testing.

The last submitted patch, 64: 2638116-64--clean_up_index_caching.patch, failed testing.

drunken monkey’s picture

The three remaining fails seem to be a different problem, which will hopefully be easier to solve. Can I leave that to you?
Otherwise, just ping me.

borisson_’s picture

Fixes remaining tests. They were new assertions, added in #56. The fails were a case a bad counting on my side, they are green now.

Thank you so much for your awesome help @drunken monkey.

Status: Needs review » Needs work

The last submitted patch, 69: clean_up_caching_of-2638116-69.patch, failed testing.

The last submitted patch, 69: clean_up_caching_of-2638116-69.patch, failed testing.

borisson_’s picture

Rerolled.

borisson_’s picture

borisson_’s picture

borisson_’s picture

borisson_’s picture

Still fails in a couple of places, getting this up as intermediate work. - removed get/set Processor Settings.

Also adds a couple of extra checks in the ProcessorIntegrationTest.

Status: Needs review » Needs work

The last submitted patch, 76: clean_up_caching_of-2638116-76.patch, failed testing.

The last submitted patch, 76: clean_up_caching_of-2638116-76.patch, failed testing.

borisson_’s picture

Fixes one of the fails; this is of a new assertion added in #76, so oops. 2 other fails still expected.

What is weird is that the failure occurs in checkProcessorChanges and configureFilter, when triggered trough testFramework but not when triggered trough testIntegerIndex.

Looking into that.

Status: Needs review » Needs work

The last submitted patch, 79: clean_up_caching_of-2638116-79.patch, failed testing.

The last submitted patch, 79: clean_up_caching_of-2638116-79.patch, failed testing.

borisson_’s picture

Fixes the remaining 2 failures. I think this is ready for an initial round of reviews. We should improve the caching as well but this is a big step in the right direction imho.

borisson_’s picture

Status: Needs work » Needs review

Go testbot, go!

Status: Needs review » Needs work

The last submitted patch, 82: clean_up_caching_of-2638116-82.patch, failed testing.

The last submitted patch, 82: clean_up_caching_of-2638116-82.patch, failed testing.

drunken monkey’s picture

Wow, awesome job! That patch is insane … (Although it might be even worse to review, with all those refactoring changes the IDE does automatically …)
And still a lot to do, I'm afraid. I really feel bad that you took this issue …
If you want, we can also swap? I think I've solved most of the hard problems in the Views fields issue (haven't yet posted a new patch, though), so maybe that would be fairer.

Anyways, my patch review:

  1. +++ b/search_api_db/src/Plugin/search_api/backend/Database.php
    @@ -1287,7 +1287,7 @@ class Database extends BackendPluginBase {
    -                if (empty($index->getProcessorSettings()['search_api_tokenizer']['status'])) {
    +                if (empty($index->getProcessors(FALSE)['search_api_tokenizer']->getConfiguration()['status'])) {
    

    That line was completely out-dated even before the patch. Let's fix it.

    Also, I actually want to get rid of that $only_enabled parameter, it's almost never used and just complicates things (a lot).
    Incidentally: doesn't it complicate this patch, too? Without it, you could work directly with the properties, but with it you have one intermediate step again, and dealing with datasources and processors becomes both more complicated and inconsistent (compared to the tracker and fields).

  2. +++ b/src/Entity/Index.php
    @@ -147,30 +138,28 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -  protected $tracker_config = array();
    +  protected $tracker_settings = array(
    +    'default' => array(
    +      'plugin_id' => 'default',
    +      'settings' => array()
    +    ),
    +  );
    

    Would initializing this with an empty array cause any errors? I'd say this is a pretty unclean way of setting a default for the tracker (and it also doesn't (and can't) use the config setting). Even though it's of course just the translation of the previous one, I get that.

  3. +++ b/src/Entity/Index.php
    @@ -352,9 +341,10 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    return array_intersect_key($this->datasourceInstances, array_flip(array_keys($this->datasource_settings)));
    

    array_flip(array_keys())? Seems like a no-op (in this case).

  4. +++ b/src/Entity/Index.php
    @@ -480,19 +482,43 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +  public function addProcessor(ProcessorInterface $processor, array $settings = array()) {
    +    $this->processor_settings[$processor->getPluginId()] = array(
    +      'plugin_id' => $processor->getPluginId(),
    +      'weights' => array(),
    +      'settings' => $settings,
    +    );
    +
    +    // Reset processors static cache.
    +    $this->processorInstances = array();
    

    The weights should be set to the defaults, I reckon, passing the settings shouldn't be necessary and I'd just add it to the processorInstances property – if we directly manipulate the processors only in a few methods on the index, we can just keep the array up-to-date.

    As said earlier: not a "cache", but really the canonical representation. Then we wouldn't even change the processor_settings property there, but just in preSave() write back the configuration from the plugins. (Would need a storage and retrieval mechanism for processors on the processor plugin, though – that could be tricky, or would at least be another larger framwork change.)

    Also, if we do reset the processorInstances property, I'm pretty sure it should be set to NULL. And the check in loadProcessors() changed to isset(). Doesn't make sense this way, you could never cache an empty processors array. (Although that won't happen in practice anyways, of course, due to the locked processors.)

  5. +++ b/src/Entity/Index.php
    @@ -578,7 +589,7 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -    $this->fields[$field_id] = $field->getSettings();
    +    $this->field_settings[$field_id] = $field->getSettings();
    

    As above: don't work with field_settings, work with fieldInstances. Take care of the settings only on save. That should save a lot of trouble with the cache (which is the whole point of this issue).

    And with the addition of a getSettings() method to IndexPluginInterface (maybe even on a new common interface with FieldInterface) you could simplify the code for that a lot – just reset the *_settings arrays, go through *Instances and save getSettings() back. (Would also require the above change for weights on the processor plugin, though. So maybe really do that?)

  6. +++ b/src/Entity/Index.php
    @@ -1234,8 +1231,8 @@ class Index extends ConfigEntityBase implements IndexInterface {
       protected function reactToProcessorChanges(IndexInterface $original) {
    -    $original_settings = $original->getProcessorSettings();
    -    $new_settings = $this->getProcessorSettings();
    +    $original_settings = $original->getProcessors();
    +    $new_settings = $this->getProcessors();
    

    The whole code in this method now seems quite confusing, with "settings" as names for arrays of processor objects, and then even the additional $processors array. I cleaned that up a bit.

    However, this part:

  7. +++ b/src/Entity/Index.php
    @@ -1268,7 +1265,7 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -            if ($processors[$key]->requiresReindexing($old_processor_settings, $new_processor_settings)) {
    +            if ($processors[$key]->requiresReindexing($old_processor_settings->getConfiguration(), $new_processor_settings->getConfiguration())) {
    

    is actually wrong now, since the method parameters should be arrays containing both weight and settings.

    Maybe another reason to store the weights inside the processors – maybe even just in the settings? Doesn't seem to make much sense to store them elsewhere, it would simplify the code and we could even get rid of plugin_id in all plugin settings properties.

  8. +++ b/src/Entity/Index.php
    @@ -1522,18 +1519,30 @@ class Index extends ConfigEntityBase implements IndexInterface {
         // The handling of how we translate plugin changes back to the index varies
         // according to plugin type, unfortunately.
    

    No, it doesn't (anymore). This can be fixed and unified now. (But could also be done in a follow-up.)

  9. +++ b/src/Form/IndexForm.php
    @@ -526,15 +526,17 @@ class IndexForm extends EntityForm {
    -    $index->set('datasource_configs', $datasource_configuration);
    +    $index->set('datasource_settings', $datasource_settings);
    

    Why does this not use methods like for processors and fields?
    Same for the tracker settings below.

  10. +++ b/src/Tests/Processor/ProcessorIntegrationTest.php
    @@ -43,18 +48,68 @@ class ProcessorIntegrationTest extends WebTestBase {
    +    $this->assertEqual(sort($enabled), array_keys($this->loadIndex()->getProcessors()));
    

    Here and in the rest of the method: sort() returns a boolean and sorts the array by reference – comparing its return value won't do what intended.
    Also, why even add this? This should already be taken care of in enableProcessor(), more or less. Maybe check in the end whether all of them are enabled, if you like, but I don't think checking after every method call will get us anything.

  11. +++ b/src/Tests/Processor/ProcessorTestBase.php
    @@ -87,37 +88,41 @@ abstract class ProcessorTestBase extends EntityUnitTestBase {
    +    $field_subject = new Field($this->index, 'subject');
    +    $field_subject->setType('text');
    +    $field_subject->setPropertyPath('subject');
    +    $field_subject->setDatasourceId('entity:comment');
    +    $field_subject->setLabel('Subject');
    +
    +    $field_title = new Field($this->index, 'title');
    +    $field_title->setType('text');
    +    $field_title->setPropertyPath('title');
    +    $field_title->setDatasourceId('entity:node');
    +    $field_title->setLabel('Title');
    +
    +    $this->index->addField($field_subject);
    +    $this->index->addField($field_title);
    +
    +    if ($processor) {
           /** @var \Drupal\search_api\Processor\ProcessorPluginManager $plugin_manager */
           $plugin_manager = \Drupal::service('plugin.manager.search_api.processor');
           $this->processor = $plugin_manager->createInstance($processor, array('index' => $this->index));
    +      $this->index->addProcessor($this->processor, array());
         }
    

    I see no reason why fields and processors shouldn't just use set(), or be included in the initial $values, in a test case. We don't need to use the API there.

  12. +++ b/tests/src/Kernel/DependencyRemovalTest.php
    @@ -79,13 +87,13 @@ class DependencyRemovalTest extends KernelTestBase {
    -   * @param bool $remove_dependency
    +   * @param bool $dependency_removal_return_value
    

    I'd say that's a bit overly verbose for a parameter name – and it also doesn't really express the whole meaning, since it doesn't only influence the return value, but also the functionality of Plugin::onDependencyRemoval().
    If you find the parameter is unclear, please expand the (doc) comments, but I don't think the rename helps.

  13. +++ b/tests/src/Kernel/DependencyRemovalTest.php
    @@ -210,15 +223,14 @@ class DependencyRemovalTest extends KernelTestBase {
    +      $this->assertContains('search_api_test_dependencies', array_keys($datasources), 'Datasource not removed');
    

    Instead of assertContains() and array_keys(), just use assertArrayHasKey().

  14. +++ b/tests/src/Kernel/DependencyRemovalTest.php
    @@ -210,15 +223,14 @@ class DependencyRemovalTest extends KernelTestBase {
    +      $this->assertNotContains('search_api_test_dependencies', array_keys($datasources), 'Datasource removed');
    +      $this->assertArrayNotHasKey('search_api_test_dependencies', $datasources, 'Datasource config removed');
    

    Those two lines test exactly the same thing, one can be removed now. (Keep the lower one, but with the message of the upper one.)

  15. +++ b/tests/src/Kernel/DependencyRemovalTest.php
    @@ -439,13 +461,24 @@ class DependencyRemovalTest extends KernelTestBase {
    +   * take as "all OK, dependency removed" and leave the plugin where it is,
    +   * only with updated configuration.
    

    Line could be broken later!
    Hah! That I live to see this day … ;P

  16. +++ b/tests/src/Kernel/DependencyRemovalTest.php
    @@ -439,13 +461,24 @@ class DependencyRemovalTest extends KernelTestBase {
    -      'Remove dependency' => array(TRUE),
    -      'Keep dependency' => array(FALSE),
    +      'Keep dependency' => array(TRUE),
    +      'Remove dependency' => array(FALSE),
    

    Why? That doesn't make any sense. For TRUE, the plugin is kept, yes, but the dependency isn't.

Also, we could now probably unify most of the methods for all three plugin types, right? Could save quite a bit of code. But that's also possible in a follow-up, of course. (While we're at it, we could also add a public createPluginInstance() method to the Index class, which would be an easy way to substitute the current $only_enabled parameter on the getter methods.)
In general, we should probably take the same approach here as in #2574969: Add a Views-like UI for adding fields: make it work, then commit it, then polish it.

And to summarize my checklist for whether this issue is RTBC yet:

  • Code style (as always).
  • Tests pass (as always).
  • Index::resetCaches() is gone.

A bit of a simplification, but that's the main goal of this issue, I guess. I'd also just remove our persistent cache for now, and see later if we really need anything cached. Due to #2574969: Add a Views-like UI for adding fields it serves considerably less purpose now, I'd say. "Get rid of it and add it once someone complains" seems to be the smart move here to me.

borisson_’s picture

Status: Needs work » Needs review
FileSize
15.22 KB
84.12 KB
  1. You fixed that in your patch, thanks. Not sure about the $only_enabled parameter for now.
  2. It would, because of schema errors, I agree this looks ugly but it replaces 2 properties (both $tracker_config and $tracker)
  3. Yup, thanks for fixing
  4. This is another big change; but I like it. The change to NULL is easy, done.
  5. Sure, I'll see if I can do that.
  6. Thanks, this does look better.
  7. This is the reason the tests are currently still failing as well.

    I like your notice of removing the plugin_id and moving the weights to the settings.

  8. Let's do that in a follow-up and leave this as-is for now.
  9. It should, let's do that.
  10. I added those because I broke a couple of things while porting code and I broke so much.
    This helped in fixing the code. (The problem I had was that all processors were enabled all the time and the tests didn't catch that). I only added the sorts after I fixed the code for the first 2 processors there. I've changed the test so it actually tests what we expect it to test now.
  11. I don't think using the API hurts either.
  12. Sure, reverting that change. the new doc comment for the provider already helps a ton.
  13. Yep, done, thanks
  14. Done
  15. :-D I had to reflow the lines again because of the variable change in 14.
  16. The doc comment actually makes everything clear now.

Remaining work:

  1. Don't work with _settings arrays as much. see #86.4 & #86.5
  2. Remove all notices of plugin_id and use the array key as plugin_id
  3. Move weights into settings for processors
  4. Add addDataSource / removeDataSource and setTracker as methods and remove all $index->[s|g]et("key", $value); calls for datasources and tracker with those.
  5. Create a followup to unify code for plugin types.
  6. remove resetCaches

Status: Needs review » Needs work

The last submitted patch, 87: clean_up_caching_of-2638116-87.patch, failed testing.

The last submitted patch, 87: clean_up_caching_of-2638116-87.patch, failed testing.

drunken monkey’s picture

It would, because of schema errors, I agree this looks ugly but it replaces 2 properties (both $tracker_config and $tracker)

Why would having no tracker in the array lead to schema errors? That doesn't seem to make sense to me.
Also, we still would never save an index without a tracker, it's merely about initializing. So the config schema shouldn't notice that change.
Or is the tracker actually never set currently? Then fixing that would of course be a separate issue.

I don't think using the API hurts either.

Of course not, it just looks weird to have both styles in the same method. But yes, not a big issue, just keep it like that if you prefer.

borisson_’s picture

Attached is the work I did on this patch yesterday, I moved the weights into settings and still get the same failures. For some reason there's a difference between $old_processors and $new_processors in IndexProcessorsForm. I'll try figuring that out further now. I really want to get tests green before doing any of the other todo items.

Status: Needs review » Needs work

The last submitted patch, 91: clean_up_caching_of-2638116-91.patch, failed testing.

The last submitted patch, 91: clean_up_caching_of-2638116-91.patch, failed testing.

borisson_’s picture

This fixes the failing IntegrationTest, I feel very sad that this 2kb fix cost me close to a day of work.
Anyway, next up is removing all calls to resetCaches.

borisson_’s picture

borisson_’s picture

Removed resetCaches, still 1 fail in \Drupal\Tests\search_api\Kernel\CustomDataTypesTest that I can't explain.

Status: Needs review » Needs work

The last submitted patch, 96: clean_up_caching_of-2638116-96.patch, failed testing.

The last submitted patch, 96: clean_up_caching_of-2638116-96.patch, failed testing.

borisson_’s picture

I figured this was something related to the caching of the field, but it looks like it isn't. More lines of testcode - still the same fail.

Status: Needs review » Needs work

The last submitted patch, 99: clean_up_caching_of-2638116-99.patch, failed testing.

The last submitted patch, 99: clean_up_caching_of-2638116-99.patch, failed testing.

drunken monkey’s picture

    $this->index->getField('name')
      ->setType('search_api_test_data_type')
      ->setLabel("Test");

    $name_field = $item->getField('name');

Since you removed caching from getFields(), this can't work. $this->index->getField() will return a new field object, you set something on it, but right afterwards it's lost forever and can't possibly influence $item->getField(), which will get a completely new field object from the index.

The proper solution here is the next step – i.e., have one Index::$fieldInstances property with all the field objects, which you only ever populate a single time during a page request, and write back to field_settings in preSave().
For the moment, though, just calling $index->addField() should also do the trick.

borisson_’s picture

I spent some work on that removing all $index->set(thing, value) and $index->get(thing) calls and got the tracker to work, that one's easiest. I also spent about 2 hours on doing the same for datasources and another hour on doing that for processors as well, but those are a little bit harder.

Anyways, patch attached and I'm trying to do more on this today. I would've loved to get this patch in before sitting down to discuss all caching related things with Wim tomorrow.

Also: hurray 30+ patches.

borisson_’s picture

Part of the processor changes are now done, still some fails in the integration tests that I can't pinpoint now but my day 's done for now.

Status: Needs review » Needs work

The last submitted patch, 104: clean_up_caching_of-2638116-104.patch, failed testing.

The last submitted patch, 104: clean_up_caching_of-2638116-104.patch, failed testing.

Berdir’s picture

I've just look at the issue summary which is probably outdated at this point and doesn't contain the solution but that sounds vey similar to the issues that we've been fighting with in #2541206: Consider field storage dependency removal on Index? If this addresses that, then we can hopefully simplify that other patch and finally get it in :)

borisson_’s picture

I'm not sure if this patch will fix the other issue. This will 100% break that patch though. I did more work on #102 - adding a $fieldInstances variable that is the canonical storage.

Status: Needs review » Needs work

The last submitted patch, 109: clean_up_caching_of-2638116-109.patch, failed testing.

The last submitted patch, 109: clean_up_caching_of-2638116-109.patch, failed testing.

Berdir’s picture

Sorry, I possibly wasn't clear. I don't expect that it will *fix* that issue. But while testing that, with tests and manually, we also did run into all kinds of issues with persistent and static caching of $this->fields. We tried to solve that there too. If this does indeed fix that problem then it should allow us to focus on the actual problem there.

And yes, I expect it will conflict, that's OK :)

drunken monkey’s picture

I've just look at the issue summary which is probably outdated at this point and doesn't contain the solution but that sounds vey similar to the issues that we've been fighting with in #2541206: Consider field storage dependency removal on Index? If this addresses that, then we can hopefully simplify that other patch and finally get it in :)

Yes, exactly that is the plan, and more or less the main motivation behind this issue: solve all those intricate problems, like the one in the other issue, caused by our chaotic caching code.

borisson_’s picture

I don't understand the remaining fail yet but it's going in the right direction.

Status: Needs review » Needs work

The last submitted patch, 114: clean_up_caching_of-2638116-113.patch, failed testing.

The last submitted patch, 114: clean_up_caching_of-2638116-113.patch, failed testing.

drunken monkey’s picture

Awesome job, Joris, looks really great already!
Since I'm gonna be on vacation next week, which is of course pretty bad timing, I'm gonna review this now so hopefully when I get back I can just commit. ;)

  1. +++ b/config/schema/search_api.processor.schema.yml
    @@ -2,6 +2,12 @@ plugin.plugin_configuration.search_api_processor.add_url:
    +    weights:
    +      type: sequence
    +      label: 'The processor''s weights for the different processing stages'
    +      sequence:
    +        type: integer
    +        label: 'The processor''s weight for this stage'
    

    Maybe just create a new search_api_processor_config config type and use that as a base instead of mapping? That way you wouldn't have to copy that "weights" definition to all individual processor definitions.
    Views handlers, e.g., work a lot with that (since they all have a gazillion options, and mostly from the base class).

  2. +++ b/src/Entity/Index.php
    @@ -368,23 +368,42 @@ class Index extends ConfigEntityBase implements IndexInterface {
       public function getTrackerId() {
    -    return $this->tracker;
    +    return $this->getTrackerInstance()->getPluginId();
       }
    

    We can't use getTrackerInstance() in getTrackerId(), otherwise this and (worse) hasValidTracker() might throw an exception.
    It should be the other way round – you already have a way to get the tracker ID in getTracker(), just move it to getTrackerId() instead and call that there.

  3. +++ b/src/Entity/Index.php
    @@ -453,15 +478,15 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -        if (!empty($processor_settings[$name]['weights'][$stage])) {
    -          $processor_weights[$name] = $processor_settings[$name]['weights'][$stage];
    +        if (!empty($processor_settings[$name]['settings']['weights'][$stage])) {
    +          $processor_weights[$name] = $processor_settings[$name]['settings']['weights'][$stage];
    

    I think weights should have getters and setters. We store it in the processor configuration by default, but I really don't think we should hard-code it – it's still the processor's "autonomous" configuration. (Also, that way the code for getting the default can just be internal to processors and we don't have to deal with it here (and in the admin UI).)

  4. +++ b/src/Entity/Index.php
    @@ -596,15 +629,14 @@ class Index extends ConfigEntityBase implements IndexInterface {
    -      throw new SearchApiException(new FormattableMarkup('%field_id is a reserved value and cannot be used as the machine name of a normal field.', $args));
    +      throw new SearchApiException(new FormattableMarkup('%field_id already exists and can\'t be used as a new field id.', $args));
    

    Minor nit-pick: use double quotes here instead of escaping the apostrophe.

  5. +++ b/src/Entity/Index.php
    @@ -1073,33 +1009,61 @@ class Index extends ConfigEntityBase implements IndexInterface {
    +    $this->field_settings = array();
    +    foreach ($this->getFields() as $field_id => $field) {
    +      $this->field_settings[$field_id] = $field->getSettings();
    +    }
    

    Exactly what I meant, great!
    However, instead of doing it twice, maybe just do it once at the very end of the method. And instead of resetting the locked/hidden flags in the settings, just reset them on the field objects and see whether the processors change them back.
    I don't know whether that will work, though, there's probably a reason why you're currently doing it twice.

  6. +++ b/src/Entity/Index.php
    @@ -1073,33 +1009,61 @@ class Index extends ConfigEntityBase implements IndexInterface {
         // We first have to check for locked processors, otherwise their
         // preIndexSave() methods might not be called in the next step.
    +    $this->processor_settings = array();
         foreach ($this->getProcessors(FALSE) as $processor_id => $processor) {
    -      if ($processor->isLocked() && !isset($this->processors[$processor_id])) {
    -        $this->processors[$processor_id] = array(
    -          'processor_id' => $processor_id,
    -          'weights' => array(),
    +      if ($processor->isLocked() && !isset($this->processor_settings[$processor_id])) {
    +        $this->processor_settings[$processor_id] = array(
    +          'plugin_id' => $processor_id,
               'settings' => array(),
             );
           }
         }
     
    -    // Reset the cache so the used processors and fields will be up-to-date.
    -    $this->resetCaches();
    +    foreach ($this->getProcessors() as $processor_id => $processor) {
    +      $this->processor_settings[$processor_id] = array(
    +        'plugin_id' => $processor_id,
    +        'settings' => $processor->getConfiguration(),
    +      );
    +    }
    +
    +    // Reset the canonical representation of the processors by running
    +    // ::getProcessors() again here after resetting them back to NULL. This is
    +    // only needed when creating a new Index.
    +    $this->processorInstances = NULL;
    +    $this->getProcessors();
    

    (Mega-quote, sorry.)
    It seems to me that if you merge the writing of the settings with the locked-status check above, and update the processors with addProcessor() accordingly, you can get rid of those ugly last two lines.
    But I might be mistaken, of course, there seems to have been a need for them.

  7. +++ b/src/Entity/Index.php
    @@ -1560,22 +1536,18 @@ class Index extends ConfigEntityBase implements IndexInterface {
           foreach ($plugin_configs as $plugin_id => $plugin_config) {
             switch ($plugin_type) {
               case 'processors':
    -            $this->processors[$plugin_id]['settings'] = $plugin_config;
    +            $this->processor_settings[$plugin_id]['settings'] = $plugin_config;
                 break;
               case 'datasources':
    -            $this->datasource_configs[$plugin_id] = $plugin_config;
    +            $this->datasource_settings[$plugin_id]['settings'] = $plugin_config;
                 break;
               case 'tracker':
    -            $this->tracker_config = $plugin_config;
    +            $this->tracker_settings[$plugin_id]['settings'] = $plugin_config;
                 break;
             }
           }
    

    I think this whole block can just be removed now.
    However, a lot of onDependencyRemoval() can now be improved, so we can also just leave it here and overhaul the whole method in a follow-up.

  8. +++ b/src/Item/Field.php
    @@ -167,7 +167,7 @@ class Field implements \IteratorAggregate, FieldInterface {
    -    if ($this->index->id() != $index->id()) {
    +    if ($this->index instanceof IndexInterface && $this->index->id() != $index->id()) {
    

    I don't see any reason for that, I see no way how this could be NULL, or not an index. (Except a botched unserialization, but at that point I'd say failing is a pretty good option.
    Or, at the least, we should then check for whether the $this->indexId property is there and compare that. (And not unset it in __wakeup() if the index load failed.))
    Is this needed for some tests, maybe? In that case, maybe just leave it in there and create a follow-up for fixing this according to my suggestion.

But all in all, again, great work! Thanks!
And maybe one of my suggestions will actually fix the patch fail, some of them could become bugs in certain cases. (Though the test fail doesn't look like it.)

Once you think this is RTBC, I guess either work on Facets, or create Search API patches already based off of your patch, and keep branches so you can easily rebase them?
Sorry, really bad timing with the vacation, having such an uncommitted giant lying around.

borisson_’s picture

#117

  1. Sure, let's do that in a non-critical follow-up?
  2. I think we can, that way, the only method that actually talks to the tracker_settings. It has a fallback to the default tracker - because there always has to be one tracker I don't think this will lead to any problems. The tests don't seem to think so either.
  3. Not sure if I understand this. - I'll give it a go in the next patch though.
  4. Sure, done.
  5. There was a reason, but doing it at the end is okay as well, all tests are green. Thanks for the tip.
  6. I can't get it to work with your suggestion I'm afraid, we can always do that as a non-critical follow-up though.
  7. Let's leave that in for now and do that in a non-critical follow-up
  8. I think it was needed during tests because I kept breaking all kinds of things, tests pass again though

Attached patch fixes/answers most of the remarks from #117. I don't really understand .3 - but I'll give it a go now.

Anyway, tests are green again locally - hopefully the testbot agrees with that.

borisson_’s picture

Ok so I didn't really understand #117.3 - I did however read over the entire patch again and found a couple of things that could be better, so I fixed them. Some other fixes still to come.

borisson_’s picture

There are still a couple of instances of $index->set('datasource_settings', ...) left in the code, going to rip those out now. This attached patch did that for processor_settings.

borisson_’s picture

As promised earlier - I ripped out all occurrences where datasource_settings was being talked to from outside the index class.

borisson_’s picture

drunken monkey’s picture

Again, thanks a lot for your work! Looks very good now, I guess we are about to finally commit this.

+++ b/config/schema/search_api.datasource.schema.yml
@@ -3,7 +3,7 @@
   label: 'Entity datasource configuration'
   mapping:
     default:
-      type: string
+      type: integer

Shouldn't this be a boolean? But it seems it's unrelated anyways, so maybe just open a new issue for that?

Otherwise, I have just written my comments in patch form, please see the attachments.

Status: Needs review » Needs work

The last submitted patch, 123: 2638116-123--clean_up_index_caching.patch, failed testing.

The last submitted patch, 123: 2638116-123--clean_up_index_caching.patch, failed testing.

drunken monkey’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 127: 2638116-127--clean_up_index_caching.patch, failed testing.

The last submitted patch, 127: 2638116-127--clean_up_index_caching.patch, failed testing.

drunken monkey’s picture

Status: Needs review » Needs work

The last submitted patch, 131: 2638116-131--clean_up_index_caching.patch, failed testing.

borisson_’s picture

I think the failures in #131 are unrelated, there is one change in #127 I disagree with though.

  1. +++ b/tests/src/Kernel/DependencyRemovalTest.php
    @@ -342,14 +342,14 @@ class DependencyRemovalTest extends KernelTestBase {
    -    $tracker = \Drupal::getContainer()
    +    $tracker_id = \Drupal::getContainer()
           ->get('plugin.manager.search_api.tracker')
           ->createInstance('search_api_test_dependencies', array(
             $dependency_key => array(
               $dependency_name,
             ),
           ));
    -    $this->index->setTracker($tracker);
    +    $this->index->setTracker($tracker_id);
    

    I see you changed this variable, this is not correct though, we're creating an instance of the tracker plugin, not getting the ID.

drunken monkey’s picture

Thanks for spotting that!
That's just an unintended side effect of a change further down in the same method, where $tracker was used to hold the tracker ID. All the better that you spotted it!

The test fail looks indeed very strange. Let's just give the test bot another chance …

drunken monkey’s picture

Status: Needs review » Fixed

Excellent, passing again. Got the OK from Joris, too, so: committed!
Thanks again everyone, especially of course Joris!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mpp’s picture

The following change causes a Fatal error: Nesting level too deep - recursive dependency? because old_processors != $new_processors will recursively compare an array with recursive data.

   protected function reactToProcessorChanges(IndexInterface $original) {
-    $original_settings = $original->getProcessorSettings();
-    $new_settings = $this->getProcessorSettings();
+    $old_processors = $original->getProcessors();
+    $new_processors = $this->getProcessors();
 
     // Only actually do something when the processor settings are changed.
-    if ($original_settings != $new_settings) {
+    if ($old_processors != $new_processors) {