For things like #2421935: User entity is re-indexed after every user login, we should make it possible (or easier) to react on item state changes and possibly alter or prevent the calls to TrackerInterface::trackItems*().
How to best do this? I don't know. But would be nice to have, so we should look into it at some point.

Estimated Value and Story Points

This issue was identified as a Beta Blocker for Drupal 8. We sat down and figured out the value proposition and amount of work (story points) for this issue.

Value and Story points are in the scale of fibonacci. Our minimum is 1, our maximum is 21. The higher, the more value or work a certain issue has.

Value : 1
Story Points: 13

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drunken monkey’s picture

Possibly a new stage and method for processors? If there are no processors with that stage, there shouldn't* be much of a performance drawback, and it would be a rather clean solution, I guess.
Remaining question: do we need to support it for all three CUD operations, or only for updates? And, in the former case (which I think I'd prefer): do we then also need three additional stages (I'd say: no)?
In any case, I don't think we should expose this stage/these stages in the UI – or should we?

Other questions: Should there also be a hook? Should there only be a hook? And what should the exact mechanism be: return a boolean? How would this then be aggregated over multiple processors?

In any case, if we implement this, we should probably provide a solution right in the module that rejects user updates where only the last login time changed (unless that's indexed, of course). We'd have to figure out how to cleanly check which fields changed, though. Due to "related fields", just iterating over all fields is really not enough.

On the other hand, once we're there we can really just implement this for any entity update and save a whole lot of unnecessary re-indexing for all indexes everywhere. Which would of course be great.
Then we could either not implement this at all, or offer it as a method for modules to override this, when necessary.

* Probably should verify this, though. We would have to instantiate all of the index's processors when an item changes, on page loads where we didn't need them before.

m1r1k’s picture

Issue tags: +drupaldevdays

1) For me it seems like we have to split FieldsProcessorPluginBase into some other Base plugins instead of messing with stages. At least for Index, Query, SearchResults processors, because we don't have any processor that uses at least two of them (except ContentAccess, but it is mostly exception then rule)

2) Lets avoid using hooks for redcing items for indexing in D8 and use IndexProcessors only

3) For #2421935: User entity is re-indexed after every user login example I'd like to create kind of *hidden* IndexProcessor where we could put all out-of-box settings which do not require configuration and have to be enabled always. If is is acceptable I will create new processor and add ability to make it hidden.

drunken monkey’s picture

ad 1) At least half the processors preprocess during both indexing and searching, so I don't really know where that's coming from. Also, I don't see how that's relevant to this issue?

ad 2) Why?

ad 3) See #2090341: Add a way for marking fields/processors as "locked" or "hidden" for making the processor hidden. Apart from that, the question would be what other functionality you would want to add there?

m1r1k’s picture

Btw, issue is not relevant anymore, because login does not trigger entity_update (user.module:user_login_finalize()):

  $account->setLastLoginTime(REQUEST_TIME);
  \Drupal::entityManager()
    ->getStorage('user')
    ->updateLastLoginTimestamp($account);

But I will create a patch with appropriate hook. Something like this:

/**
 * Allows to skip tracking of entity on CRUD operation.
 *
 * @param \Drupal\Core\Entity\EntityInterface $entity
 *   Passed from hook_entity_delete() so contains $entity->origin.
 *
 * @param string $op
 *   'update', 'delete' or 'insert' event.
 *
 * @return boolean
 *   TRUE if given entity should not be tracked to be indexed.
 *
 * @see search_api_entity_delete()
 * @see search_api_entity_update()
 * @see search_api_entity_insert()
 */
function hook_search_api_skip_tracking(\Drupal\Core\Entity\EntityInterface $entity, $op = 'update') {
  if ($op == 'update' && $entity->getEntityTypeId() == 'user' && $entity->id() == '123') {
    return TRUE;
  }
}
drunken monkey’s picture

The hook as you sketch it would then only work for the content entity datasources of the Search API. Would be a possibility, though, having just a module-specific hook instead of a framework-level one (in which case we also shouldn't allow processors to influence this, though). Modules with other datasources would then need to provide their own hooks, if they want to.
On the other hand, if we want this to work for all datasources, we would probably have to mandate that modules providing datasources invoke the hook and/or processor method prior to tracking the operation – or change a lot how CRUD operations are tracked.

In any case, the code of the search_api_entity_*() hook implementations would get even more complex than they currently are. So, especially if the problem with the user login is resolved for now, I would wait for a decision on this issue for the moment, and concentrate on other areas.

Nick_vh’s picture

Issue summary: View changes
Issue tags: +beta blocker
drunken monkey’s picture

After a bit of discussion, the only real use case for "altering CRUD reactions" we could come up with is avoiding to track an item as updated (or, maybe, inserted).

One solution to implement this we came up with, if that's possible, is looking for a certain flag/property on the entity and skip the tracking in this case. Then those people who need this can just implement the appropriate entity CRUD hook(s) to set that flag/property before we look at it.
We'd just have to see if this is even possible with current entities.
(This would of course only be a solution for our datasources – but that should be good enough.)

Otherwise, we could still just call a hook, and either mandate that other datasources invoke it, too, or make it again specific to our entity datatsources.

drunken monkey’s picture

Status: Active » Needs review
FileSize
7.21 KB

The attached patch should implement this, and comes with tests.

Luckily, $entity->search_api_skip_tracking works very well, and even automatically returns NULL if it wasn't set previously (without issuing any PHP notices).
One question, though: Should we also unset it after checking it?(The check itself creates the value, if it wasn't there before.) Or explicitly check for isset() first? (empty() would probably do both at once in this case, too.)

Also, do we need to document this anywhere (else)?

borisson_’s picture

Status: Needs review » Reviewed & tested by the community

This should be enough documentation.

  • drunken monkey committed 946295d on 8.x-1.x
    Issue #2454517 by drunken monkey: Added support for skipping tracking of...
drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed

OK, great, thanks for your review!
Committed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.