Problem/Motivation
If you create a translation of an article in Ukrainian, for example, and then delete it, then on page /admin/config/search/pages we see the following: "There is 1 item left to index."
Running the cron does not solve the problem.
Result:
On page /admin/config/search/pages we see the following: "There is 1 item left to index."
If this is repeated for 100 articles, the indexing of the site will stop.
There is such an entry in the database:
SELECT * FROM `d8_search_dataset`
- sid = 1
- langcode = uk
- type = node_search
- data = "some text"
- reindex = 1618155506
Steps to reproduce
1. Install Drupal 8.9.13 (Latest version for now)
2. Install modules (from core): Configuration Translation, Content Translation, Interface Translation,
Language
3. /admin/config/regional/language - Add the Ukrainian language, for example
4. /admin/structure/types/manage/article - "Enable translation"
5. create an article and add a translation to it.
6. Runs Cron
7. Delete article translation.
8. Runs Cron
Proposed resolution
In Node::postSave() execture a new private method nodeSearchRemoveDeletedTranslations that will remove deleted translations from the search index.
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Problem/Motivation
Issue fork drupal-3208247
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3208247-new-branch
changes, plain diff MR !8211 /
changes, plain diff MR !7966
- 11.x
changes, plain diff MR !7804
- 3208247-after-deleting-a
changes, plain diff MR !5357
Comments
Comment #2
cilefen commentedDoes this item remain forever despite cron executions?
Comment #3
orb commentedRunning the cron does not remove this item. Over time, there will be more and more of these items.
Comment #4
cilefen commentedThis may be related to, or currently part of, this effort #2723323: [META] Clean up references to deleted entities.
Comment #5
orb commentedI don't know, but the Search module should remove the index.
Comment #10
quietone commentedI tested on Drupal 11.x and, using the steps in the Issue Summary, I can confirm the problem still exists.
Comment #13
nikolay shapovalov commentedProvided tests to display described issue.
I created draft MR.
Comment #15
sukr_s commentedunable to change the draft status on the MR
Comment #18
sukr_s commentedunable to change the draft status in MR so reverted the changes and created a new mr.
Comment #19
smustgrave commentedNot sure a hook is the best place for this. There's NodeSearch.php shouldn't that leveraged to handle a deletion. Or does content_translation need a search plugin as well?
Comment #20
smustgrave commentedComment #21
sukr_s commentedWhen a node is deleted, it's removed from the search index in the node::preDelete event. It's not handled by NodeSearch or content_moderation. On the same lines when a translation is deleted, it must be removed from the search index using the translation_delete hook.
Comment #22
smustgrave commentedStill think there could be a better way to go about it. But at a minimum believe the change and test should belong in the search module. Not all sites use the search module and even though this hook wouldn’t do anything it will still get called every translation deletion.
Comment #23
sukr_s commentedw.r.t to alternate approach to solve this, it would be far more expensive to find what was deleted in a cron run e.g. run though list of languages configured for the site, then the list of languages currently available to for the node and then to clear the index for missing languages. So I believe it's best done at the time of deletion.
I also think that this should be in search module. However this would result in hard coding the type node_search (used in clear method) in search module which would not be right since search module does not own the 'node_search' plugin. I could not find a method to dynamically determine the search plugin type based on entity type. Therefore chose to place it in the node module.
is it possible to find search plugin type based on entity type or is it ok to hard code the type in search module?
Comment #24
smustgrave commentedWill see if the sub maintainer takes a look in a few weeks.
Comment #25
pwolanin commentedNode module is also responding to various changes in hooks (e.g. comment update) and calling
node_reindex_node_search()This feels like a similar case, so I think the approach in the MR is reasonable given the limitations of the system.
Comment #26
smustgrave commentedThanks @pwolanin for taking a look!
https://git.drupalcode.org/issue/drupal-3208247/-/jobs/1453892 shows the test coverage
Applied 1 more nitpicky change but this got sign off from the sub-maintainer so seems fine solution that addresses the problem.
Comment #27
alexpottSeems tests are failing.
Comment #30
smustgrave commentedI have 0 idea how the suggestion of :void broke so many tests. But started a new MR as the branch name 11.x wasn't good for local (since that's the main branch). Applied the suggestion by @alexpott and tests are green again.
Still very weird.
Comment #31
smustgrave commentedSorry @sukr_s @alexpott for all the noise
Comment #32
alexpottI think we should open a follow-up to add a post update to clean up the incorrect data. This issue stops the rot but the bogus data hangs about.
Comment #33
smustgrave commentedOpened #3446137: Post update for clearing search index of deleted translations
Comment #34
catchOne question on the MR.
Comment #35
catchActually two questions.
Nodes are marked for reindex in Node::postSave()
If we could detect that a translation is being deleted there (can we?) then we would not need to mark the node for indexing, and then delete it again in the hook implementation added here.
Or... is there somewhere else that has this information instead?
Comment #36
sukr_s commentedchanged to postsave and invoking
node_reindex_node_search($this->id());only if the translation was not deleted.Comment #37
alexpottLeft a comment on the MR. I think we need to always mark for reindex. The reindexing is not language specific but doing this is not going to recreate any row.
Comment #38
sukr_s commentedComment #43
catchCommitted/pushed to 11.x and cherry-picked back through to 10.3.x, thanks!
Comment #49
catchI missed a test failure in the last pipeline here, which then broke HEAD, reverted from all four branches.
Comment #51
sukr_s commented"commit db83ea46 - Better names and no need to pass translations around" broke the fix, which has now been resolved by reverting the relevant changes.
Comment #52
smustgrave commentedEdit.
Tests appear green again.
Comment #53
quietone commentedI read the issue summary, comments and the MR. The found two things that need attention
1) The proposed resolution is not what is implemented in the MR.
2) This needs a follow up for https://git.drupalcode.org/project/drupal/-/merge_requests/7966#note_310136
Comment #54
alexpott@quietone the approach changed after that comment and is no longer relevant.
Comment #55
alexpottI've moved the private function to be inline - keeping old translation objects around after \Drupal\Core\Entity\ContentEntityBase::postSave feels wrong.
Comment #56
larowlanLeft a comment on the MR - is there really no way to do this from search module or similar, putting it in a specific entity class feels wrong - what about other translatable nodes?
Comment #57
catch@larowlan all the search integration for nodes is in node module, like the NodeSearch plugin. I agree it would be much nicer if search module had entity support which was then enabled for nodes by default, but for now this should stay with the rest of the integration.
Comment #58
alexpottI think catch's answer in #57 menas this can go back to rtbc.
Comment #59
quietone commentedI tested this on 11.x today and found that this does remove the deleted item from the search index, but only after a cron run. It still adds it to the search index, so that the search page shows
98% of the site has been indexed. There is 1 item left to index.after the deletion. That is incorrect. And not what I expected from reading the issue title and summary. I expected search to not want to re-index it.There are the steps I used with the diff applied.
If my testing is confirmed correct then it looks like a follow up is needed for correcting the information on /admin/config/search/pages. Or have I misunderstood something again on this issue?
Comment #60
sukr_s commentedWhen any translation is removed, the original content is also added for index. This behaviour has not changed. Pervious to this fix, after the cron run the original article would get reindexed but the removed translation would be left hanging and still gets reported as "1 item requires indexing".
After this fix, the translation is removed from the index therefore before the cron run it shows 1 item which is the original article and after the cron run, it's zero items.
Comment #63
catch#60 makes sense to me, the entity will be re-indexed but the translation won't.
Since this caused a regression before, committed to 11.1.x and cherry-picked to 10.4.x, thanks!