There is a data model issue. Taxonomy terms are content, Vocabularies are config. Changing the taxonomy term content can change the config This is wrong. If the vocabulary hierarchy is indeed content, it should be stored somewhere else. The Vocabulary hierarchy field and the taxonomy hierarchy table are two different things, don't confuse them. This is about Vocabulary.hierarchy, which is just used by the UI.
As a User Interface configuration field, there is a also UI issue. If you configure a vocabulary to be hierarchical and the user removes all child terms, making the hierarchy flat, this changes the configuration from hierarchical to flat. Why does user content override the site builder's configuration? Further, if the user makes such a change, converting the vocabulary from hierarchical to flat, they can no longer add hierarchical terms on the taxonomy edit page, they can only do so on the overviews page.
I believe that Vocabulary.hierarchy is only used by the UI:
* If it's set to 0 (disabled), it's used on the overview page to not allow re-ordering. In our case, if it's set to 0, then the UI prevents re-ordering, and our change should never be called.
* If it's set to multiple, the term edit page displays related terms differently.
Our options are:
1. Treat the Vocabulary hierarchy as configuration only. Site builders can set this. Data changes do not change the hierarchy setting.
2. Add a vocabulary content entity that stores the configuration.
3. Remove the hierarchy property from vocabularies and compute it at runtime when needed.
I think (1) is the right option.
The patch from #4 implements option 3.
Comment | File | Size | Author |
---|---|---|---|
#33 | interdiff-33.txt | 4.82 KB | amateescu |
#33 | 2957381-33.patch | 32.17 KB | amateescu |
#32 | 2957381-25-8.6.patch | 29.49 KB | GaëlG |
#25 | interdiff-25.txt | 10.52 KB | amateescu |
#25 | 2957381-25.patch | 29.33 KB | amateescu |
Comments
Comment #2
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedI think what I would do is:
- Split it in 3 parts:
* taxonomy_term, content entity
* taxonomy_hierarchy, content entity
* vocabulary, config entity
One vocabulary "contains" (like references, is combined with, ...):
- 0..1 taxonomy_hierarchy entities
- 0..n taxonomy_term entities
That also sweetly solves the use case of CPS (and related modules), allows generic hierarchy's if there is a generic hierarchy entity, and data model wise it can be in a table, but that is determined by the hierarchy entity itself.
Comment #3
douggreen CreditAttribution: douggreen at Appnovation for Pfizer, Inc. commentedComment #4
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThe fact that a change on a content entity can also affect a config entity is a huge problem for the Workspace module, because we want to disallow any config changes at the API level when you are in a non-default workspace.
So I think we should simply remove this
hierarchy
property from vocabularies altogether and replace it with a very simple run-time query:SELECT parent_target_id as parent_id, MAX(delta) AS max_delta FROM `taxonomy_term__parent` WHERE bundle = '<vid>' GROUP BY parent_target_id
Vocabulary::getHierarchy()
changes into a simple static cache for the result of that query.Without this fix, all the work we did in #2880149: Convert taxonomy terms to be revisionable is meaningless since saving a revisionable term won't be possible in a non-default workspace, so promoting to major.
Comment #6
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedMissed a few spots.
Comment #7
jibranWow, #4 is quite simple and smart soultion. We should update the IS. We also need a change record. The patch looks great this is the only thing was able to find questionable:
Do we have upgrade path tests?
Comment #8
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedWrote upgrade path tests, a change record (https://www.drupal.org/node/2979986) and updated the IS.
Comment #9
jibranThanks, patch looks ready to me. Do we need an approval from subsystem maintainer?
Comment #10
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI don't think so, but if @catch is going to look at committing it, he's also a Taxonomy maintainer :)
Comment #11
alexpottCan we document that this is not a property from configuration but now a static cache. Also we've introduced a subtle bug. If you:
It won't be updated but I think it would have been before.
Is the anonymous function actually necessary here? Won't just saving them remove the
hierarchy
because it is not in the export properties?Comment #12
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks for reviewing!
1. Very good point! You're right that the hierarchy would have been updated before. I added a static cache reset for that property in a way that should also work if you do those steps via a REST request, which was not handled before.
2. Nope, because the default callback provided by
\Drupal\Core\Config\Entity\ConfigEntityUpdater::update()
only checks if the dependencies have been changed, and returns FALSE in our case, so the entities are not saved. Here's a test run without the anonymous functions:Comment #13
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOops, the interdiff is correct but I forgot to stage the changes in git :/
Comment #15
alexpottI think we still some comment that this property is a static cache and not a config property. Mostly because this has been a config property in the past.
It'd be good to know how expensive this query is. We already have performance problems with the term overview page and this might add to them. On really large term tables I think this might be expensive.
Comment #17
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRe #15
1. As discussed on slack, let's remove that variable from
Vocabulary
altogether and move it to the place which actually computes it, the term storage :)2. Here's some performance number for that query, with various number of terms:
2649 terms:
Query result: 1307 total, Query took 0.0031 seconds.
7789 terms:
Query result: 3841 total, Query took 0.0214 seconds.
16294 terms:
Query result: 8089 total, Query took 0.0036 seconds.
At this point, I wasn't even able to load the term overview page with a 128M memory limit, so I think the performance of the query is fine :)
Comment #18
alexpott@amateescu thanks - there's another issue to fix the ability to load the page :)
I think we should deprecate this setter can just make it reset the static cache on the TermStorage - as the reality is this is not something you can set.
I don;t think we should introduce this as this is not something to set.
Comment #19
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRe #18, that's perfectly fine with me, I don't know why I thought about introducing a new setter for it :)
Comment #21
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedHere's the proper patch :)
Comment #22
jibran@alexpott that was some amazing reviews. Patch looks really nice now and change notice is also up to date so back to RTBC.
Comment #23
alexpottLet's deprecate in favour of calling the term storage - and replace core's usages.
There's no setting from outside anymore.
Nice catch. Extra bug fixes here too .... hmmm.
Should we be telling people to get it from the Vocabulary or the term storage direct? I think term storage.
Comment #25
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed all of #23, rerolled to 8.7.x and opened #2994830: TermStorage::__sleep() does not unset the 'ancestors' variable for #23.3.
Comment #26
Wim Leers#25 looks great!
Comment #28
catchApproach looks really good here, do we need to run EXPLAIN on the query with a decent sized database though?
Comment #29
catchHere's the query
And the current indexes:
Do we need something on bundle, delta? Maybe replace the current index that's just on 'bundle'?
Comment #30
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@catch, see #17 for some performance numbers. It would be interesting to see if there are any improvements with a
[bundle, delta]
index, but the result there with 16294 terms seems to indicate that it's not really needed.. I think.Comment #31
catchhmm the 'using temporary, using filesort' and that it's looking at 8k out of 16k rows from the EXPLAIN suggests to me that it does need the improved index (assuming that fixes it).
Comment #32
GaëlGHere's a backport of patch #25 for Drupal 8.6 (applies at least on 8.6.0 and 8.6.1), to be referenced in a composer.json file.
Comment #33
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI tried adding an index on
bundle, delta
but it didn't change anything, so instead I looked a bit into the reason why that query was using temporary and filesort, and apparently it's caused by theGROUP BY
clause. After inspecting our requirements more carefully I realized that we don't really need to select all parent IDs, we just need to know if we have anyparent_id > 0
, so here's the simplified query that we can use:And with an index on
bundle, delta, parent_target_id
the EXPLAIN looks like this:Comment #34
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThe performance problems with the query have been addressed, so I think the patch is ready for another review from core maintainers :)
Comment #35
catchCommitted 70e4010 and pushed to 8.7.x. Thanks!
Comment #37
Wim LeersI just want to say thank you very much, @amateescu for simplifying one bit of Drupal, which will result in one bit of complexity less in Drupal 9! 🙏
Comment #38
Manuel Garcia CreditAttribution: Manuel Garcia as a volunteer and at Appnovation for Pfizer, Inc. commentedAwesome work here guys, so glad to see this in!
Should we publish the CR or does it need any final touches?
Comment #39
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThank you both for the kind words :) I published the CR now.
Comment #40
Wim LeersThis proves the REST API was impacted. In principle, this was a BC break.
This also broke JSON:API: https://www.drupal.org/pift-ci-job/1122706
Comment #41
Wim LeersJSON:API issue: #3014289: VocabularyTest is broken by Drupal core >= 8.7 which removed the `hierarchy` key.
Comment #43
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedWhile working on #2880149: Convert taxonomy terms to be revisionable, I found a problem with the test that was added here: #3031479: TaxonomyVocabularyHierarchyUpdateTest::testTaxonomyUpdateParents() is prone to breaking on PostgreSQL