Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Maybe this is a support request because of my brain not working...
When taxonomy_get_tree() gets called it stores the tree in static vars for that page load, but if you programmatically create a term, say with drupal_execute(), then try to add a node to that term with drupal_execute() it will tell you you have selected an illegal choice. I've attached a patch to illustrate my point, but I'm sure it could get cleaned up.
Comment | File | Size | Author |
---|
Comments
Comment #1
RobRoy CreditAttribution: RobRoy commentedSure of this now, better title.
Should the $reset just exit instead of processing the whole tree again?
I'm thinking this should probably get called on taxonomy_* API functions too because of...
Any other suggestions before I re-roll?
Comment #2
RobRoy CreditAttribution: RobRoy commentedHere is the code for D5 to cause this error. Say you wanted to create a forum container and then a child forum under that container in an install profile? Not too uncommon.
Do this and you'll get an "invalid choice selected" error from the last drupal_execute().
I guess this is a feature since you don't HAVE to use drupal_execute, so maybe for D6 we need to refactor since this patch feels a bit hackish. Could we use Drupal's caching mechanism to cache taxonomy? Then, we could do a cache_clear_all('vid:3', 'taxonomy') or whatever after modifying terms so the tree would be rebuilt.
I need some guidance from above here. ;)
Comment #3
RobRoy CreditAttribution: RobRoy commentedHere's the start for a caching patch. I haven't really tested it, just wanted to get it up now while I remember. It should probably go in it's own cache_taxonomy, but I wanted a bit of direction before heading in that direction.
Comment #4
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedJust for the record: I don't think we need an extra table for cached taxonomies. The plain cache table contains only a single entry on a default Drupal install, there is plenty of space, especially for cached itesm that won't change that often.
Comment #5
RobRoy CreditAttribution: RobRoy commentedOkay, updated this. I'm stoked! This is a huge speed boost for sites with large taxonomies. There are a couple things to point out.
1. This only caches the tree when called directly and since there is recursion I had to add another param to not cache recursive calls.
2. Some int casting for the cache ID because I'm not sure where NULL or '' is passed in, should I cast to (string) so '' and 0 return different trees?
3. On term delete I added a forward-looking feature to clear caches where a term could have a child term in a different vocabulary. Not sure if any contrib modules do this, just to be safe.
Could use some testing, but seems to work fine so far. Maybe some benchmarks too?
Comment #6
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedLooks good, some benchmakrs would indeed be helpful.
Comment #7
RobRoy CreditAttribution: RobRoy commentedOkay, I have a vocab with 509 terms and hierarchy built with devel. I did some super ghetto benchmarking using -c1 -n10 just to get an idea because running -c10 -n500 on the uncached tree was timing out!
It's pretty apparent that not generating the tree on each request is a huge speedup. I'll leave it to someone else to recommend or do a better benchmark than I have time for atm. I even left the query cache on. This is just of the taxonomy overview page and keep in mind this is on my bogged down laptop, but you should get the idea.
# ab -c1 -n10 http://local.drupalhead3.com/?q=admin/content/taxonomy/1
# HEAD without caching on taxonomy_get_tree()
Requests per second: 0.27 [#/sec] (mean)
Time per request: 3746.387 [ms] (mean)
# HEAD with caching on taxonomy_get_tree()
Requests per second: 4.42 [#/sec] (mean)
Time per request: 226.325 [ms] (mean)
Comment #8
RobRoy CreditAttribution: RobRoy commentedChanged title to sell it more. :P
Comment #9
ChrisKennedy CreditAttribution: ChrisKennedy commentedI created a vocabulary (page type, multiple hierarchy, related terms, no freetagging, multiple select, not required, 0 weight) and added a term, but "list terms" keeps showing "No terms available." and the terms aren't showing up as possible related terms or parent terms when I try to add another term.
If I then revert to HEAD the terms will show up. They are being inserted correctly but apparently the cache check isn't finding them.
Comment #10
RobRoy CreditAttribution: RobRoy commentedOkay, here is a re-roll just in case. The cache is clearing for me...hmmm.
Comment #11
RobRoy CreditAttribution: RobRoy commentedOkay, here is a re-roll just in case. The cache is clearing for me...hmmm.
Comment #12
RobRoy CreditAttribution: RobRoy commented#10 is the one. I was getting a 500 error on d.o so the last one didn't come through.
Comment #13
beginner CreditAttribution: beginner commentedpatch in #11 didn't upload.
If this issue is indeed a duplicate of http://drupal.org/node/83057 , then it is not a feature request but a bug.
Comment #14
beginner CreditAttribution: beginner commentednever mind about patch in #11. I notice a bit late that it's a duplicate post of #10.
Comment #15
RobRoy CreditAttribution: RobRoy commentedOkay, revisited this patch and think I actually got it working this time. I guess I'm a little too late for D6 as this adds an optional parameter to taxonomy_get_tree() but I will leave for the powers that be. But this is a frickin huge speedup, did you see the title? :P
Also...
- Fixed a bug where vid was being passed to taxonomy_form_term() in taxonomy_admin_term_edit() making the parent dropdown not show. Should roll into another patch if this is postponed to D7.
- Some notices.
Comment #16
RobRoy CreditAttribution: RobRoy commentedRe-roll from root folder. Oops.
Comment #17
RobRoy CreditAttribution: RobRoy commentedComplimentary issue at http://drupal.org/node/146688 it seems.
Comment #18
Wim LeersSubscribing.
Comment #19
Zothos CreditAttribution: Zothos commented*Subscribing*
get in in dudes ^^
Comment #20
catchThis needs a re-roll after the split patch went in.
Comment #21
Wim LeersReroll. While this is an API change (an additional boolean $cache parameter in taxonomy_get_tree()), it doesn't break anything. So please don't make this patch wait for D7!
The changes in the admin area of the previous patch are no longer necessary, all of that is already fixed.
This is potentially a huge speed up for taxonomy browsers, and also for the Hierarchical Select module.
Comment #22
Wim LeersForgot the patch...
Comment #23
Wim LeersComment #24
Wim LeersThe previous patch included a change to index.php. Use this one instead.
Comment #25
Wim LeersBetter title.
Comment #26
catchWell this applies cleanly, the only implementation in core that's obvious is the forum inde and that works fine with a couple of levels of hierarchy. Not sure how to go about testing the performance implication to actually get this RTBC though.
The index improvements are now over at: http://drupal.org/node/164532
Comment #27
Wim LeersSee RobRoy's tests in #7?
Comment #28
catchWell in that case it's has had reviews by various people, it's a big boost, and it's a small API change. The main boost within core would be forum_get_forums, which is a very slow query. Patch applies cleanly and doesn't throw up any errors etc.
If the API bit stops it getting into D6, it's a decision for Goba and Dries to make, so bumping to their queue.
Comment #29
Gábor HojtsyI looked through the patch. Comments:
- having a specific cache_clear_all() before a generic one looked like interesting, but I see that the generic only clears cache_page and cache_block, so we need the specific clears still
- removing the array cast from taxonomy_del_term()s taxonomy_get_term() modifies the API for module_invoke_all('taxonomy', 'delete', 'term', $term) called later
- the caches seem to be very fine grained based on vid, parent tid, depth and max depth, which in combination could make up for a huge number of cached entries, which could easily clutter up the generic cache table (I know adding another cache table might be even more API change now)
- what testing is required still to clean up: "// Don't cache tree if empty, may be able to remove this after more testing."?
The benchmark looks impressive for sure, but there were changes in the code since then, so it should be rebenchmarked IMHO. Also I wonder what killes has to say about the current state of the code.
I am marking needs work because of the array cast API change. Others seem to need some discussion, but might not need code change.
Comment #30
Wim LeersComment #31
Wim LeersI walked into this bug again today. I'll reroll with Gábor's comments taken into account.
Comment #32
funana CreditAttribution: funana commentedHi there,
could anybody give me a patch for Drupal 5.7? Or tell me which patch I can use?
Thank you!
Comment #33
aclight CreditAttribution: aclight commentedSubscribing. The inability to reset the static cache in taxonomy_get_tree() makes it rather difficult for the drupal.org testing installation profile to create forums and containers via drupal_execute().
Comment #34
Xen CreditAttribution: Xen commentedThe static cache is also a problem for migration code that automatically creates terms and show a node edit form. If the pathauto module is active, its hook_taxonomy calls taxonomy_get_tree which caches the tree after the first term is added, and the rest doesn't show up in the node form. They do, however, get created.
Comment #35
doq CreditAttribution: doq commentedSubscribing.
Comment #36
lilou CreditAttribution: lilou commentedAdd tag.
Comment #37
JeremyFrench CreditAttribution: JeremyFrench commentedSubscribe.
Comment #38
JeremyFrench CreditAttribution: JeremyFrench commentedI ran into this issue on D6, looking at the D7 code, this now uses drupal_static to declare the variables. So they can be cleared.
Should this ticket be put back to D6 where the issue persists?
Comment #39
catchThis is still a performance issue in HEAD and that was the main part of the patch IMO - maybe open a separate issue for D6 but reference this one for the $reset param.
Comment #40
JeremyFrench CreditAttribution: JeremyFrench commentedI have hacked a D6 taxonomy module to do some caching and cache clears. I may see if I can port this to D7.
Personally I don't think that taxonomy_get_tree can scale as a function, if you have a very large vocabulary it takes an age just to deserialise the data. My preferred approach would be to only allow something like and Iterator, or dom like functions (get parents, get children, get next sibling) for taxonomies and terms, but I think that is probably too much of an API change at this stage.
Thoughts anyone?
Comment #41
catchWe can't change the API, at most tweaks. If there's something which helps some cases, that'd be great. We really need to work on pluggable systems, paging etc. in Drupal 8.
Comment #42
JeremyFrench CreditAttribution: JeremyFrench commentedThis patch might do it. From a look at the code places which edit terms call cache_clear_all(). The unit tests pass locally.
Comment #43
JeremyFrench CreditAttribution: JeremyFrench commentedI have changed the title for clarity. I think the other issues raised in this ticket originally have been split out.
Comment #44
JeremyFrench CreditAttribution: JeremyFrench commentedComment #45
c960657 CreditAttribution: c960657 commented$depth is never -1, because it was incremented three lines above. (BTW, unrelated to this issue, but wouldn't it be more intuitive to set the default value to 0, remove the $depth++, and then instead pass $depth + 1 in the recursive function call?).
You should return $cached_tree->data instead of the entire cache object, but ...
Instead of returning early, why not move the cache_get() into the
if (!isset($children[$vid])) {
block and then always fetch the cached value, even if $max_depth != NULL and $parent != 0. Then just put the cached value into $children[$vid][0] and let the rest of the function do its magic.Missing space after comma.
I suggest using colon as separator before $vid in the cache cid. This pattern is used elsewhere in core.
I'm not sure why you removed the new cache_clear_all('taxonomy_get_tree_' . $vid, ...) calls? cache_clear_all() without arguments only clears cache_page and cache_block.
Comment #46
JeremyFrench CreditAttribution: JeremyFrench commentedThanks for the review, I'll try to get round to fixing. Most of the comments make sense, the only one that I would question is:
This would create a lot of cache keys for vocabs with a deep hierarchy. Many of these cache keys would only be set when taxonomy_get_tree calls itself. So would not give a performance gain.
Where as you get the biggest performance hit when querying the root of a vocabulary. So there would be a bigger cache for very little benefit most of the time.
Ideally you could set a flag to determine weather you are on a recursive call or calling the function from elsewhere and only cache on the non hierarchical calls. But this may require an API change so I went with the simpler option.
Comment #47
c960657 CreditAttribution: c960657 commentedYou can use $depth for that. However, on second thought I don't think my initial thought about always using the cache can be implemented as easy as described, but something can be done:
WRT to $max_depth you could get the cached copy of the whole tree and the prune it in PHP code without hitting the database.
Using the cached copy of the whole tree for $parent != 0 is a bit tougher, because there is no easy way to find the appropriate subtree in the array unless you also save the $parents array to the cache. With this you can find the path of parent tids on the way from the specified $parent to the root and then use that to descend down through the cached tree. I don't know how this will perform.
Comment #48
Mir CreditAttribution: Mir commented#42: taxonomy_get_tree_cache.patch queued for re-testing.
Comment #49
janusman CreditAttribution: janusman commentedCode seems fine, but it really seems to be a solution looking for a problem.
Would anyone care to comment why this is important? =)
Comment #50
JeremyFrench CreditAttribution: JeremyFrench commentedWith very large taxonomies, taxonomy_get_tree can become a bottleneck on quite a few pages. I think some of the other changes in D7 improve this, by not calling it so much.
It may be worth testing a D7 install with a 100,000 term vocabulary, to see how it performs.
Comment #51
slantview CreditAttribution: slantview commentedSubscribing.
Comment #52
Gerhard Killesreiter CreditAttribution: Gerhard Killesreiter commentedsetting to "needs work" since the concerns expressed in #45 haven't been addressed.
Comment #53
phantomvish CreditAttribution: phantomvish commented+1 subscribing
Comment #54
JeremyFrench CreditAttribution: JeremyFrench commentednow that taxonomy_get_tree() memory issues is done, is there a genrall feeling as to whether this is needed any more.
Comment #55
JeremyFrench CreditAttribution: JeremyFrench commented#42: taxonomy_get_tree_cache.patch queued for re-testing.
Comment #57
vin247 CreditAttribution: vin247 commented#42: taxonomy_get_tree_cache.patch queued for re-testing.
Comment #59
catchMoving this to D8.
Comment #60
xjmCrossposting #1207326: Refactor taxonomy hierarchy API for performance, consistency, and convenience.
Comment #61
bailey86 CreditAttribution: bailey86 commentedtaxonomy.tree.patch queued for re-testing.
Comment #62
bailey86 CreditAttribution: bailey86 commentedSo far this seems to be working for me:
Comment #63
xjmThere isn't a current patch in this issue.
Comment #64
bailey86 CreditAttribution: bailey86 commentedBTW - I'm not (yet) familiar with patching - so if someone could create a patch that would be great.
When I've created a patch for a project I'm working on I'll come back to this and create a patch if needed.
Comment #65
xjmRegarding #64, check out:
http://drupal.org/patch
http://drupal.org/patch/create
It's not too complicated, I promise. :)
Comment #66
jibranComment #67
andypostThis is a tricky one.
Suppose we need to store only tree structure because whole entities are entity cache territory.
otoh any static cache affect "statability" of the controller/service...
So maybe better to abstract "tree-handling" into separate service?
We already have one after https://www.drupal.org/node/2226481
This should allow contib to implement a site-specific handling and caching?
PS: I know that @amateescu have a Tree module ported for d8 so glad to get his opinion on that
Comment #68
amateescu CreditAttribution: amateescu commentedMy opinion is that I will just do it in contrib and we can see later about moving it to core :)