It runs it's own query just including fields from the taxonomy_term_data table, instead of just loading all the tids and passing them to taxonomy_term_load_multiple(). This has been causing problems with the [term:parent:path] token I've implemented in token.module since entity_uri expects $term->vocabulary_machine_name to exist.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Dave Reid’s picture

Status: Active » Needs review
FileSize
4.54 KB

Patch for review that:
1. Changes taxonomy_get_parents to use a simple non-DBTNG query since we get automatic things like taxonomy access and translatable queries with the entity load system used in taxonomy_term_load_multiple().
2. Converts the [term:url] token to use entity_uri() to demonstrate the problem without the changes to taxonomy_get_parents
3. Adds some prevention to prevent token errors if a term has no parents. Added tests assertions for this.

Dave Reid’s picture

Issue tags: +token

Change affects token/pathauto tests for D7, so adding a tag.

Dave Reid’s picture

Issue tags: +pathauto

Status: Needs review » Needs work

The last submitted patch, 870528-taxonomy-get-parents-D7.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
4.51 KB

Status: Needs review » Needs work

The last submitted patch, 870528-taxonomy-get-parents-D7.patch, failed testing.

mr.baileys’s picture

It seems that the current version of taxonomy_get_parents does not actually cache the results of the query (tids is never assigned a value after the initial $tids = &drupal_static(__FUNCTION__, array());). The version in this patch *does* cache the results.

The test fail seems to be caused by taxonomy_term_delete (called by taxonomy_vocabulary_delete) relying on taxonomy_get_parents to get an accurate count of parents for a term during mass deletion (terms with multiple parents are not deleted). After deleting some terms, taxonomy_get_parents starts to return stale data (parent terms that were deleted still appear in the returned array), causing the taxonomy_term_delete skip deletion of terms that would otherwise be deleted.

Since the current implementation of taxonomy_get_parents() does not actually cache the values, may be we should leave that as it is and fix the cacheing in a seperate issue?

+++ modules/taxonomy/taxonomy.module	1 Aug 2010 21:25:16 -0000
@@ -702,34 +702,21 @@ function taxonomy_vocabulary_get_names()
  * Find all parents of a given term ID.

Since we're tinkering with the documentation anyway, this should become "Finds all parents..."

+++ modules/taxonomy/taxonomy.module	1 Aug 2010 21:25:16 -0000
@@ -702,34 +702,21 @@ function taxonomy_vocabulary_get_names()
+ * @return

Needs to be preceded by a blank line.

Powered by Dreditor.

Damien Tournoud’s picture

This query has to use the query builder and add the term_access tag, as entity_load_multiple() doesn't do any access control, it's not its job.

Dave Reid’s picture

Then what is this doing?

class TaxonomyTermController extends DrupalDefaultEntityController {

  protected function buildQuery($ids, $conditions = array(), $revision_id = FALSE) {
    $query = parent::buildQuery($ids, $conditions, $revision_id);
    $query->addTag('translatable');
    $query->addTag('term_access');
    // When name is passed as a condition use LIKE.
    if (isset($conditions['name'])) {
      $query_conditions = &$query->conditions();
      foreach ($query_conditions as $key => $condition) {
        if ($condition['field'] == 'base.name') {
          $query_conditions[$key]['operator'] = 'LIKE';
          $query_conditions[$key]['value'] = db_like($query_conditions[$key]['value']);
        }
      }
    }
    // Add the machine name field from the {taxonomy_vocabulary} table.
    $query->innerJoin('taxonomy_vocabulary', 'v', 'base.vid = v.vid');
    $query->addField('v', 'machine_name', 'vocabulary_machine_name');
    return $query;
  }
Damien Tournoud’s picture

That should not be there.

Dave Reid’s picture

Is there an issue where it's being fixed/removed? We'd still need to add the term_access but not the translatable tag to the parents query since we're only resulting with rids right?

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
4 KB

Revised patch that clears taxonomy cache after each term is deleted, which fixes the taxonomy vocabulary test since that cache was broken as pointed out by mikey_p (and I think we need to fix). Also converts the query back to DBTNG so it can use term_access tag when taxonomy entity loading is cleaned up.

Dave Reid’s picture

BTW if you want to blame the cache not working in taxonomy_get_parents() was introduced in #142051: Static cache for taxonomy_get_parents and taxonomy_get_children. :)

Dave Reid’s picture

Title: taxonomy_get_parents() does not return a full term object » taxonomy_get_parents(), taxonomy_get_children(), and taxonomy_get_tree() do not return a full term objects
FileSize
8.17 KB

Also fixes taxonomy_get_children() and taxonomy_get_tree() since it's the same problem there as well. This is frustrating when non-term objects are returned.

Dave Reid’s picture

Darnit I kenw that would happen.

Status: Needs review » Needs work

The last submitted patch, 870528-taxonomy-get-parents-children-tree-D7.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
7.59 KB

Found the bug with my changes overwriting the $parent parameter is taxonomy_get_tree(). This should get the green light.

Dave Reid’s picture

So summary of the patch in #17:
- Resets the taxonomy term static caches after every term is deleted in taxonomy_term_delete(), to ensure we always have fresh information since functions like taxonomy_get_children() and taxonomy_get_parents() are used in term deletion.
- Converts taxonomy_get_parents(), taxonomy_get_children() and taxonomy_get_tree() to use as simple of a query as possible (usually fetch just the single 'tid' column), which is then passed into taxonomy_term_load_multiple() so that we always have full term objects loaded through the entity system.
- Converts the [term:url] token to use entity_uri() which exposes a the original bug if [term:parent:url] is used. Tests added to back this up.

moshe weitzman’s picture

Logic and code look very nice.

$query->orderBy('t.weight'); gets added twice in successive lines. Probably a copy/paste snafu.

Unrelated to the patch, taxonomy_terms_static_reset() is an abomination. we clear 7 statics when a term is added/edited/deleted? Thats excessive caching folks. Its bound to cause subtle freshness problems.

Dave Reid’s picture

Thanks moshe. Fixed the copy/paste error and also removed an obsolete line of code in taxonomy_get_tree() since $term->parent is no longer even defined thanks to the query.

I agree we need to make taxonomy_terms_static_reset() smarter somehow (in a separate issue) or if it's something we can figure out before D8. :/

Dave Reid’s picture

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

It is RTBC for me.

Dave Reid’s picture

It helps when I upload the right patch... still RTBC if bot confirms.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Looked at the surrounding code, and this makes sense. Committed to CVS HEAD.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

mh86’s picture

Status: Closed (fixed) » Active

I was testing the taxonomy system and this patch broke the multi-parent support when using taxonomy_get_tree (see term overview with a simple multi-parent vocabulary), it worked in alpha6 before this patch was committed.

further, using taxonomy_term_load_multiple really slows down taxonomy_get_tree (an already slow function). this can be easily reproduced by generating a taxonomy with around 5000 terms, where just the loading of terms with taxonomy_term_load_multiple in taxonomy_get_tree will take 2-3seconds (at least on my notebook). without this call, the fetching of terms from the database is very fast.

mh86’s picture

Status: Active » Needs review
FileSize
2.75 KB

The uploaded patch adds a test case, which verifies that multi-parents are working in the term overview (uses taxonomy_get_tree).

Further I reverted the taxonomy_get_tree function, therefore removed the full entity loading. As I pointed out in my previous comment, the taxonomy_get_tree really gets slower when adding the entity loading and when I take a look at the taxonomy module, the taxonomy_get_tree function is usually used to build forms and overviews of terms, tasks where a full entity is not needed. I understand that having the full object can be useful, but in this case it means a lot of overhead and actually breaks multi-parents.

The patch doesn't affect taxonomy_get_parents and taxonomy_get_children, as they normally do not load that many terms.

Dave Reid’s picture

Not providing the full object is also broken. We should be able to fix it somehow, but I'm pretty sure we'd want to still use the entity loader since the function is designed to provide the full term objects.

fago’s picture

>Not providing the full object is also broken.

Agreed. We need to fix the multi-parents, but also the performance of taxonomy_get_tree() is critical as it is used in quite some taxonomy functions. Best we deal with that in separate issues.

mh86’s picture

Status: Needs review » Fixed

added a separate issue with the bug report: #899258: Multi parents broken

mh86’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

catch’s picture

Priority: Normal » Critical
Status: Closed (fixed) » Active

This should never have gone in, and was committed within five days of being opened without either of the taxonomy maintainers being contacted.

It has made #556842: taxonomy_get_tree() memory issues, many, many times worse, and that issue was already major. I just saw a form validation completely blow up trying to validate a vocabulary with 3,500 terms via taxonomy_allowed_values(). 3,500 terms is nothing.

Re-opening this as a critical bug to roll back the taxonomy_get_tree() hunks. If that's not an option then #556842: taxonomy_get_tree() memory issues needs to be promoted to a release blocker, but at the moment even small-ish sites are going to hit the PHP memory limit much, much quicker than they ever did in D6, making this a serious regression from something that was already very bad in the first place.

moshe weitzman’s picture

I'm OK with rollback if thats the desire of the taxo maintainers. It is a small patch, and I doubt much depends on it. Hopefully bot will flag any problems.

webchick’s picture

Well. The problem is it was in there when we announced beta1, so my guess is contrib is actively using this. (for example, almost assuredly pathauto and token since they're tagged here) So IMO we need to revert, but keep some sort of a replacement. Is that replacement separate API functions, or an extra "return object = TRUE" argument to these functions? Not sure. Let's discuss.

Dave Reid’s picture

The worst offender is taxonomy_get_tree() which I think can be selectively rolled back. The other functions (taxonomy_get_parents and taxonomy_get_children) are being used and do need to return full objects and should stay that way. Rolling back those would cause the additional test case in core taxonomy token tests to fail.

catch’s picture

Status: Active » Fixed

taxonomy_get_parents() is fine. taxonomy_get_children() is fine most of the time, if you had a vocabulary where one term had thousands of children that would also blow up, but it'd be an edge case.

I took another look at #556842: taxonomy_get_tree() memory issues and the patch was a lot more ready than I thought, re-rolled and Moshe has RTBCed, that definitely gets us back to where D7 was in August, the extra optimization may be improvement on D6 too. There are rough times at http://drupal.org/node/556842#comment-3464092 but no memory numbers. For now I think it makes sense to close this one again, and bump that to critical. If it gets bogged down then we'll need to go for the straight revert though.

Status: Fixed » Closed (fixed)
Issue tags: -token, -pathauto

Automatically closed -- issue fixed for 2 weeks with no activity.