i have this site set up with Image.module, image gallery, taxonomy and tagadelic.

Active tags work OK, but Popular tags show unpopular tags!
whatever i try, the tags are always the same (single occurring)

what can i be doing wrong?

Comments

wmn’s picture

Category: support » bug
batbug2’s picture

have the same issue, trying to fugure out how these tags are selected

wmn’s picture

the tags have changed by now though, but they still aren't so popular
there are way more "popular" tags that have never showed up here.

arhak’s picture

subscribing

batbug2’s picture

As far as I found out, popular tags are taken from another vocabulary, not from the one they are attached to.

cassus’s picture

Same issue here, please fix it!

cassus’s picture

I suggest replacing a line in active_tags_popular_callback.

orig:
$result = db_query_range(db_rewrite_sql("SELECT t.tid, t.name FROM {term_data} t WHERE t.vid = %d GROUP BY t.name ORDER BY COUNT(t.name) DESC", 't', 'tid'), $vid, 0, variable_get('active_tags_popular_no_'. $vid, 10));

new:
$result = db_query_range(db_rewrite_sql("SELECT t.tid, t.name FROM {term_data} t INNER JOIN {term_node} tn ON t.tid = tn.tid WHERE t.vid = %d GROUP BY t.tid ORDER BY COUNT(tn.nid) DESC", 't', 'tid'), $vid, 0, variable_get('active_tags_popular_no_'. $vid, 10));

I've tested it, this one gets the popular tags at my site!

What do you think?

mafitzpatrick’s picture

cassus: Your fix is correct. Currently the module simply gets terms from the term table - which should be unique entries (i.e. there should not be more than one with the same value) making the COUNT(*) redundant. The data for how many times a term is used is in the term_node table (one entry per use).

Can this fix be applied to the module? Currently it's broken.

dragonwize’s picture

Title: Popular tags are not popular » Popular tags are not popular & SQL group by issues on pgSQL and MySQL 4

Merging #316401: SQL group by issues, pgSQL and MySQL 4 with this issue because they affect the same line of code and will be easier to fix with one patch and more people testing.

dragonwize’s picture

Testing cassus' suggestion in #7 and playing around with this issue in general, there is still one major issue not fixed:

The count of terms includes revisions. So that if you have the below situation the popularity of terms is still not correct.

Use case: You have a site with
1 node tagged with term #1 and 5 revisions
and
4 nodes with term #2 and no revisions on any of the 4 nodes

In this situation term #1 is pulled as more popular as it is counted as being used 5 times versus term #2 is only counted as being used 4 times. Obviously term #2 is more popular but it doesn't come through with current efforts.

Any ideas on the best SQL to pull this and be cross DB compat?

dragonwize’s picture

Status: Active » Needs review
StatusFileSize
new1.13 KB

Ok, its not pretty and I am not sure how performant it would be on a system with a lot of terms and nodes but this is the best way I see.

SELECT t.tid, t.name 
FROM term_data t 
INNER JOIN term_node tn ON t.tid = tn.tid 
RIGHT JOIN node n ON tn.vid = n.vid 
WHERE t.vid = %d 
GROUP BY t.tid
ORDER BY COUNT(tn.tid) DESC

If I can get some reviews on this, especially from MySQL 4 and PostgreSQL people, I will get this committed and released.

dragonwize’s picture

oops. Wrong join left in there.

SELECT t.tid, t.name
FROM term_data t
INNER JOIN term_node tn ON t.tid = tn.tid
INNER JOIN node n ON tn.vid = n.vid
WHERE t.vid = %d
GROUP BY t.tid
ORDER BY COUNT(tn.tid) DESC
darrenmothersele’s picture

This is the query used by the suggested terms module...

SELECT 
  t.name AS name, 
  COUNT(t.name) AS total 
FROM {term_data} t 
INNER JOIN {term_node} n ON (t.tid = n.tid) 
WHERE t.vid = %d 
GROUP BY t.name 
ORDER BY total desc

It doesn't look like it supports versioning either.

So +1 for your version unless we can come up with something more performant.

dragonwize’s picture

Well, we are only using 2 inner joins which isn't very bad in the scheme of things. Using views you can easily do a normal view with many joins and it still runs well enough for most sites.

We could also make use of caching easily to speed things up if there are performance issues because the top most popular terms are not going to be changing every second.

dragonwize’s picture

Looking back at some of the DB compat comments and SQL standards, have a small fix for the group by.

SELECT t.tid, t.name
FROM term_data t
INNER JOIN term_node tn ON t.tid = tn.tid
INNER JOIN node n ON tn.vid = n.vid
WHERE t.vid = %d
GROUP BY t.tid, t.name 
ORDER BY COUNT(tn.tid) DESC
sgriffin’s picture

#15 still throws a user warning.
user warning: Invalid use of group function query: SELECT t.tid, t.name FROM term_data t INNER JOIN term_node tn ON t.tid = tn.tid INNER JOIN node n ON tn.vid = n.vid WHERE t.vid = 2 GROUP BY t.tid, t.name ORDER BY COUNT(tn.tid) DESC LIMIT 0, 10

dragonwize’s picture

Status: Needs review » Needs work

sgriffin: In what DB?

sgriffin’s picture

MySQL database 4.1.22

dragonwize’s picture

ok, seems MySQL 4 doesn't like using group by aggregates with hidden fields.

sgriffin: can you test this please?

SELECT t.tid, t.name, COUNT(tn.tid) AS term_count
FROM term_data t
INNER JOIN term_node tn ON t.tid = tn.tid
INNER JOIN node n ON tn.vid = n.vid
WHERE t.vid = %d
GROUP BY t.tid, t.name
ORDER BY term_count DESC
dragonwize’s picture

Status: Needs work » Needs review
Issue tags: +PostgreSQL

would help to set the proper status.

If there are any Postgres users listening, I would love to hear from you on this.

sgriffin’s picture

Appears you nailed it and they appear on node edits.
Good work.

dragonwize’s picture

Great.

Seeing that it may be hard to get someone to verify this on Postgres(PG) and that the previous issue report (#316401: SQL group by issues, pgSQL and MySQL 4) on PG was fixed by including all not aggregate fields in the group by clause as defined by SQL standards which this patch has included, I would like to propose that we commit this patch.

Are there any objections? Anyone else want to review this?

darrenmUK, what are thoughts?

darrenmothersele’s picture

As you said, the pgsql bug was fixed by making sure all fields were in the Group by, or an aggregate function. I have no way to test, but I would think this was ready to commit.

dragonwize’s picture

Status: Needs review » Fixed

Committed. Released in 1.4.

Status: Fixed » Closed (fixed)
Issue tags: -PostgreSQL

Automatically closed -- issue fixed for 2 weeks with no activity.