This is an awful query. Content Profile support absolutely kills performance until this is optimized.. CONCATs in subqueries?

mysql> EXPLAIN SELECT realname, u.name, u.uid, u.mail, g.is_active, g.is_admin FROM users u LEFT JOIN (SELECT uid, is_active, is_admin FROM og_uid WHERE nid=41996 AND is_active) g ON u.uid = g.uid LEFT JOIN (SELECT uid, CONCAT(field_last_name_value, ", ", field_first_name_value) realname FROM content_type_profile JOIN node n USING (vid) WHERE n.type = "profile") r ON u.uid = r.uid WHERE u.uid  ORDER BY  realname ASC LIMIT 0, 50;
+----+-------------+----------------------+--------+-----------------------------------+-----------+---------+--------------------------------+-------+----------------------------------------------+
| id | select_type | table                | type   | possible_keys                     | key       | key_len | ref                            | rows  | Extra                                        |
+----+-------------+----------------------+--------+-----------------------------------+-----------+---------+--------------------------------+-------+----------------------------------------------+
|  1 | PRIMARY     | u                    | ALL    | NULL                              | NULL      | NULL    | NULL                           | 42637 | Using where; Using temporary; Using filesort | 
|  1 | PRIMARY     | <derived2>           | ALL    | NULL                              | NULL      | NULL    | NULL                           |   897 |                                              | 
|  1 | PRIMARY     | <derived3>           | ALL    | NULL                              | NULL      | NULL    | NULL                           | 37003 |                                              | 
|  3 | DERIVED     | n                    | ref    | vid,node_type                     | node_type | 14      |                                | 15096 | Using where                                  | 
|  3 | DERIVED     | content_type_profile | eq_ref | PRIMARY,vid                       | PRIMARY   | 4       | mydb.n.vid |     1 |                                              | 
|  2 | DERIVED     | og_uid               | ref    | PRIMARY,nid_is_active_uid,nid_uid | nid_uid   | 4       |                                |   896 | Using where                                  | 
+----+-------------+----------------------+--------+-----------------------------------+-----------+---------+--------------------------------+-------+----------------------------------------------+

Comments

Andy Inman’s picture

Category: bug » feature
Priority: Normal » Minor

Please feel free to offer an improved query and/or patch.

Josh Benner’s picture

I don't have the time on this project to try to refactor the module code, especially since it eschews significant swaths of Drupal best practice and convention. However, this might help:

On a client's site, groupadmin was generating this query (formatting added):

SELECT realname, u.name, u.uid, u.mail, g.is_active, g.is_admin
FROM users u 
JOIN (
	SELECT uid, is_active, is_admin 
	FROM og_uid 
	WHERE nid=27782 AND is_active != 0
	) g ON u.uid = g.uid 
LEFT JOIN (
	SELECT uid, CONCAT(firstname, " ", realname) realname 
	FROM (
		SELECT uid, value as firstname 
		FROM profile_values WHERE fid = 1
		) _a 
	JOIN (
		SELECT uid, value as realname 
		FROM profile_values WHERE fid = 2
		) _b USING (uid)
	) r ON u.uid = r.uid 
WHERE u.uid 
ORDER BY realname ASC 
LIMIT 0, 10;

We have thousands of users, and hundreds of groups. This query took 6.5 seconds on a powerful machine -- and even longer (20+ seconds) on test resources.

This is the query, rewritten to be performant:

SELECT
	TRIM(CONCAT(fn.value, ' ', ln.value)) AS realname,
	u.name, u.uid, u.mail, ou.is_active, ou.is_admin
FROM users u
INNER JOIN og_uid ou ON (ou.uid = u.uid)
LEFT JOIN profile_values fn ON (fn.uid = u.uid AND fn.fid = 1)
LEFT JOIN profile_values ln ON (ln.uid = u.uid AND ln.fid = 2)
WHERE ou.nid = 27782
ORDER BY ln.value ASC, fn.value ASC, u.name ASC
LIMIT 0, 10;

This query is roughly 2,700 times faster, clocking at around 2.4 milliseconds on the same machine and data that yielded the 6.5 second benchmark for the previous query.

AaronBauman’s picture

Category: feature » bug
Priority: Minor » Major

A 20-second page load is a major UX bug.

Andy Inman’s picture

Josh, please explain how the module "eschews significant swaths of Drupal best practice and convention". In case you're referring to lack of Views support, that is discussed here.

Ok, so it's clear that the generated query is seriously inefficient when Content Profile is in use. The function _groupadmin_cp_getquery() in groupadmin_cp.module is probably all that needs to be changed. That function modifies the basic query used by GA to add support for name fields held in CP. It was a first-pass at providing the required basic functionality, never optimised. I have GA in use in a number of sites, both my own and clients, and this issue doesn't affect me, either because data sizes are small or because CP isn't used. So, by all means let fix it, but "I don't have time" is as convenient an excuse for me as anybody else.

For now, I'll add a warning on the project page that performance is an issue with CP support and large datasets.

martinwjones’s picture

Has there been any progress, patch or workaround on this.
I have just timed a groupadmin page taking almost two minutes to load (118 seconds)
Our users didn't think it was working and filed a bug report. I discovered it was just taking a loooong time to respond.
Unfortunately I am new to Drupal so I don't understand how all the hooks and queries are implemented.
For the record my site has about 10,000 users and will be hosting hundreds of projects.

AaronBauman’s picture

Now that og in D7 is fields-based, i bet this module could be replaced by some clever views and maybe views bulk operations.

At any rate, this module is clearly no longer maintained.

Andy Inman’s picture

martinwjones, please confirm that you're using the Content Profile add-on? Could you try testing without it just to confirm that is the issue?

Josh's suggestion in #2 looks good and I'm willing to implement it if some people are available to do testing.

martinwjones’s picture

Hi netgenius

Apologies for not replying before now. I didn't get a notification that you had updated the thread.
We are not using the Content Profile add-on, we are using the "Drupal Core" Profile module in conjunction with the OG Group Admin Profile module.
If I disable the Group Admin Profile the results are retrieved in seconds so it would appear that Josh's suggestion is a good one.

We would be happy to do some testing.

Many thanks
Marty

Andy Inman’s picture

Hi Martin, so I guess you didn't try modifying the code? When I said I was willing to implement the change I was hoping that somebody else would try changing the code to use the suggested query (#2) - that query doesn't include searching by name/email etc. so that logic would need to be added. The relevant code is in groupadmin_profile.module function _groupadmin_profile_getquery().