Problem/Motivation
RocketChatChannelBlock Build iterates over users and makes unnecessary API calls before rendering Block. This makes its use unscalable and useless for any decent amount of users. It appears to be checking if all the group members are added to the channel and creating them if they aren't prior to allowing the Chat Block to render. The time to add/check users is at the time of user addition to the Group (which I believe the module already does). So this is redundant and adds insane latency to the page as the number of users grow.
In a test on a site with almost 100 users in a group, an authenticated group page that has the Chat Block on it loaded in 9 seconds. Disabling the block allowed the same page to load in 4.5 seconds. And by turning on a proxy cache, we were able to prove that all the iterative getUserProxy calls were completely unnecessary Simply providing the same page, statically cached, allowed the chat to work as expected without all the calls and allowed users to login via OAuth in the iframe without issue and the HTML loaded in only 81ms.
Unless there is a reason I'm missing, as I said, I believe these calls are redundant and serve no purpose but to make it unusable in groups with even moderate numbers of users.
Steps to reproduce
Enable & configure Rocket_Chat and Rocket_chat_group.
Add a group.
Add about 100 users to the group.
Check the load speed of the page.
Add the Chat block to the group page.
Recheck the load speed.
Proposed resolution
Remove all unnecessary API calls and iterations in the Build process for the RocketChatChannelBlock. If the calls are all redundant, this could even probably be reduced to simply returning the iframe markup alone if minimal checks were desired.
Comments
Comment #2
digitalfrontiersmediaUpon another glance at the code, it appears that the getUserProxy iteration only occurs if the user is not a "member" already and our user was. Our tests show that something in this process is definitely slowing things down. So someone will need to do a more detailed trace of what's going on here.
Comment #3
digitalfrontiersmediaTitle changed to reflect symptom since I am now unsure of exact cause. Also changing to support request in light of this.
Comment #4
sysosmaster commented@DigitalFrontiersMedia if you need to mange users in bulk, its better to use the api in a bulk async operation instead of the page load itself. as you can see if you have huge amounts of users and do it in page it will slow down a lot. (basically you are doing a poor mans cron task...)
as to what is slowing it down this is the flow as I remember it:...
Most of these steps involve 1 or multiple calls to the Rocket chat. each with a timeout and possible under rate limit. (with mitigation for rate limits in the code to not hammer the poor server).
There is certainly room for improvement in this code. but we need tighter interactions with the rocket chat server (or possibly a dedicated runner that keeps the Drupal up to date with changes).
I hope this helps your understanding.
Comment #5
digitalfrontiersmediaI understand what you're saying, but if the module is coming out with the rocketchat_group module distributed with it including a chat block, I think the onus is on the distributed code, not on the user recreating the features that come with the distributed module in a way that works correctly.
The source causes should be identified and the source code fixed.
I have New Relic running on an instance of this and New Relic logged 52 API calls to the Rocket.Chat server in one page request that had the chat block enabled. If I find time or get further information on the specifics that could direct the corrective measures necessary, I will come back to share them.
Not sure if this should be changed to Task, Bug Report, or what, but I don't think it should be closed/forgotten.
Since the latency builds as you add more users, on Pantheon (and many others) if you have an upper max_execution_time for a PHP script run, you could end up with a 504 or timeout once you reach a couple hundred users. That would basically make the module unusable.
So it's a performance problem, but one that could turn into a critical bug in many situations.
Thanks for getting back to me with an explanation.
Comment #6
sysosmaster commentedI agree... so this is a task now.
(the Groups integration was created for OpenSocial and paid for by OpenSocial. its currently in its minimal viable state)
what we need:
Comment #7
florisg commentedThey never payed.
Anyone taking this up, it's a great improvement?