Problem/Motivation

We noticed lots of external requests being made by the Mailchimp API when loading a user's account page. It appears to stem from mailchimp_lists_field_formatter_view(), which calls mailchimp_get_list() (subsequently calling mailchimp_get_lists()). This process seems to be backwards; shouldn't the field formatter call mailchimp_get_lists($list_id) to load just the one requested?

Instead, it appears that even when a single list_id is specified, all lists are queried, and all their interests are also retrieved. Since these items are loaded from the MC API every time, for sites with multiple MC lists and multiple interests per list, this results in lots of MC API queries being made simply when viewing a user account.

The lists are filtered out at the end of the function, but only after lots of external requests to the MC API.

Proposed resolution

1. (Easiest) Call $mc_lists->getList() directly from the field formatter.
2. (Hardest) Refactor some of the get_list() functionality so that not all lists (or interests) are retrieved when not requested.

Bonus:
1. Consider checking if interest groups are configured for the field before retrieving them
2. Consider caching interest data
3. Consider caching each list in its own cache entry, instead of one big cache of "lists"

Remaining tasks

Discussion and review of options.

User interface changes

None.

API changes

Ideally, mailchimp_get_lists() would respect that only a single $list_id was requested.

Data model changes

None.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

torgosPizza created an issue. See original summary.

torgosPizza’s picture

Attached is a patch which helped us with a quick-and-dirty fix, since we don't necessarily need to get and (re-)cache all of the data for every list every time a single user account is viewed by node_view().

The main issue I have going this route fully is that it doesn't address the actual root issues (the way all lists are retrieved regardless of which ones are requested by the API call) but perhaps this should be the way forward since it's assumed that any instance of a Mailchimp field will be associated with just one list.

torgosPizza’s picture

Here is another patch which skips interest categories that are hidden in MailChimp.

This caused significant performance improvements for us, since we have a bunch of admin-only categories that we use for customer acquisition stuff, but which they'll never see, so it doesn't make sense to retrieve them with every mailchimp_get_lists() call.

Also, I think the mailchimp lists should be CACHE_PERMANENT across the board. (I added a patch which makes the list cache permanent as an example.) Since there's a hook_flush_caches, I think it's safe to instruct users of the module that, anytime they make major changes to their MC lists, they'll want to clear the caches so the changes are reflected.

I think these changes would help performance a lot; having to ping the MC API every single time the caches are cleared and we need to load one list (which results in looping through all lists, loading all their interest categories, and then the interests) is causing a huge performance bottleneck.

mstrelan’s picture

This is quite problematic. We have a domain access site with 31 domains, each with their own mailchimp list, but all on the same mailchimp account. There is a signup block in the footer of every page for each domain. Frequently when loading the block it will attempt to re-download all data about all lists including all interest groups and all mergevars, which can take around 5 minutes. This seems to be cleared out quite frequently. If multiple domains try at the same time to load the block we get the following error:

An error occurred requesting list information from MailChimp. "429: Too Many Requests - You have exceeded the limit of 10 simultaneous connections."

It seems it would be best to use a separate cache id per list, eg lists::bb49720d9f. This loads far quicker, but still very slow. It seems like it would be better to use something less volatile than cache tables. Any ideas?

mstrelan’s picture

My workaround in a custom module. Note that I am not using any of the sub-modules except for mailchimp_signup.

<?php
/**
 * Implements hook_module_implements_alter().
 */
function MYMODULE_module_implements_alter(&$implementations, $hook) {
  // Don't flush mailchimp cache!
  if ($hook == 'flush_caches') {
    unset($implementations['mailchimp']);
  }
}

/**
 * Implements hook_cron().
 */
function MYMODULE_cron() {
  // Refresh the mailchimp cache daily.
  if ($object = cache_get('lists', 'cache_mailchimp')) {
    if ($object->created < (REQUEST_TIME - 86400)) {
      mailchimp_get_lists(array(), TRUE);
    }
  }
}
?>
helmo’s picture

I applied the patches from #3 last week and it seems to have improved things on our staging site ... thanks.

ruscoe’s picture

Assigned: Unassigned » ruscoe
ruscoe’s picture

Here's a patch that refactors mailchimp_get_list() so it no longer calls mailchimp_get_lists(). It didn't make sense to load all lists and interest groups then throw most of them away.

Loading interest groups is now optional and won't happen if a signup form isn't going to use them. Lists are also cached individually if loaded that way now.

  • ruscoe committed 24eb273 on 7.x-4.x authored by torgosPizza
    Issue #2839768 by torgosPizza: Performance problems with...
  • ruscoe committed 859896d on 7.x-4.x authored by torgosPizza
    Issue #2839768 by torgosPizza: Performance problems with...
  • ruscoe committed 91896ba on 7.x-4.x
    Issue #2839768 by torgosPizza: Performance problems with...
ruscoe’s picture

Status: Needs review » Fixed

Patches committed! Thank you, @torgosPizza. I'm seeing much better performance in local testing; hopefully this also looks good on your side.

MegaChriz’s picture

Here is another patch which skips interest categories that are hidden in MailChimp.

This caused significant performance improvements for us, since we have a bunch of admin-only categories that we use for customer acquisition stuff, but which they'll never see, so it doesn't make sense to retrieve them with every mailchimp_get_lists() call.

Crap, this broke my custom module that called mailchimp_get_list() to retrieve all interest categories, including the hidden. See #1808704-30: Ability to put a Member into a MailChimp Group with a Rule Action.

torgosPizza’s picture

D'oh! Sorry about the unintended consequence. Perhaps we should allow users the option of whether a list should be loaded with all interests or just visible ones? Or if there's another, better solution, we should discuss some ideas.

Another idea I had that could work would be another function that loads "visible" interest groups for things like user-facing subscription forms, and another function (or an API function option) that loads them all. That way different use cases can still get the data they need and it isn't an all-or-nothing approach.

MegaChriz’s picture

@torgosPizza
I fixed the issue in my custom module now with making an extra API call (see 7a556dd and f126e67). The main reason why I used mailchimp_get_list() first, was because the data got cached. So when there is a taste in whether or not hidden groups should be loaded, you would need two cached versions.

torgosPizza’s picture

Agree that things should be cached a bit more. If you have a proposed way to make things happen more easily in this module, so that you don't have to hack your code, I would humbly suggest making a patch for all of us to review! :)

MegaChriz’s picture

@torgosPizza
I agree that I should make a patch. It would probably need an extra (optional) parameter to the mailchimp_get_list() function that is called $include_hidden_interests_groups.

ruscoe’s picture

Status: Fixed » Needs work

Changing this back to needs work so the issue in #11 can be fixed.

hanoii’s picture

This is happening on 8.x too!

I also saw #2870632: Large lists crash the cache rebuild & miss accounts with more than 500 lists which has interesting proposals. I think we should move all discussions to one issue and plan/decide on a fix and then work on implementing it. I think making some of what mailchimp_get_lists() does configurable could help. Not all sites need interest groups and not all sites need merge vars. That could help initially.

Then moving some/all of this logic to the cron should help. I think the other issue have good suggestion, so maybe close this one in favor of the other one?

dobe’s picture

I needed what was brought up in #15. I want to see the hidden interests.

Honestly not sure why we even hide them on this side of the fence as it seems like more of a front end issue then a backend one.

I also separated out the interest calls into there own functions and cache them. Not really what needed to happen but it seems cleaner it also gives better control to use the module calls to separate an ajax based form, calling them only when we need to. I think caching them will increase cache calls but in the end will likely improve performance. I use redis cache on most my installs so it may affect mysql differently.

Let me know.
-Jesse

dobe’s picture

Assigned: ruscoe » Unassigned
Status: Needs work » Needs review
Perignon’s picture

After a day of debugging, I found the root cause of my marketing workflow to stop working for 9 months and it is the commits in this issue. You must provide a way to access the hidden interest groups. I use hidden interest groups as a way to trigger funnel marketing automation in Mailchimp. The groups are my funnels.

Perignon’s picture

Version: 7.x-4.x-dev » 7.x-5.2
Priority: Major » Critical

Updating the applicability of this.

Perignon’s picture

I marked this issue as critical because the changes here were breaking changes.

I re-rolled the patch and made a PR on Github: https://github.com/thinkshout/mailchimp/pull/239

Perignon’s picture

There is a bug in the patch of #18. I fixed it in my PR.

Perignon’s picture

Status: Needs review » Reviewed & tested by the community

I have this patch running in my production site and I ran through a quick positive and negative test.

rjacobsen0’s picture

Rerolled the reroll from @dobe (comment #18). Changed the order of parameters so that the new parameter is the last one and it must have a default value (which it did). If these are not true then it is a breaking change.

rjacobsen0’s picture

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 26: mailchimp-get-lists-performance-2839768-25-reroll.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

rjacobsen0’s picture

A fix for the failing tests (works on my local).

  • rjacobsen0 committed b3f4976 on 7.x-5.x authored by dobe
    Issue #2839768 by rjacobsen0, torgosPizza, ruscoe, dobe, Perignon,...
rjacobsen0’s picture

Status: Needs work » Fixed

Committing for the next release.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.