Problem/Motivation
Drupal 8.8 introduced an API (see #2991683) to manage configuration transformation by allowing event subscribers to edit configuration before import or export. It works by loading the new configuration in a storage object and emitting an event to allow modules to modify the config before import. See ImportStorageTransformer.php.
Importing configuration can be a costly effort in terms of number of database queries, but usually there are no configuration differences, resulting in fast configuration management commands like config:status and config:import. However, since the new API was introduced, the complete site configuration is written to the database for each invocation of the config:status or config:import command, even if there are no changes.
In optimistic scenario's where the amount of configuration is small or the database is optimized for write speed (instead of data consistency) this has a minor but still noticeable impact on config sync operations. In other scenario's, even for sites with a small amount of configuration, config:import takes more than 10 times longer to complete compared to Drupal before the new API was introduced.
The cause of this performance regression is using a database storage backend to temporarily store the complete site configuration. The problem is aggravated by not using transactions causing databases optimized for data consistency (like InnoDB with innodb_flush_log_at_trx_commit set to a value lower than 2, which is the default for MySQL and MariaDB) to consider each write query a separate transaction.
Steps to reproduce
The problem can easily be observed by comparing the time the complete config:status or config:import before and after Drupal 8.8. Another way to observe the difference is by manually changing the backend used in ImportStorageTransformer. In the transform() method (see attached patch), change:
$mutable = new DatabaseStorage($this->connection, 'config_import');
into:
$mutable = new MemoryStorage()
The smallest improvement I've seen with this change is 20% (for example, the operation taking 7 instead of 9 seconds), the largest improvement was 95%:
DatabaseStorage:
$ time drush config:status
[notice] No differences between DB and sync directory.
real 1m0.115s
user 0m2.469s
sys 0m0.605s
MemoryStorage:
$ time drush config:status
[notice] No differences between DB and sync directory.
real 0m2.608s
user 0m1.954s
sys 0m0.360s
Proposed resolution
An obvious solution would be to use the MemoryStorage instead of DatabaseStorage. The comment in ImportStorageTransformer however reads:
We use a database storage to reduce the memory requirement.
Which would mean this problem requires a different solution. We could wrap the write queries in a single transaction:
$transaction = $this->connection->startTransaction();
// Copy the sync configuration to the created mutable storage.
self::replaceStorageContents($storage, $mutable);
$this->connection->popTransaction($transaction->name());
In my test cases, wrapping the queries in a transaction results in almost the same performance gain as using the MemoryStorage. This might be different in other setups.
I managed to find one comment where the use of DatabaseStorage instead of MemoryStorage was discussed. The comment mentions "concerns with memory usage". I can't talk for the many people using Drupal in different ways, but in my situation - a Drupal site not very big, not small either - with 7MB of yaml config, I do not notice any difference in peak memory consumption.
Remaining tasks
- Discuss: should the transformer use a memory or database backend?
- Should the backend perhaps be configurable?
- If database storage is to stay: wrap in transaction
I'd also like to point out the StorageCopyTrait::replaceStorageContents() method has a similar issue. It is used in two places:
- ExportStorageManager: export would be faster if a MemoryStorage or FileStorage is used
- ConfigSnapshotSubscriber: creating a full config snapshot in the database after each import (when at least one item is changed) - could someone explain the purpose? Why would someone want this? Snapshotting would also benefit from being wrapped in a transaction.
Changing the ExportStorageManager to use the memory storage or implement transactions seems like a good idea. Doing the same for the snapshot subscriber might defeat its purpose. Though making it optional might be a valid use-case. I consider those separate issues.
User interface changes
None.
API changes
None, the API itself is fine.
Data model changes
None.
Issue fork drupal-3217783
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
jsst commentedPatch for alternative solution using transactions (patch #1 and #2 are mutually exclusive).
Comment #4
longwaveIf MemoryStorage is suitable for all but extremely large sites, it looks like we could inject the storage to the transformer, default to MemoryStorage and sites that want to opt out can switch it for DatabaseStorage instead?
Comment #5
idebr commentedComment #7
bircherIn my original patch I was using a memory storage. But I never did any benchmark to see what the overhead would be in comparison.
If we were to use a memory storage we would also have to remove the check for the persistent config importer lock and verify that the config importer uses the same memory storage across the batch import. (I think it does but we should verify that)
I also opened #3223976: Use StorageComparer in StorageCopyTrait which solves this also for many cases. But they are not duplicate issues because they both improve performance in their own right and are compatible with each other. (Of course if we use the memory storage then the StorageComparer would only apply when writing the files to disk on export with drush, or with modules such as Config Filter 2.x that also copy storages around.)
That said, I am also ok with using db transactions.
Comment #8
nigelcunningham commentedI wasn't able to get this patch to work for me - regardless of what I tried, I was still getting the same time taken.
I have developed an alternate solution, along the lines of the storage comparer but much simpler. I don't believe the complexity of the storage comparer is needed, and am also not convinced the heuristic in it will always give the right approach.
Instead of deleting of all the contents of the target collection, I am always only applying the changes to the target collection. This saves both a delete and insert for every unchanged file, which I anticipate being 90+% of the files in most cases. In testing on a staging server for a client that was previously timing out when seeking to load /admin/config/development/configuration, the page load time reduced to approximately 3 seconds - the same amount of time drush cim --diff takes. (The web page previously took much longer than drush cim, even with the patch mentioned in the next paragraph).
One additional change was also required in ImportStorageTransformer.php, where a lock is taken and not released. The /admin/config/development/configuration page causes the method to be invoked twice, so without the lock release it deadlocks against itself, causing a good portion of the timeout issues mentioned above.
If it is agreed that this is a better solution, I'll replace the code in PR.
Comment #9
nigelcunningham commentedVersion that applies against Drupal 9.3
Comment #10
bircherGreat idea, I spun that off in its own issue #3232494: Optimise StorageCopyTrait for slow write operations, that way it will not get into the discussion about database transactions and locks etc.
Maybe we can close #3223976: Use StorageComparer in StorageCopyTrait then.
Comment #11
alexpottUsing transactions is a great idea and so is #3223976: Use StorageComparer in StorageCopyTrait.
The merge request here is RTBC.
Comment #13
xjm9.2.x is security-only, so all bug fixes should be targeted for 9.3.x. (This was an oversight that is partly my fault in the automated issue update above. Oops.)
Comment #14
catchLet's add a comment explaining why we're using a transaction here.
Comment #15
trogie commentedVery interested in this for a 'big' Drupal Commerce site: lots of product (variation) types with lots of fields and product attributes with an external PIM system. So we also have to ignore lots of fields and config. Doing a cim after deploy takes +-20min...
Comment #25
prudloff commentedComments have been added.
Comment #26
smustgrave commentedLeft a small comment on the MR
If you are another contributor eager to jump in, please allow the previous poster at least 48 hours to respond to feedback first, so they have the opportunity to finish what they started!
Comment #27
prudloff commentedComment #28
smustgrave commentedFeedback appears to be addressed, not sure about test coverage.
Comment #29
catchThis isn't possible to test.
We should open a follow-up to look at whether the same trick could help in other situations, one I can think of might be cache garbage collection, which runs a delete against potentially a lot of rows.
Comment #30
godotislateNot a blocker, but a couple questions about this:
ExportStorageManager, shouldn't it also be forImportStorageTransformer?Comment #31
catchYes that also looks out of scope, might be better to revert those changes if the answer to the last two bullet points is 'yes' and open a follow-up. I think it's fine to do property promotion/remove constructor docs when already changing a constructor, but otherwise best handled in dedicated issues.
Comment #32
alexpottI think the constructor changes here are in scope because we promote $connection to a property so we are changing the constructor. Imo the ImportStorageTransformer is not in scope because we're not changing them.
Comment #33
godotislateOh, right, I missed that bit. Makes sense.
Comment #35
catchCommitted/pushed to 11.x, thanks!
Comment #37
nigelcunningham commentedWow. Great to see this merged. Thanks all!
Comment #38
kim.pepperI had to look into why
unset($transaction)works and it's because we have a\Drupal\Core\Database\Transaction::__destruct()method that calls$this->transactionManager()->unpile()