Fieldable entities (users, taxonomies, nodes and files) iterate through their fields for each row, generating a query to the source database for each field, for each record. If an entity type has 10,000 entries, and 10 fields, that will result in 100,000 queries to the source server to gather the data required to save the entity. There is a HUGE amount of overhead in establishing these requests and making these queries, making large migrations very difficult or impossible.

Ideally, it should be fine to gather all the field data associated with single-value fields associated with the entity within the initial data query. Any multiple value fields would obviously need to continue to be separate queries in order to prevent duplicates.

The attached patch attempts to do just that. It gathers the fields, determines which ones are single value fields (by validating the data vs. trusting the configuration value) then joins those tables onto the initial source query. any multiple value fields are gathered later using the already determined way of getting field data.

I've lost the profile data to show the performance improvement, but it was significant - on the order of 30%-40% faster. If you are attempting to do large scale data migrations, this will help speed things up a lot.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

netw3rker created an issue. See original summary.

netw3rker’s picture

Status: Active » Needs review
FileSize
9.06 KB

Updated patch to have proper file paths.

netw3rker’s picture

Found a small bug in the way multiple value fields were being detected. Here's a re-roll of the patch with the fix.