Problem/Motivation
I have several External Entities that I have mapped, that are currently being referenced by Node Entities. I have one Node entity in particular that has 7 external entity references.
When I make edits to the configuration of these entities, I get the following warnings on save:
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.create missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.read missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.update missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.delete missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.list missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.count missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.connection missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.placeholders missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.filter_mappings missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.queries missing schema,
external_entities.external_entity_type.bill_type:data_aggregator.config.storage_clients.0.config.debug missing schema
These warnings appear to be related to the fields on the entity configuration form.
Another performance issue that is occurring on the Node entity that has the 7 mapped external entity reference fields is that, when running a Drush custom command that creates or updates this particular type of node entity, the performance is through the floor abysmal. I have other Node entities Drush commands that have only one or two references to external entities on those node types that run much more quickly.
Comparatively speaking, the script that creates the other Node entities with one or two external entity references executes in a matter of minutes to create ~3,200, ~7,700, and ~14,000 Node entities, while the Drush command that creates the Node entities with 7 external entity reference fields creating ~9,200 entities is taking hours to execute. Unfortunately this is not viable for the use case I have.
I don’t know if these two issues are related. If you think they perhaps are not, let me know and I’ll split them into different issues.
Comments
Comment #2
lisa.rae commentedComment #3
lisa.rae commentedComment #4
lisa.rae commentedA followup: Because of the significant performance hit, I’ve switched a number of the External Entities over to Taxonomy Vocabularies, and modified my imports scripts.
The performance sweet spot seems to be around 2-3 entity reference links to an external entity on a content entity. Once I reduced the number of external entity reference fields to 2-3, I got a significant performance increase in the import scripts, which will need to run as frequently as hourly to keep data updated, with a full data refresh occurring weekly.
I am leaving about 20 external entities total in the project, so the overall experience was a bit of a time saver, and I learned a ton about how this module works. I would definitely use it again, for specific use cases.
The performance issue is concerning. It would probably not be noticeable on smaller data stores, but in this case I’ve got 20,000 records in some of the Node tables, and this is not the largest dataset I’ll be using the project with.
Please let me know if you have a chance to look at this and provide any insight. I’m happy to share the codebase I’m developing so that you can really put this module through its paces, as the datasets are available to be used by simply requesting an API key from the provider.
Looking forward to hearing back on this.