Problem/Motivation
setLastInstalledFieldStorageDefinition first reads the cache, then adds the new field storage definition then writes the database table and deletes the cache. However, another process might have read the old data and sets it into the cache after the cache delete causing the next import to lose the freshly set data. The fields which do not have their field storage definition stored like will cause a fatal if used in entity query.
Steps to reproduce
Way too hard. But people can check whether they are affected by running the following with drush -- this does not cover base fields:
use Drupal\field\Entity\FieldStorageConfig;
use Drupal\Core\Entity\EntityLastInstalledSchemaRepositoryInterface;
foreach (array_keys(\Drupal::entityTypeManager()->getDefinitions()) as $entity_type_id) {
$config_prefix = "field.storage.$entity_type_id.";
$config_names = \Drupal::service('config.storage')->listAll($config_prefix);
$config_field_names = array_map(fn($x) => str_replace($config_prefix, '', $x), $config_names);
$installed_fields = \Drupal::keyValue('entity.definitions.installed')->get("$entity_type_id.field_storage_definitions", []);
if ($not_installed_field_names = array_diff($config_field_names, array_keys($installed_fields))) {
printf("$entity_type_id is missing %s\n", implode(', ', $not_installed_field_names));
}
}
Proposed resolution
Make setLastInstalledFieldStorageDefinition read the key-value table instead of the cache.
Remaining tasks
Should we add an update to add the missing field storages? Smartsheet over the years accumulated over a dozen. Continuing the script from the reproduce section:
use Drupal\field\Entity\FieldStorageConfig;
use Drupal\Core\Entity\EntityLastInstalledSchemaRepositoryInterface;
$repository = \Drupal::service('entity.last_installed_schema.repository');
assert($repository instanceof EntityLastInstalledSchemaRepositoryInterface);
foreach ($not_installed_field_names as $field_name) {
$field_storage_definition = FieldStorageConfig::loadByName($entity_type_id, $field_name);
$repository->setLastInstalledFieldStorageDefinition($field_storage_definition);
}
| Comment | File | Size | Author |
|---|---|---|---|
| #33 | 3363732-33.patch | 10.92 KB | andypost |
| #33 | interdiff.txt | 1.61 KB | andypost |
| #31 | 3363732_31.patch | 10.68 KB | ghost of drupal past |
Issue fork drupal-3363732
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
ghost of drupal pastComment #3
ghost of drupal pastComment #4
ghost of drupal pastComment #5
ghost of drupal pastComment #6
ghost of drupal pastComment #7
ghost of drupal pastComment #8
bramdriesenQuick issue review. The issue description is mentioning the
setter, but the patch is changing thegetter.We should also target 11.x and backport to 10.x
This PHPDoc is not complete. Missing an function description (first line)
Return type mixed? The PHPDoc only shows returning an array of storage definitions.
Comment #9
ghost of drupal pastFixed the phpdoc and the return type.
The patch factors out the uncached getter from the current getter and makes the setter use it instead of the current getter.
Comment #10
bramdriesenThanks Charlie! Patch looks better now. Will leave it for NR so one of the core committers also can pick in.
Comment #11
smustgrave commentedCan we get a test case showing the bug
Thanks.
Comment #12
bramdriesenI guess, if it's such a race condition it might be difficult 😅
Comment #13
ghost of drupal pastCan we get a test case showing the bug
Sorry but no. I do not know how to write a meaningful test case for this. You need multiple processes running in parallel at a very precise point in time.
To recap
Comment #14
ghost of drupal pastFollowing the example of #3312001: Race condition with automatic deploy steps on ConfigImporter I added more comments and I am removing the needs test tag. The enormous difficulty of writing one vs the trivial change here just not worth it.
Comment #15
ghost of drupal pastExcept I put the comment before the wrong line... out of two didn't manage to hit the right one. SIgh.
I am noting there's still a race chance here with two field storages being installed at the very same time: field1 reads, fields2 reads, field1 writes, field2 writes but this window is really really tiny against a data structure rarely written. We can decide to protect against this one as well by taking out a lock.
Comment #16
larowlanShould we be adding a lock here to make this multi process safe?
Comment #17
ghost of drupal pastMaybe, it's just much messier core process wise to pass in a new argument especially because this being a bug this will need to be backported all the way to 9.x which has a deprecated parameter pass already and this race is much less likely than the one originally described in the issue. The original race merely requires an ordinary request to race with a field install one, this is between two field installs. But technically, to be super correct, yes. Note this lock doesn't mean the original race is solved with it -- we do not want to add even observing the lock to get, that's an unnecessary slowdown when the original race is solved with this rather simple change to set. Yes, set becomes marginally slower but who cares about set speed.
Comment #18
ghost of drupal pastI really should read my patches before I upload the, not after :)
Comment #19
bramdriesenShould be
lockBackend?Comment #20
ghost of drupal pastIt totally should be, that's what phpstan was complaining about too.
Comment #21
andypostPlease use type for the new property
New argument should be optional as serialized container will carry old definition before running updates
Comment #22
bramdriesenJust add $lock to the PHPDoc with the proper type and description and it should pass the first stage :-)
Comment #23
ghost of drupal pastY'all understand this drives contributors away, right?
This has negative value. It adds absolutely nothing, the interface already tells you what it is and it makes the class harder to read. I'd rather add a phpcs bypass than this.
But here it is, hoping for the day when we can stop cluttering our patches with this. Is there a policy issue already not to add doxygen if the type already tells all there is?
Comment #24
ghost of drupal pastComment #25
ghost of drupal pastYou know what, this is not a great patch, this is not a visionary solution.
Why are we using a serialized blob here.
If we were writing one database row per field to key-value , none of this would occur.
We could still cache the information together after a single getAll. There's no discussion on this in #2554235: Make the content entity storage and entity query use the last installed definitions instead of the ones living in code.
What do you think.
Comment #26
larowlanFyi #3238192: Allow omitting @var for strictly typed class properties - we're just blocked on governance issues I think, something to do with the TWG not having active members
Comment #27
ghost of drupal pastSomeone should file an issue to replace the meat of getLastInstalledDefinitions() with
because IMO it's much easier to read and only iterates the array once.
But anyways.
What about the attached patch. Needs a healthy dose of an update function to unserialize, of course. And maybe still a lock this time in setLastInstalledFieldStorageDefinitions but I am not so worried about that one, all hell already breaks lose if a racing process starts just in time for an entity type to exist but EntityTypeListener has not yet called setLastInstalledFieldStorageDefinitions because then the cache will store empty. I am not even sure that's in scope for this issue.
Comment #28
ghost of drupal pastQuickfix.
Comment #30
ghost of drupal pastOh yeah, update tests are unhappy.
Comment #31
ghost of drupal pastThe first failure is from #2560237: UpdatePathTestBase saves the root user before updates have run but fixing that I found drawing the update page calls hook_theme which calls views_theme which requires a working entity field API. Then there are broken tests which presume entity field API is working, one such is #3364503: BlockContentRemoveConstraint presumes a working entity field API I provided something that makes that one pass.
I really can't emphasize this enough: as far as I am able to tell, all the failures above are not from this patch, it just sheds light on tests which have unfounded trust in the entity field API working with an unupdated database.
Comment #32
ghost of drupal pastWhoever wants to bother with the misery core development became can continue, I am unsubscribing from this issue and until #3364512: [Meta] Run tests with phpcs fail I will submit the absolute minimum possible to make my issues go away and immediately unsubscribe from everything. This process is no longer tenable.
Comment #33
andypostHere's fix
Comment #37
kumudbIn Drupal version 11.x, I have successfully tested patch #33 in my local environment, and it is functioning as expected. However, I believe that my review may not be sufficient to ensure all aspects are covered. Therefore, I recommend moving this patch to the "Needs Review" status so that it can receive further evaluation from the community.
Thank you!