Spin-off from #1609760-62: hook_image_style_*() is not invoked for image styles upon Image module installation

Problem

  • Module APIs being invoked during a config import operation may not save the to be imported configuration (at all), and as such, cannot be trusted.

Goal

  • Make the import mechanism save every Config object during an import, regardless of whether is has been handled by the owning module callback already.

Proposed solution

  1. Invoke the MODULE_config_import_*() callback for a Config object to allow the owning module to take over the saving/deleting.
  2. Overwrite the Config object saved by the module API with the exact data from the source storage (or respectively, enforce its deletion).

Details

  • - Unknown -

Notes

(Discussion between @beejeebus and @sun)

  • There has been an original assumption that when we import new config, we write out the values from the import tree exactly as they are.
  • But now that we are actively involving the module API, it appears wrong to some of us to overrule the module API (which owns the config object).
  • The provided examples about setting defaults and stuff do make a point. They are not necessarily correct, but they are cases we need to consider.
  • Taking this "feature" into consideration, one could (hypothetically) write a new view config file with just, say, it's "name" property, and assume that Views will fill in a bazillion default fields for me. And there's not really something wrong with that.
  • The original assumption was that such a config would be rejected by the import mechanism. Specifically with regard to scenarios, in which you push from dev/test to production. In that scenario, you export config and push them. And you do not expect to see differences.
  • Technically, that is already the case. The module API just saves what it gets/imports (i.e., idempotence in ->load()->save()). So you do not see a difference.
  • However, if we'd just simply overwrite what the module API wrote before, then we apparently would exactly not show you a difference.
  • If you'd want to see/catch any errors/differences during/after import, then we'd rather have to invoke the module API, and after doing so, reload what has been written, and compare that with the config that was supposed to be written originally.
  • So, on one side, we have this idea of "what goes in from an import is what goes out" vs "modules may augment imported config instead of rejecting it". The idea of setting defaults etc seems to be very useful. That would be a real feature, but it does complicate the assumptions around the system.
  • The idea of strictly saving what's supposed to be imported is well understood; many of us assumed that for a long time. But it does not necessarily make sense to enforce this rule at all times (specifically with regard to installing a module's default config).
  • It might be possible to add a flag to the import mechanism to conditionally enable that post-module-import-callback-validation for "true" imports from exported config files -- although it's not clear whether that would help.
  • You do not really care for whether the default config ends up 1:1 in the system after installation. You only care about that in the situation in which you import from exported files (or more generally, the passive store). In that case, we can make all sorts of assumptions about the target system.
  • So we could add another argument to config_import_invoke_owner(), which has a reasonable default, but which essentially enables that post-import-validation after invoking the module API by comparing the effective result with the expected result, and somehow, erroring out on a mismatch. That, in order to validate "what gets written is the same as what goes in."
  • Actually, the module callback might have to decide on its own whether the config to be imported shows essential differences, so as to make a clear difference between error conditions and minor diversions.
  • In any case, the module code has no idea about the system it is being installed on. It could be a fresh install, could d.o., whatever. So, when importing module config at install time, it makes sense that what gets written may differ, because many modules augment others, or add something to each view or whatever. So you can't look at the set of module default config files and say "ok, that is what is in the active store".
  • However, this is also the reality of having to deal with a multi-dimensional module scenario. Because you do not know what the target system/environment is.
  • In turn, the question pretty much boils down to: In a staging scenario, you can and should reasonably expect identical import results if your target system/environment is 100% identical to the previous dev/stage environment. And, there is no reason why the import would yield different results if everything is identical.
  • You can only get differences if the target environment is not the same. And this is where the conflict with original assumptions appears.
  • The original assumption is that this does not feel right. The system should do what it is supposed to do, the obvious thing. In natural language, "import this config", just that.
  • However, if you want to make the system to do exactly what you think it should do, then you just let it do its thing. If, for whatever reason, that turns out to not yield the result you were intending, then you better know by seeing differences, instead of failing + overwriting silently. That's a valuable point.
  • At the point you're writing the exact data that lives in files, you're eliminating the potential difference. So we should probably blow up upon detecting a difference. Although that begs the question of "how".
  • There are two options: 1) Just overwrite what the module API might have written. 2) Detect a diff and blow up.
  • But it really boils down to trusting the module API, or not. If we do not trust it, then we must always write, and we don't care if what is written doesn't match what was exported. If we trust it, then we trust what it writes.
  • The original assumption also contained: "Look, I exported that tree, that's what prod looks like." + We're building something to move config between environments. But now, we're building something that allows you to take some config from anywhere, try to import it.
  • If you want to see what prod looks like, just export it. That's always going to be 1:1 export. But if you want import that tree of config, then you need a system/environment that is 100% identical to the one it was exported. If that is the case, no differences.
  • Only if the target system/environment is NOT identical, you naturally get differences.
  • So essentially, an exported tree from a dev/staging environment is just a "guide" to what will be written to your prod system.
  • And yes, we're going to run into epic/monster bugs with the export/import mechanism, since we didn't even think a minute about module update paths and version conflicts. But luckily, that's a topic for code freeze.
  • It probably wouldn't be just a guide, if there was an overall system compatibility validation operation happening before the import -- e.g., comparing enabled modules, their versions, Drupal version, PHP version, PHP extensions, etc.pp. - whereas that overall system validation would be an attempt to validate 80-99% system equality before the import.
  • And that's pretty much the essence: If you make the assumption that the source and target systems are 100% identical, then no module API will yield any differences in what is being saved. It is probably safe to make that assumption.
  • But in light of that, an overall system validation as a first step before trying to import exported config might make sense, too. That overall system validation might help us to resolve conflict scenarios later -- e.g., if you push new code + new config at the same time, the system would be able to tell you that you first need to run update.php, before you can execute the import.
  • As a result, the only disagreement is whether to trust module APIs to write changes or not. It appears to be a minor detail, but apparently questions the entire concept.

Counter arguments

  • Overwriting the config object after the module API saved it will potentially overwrite differences. The effective result of what is being saved after being passed through a module API can depend on other factors in the system/environment. Doing so will only hide errors and effective differences.

    For example, a module API may reasonably decide to inject its current API version into each configurable thingie's object that is saved through its API. Other reasonable examples would be:

    • Ensuring default values for all properties.
    • Injecting system/environment-specific values. (e.g., auto-generating the system.cron:key in case none is defined yet)
    • Validating references to other resources in the system, and possibly adjusting a status/enabled flag depending on their existence in the target system.
    • ...
  • Making the overwrite less rigid by writing the data in $new_config does not account for the fact that there is no hard requirement for the module API to use $new_config as-is/literally. In other words:
    1. There is no guarantee that the module API saves the $new_config object. This breaks the assumption:
        $config = config($new_config->get());
        $config->save();
      
    2. There is no guarantee that the module API uses the $new_config object. There is no requirement for the module API to be based on the literal Config object as is. A module may use a completely different way for managing its dynamic configuration thingies. Have a look at Image module, whose image styles happen to be plain arrays:
        $style = $new_config->get();
        return image_style_save($style);
      

    This essentially repeats #132 and #136 from #1447686-132: Allow importing and synchronizing configuration when files are updated

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

Anonymous’s picture

so, i posted a comment over at https://drupal.org/node/1609760#comment-6193010

had a long discussion with sun and chx about #70.

to try to summarise:

  • my concern in #70 was based on an assumption that during an import (not module enable) we would write the same data to the active store as what was in an exported config file
  • i'm alone in this assumption. i can live with that, so i'll withdraw my objections to the code that was committed in this issue
  • the disagreement between sun and chx on this issue is whether or not to trust that modules will write out config. i don't have a strong opinion either way, so i suggest if the two of them can't work it out, let's invoke the heyrocker hammer. doesn't feel like a big issue to me
Anonymous’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Status: Active » Needs review
FileSize
1.8 KB

A dead simple test assertion that demonstrates what a module API might reasonably do.

This is just one of many possible examples.

- Ensuring default values for all properties.

- Injecting system/environment-specific values. (e.g., auto-generating the system.cron:key in case none is defined yet)

- Validating references to other resources in the system, and possibly adjusting a status/enabled flag depending on their existence in the target system.

- etc.

chx’s picture

Status: Needs review » Closed (won't fix)

You will not derail the critical discussion into a normal followup. No way. This stays in the other issue. That said, good that I save via $new_config->get() so that all those changes persist.

sun’s picture

Status: Closed (won't fix) » Needs review

1) There is no guarantee that the module API saves the $new_config object. This breaks your assumption:

  $config = config($new_config->get());
  $config->save();

2) There is no guarantee that the module API uses the $new_config object. There is no requirement for the module API to be based on the literal Config object. A module may use a completely different way for managing its dynamic configuration thingies. Have a look at Image module, whose image styles are plain arrays:

  $style = $new_config->get();
  return image_style_save($style);

EDIT: This essentially repeats #132 and #136 from #1447686-132: Allow importing and synchronizing configuration when files are updated

chx’s picture

Status: Needs review » Closed (won't fix)

Next you reopen I will delete the node. The other issue is the one where the discussion happens and you will not add another issue to the already impossible to follow pile of CMI issues.

chx’s picture

Issue summary: View changes

Updated issue summary.

gdd’s picture

Priority: Normal » Major
Status: Closed (won't fix) » Active
Issue tags: +Needs issue summary update

Reopening and escalating to major after discussion in #1609760: hook_image_style_*() is not invoked for image styles upon Image module installation. There's some context there that might want to be wrapped into this issue summary as well.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Patch is supposed to fail. Test-only one not.

chx’s picture

FileSize
5.59 KB

That's basically a made-up test case that is not supposed to happen. Here's what is supposed to happen. It's hard to see why would a module create a new config object with the same name when it already has one.

chx’s picture

FileSize
5.88 KB

Taking a deep breath, he patch here contains the test from #7 and the config.inc simplification from #8 with the module invoke call after the blanket delete / save. I feel the only remaining technical issue is whether we want to move perhaps the delete after?

sun’s picture

Writing/deleting before the module callback breaks the semantics of Config::isNew().

Especially module APIs and hook implementations being invoked by the APIs heavily rely on the value returned by isNew().

chx’s picture

Does it? I have, deliberately left in $target_storage->write($name, $new_config->get()); instead of $new_config->save(); to avoid changing the value of isNew. Edit: you can now say that the meaning is falsified, that there is now a value in the config store which is not acted upon by the module introducing a race-like condition where another process might fail. That's true. But it's not new. An operation like this (for eg module enable) have moments like that already, when we are flushing / rebuilding one structure after the other and they are dependent on each other. A module is already disabled but the field cache is not yet cleared, etc.

chx’s picture

Assigned: Unassigned » gdd

Assigning to heyrocker hoping for a decision.

gdd’s picture

I'd like to see some more discussion from people other than just you and sun about the implications of this. I'll ping some of the active CMI contributors to see if they can get some more data.

merlinofchaos’s picture

After some discussion with chx via IRC, this patch needs a validate step that can prevent the write.

I'm still concerned about double writes, but I want it to be explicit that the module callback will be allowed to write the data, even if this causes a change or simply writes the data existing. The reason is that I don't want to be required to write two separate code paths, because ultimately changes in a $view should all be written out via $view->save() in my opinion, and the forced write here makes it *look* like that will be a bad option. A double write must be explictly okay and even then I'm still a little concerned.

sun’s picture

I'm not exactly sure what kind of validation step @merlinofchaos thinks of, but an extra (+ optional) validation step — instead of the enforced write — was also the idea that came up in the discussion (see OP).

That would make much more sense to me, as it essentially means that e.g. Views API can do whatever it wants with $config » $view->save(), and the import mechanism only optionally allows to validate that input/output is identical.

For example, the import of default module config does not necessarily have to be 100% identical. And whether exported configuration that is imported has to be 100% identical is most likely also specific to config import use-cases. And even for those cases that might assume validation to be required by default, I could imagine a global setting of its own that allows to override that decision.

In other words, let there be an additional, optional validation step, as in attached patch.

Anonymous’s picture

the only thing i can think of that's not 'we're doing this wrong already, who cares?' is - can we please keep the concept that writing during an import cycle may be required to break a dependency cycle?

sun’s picture

please keep the concept that writing during an import cycle may be required to break a dependency cycle?

Can you clarify what you mean?

I guess you're hinting at interdependencies between imported configuration (field instances imported before the bundle or even entity type exists), which would be the same what @yched argued for. The current import mechanism has been stripped down to not care for dependencies at all for now, and the entire discussion has been deferred to #1605460: Implement dependencies and defer process for importing configuration. The module callback design and question of enforced writing does not seem related to that. So far, none of the patches and change proposals here intend to change that. Please correct me where I'm wrong. :)

Further discussion on properly handling dependencies should happen on #1605460: Implement dependencies and defer process for importing configuration -- but that needs to be unblocked first by #1668820: Concept, base class, and interface for Configurables (e.g., node types, image styles, vocabularies, views, etc), so we have some config with dependencies in the first place. ;)

Anonymous’s picture

i thought i saw a proposal to defer writing during module hooks and instead just change content in memory in a config object, and set a dirty flag or something, instead of writing to the active store? guess i misunderstood.

but yeah, the dependency issue is what i was concerned about, so if we have another issue, i'm back to not caring how this lands.

sun’s picture

The actual config import functions are in, so let's see whether #15 reveals any mismatches.

Status: Needs review » Needs work

The last submitted patch, drupal8.config-import-validate-write.19.patch, failed testing.

sun’s picture

Status: Needs work » Needs review

The test failures are all caused by config files that are manually written to FileStorage in ConfigImportTest.

Perhaps, the default value for $validate_identical_writes in config_import() should not be TRUE. Strict equality validation still appears use-case-specific to me. Not sure.

sun’s picture

However, if this is what it takes to make everyone happy in terms of strict equality validation for the config import mechanism, then I will happily adjust, complete, and harden the tests for the new expectations.

moshe weitzman’s picture

Anything to do here?

sun’s picture

Wanted to re-roll this, but:

The expectation being added here essentially means that the two (or more) systems between configuration is staged have to be 100% identical in terms of installed extensions and their versions.

Which is closely related to #1677258: Configuration system does not account for API version compatibility between modules

That is, because even the slightest difference between the exported/staged config object and the final/resulting imported config object will lead to a persistent difference that won't vanish.

A typical and "innocent" example I ran into already is the lack of a UUID for an exported config object (see this interdiff). You can import such a config object, and doing so will add a UUID (because it is saved through the module API), but in the end, the imported config object that is written is different from the exported/staged config object.

Which does not mean the exported object is bogus or wrong or invalid. The module API was able to save it and cope with it.

Do we want to enforce this dependency on strict system equality? I'm not sure.

OTOH, we might have to. Since, otherwise, the config import mechanism would persistently show a difference between staging and active config where no "actual" difference is.

However, if we to do that, then I really want to go with the latest patch in here -- which means to diff the staging config against the resulting active config after the import, and bailing out on any difference.

That is, because the former/original idea of simply (over)writing what the module API might have written would totally hide essential differences. Even worse, with the given example above, the resulting, written config object would not have a UUID at all, even though the module API added/wrote one (since the idea of overwriting would destroy any kind of difference to the config object). In turn, every consecutively executed config export/import process would generate a new UUID for the exact same config object.

yched’s picture

Relying on same module versions being present for CMI deploying doesn't seem unreasonable. Support of varying code bases between environment seems like a can of worms I'm not sure I see how we can support.

I'm more worried about different modules being enabled between environments. It's fairly common to have some modules enabled on prod vs dev/staging.
Of course, if "list of enabled modules" gets moved to CMI, then import will sync the set of enabled modules. The only way to get different behavior between environments is settings.php overrides and <? if (config('some_dev_module.settings')->get('enable_behavior')) ?> in code.

However, if we to do that, then I really want to go with the latest patch in here -- which means to diff the staging config against the resulting active config after the import, and bailing out on any difference

I'm not sure what bailing out would mean. If the import process has run, there's no way to undo the side-effect actions that were done in hook_config_import() (e.g field data data tables have been created).

sun’s picture

Status: Needs review » Needs work
Issue tags: +staging

I'm not sure what bailing out would mean. If the import process has run, there's no way to undo the side-effect actions that were done in hook_config_import() (e.g field data data tables have been created).

Right. It would just simply mean that the config import process blows up upon detecting a difference:

+++ b/core/includes/config.inc
@@ -212,6 +216,13 @@ function config_import_invoke_owner(array $config_changes, StorageInterface $sou
+        if ($validate_identical_writes) {
+          $written_data = $target_storage->read($name);
+          if ($data !== $written_data) {
+            throw new ConfigException('Configuration data for ' . $name . ' was not imported identically.');
+          }
+        }

Already imported things cannot be reverted in that case. The import also cannot be continued.

It doesn't really add any magic to the import mechanism. The only point that we're trying to address here is that we might need a built-in validation of what is being written to the active store, with the idea that what gets written there should be identical to what has been exported for staging before.

I agree this is far from perfect and we might want/need to take a step back to rethink what we want and need and expect within the scope of staging.

sun’s picture

Assigned: sun » Unassigned
Priority: Major » Normal

Demoting priority. Recent comments in here indicate that the suggested change is not really possible, and we need to rethink the situation and expectations.

gdd’s picture

Status: Needs work » Closed (won't fix)

My undertanding is we addressed this topic in other patches already, so I'm closing. Feel free to reopen if my understanding is mistaken.

gdd’s picture

Issue summary: View changes

Separating the discussion summary into notes.