Problem/Motivation
Over in #2224887: Language configuration overrides should have their own storage we've been exploring providing a separate storage for language configuration overrides. The approach has several advantages as:
- config schema just work as the config object name is the same
- the amount of config objects in each storage does not become a huge mess - think of how many files a staging directory could contain on a big multilingual site
- fixes issues with key length since we don't have to prefix language overrides
However the approach falls down when trying to make it work with config synchronisation because:
- If we are only importing language overrides at the moment this will not work because the config sync page is only looking for changes in the regular config storage
- The language config sync code has to make assumptions about how staging in structured and that it is in fact a file storage
- If another module was providing overrides through a similar method of storing in pseudo config files it is going to have to write much of the same code
- If the language module is not enabled then it is totally impossible with this approach for the sync process to inform the user what is going to happen if you stage installing language and create language configuration overrides in the same sync
Proposed resolution
Add collections to config storages. A configuration storage can contain multiple sets of configuration objects in partitioned collections. The collection key name identifies the current collection used. The storage interface contains the method getCollectionNames() to list all the available collections. This means that config sync will be able to compare staging to active even if the module that writes to that particular collection is not installed. So for example, if staging contains a collection called language.de, we can discover this and display to the user that this configuration will be synchronized even when the language module is being installed at the same time.
Use createCollection() which returns a new instance of the storage with the collection set. It does this to avoid side effects caused by the storage object being injected into each Config object - changing the collection on the storage of one Config object would change them all.
For databases this will add an extra column and in file systems this will use directories.
The patch here just adds the ability and tests it. Implementation of language to use this should be done in #2224887: Language configuration overrides should have their own storage
Remaining tasks
Review patch
User interface changes
None
API changes
Only additions.
| Comment | File | Size | Author |
|---|---|---|---|
| #34 | 2262861.34.patch | 131.25 KB | alexpott |
| #34 | 30-34-interdiff.txt | 4.48 KB | alexpott |
Comments
Comment #1
alexpottComment #2
alexpottWe need to make CachedStorage work and introduce a proper test for it since it is not being used anymore. And add code to the config sync process to be able to handle collections.
Comment #3
alexpottAmateur - forgot to rebase :) Ignore patch in #2
Comment #5
gábor hojtsyCollections need to be explained both in the summary and in the patch :)
Comment #6
alexpottPatch adds:
Now to work on ConfigSync and ConfigImporter support.
Comment #7
alexpottPremature save!
Comment #8
alexpottPostponed #2224887: Language configuration overrides should have their own storage on this - so marking this a beta blocker.
Comment #9
alexpottComment #10
xjmComment #11
alexpottOpened:
To remove changes that will help keep the patch focus on just adding the concept of collections to the configuration storages.
Comment #12
gábor hojtsyPostponing this on #2263287: Test the CachedStorage class using ConfigStorageTestBase since it does part of the same as here.
Comment #13
gábor hojtsy7: 2262861.6.patch queued for re-testing.
Comment #15
gábor hojtsyWith #2263287: Test the CachedStorage class using ConfigStorageTestBase committed, this surely does not apply anymore.
Comment #16
alexpottPatch attached is a reroll after the 3 clean up issues have gone in. Plus it adds collection awareness to:
One interesting issue is whether or not collections should support config entities - this opens up a huge can of worms so I've implemented a method of the storageComparer::supportsConfigurationEntities() that basically says that only the default collection does. If you think about language overrides this only contain the overridden configuration keys and therefore can not be treated as an entity. I've tried to solve this in a way that makes sense for how things work at the minute but would allow us to explore other possibilities.
A test is added to test the importing of collection configuration change.
Comment #17
alexpottAdded ability to diff config in that is stored in collections.
Comment #18
alexpottHow to test this patch:
drush core-clito do this.admin/config/development/configuration/full/importto a tarball and extract to the staging directory. Or you can use the Drush patch attached to this issue to usedrush cex -yadmin/config/development/configurationto test the diff, listing and finally the ability to import changeComment #19
berdirHere we go, first review.
Didn't look closely at the tests and the config import/compare changes, but from what I've seen it's just "loop over collections" + "pass collection around" all over the place :)
Is it documented somewhere what a valid collection name is? Can't we require [a-z_] only as valid characters?
Edit: Ah, now I understand, you use . for namespacing/sub-collections, makes sense then. Maybe add a short inline comment?
We don't support partial changes, so there's no point in limiting the actions, so we can safely remove it.. not really related to this I think, but good cleanup ;)
wondering if drierectory has something to do with Dries... :)
Minor: Wouldn't need {} anymore
Existing code, but how could this happen if we ensure that the directory exists?
mixing @inheritdoc with additinal documentation isn't officially supported (and not supported by api.drupal.org). Maybe move the recursion to a protected helper method like discoverCollectionDirectories/Names() ?
This code isn't trivial to understand (the part about if it has sub-directories, we only use them, otherwise us the directory itself as collection).
I think it's also not going to work for more than two levels, because if you have:
a/b/c
then, when it calls it it to iterate on b, it will only use b and c for the collection name, which would result in b.c, but it should really be a.b.c?
I think you would need to pass only the relative directory path through, and then use that to build the collection name instead of just the current directory name.
Once fixed, some inline documentation on what it is doing and why would help.
I still didn't get to the documentation of the collection string, but this also means that it is not valid to mix collections and collection namespaces, if you use language.$langcode, you are not allowed to put something in a collection "language". Might be worth documenting if it's not already.
Interesting, if support of collections is mandatory, should there be a way to have a collection on the null storage as well? Probably doesn't matter, just wondering what would be correct here...
I think we're missing documentation for $collection here?
Here we are :)
Yep, as mentioned above, should possibly mention that when you nest, then you can't use the parent anymore, looks good other than that.
The wording is a bit confusing.
For someone using the method, it's not that relevant that it is cloned I think, it's more an instruction for implementations that they have to clone it, so maybe change it to "Implementations must clone themself.." or something?
Also, it says instance of storage class, but maybe it would be more correct to say that it's an instance/version/something of that storage *backend* as it is going to be the same backend (class + configuration)?
Comment #20
gábor hojtsyI wanted to review this but Berdir's points are great :) So reviewed the issue summary instead. One of the points of #2224887: Language configuration overrides should have their own storage was that we need a new storage for language overrides, so we can skip assuming the things we assume about config, such as the forced casting on save. Once its the same storage again but a different collection, how will that assumption hold up then? Will these behaviours be collection specific?
Comment #21
alexpott@Gábor Hojtsy - actually one of the aims of the language storage patch was to have the same config object name so that schema would apply and casting on save would work :)
There is a question about how to prevent overriding overrides but this does not occur on the StorageInterface level so it is something that can be explored in #2224887: Language configuration overrides should have their own storage
Comment #22
gábor hojtsyRight, we don't use the storage to load something directly, we load it from upper level APIs which expose the data with the appropriate interfaces. It should be the task of those levels to expose the right collections with their appropriate behaviour, not the storage.
Comment #23
alexpottComment #24
berdirHelpful comment ;)
be = by?
Comment #25
alexpottFixed up commentes
Comment #26
alexpottWhilst working on #2224887: Language configuration overrides should have their own storage discovered a bug in DatabaseStorage::exists(). Patch fixes and adds tests.
Comment #27
gábor hojtsySo this patch shows the extent of changes needed to handle multiple parallel sources/storages of config, which was what we shied away when we implemented language overrides as files using config key prefixes so the API changes would not have been so extensive on all levels of config. I did not try implementing separate storages but I think the effect would be similar, one would need to be able to get different storages from a storage container or somesuch and iterate over them like this patch iterates over collections in one storage. So given that we want to store multiple things with the same config name, we need to differentiate on a different level, in essence the storage/collection level. The issue summary has several good reasons for this to be the case. The patch seems to be taking collections to all the places it needs and #2224887: Language configuration overrides should have their own storage is proof it works :) So I think some minor cleanups are in order and we can get this in :) Only found these by reading the whole patch top to bottom:
So the default collection is defined as an empty string and this constant is used extensively but there are still several places where the empty string or the emptyness is explicitly used as equivalent to the default collection as far as I can tell. Eg. docs says "default unnamed collection". Marked some, possible not all of them.
This bakes in assumptions that the constant was supposed to avoid as far as I see.
$directory is never used? Also this is from the general interface, which would not define $directory as an argument either?
Looks like the helper is separated for recursion support and this $directory arg may be a prior version of the recursion implementation.
Not true for collections in general? Would be true for collections outside of the default collection now?
An new
Comment #28
alexpottI think some of the work on the installer in #2224887: Language configuration overrides should have their own storage can be generalised so that we fire an event to list active collections and install config from them. The patch attached explores this - I'll reroll the language storage patch on top - it should have far less non test code. There are lots of changes to the ConfigInstaller service to support this.
Comment #29
alexpottDrupal\config\Tests\ConfigInstallTestis the test I've updated to test all the permutations around collections and config installComment #30
alexpottMissed uninstall.
Comment #32
sunHrm. To some very large extent, this effort appears to duplicate #2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage :-(
Limited to major concerns:
(IETF/RFC notation just for clarity, not meant to yell)
Collection names MUST NOT have a concept of nested collections.
The collection name is an arbitrary identifier that is passed into storages. We MAY make special use of the
'/'character to denote some level of "namespacing" and theFileStorageMAY use that character literally for subdirectories. But otherwise, the$collectionis just a string without meaning, which only controls the hashmap that a specific storage instance operates on.A collection-specific storage instance MUST NOT be aware of any other collections.
If such an introspection is required for any reason (that I do not understand), then that facility MAY live on the engine-specific factory, but not on the storage itself, because that instance is specific to a specific collection already.
This means to move the collection logic outside of the storage classes. Having that knowledge baked-in is a clear breach of concerns.
The caller has to know the collection name upfront. A storage itself is pure CRUD and I/O — you instantiate it with the right parameters, and the storage just simply operates on those. Mixing any additional logic into that is a recipe for a flawed architecture and thus, unrecoverable errors.
Comment #33
gábor hojtsy@sun: how would a caller know about the collections in an arbitrary collection that is needs to work with? Eg. config being staged to be synced? The storage would be the only point to ask about the collections within, no?
Comment #34
alexpottTo go down the route of decoupling all knowledge how collections are stored by the storage we need to a factory per storage "area" so one for active, staging and snapshot.
The new patch attached removes the collection name restrictions so both 'collection' and 'collection.new' are equally valid - tests added.
At the moment each storage has it's own factory method
createCollection()that returns a new instance. And a methodgetAllCollectionNames()to list all the collections that are in use. ThecreateCollection()method I think obeys the rules @sun outlines in #32. The question is does the abstraction of a factory per storage buy us anything? I think not really - just another level abstraction in an already very abstract system. As @Gábor Hojtsy explains in order for config synchronisation of collections to work we need to be able discover collections before a module that uses it is installed - without decoupling collections from the module that uses that particular collection we have no way to inform the user what will occur during a sync.To say that this patch "to some very large extent" duplicates #2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage is misleading - that patch is about architectural purity of the config storage system - it happens to introduce collections to config storage because our key value solution has the concept of collections. That patch does not touch config synchronisation, config installation of config storage comparison with respect to what collections mean for config. I have a number of concerns with respect to building config storage on top of key value storage because we have two requirements - the ability to list by partial key and the ability to list all collections in use that are not currently part of the key value sub system.
The fail in
Drupal\locale\Tests\LocaleUpdateTestappears to be random.Comment #35
gábor hojtsySo this patch moves us from using parts of keys to collections with the same keys in config. Our original approach of config key prefixes was basic and this is a huge improvement in itself, while putting lots of the burden all across config. I think the biggest benefit of our original approach with the key prefixes was that the config API in general was not affected. It did not work with dependent config synching, exploded config keys, ran into key limitations, and so on. I think this collection system is a huge improvement over that and impacts the config API enough that we should not complicate this further if possible.
Since this is a hard-blocker for moving forward with #2224887: Language configuration overrides should have their own storage, proven to work well and got great reviews from berdir and myself, I think this looks good to go. Assigning to catch for core committer review :)
Comment #36
alexpottFixing issue summary to be inline with patch.
Comment #38
dries commentedGabor walked me through this patch. This is a solid step in the right direction and allows us to make progress. We can make it more "awesomer" (per sun's ideas) in follow-ups. Committed to 8.x. Thanks everyone.
Comment #39
olli commentedFollow-up #2270917: Limit collection name due to MyISAM restrictions and test ConfigCollectionInfo class.
Comment #40
xjmComment #41
jose reyero commentedLate to this -interesting- thread, just wondering whether it would make sense to store config schema as a collection?
(Then flattening the folder structure in 'config' a little bit and simplifying file storage classes)
Comment #43
gábor hojtsy@reyero: I believe the point of the runtime configuration storage is for things that may change in environments and need deployment that may involve merging changes, etc. not just a flat-out push like the schema files that cannot have different changes per environment. Which is why they are deployed with code.