Architectural Changes

Storage Classes

A base StorageInterface needs to be created, with different storage mechanisms that implement it (currently the file storage and database storage are completely separate classes.)

Storage Manager

The configuration system was designed to have two storage systems - files and active store. However, it turns out that there are actually more storage systems than that. For instance, there is the default configuration that ships with modules, which is read-only storage. This potentially needs to be referred to at install time, to check if config have changed from defaults, and to reset config to default values. There may be other storages which come down the line. In order to manage this, we should adjust the config system to work with 1:N pluggable storage engines. There are several adjustments that should happen for this to work.

  • Currently the config class manages the interaction between the file storage mechanisms. A potentially better solution would be to create a StorageManager class which manages the interaction between the storage engines.
  • Define what all of our read/write situations are so that we can architect this properly.
  • Manager contains instances of all storage objects and attemps to read or write from or to them as appropriate.

When these two are done the architecture of the config system will look more like this:

Base Interface/Objects

It would be useful to provide an interface and base class that implements functionality common to configuration more complex than a collection of settings (image styles, content types, views, etc.) There has been a lot of discussion about what these should be named but the only agreement is they should NOT have the words ‘entity’ or probably ‘config’. For the sake of this discussion we will call them ‘thingy’. Thingy is not really a part of the configuration system, it is just a useful base class for people to extend for more complex use cases.

Roadmap

  1. #1567812: Remove "Verified" from configuration class names
  2. #1500238: Encode/decode data via PHP serialize()/unserialize() for DatabaseStorage (storage controller specific data formats)
    • Unblocks YAML FileStorage format change.
  3. #1470824: XML encoder can only handle a small subset of PHP arrays, so switch to YAML
  4. #1589174: Configuration upgrade path is broken
  5. #1605324: Configuration system cleanup and rearchitecture
  6. #1447686: Allow importing and synchronizing configuration when files are updated
    • Unblocks Configuration UI work and starts to untangle storage controllers.
    • Introduces basic import/synchronization support.
  7. #1324618: Implement automated config saving for simple settings forms
    • Required to properly convert variables into configuration.
  8. #1609760: hook_image_style_*() is not invoked for image styles upon Image module installation
  9. #1175054: Add a storage (API) for persistent non-configuration state
    • Required to fully convert settings forms into configuration forms.
  10. #1599554: Tutorial/guidelines for how to convert variables into configuration
  11. #1552396: Convert vocabularies into configuration
    • Separates the actual configuration object from the storage controllers.
    • Introduces "typed" configuration objects; e.g., Vocabulary + ImageStyle configuration objects (mimicking Entity objects to some extent).
    • Introduces high-level Configuration API CRUD functions for module configuration.
    • Unblocks all conversions for typed configuration; e.g., entity types, entity bundles (node types, etc), fields, field instances, text formats, etc.
  12. #1609418: Configuration objects are not unique
  13. #1605460: Implement dependencies and defer process for importing configuration [CONFIG_DEFER_SYNC]
    • Enhances import/sync mechanism for complex configuration having (inter-)dependencies (mostly typed config, such as entity types, entity bundles, fields, field instances).
  14. #1605338: Implement pluggable storage for configuration system
    • we need pluggable back ends for config storage so that sites can use something other than the db
  15. #1605124: Configuration system performance
    • Fixes the performance of retrieving configuration and repetitive calls to config().
  16. @todo Generic Property API.
  17. #1448330-19: [META] Discuss internationalization of configuration
    • Leverages Property API to introduce meta data for config object keys; e.g., required, validation, multiple/cardinality, translatable, etc.
    • Unblocks translation UI for configuration.
  18. @todo Configuration UI

Parallel / Anytime


Original report by @heyrocker:

The config system was designed before any implementations were done, and several threads have discussed potential cleanups. I'd like to bring all those threads together here and move forward on options. Ideally, these changes would not affect the public API.

One of the most glaring issues is that there is no commonality between the file storage class and the sql storage class. I would like to see this rearchitected so that we have a ConfigStorage interface, and there are ConfigFileStorage and ConfigSQLStorage implementations of it. This will also make a lot of issues like #1447686: Allow importing and synchronizing configuration when files are updated a lot easier to implement. There has been discussion that this new interface should be a generic key/value store (this was discussed somewhat in #1202336: Add a key/value store API) however I'm not sure this will work out. msonnabaum, beejeebus and I tried building this and it had some complications, like that the config system isn't really a key/value store, its a key/key/value store if you will.

While this is happening, much of the functionality that currently lives in DrupalConfigStorageInterface would move up to the DrupalConfig class (for instance synching between files and the active store in the situations where that happens.) DrupalConfig would now hold references to two storage objects - files and active store. Then it can manage the interactions between them within its existing public API functions.

Another thing that needs to happen is we need to figure out how/where to statically cache config data. In #1270608: File-based configuration API catch suggested we statically cache the config objects in config(), but we could also just statically cache the data within each class. Regardless, a more sane architecture would make this easier at whatever level we decide to implement it.

I know someone is going to bring up whether or not we should do automatic writethrough of to files on save, but I think that is irrelevant to this issue and needs its own discussion where we decide to add a flag to control it that is on by default. (see what I did there?)

I can start mocking this up into a real patch in the next day or two if nobody beats me to it.

Files: 
CommentFileSizeAuthor
#21 current.png24 KBalexpott
#21 proposed.png33.3 KBalexpott

Comments

heyrocker’s picture

Issue tags: +Configuration system

tagging

sun’s picture

  1. The patch in #1447686: Allow importing and synchronizing configuration when files are updated (originally authored by @beejeebus)

    already introduces an early stab at a FileStorage class (next to the existing DatabaseStorage). Both implement the StorageInterface.

    Note: The actual current class name is DrupalConfigTree over there, DatabaseStorage is currently named DrupalVerifiedStorageSQL, and StorageInterface is currently DrupalConfigVerifiedStorageInterface. - Just strip away all of those weird names and this entire thing starts to make sense! Seriously. ;)

  2. DrupalConfig would now hold references to two storage objects - files and active store.

    I strongly disagree with this, because it does not account for the reality we're facing today already. #1484690: Implement $conf overrides in the configuration system introduced a 3rd storage, $conf, which is unconditionally, fully loaded into memory and is write-locked by design.

    When considering an ideal architecture:

    1. DrupalConfig is a wrapping controller that manages 1-N storage controllers.
    2. DrupalConfig is designed from the ground up to handle 2-N storage controllers.
      (1 is an edge-case that does not require DrupalConfig in the first place.)
    3. DrupalConfig's import/export mechanisms boil down to synchronizing key/value pairs across 2-N storages.
    4. DrupalConfig handles high-level control of instantiated storages; e.g., locking reads/writes, unlocking reads/writes, reading/writing from a particular storage, fallbacks to other storages, etc.
  3. Each storage engine uses a custom storage-specific format. DatabaseStorage may use PHP serialize()/unserialize(), FileStorage may use YAML (or currently XML). MemoryStorage ($conf) uses native PHP arrays.
  4. In general: All storages should only implement functionality that can be reasonably resembled in other storages. Next to DTDs and element attributes, we once played with the idea of using file subdirectories as a mechanism for overriding config with language-specific values. Pretty much undoable in all other storages.

    The $conf in-memory PHP storage should not serve as the leading example, because 1) that presumes all capabilities of the in-memory php structure storage to be in place in all storages (which is not guaranteed; e.g., data types, serialization of class instances, etc), and 2) it's quite a difference from understanding each storage as "dumb" key/value store that can be re-implemented in any possible storage engine that currently exists or will come up in the future.

  5. Regarding caching, we want and need to investigate usage of Drupal\Utility\CacheArray for DrupalConfig::$data (array). Speaking of, DrupalConfig should only be instantiated once, $data should contain the full collection of cached config objects, etc.
catch’s picture

catch suggested we statically cache the config objects in config()

I think I said we should have a single storage instance for the request, that should deal with instantiating and returning the config objects itself - that'd make the storage classes responsible for holding config objects in memory themselves (or not) - as opposed to now where the storage class is instantiated every time config() is called.

I'd very much like core to support stacks of controllers as sun outlines - this would be useful for caching and potentially things like language negotitation handlers where you need to run one after the other (but not always). Not sure whether we necessarily want to tackle that here though but would be good to keep in mind.

heyrocker’s picture

Title: Cleanup config system architecture » [meta] Cleanup config system architecture
sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

alexpott’s picture

I think this issue #1576322: page_cache_without_database doesn't return cached pages without the database might need to be included in the summary as the resolution might require architecture changes

andypost’s picture

Is there any ability of new config system to enumerate sub-keys of config. This functionality required for
#1588422: Convert contact categories to configuration system & #1479466: Convert IP address blocking to config system

sun’s picture

#1479466: Convert IP address blocking to config system seems to turn into a different direction. So let's discuss that sub-keys question in #1588422: Convert contact categories to configuration system. I followed up already.

andypost’s picture

Probably it's time to file issue about conversion of node-types to config object

sun’s picture

Before we head on to further high-level configuration like node types, we need to get #1552396: Convert vocabularies into configuration done. It adds the necessary high-level concepts and API functions to deal with such configuration.

yched’s picture

@andypost : note that #1479492: [policy] Define standards for naming and granularity of config files has some (unresolved) considerations that would apply to the case of node types.

sun’s picture

Extensions to #2: (I'll rewrite the issue summary in a bit)

  1. Data in configuration objects has to be replaced entirely upon save.

    Otherwise, we're running into a much larger problem than we previously had with orphan variables. Currently, config() operations are only calling ->set() to insert/update specific keys. This means that configuration objects will contain plenty of obsolete/orphan keys over time. Only by replacing the entire object's data, we're able to ensure that no orphan values remain within configuration objects.

  2. Configuration objects should be namespaced by extension, and the namespace denotes the "owner".

    When uninstalling System module, then system.* is deleted.

  3. Discuss and ensure proper + concise terminology throughout code and documentation.

    E.g., when dealing with more than one storage, #1447686: Allow importing and synchronizing configuration when files are updated introduces $source_storage and $target_storage to clarify the actual relationship between the storages and intentionally remove the hard-coded knowledge of "files" and "databases".

    E.g., stuff like "default configuration" was never really thought through in detail. Might be sufficiently clear for everyone to mean the configuration being shipped with a module. But perhaps there's a more concise terminology; e.g., "vendor configuration", "factory defaults", etc.

pounard’s picture

I agree with 6 and 7 this is mandatory for keeping the system clean.

pounard’s picture

Issue summary: View changes

d'oh, wrong order/relationship. ;)

sun’s picture

Issue summary: View changes

Moved list of issues to the top.

sun’s picture

Issue summary: View changes

Better format.

beejeebus’s picture

i'd like to see us move to storing config values in a key value structure in the active store, at least for core's default backend.

i don't want to see us replace the races we have with the current variable system with (worse) races in the new system. also, i want it to be possible to write a cache that doesn't have to worry about locking-at-a-nasty-course-grained-scope to function.

the races in the D7 variable system are 'just' in the cache layer. we don't actually write out multiple keys at once, so the inconsistent data can be cleared up with 'drush cc all'.

it's worse with the current config system. simple scenario: you and i load 'foo.bar', which contains a,b,c,d,e. i unset/change 'a', you unset/change 'b', (or many variations), we both write 'foo.bar'. somebody loses. and really loses, because clearing the cache won't help. and if i lost, and a change i made to 'foo.bar:a' was required by the change i made to 'foo.baz:d', my site is hosed.

so, i'd like us to change {config} to 'filename', 'name', 'value', where 'filename'-'name' columns are a unique key. (we can't just concatenate the keys, because we're not dealing with a real tree.)

so, to load all of the values in a file we do:

SELECT * FROM config WHERE filename = :filename

to load some subset we do:

SELECT * FROM config WHERE filename = :filename AND name IN (:names)

not sure how this sits with #11.6. IMO the problem with 'orphaned' values is lazy programmer problem, not a problem with the API. config('foo')->clear('bar')->save() should be enough to clear out dead variables, plus whatever we come up with for the 'automatically clear out files on module disable'.

beejeebus’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Title: [meta] Cleanup config system architecture » [meta] Configuration system roadmap and architecture clean-up
sun’s picture

Issue summary: View changes

More roadmap.

alexpott’s picture

I think at some point in the roadmap we are going to have to address performance issues and that the way we address these will impact our architectural choices. I've started #1605124: Configuration system performance to track this work.

To outline the importance of this - currently 40% of the time doing cache('bootstrap')->get('system_list') is spent in the config() function and this does not decrease with repeated calls.

alexpott’s picture

Issue summary: View changes

Updated issue summary.

beejeebus’s picture

Issue summary: View changes

Add issue for moving sync methods into DrupalConfig

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

beejeebus’s picture

Issue summary: View changes

add issue for instantiating DrupalConfig only once

beejeebus’s picture

Issue summary: View changes

add pluggable backend issue.

beejeebus’s picture

Issue summary: View changes

add config defer issue link

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

chx’s picture

In general I very very seriously disagree with a generic property API and the introduction of meta data for config object keys. It adds way too much complexity to the config API. Required / validation should be handled by a simple hook on pre save (edit: here by save I mean "any way to get data into the active store, be it UI, API or load from the file store"), the validators raising exceptions. There is absolutely no way we can store the information necessary for validation into config as meta data because it's not data but logic. One image effect might accept only positive integers. Another might accept negative integers too. Yet another wants floats between 0 and 1. Unless you write a DSL (which we are not doing) you will never get that sort of stuff into metadata. It's logic.

Multilanguage is a different thing. I thought we agreed on a getter with an optional langcode which clearly shows it's a different topic.

catch’s picture

catch’s picture

Issue summary: View changes

Updated issue summary.

heyrocker’s picture

I have been out of the loop for a while, going back to work and facing an agressive travel schedule. Spending today going through the issues and I will comment on them as necessary, but ultimately I find myself reasonably disturbed at the amount of knowledge we are putting into the config system as a result of the issues outlined here. The whole point of the config system is that it should be basically a stupid wrapper around storage. As merlin pointed out in IRC today, it is ultimately not that much different from calling a database update(), but nobody has ever suggested that we wrap hooks into that. Generally I would like to keep as much of the responsibility for these things on the caller as possible. I also find myself pretty much against the introduction of metadata into the config system as a concept, with the possible exception of that which is needed for multilingual purposes.

It has taken me quite a while to get to the gist of many of the issues in the roadmap, and I discovered that one reason is because they are focused on implementations as opposed to the architectural change being done. For instance 'Implement vocabs as config' should really be 'Introduce typed config objects' with vocabs as an example implementation. This makes it a lot more clear what is being proposed, and focuses the discussion on what is IMO the most important issue at hand. I think that if we could focus more issues on architectural changes, and less on implementations, we would have a much better and easier time moving forward. An implementation might require six separate architectural changes, but wrapping them all into one patch (again as the vocab and sync patches do) makes it a bitch to review. Of course, this means lining up more patches in a row, but it also means that each one SHOULD be able to move more quickly. Focusing the issue summaries on the actual changes being done would help me personally a lot too.

heyrocker’s picture

Issue summary: View changes

added "Config saving is completely inconsistent in terms of hook invocation and etc." issue

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

There is absolutely no way we can store the information necessary for validation into config as meta data because it's not data but logic.

That's the point. The metadata should not be part of the config, it should live on a higher level, i.e. let the modules provide it to make working with the config system easier (translation..). To the config system itself, the metadata would be irrelevant.

Crell’s picture

it is ultimately not that much different from calling a database update(), but nobody has ever suggested that we wrap hooks into that

Pedantic point: There were a few people calling for hook_query_alter on insert/update/delete queries, too. I shut that down hard, but there were people suggesting it.

Is there anything besides DB schema changes resulting from a config change that we actually NEED all these hooks for, or is it just hypothetical "could do anything" flexibility? If the latter, I would be inclined to wait for actual use cases that cannot be solved otherwise. Remember, we're not removing direct DB access from modules; if one particular module has a totally bizarre use case where it needs to do some really complex thing with config, there's nothing stopping it from still using the database as now. We just want to minimize the use cases for that, by focusing on the major sources of configuration: Basic module settings (already solved), Compound config objects (Image styles, Views, etc.), layouts (TBD), Entity configuration (the hard one because of schema implications).

We don't need to solve every possible configuration use case right now, just the big and annoying ones. :-) Let's not over-engineer the system so badly that everyone just falls back to the database anyway to avoid dealing with it.

alexpott’s picture

FileSize
33.3 KB
24 KB

At devdays we had plenty of discussion about the architecture of the config system.

The current architecture

current.png

The proposed architecture

proposed.png

alexpott’s picture

Issue summary: View changes

Updated issue summary.

heyrocker’s picture

Issue summary: View changes

Updated issue summary.

chx’s picture

Issue summary: View changes

Updated issue summary.

beejeebus’s picture

Issue summary: View changes

D8 contrib

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Status: Active » Fixed

With regard to this particular, initial architectural revamping roadmap, we're done :)

I'll create a new meta issue for the new/current challenges ASAP.

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Updated issue summary.