Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
- During a cold start, we load a lot of config files from disk. These files must be stored on the shared filesystem when using multiple web heads. This is slow and often unreliable.
- We have an active config directory which looks editable but if you actually dare to edit it, things will break horribly. This is unnecessary brittleness.
Proposed resolution
- We retain the pluggable storage backend support for active configuration. Core ships with db storage by default, with no caching.
- The config.storage service is aliased to an un-cached config so that follow-up issues (or site-builders) can swap in a caching or a pre-load layer
Advantages of this approach
- discourage people from hacking their active config
- Less potential confusion about which files are which in the import/export workflow
- less YAML parsing and file stats
- more secure - the active configuration in no longer in files
- More familiar development workflow - live database contains everything needed aside from the code
Cons or points of debate
- harder to recover from invalid config causing a site-wide WSOD (#23), but similar to Drupal 7 in this regard
- "discoverability" of the active configuration by browsing files is lost. However they can already be browsed using the single export form
- The quick "copy from active" staging and export is gone. Import/Export UI already partially addresses this, but we would need to document expected workflows (with drush, with UI export, example below, etc.) in the handbook so that we can have people test them once the change is made.
- There are some cases where there is no UI for what's in config (e.g., Breakpoint, Tour, REST modules) and editing active config directly (rather than via export/import) is safe (i.e., config entities aren't being created, deleted, or renamed, and there are no hook implementations that need to respond to what's being changed). For these cases, the development process is made more tedious by requiring either manually editing serialized db rows or going through the export/import workflow for each iterative change. However, #2226761: Change all default settings and config to fast/safe production values is an open issue for making most default Drupal settings be what is best for production, and then providing a mechanism for overriding for developer convenience. So this point could be addressed by using file storage when in DEV mode.
- Git workflows changed. Use case described in https://drupal.org/node/1831818 would no longer be possible (but may be outdated in any case). There will be no more "native" diffing against staging stored in git; there's at least an extra step of "export your config, wipe out your separately tracked config repo that is in staging or somewhere else, and replace it with the export". Drush config-export allows to specify which config directory to export to, see this Git CMI workflow: video and example.
Remaining tasks
review/decide/commit
User interface changes
None.
API changes
No actual API changes, but some changes to developer workflow.
Follow-ups:
#2232791: Make the default core config service a db-backed storage with a cache layer.
Comment | File | Size | Author |
---|---|---|---|
#116 | install-env.png | 36.89 KB | sun |
#108 | 2161591-108.patch | 36.25 KB | pwolanin |
#108 | increment.txt | 6.29 KB | pwolanin |
#100 | 2161591-100.patch | 34.91 KB | pwolanin |
#100 | increment.txt | 5.26 KB | pwolanin |
Comments
Comment #1
sunInteresting proposal — with wide-ranging consequences, so I'm not able to vote in any way yet.
Two questions though (as mentioned in IRC):
There are claims about performance problems — Do we know what those actually look like? Can we back that up with numbers?
Please note: I can certainly imagine that property-querying many config files will be costly. But it would be good to know what exact amount of performance regression we're talking about.
Isn't it likely that config entities will be "full-blob"-cached at some layer anyway?
(Entity cache or Config cache is an irrelevant detail, as long as there's a full-blob caching layer in front of the raw data)
If so, shouldn't we rather think about materialized views for config entities?
Materialized views would involve pretty much the same architectural challenges: (1) Auto-generating database tables, (2) using a query builder for them, and (3) maintaining their relational SQL database schema.
The only addition would be (4) data pollution and (5) cache invalidation. However, those two are managed by D8's vastly improved caching (tags) system already, and yeah, the current config (entity) system is fully cached, so we're able to invalidate caches (derivative presentations) at the right time already.
As an extra benefit: If we're able to provide materialized views for config entities, then we're also able to provide them for content entities. :)
Comment #2
chx CreditAttribution: chx commentedMaterialized Views have a lot of problems the proposal doesn't. The proposal will use a single table (keyvalue) which will practically never change: only in minor versions with upgrade functions but it's hard to imagine what the heck would we want to change on a K-V storage. So there are no tables to autogenerate. Also, MV is data duplication with all the invalidate / rebuild headaches it involves, even with tags that's something to code. The beauty of the proposal is the simplicity it takes to implement it, here's how a StorageInterface implementation can talk to keyvalue:
Comment #3
chx CreditAttribution: chx commentedFiled #2161643: Add a KeyValueStore\IterableKeysInterface (allowing to retrieve keys with a given prefix)
Comment #4
chx CreditAttribution: chx commentedFiled #2162161: Change DatabaseStorage to use XML instead of serialize
Comment #5
alexpottIn the issue summary the problem is stated as
Afaics the relevant code is
Drupal\Core\Config\Entity\Query::execute()
, specifically:Running this through the debugger once the config cache is warm shows 0 hits to the disk. The call to
Storage::listAll($prefix)
results in a cache hit and so do the calls toDrupal::config($name)->get()
.Admittedly reading this code shows that we could have a brilliant optimisation just swapping out the loop here for a
ConfigFactory::loadMultiple()
!Comment #6
alexpottCreated #2163371: Replace \Drupal:config()->get with ConfigFactory::loadMultiple in Drupal\Core\Config\Entity\Query
Comment #7
yched CreditAttribution: yched commentedOther optim for trivial cases (won't help for the blocks though) : #2163919: Optimise Config EntityQueries in case there are conditions on the 'id'
Comment #8
yched CreditAttribution: yched commentedAlso, from the OP:
I'd say this is primarily a bad design choice from block entities to not have the theme be part of the file name (block ids are currently just the block machine name, not
|theme].[machine_name]
).It's been established like 2 years ago that storing config in files means they are not efficiently queryable other than by filename prefixes, and that the choice of file name structure thus has to be driven by the "critical path" query pattern (here, get all blocks for a theme) - that, or a cache layer, which is what field does with FieldInfo::$fieldMap
Not sure why we'd suddenly find this unsustainable and re-put "store in DB" on the table after 2+ years ?
Comment #9
BerdirRemoving the theme prefix was explicitly changed, that was initially there. Not idea why.
Comment #10
alexpottThe issue that did this was #2043527: Theme name is included in block machine name but should be stored as a key instead
Comment #11
yched CreditAttribution: yched commentedLeft a note in there...
Comment #12
tim.plunkettEvery config entity is $config_prefix.$entity_id.yml.
Blocks used to be $config_prefix.$theme_name.$entity_id.yml, and that was strange and confusing (on the API level) and there were many hacks spread throughout the block system to handle that. It was also confusing to the user, because it seemed like you should have been able to have the same block entity ID for each theme, but they had to be unique...
There were no performance concerns there one way or another.
Comment #13
chx CreditAttribution: chx commentedCMI took many twists and I am not sure whether the active store was editable at any point but the thing is, it's not now. If it's not human editable, why does it matter what format it is in? You can export it into YAML, version control that, edit that, etc.
Comment #14
yched CreditAttribution: yched commentedNot really, Blocks are $config_prefix.$entity_id like everyone else.
It's just that $entity_id used to be $theme_name.$machine_name, and is now only $machine_name.
What made this confusing is that $machine_name is in fact named 'id' - so yeah,
$entity_id = $entity->id() = $entity->theme.$entity->id was confusing, but that's only a matter of property naming.
There are a lot of other config entities that have a compound entity id, exactly for the same kind of use case blocks have - loading the right ones for the typical business cases without parsing all of them.
Comment #15
moshe weitzman CreditAttribution: moshe weitzman commentedI support this.
My current plan with Acquia cloud is to use Config_DB for storing Active Config in the DB. The only other option is on the shared filesystem and thats pretty scary from a performance and reliability point of view. I know that reads will usually be serviced by the cache backend but cold starts happen and then we have a serious issue.
The other motivation, as @chx mentions, is that people can't mistakenly edit an active directory because we won't have one!
So, I agree that Config_DB makes sense for core, in the spirit of Fast by Default.
Comment #16
chx CreditAttribution: chx commentedYou'll need to find someone to work on this cos I won't as you know.
Comment #17
Anonymous (not verified) CreditAttribution: Anonymous commentedi'm 100% happy to work on this if we can get consensus on the following:
1. yaml for module default config and import/export become part of the contract supported by Drupal core's CMI system. we stop the nonsense that everyone wants to rely on yaml, despite the fact that in our design yaml is an implementation detail of a storage driver.
2. we retain the pluggable storage backend support for active configuration. core ships with db storage by default, with no caching.
3. core also ships with a db-backed storage with a cache layer. this will allow sites with memcached/redis support to take advantage of their existing non-db cache layer for config without writing a new storage driver. they can simply configure Drupal to use the cache-layer storage with memcached/redis as the cache backend.
i'm ok to explain the above in more detail, but i don't really want to lead the work to get consensus - i've tried and failed for some time now already.
Comment #18
webchickAssigning to alexpott to weigh in on #17.
Comment #19
alexpott#17.1 why is this part of this issue?
#17.3 means that we have to drop the "queryable" aspect of talked about in the original issue summary.
We need to decide what format we're going to store configuration in the db.
The fact that we've got the config export ability in the UI and Drush leads me the agree that the move of the active store to the database is a good thing. The lack of export and Drush tools was one of the reasons we went with the active store in files. Another reason why active storage in files has seemed to be a good thing was the ability to place it into version control. This idea has been discarded since this would lead to deploying directly into the active directory and not through a configuration import. The current ideas around version control and configuration are to keep the staging directory in a VCS or just another directory of exported configuration in your VCS repository. There are other reasons this is a good idea too:
Comment #20
pwolanin CreditAttribution: pwolanin commented#17.1 I agree with, but indeed should probably be a separate issue.
re: #17.3 I assume we could cache the result of each particular query?
Comment #21
moshe weitzman CreditAttribution: moshe weitzman commentedLooks like have consensus on how to proceed. I updated the Issue Summary.
As to serialization format in the DB, it seems that there aren't significant advantages between var_export() and json_encode() and serialize(). See http://stackoverflow.com/questions/804045/preferred-method-to-store-php-... for some good discussion.
Comment #22
moshe weitzman CreditAttribution: moshe weitzman commentedComment #23
sunOn architectural grounds, "I agree", because the entire point of rewriting CMI 2-3 times was to decouple the storage from the application-level configuration concept.
The underlying storage is nothing else than a key/value store.
The custom concept of a "ConfigStorageInterface" should not exist to begin with. Even the current FileStorage is a key/value store, because key=filename, value=string, collection=subpath. There should only be one concept of a key/value store in Drupal and it should be used wherever applicable. Let's ensure that the configuration system does not re-invent that wheel.
Because of that, I wholeheartedly agree with issues like #2161643: Add a KeyValueStore\IterableKeysInterface (allowing to retrieve keys with a given prefix), since that brings us a big leap closer to replacing that custom k/v store concept.
Likewise, the encoder/decoder for the serialization format should simply be injected, the storage doesn't care.
I wanted to suggest Drupal\Core\Serialization first, but then I wondered how the REST/Serialization module is handling serialization formats. I learned that we apparently have the Symfony Serializer component in core and REST/Serialization is a simple wrapper around that (for REST-specific purposes, not sure I get why those are separate modules...)
→ Use whichever serialization encoder you want to use. The key/value store does not and should not care.
In case Serializer is too slow/big/complex of a dependency, then let's simply do Drupal\Core\Serialization and provide standard implementations for serialize/YAML/JSON/YML that can be injected.
With that out of the way, the following detail in the outlined plan/reasoning concerns me:
Quoting myself from #2161995-3: Faulty module cannot be removed from module list; DrupalKernel rebuilds container on every page request
The entire architecture of Drupal 8 relies on the fact that the application is able to operate based on valid configuration data. If that is not the case for whichever reason, then the system very possibly freezes infinitely.
In that case, the concept of rebuild.php does not help either, because "rebuild" naturally implies "rebuild data structures based on the current active configuration". If configuration happens to be malformed for whichever reason, then a rebuild will fail.
Right now, (1) you can easily go to your active configuration to (2) fix whichever malformed configuration happens to blow up the system and (3) successfully rebuild.
What I'm pointing out is primarily a site building/development situation that can easily happen on a site that is "under construction." As a testament to that, the specific issue of #2161995 occurred after switching git branches. — However, I am NOT saying that a Drupal production site should have to deal with the possibility of fragile/broken configuration. So to perhaps clarify my overall stance here:
+1 for killing the custom concept of ConfigStorage. → Replace with KeyValueStore + inject Serialization.
→ I couldn't care less about the default in core.
(My personal goal is to use raw JSON files via PHP-native JSON encoding/decoding without caching layer.) In other words, +1 to what @beejeebus stated with regard to keeping everything swappable/pluggable (whereas that does not mean that core supports any kind of syncing concept to aid with swapping implementations).
For production sites, core should assume that the active configuration is "stable" and can be performance-optimized.
I'm concerned about the DX of an active store that lives in the database or any other inaccessible location for the case of site development (and equally core development for that matter, but it's really one and the same thing).
→ If we're going to move the active store into a place that is not easily debuggable/fixable, then I wonder what kind of additional/alternative solutions we can offer to developers and site builders?
(Any such solution needs to be able to operate without having access to anything "Drupal", not even DrupalKernel).
Thoughts?
Comment #24
moshe weitzman CreditAttribution: moshe weitzman commentedThanks for the feedback. It sounds like you support this patch, but you have some other (valid) concerns about "discourage people from hacking config". I think we should discuss those in a CMI conf call or in another issue.
I'm hoping that others will chime in about the proposal to use KeyValueStore. Sounds reasonable to me.
Comment #25
yched CreditAttribution: yched commentedNote that switching to key/value storage still keeps the "active store" non-queryable other than by config name. Meaning the current naming scheme for blocks is still sub-optimal (#8 / #14).
Comment #26
sunFWIW, I just filed a PoC in #2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage
Comment #27
xjm@alexpott, @mtift, @Gábor Hojtsy, and I discussed this issue yesterday. While I'm not entirely opposed to this change since CMI has evolved a lot since we originally moved active config to the filesystem and all the things discussed here are legit, I'm also concerned that we have numerous critical beta blockers in CMI that must be resolved before we can ship, whereas we could totally ship with active config stored in files.
I'm not comfortable calling this a beta blocker since it's not a release blocker, but tagging "revisit before release" to make sure we look at it before we roll a beta.
I strongly recommend solving the top issues in #2187339: [META-0] CMI path to beta before we consider devoting resources to this.
Comment #28
pwolanin CreditAttribution: pwolanin commented@xjm - based on our experiences trying to use config on an actual server, I would consider this a release blocker and very high priority.
I agree with beejeebus that we need to get this out of files as soon as possible, and then we can worry about tinkering with the internal implementation.
Comment #29
alexpott@pwolanin care to outline what those "actual" experiences were?
Comment #30
catchTag futzing.
Comment #31
pwolanin CreditAttribution: pwolanin commented@alexpott - trying to finish up a related blog post, but I wasn't going to detail the hiccups there.
One major problem is that since we (by default) store the active config with the sites files, copying or failing to copy or update the files at he same time as the database can kill the site. In contrast, have the default active store in the DB would make it a lot easier to make a dev site that mirrors the live site since you know the config and data are in sync. In addition, people now can use tricks like redirecting file requests to live so they don't have to keep the files in sync.
We were also playing with this on Acquia hosting, and the initial effort to support Drupal 8 sets the config directory name differently for each environment for the site. This could be correct in as much as you might not want to overwrite your config when just copying uploaded files between environments, but then meant it was a lot harder to install in one environment and get a copy running in another.
Comment #32
pwolanin CreditAttribution: pwolanin commentedHere's the blog post (and screencast) https://www.acquia.com/blog/moving-your-drupal-8-configuration-local-ser...
Comment #33
quicksketchThe config being stored as files certainly brings a new challenge to large-scale deployments of Drupal. Just as old problems (like sharing the files directory via NFS) have been considered solved for Drupal, solutions to the config directory across web servers are most likely already out there already. Something like AFS might work for config directories: local reads and writes would prevent slowness that you might get with the more typical NFS setup used for the file system. Pantheon's existing Valhalla filesystem probably already works with config files across servers (but I haven't validated that).
I don't think the summary statement is incorrect; shared filesystems *are* often slow and unreliable. But that doesn't mean there aren't solutions out there that would fit the requirements of the config directory. However, any technical solution at the OS level won't solve the issues of multiple-get or end-user visibility of the files, so those may be the better justifications here than problems with a distributed file system.
Please take my input with a fistful of salt, I'm just looking to provide additional data and understand the problems with each storage option.
Comment #34
Anonymous (not verified) CreditAttribution: Anonymous commentedlet's postpone this on #2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage.
Comment #35
pwolanin CreditAttribution: pwolanin commentedok, then let's try to get that one in fast.
Comment #36
Anonymous (not verified) CreditAttribution: Anonymous commentedhere's a first cut patch that makes drupal install with db storage.
Comment #37
pwolanin CreditAttribution: pwolanin commentedFrom discussion at DevDays, the need to support file to DB storage would make releasing this in any form pre-beta much easier than post-beta.
While we could swap the implementation to unify with K/V, it's really not needed for the goal of getting active into the DB rather than in files.
The patch above also makes Drupal install much faster.
Comment #39
pwolanin CreditAttribution: pwolanin commentedTrying to make the Frankentest code work
Comment #40
Anonymous (not verified) CreditAttribution: Anonymous commentedthis follow up fixes the import/export config tests.
Comment #43
Anonymous (not verified) CreditAttribution: Anonymous commentedComment #44
Anonymous (not verified) CreditAttribution: Anonymous commentedComment #45
Anonymous (not verified) CreditAttribution: Anonymous commentedfix use of db_like() in DatabaseBackend. lulz.
Comment #47
pwolanin CreditAttribution: pwolanin commentedfixes update.php and trying to scrub out use of CONFIG_ACTIVE_DIRECTORY
Comment #49
pwolanin CreditAttribution: pwolanin commentedFix some more test fails, and add apparently assumed validation to Config class.
Comment #51
pwolanin CreditAttribution: pwolanin commentedneeds some doxygen cleanup, but this is more correct.
Comment #53
sunCan we revert all the changes to
config_get_config_directory()
, constants, and also the installer, please?There's really no need to change that here.
I'd like to retain the file based active configuration for tests. — Being able to easily study the actual configuration that got written in a test has become an enormous lifesaver for me in tons of issues that were exceptionally hard to debug. Removing that would definitely slow down progress on many major core issues.
In essence, all you want to change here are the service definitions to change the default active config storage from FileStorage to DatabaseStorage.
Lastly, this change should probably go hand in hand with #2226761: Change all default settings and config to fast/safe production values — use the database in production, use files for local development.
Comment #54
sunIn addition to #53, I understood the agreement of potentially doing this first in the way that #2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage will still happen in parallel and/or afterwards.
That is one more reason for retaining the current installer and config code in HEAD unmodified and just simply replace the config storage implementation here.
This means to (1) add [back] the
{config}
table insystem_schema()
and (2) change the service definition to useDatabaseStorage
instead ofFileStorage
. Further changes shouldn't be necessary, sinceDatabaseStorage
actually was the active config storage some moons ago.To keep the changes here compatible with #2190723, we should either leave the
config.cachedstorage.storage
service name alone or rename it toconfig.storage.active
. — Renaming/re-purposing it intoconfig.storage
does not look right to me, becauseconfig.storage.staging
+ others still exist, and the service name should make clear which config storage is being returned.Comment #55
pwolanin CreditAttribution: pwolanin commented@sun - I'm a little perplexed by the suggestion to use file active for tests. The attempt to over to DB has been turning up various places where we were cheating or assuming that behavior specific to File storage was generic to all config back-ends. If we wanted to be truly honest, we'd figure out how to install and test with 3 implementations, but that should be a follow-up.
In terms of the constants, etc, you mean we should create the default active directory even if we are not going to use it?
Comment #56
pwolanin CreditAttribution: pwolanin commentedSome test fixes, esp for the Frankentest, and puts back some of the active directory setup to minimize controversy/patch size.
Comment #57
pwolanin CreditAttribution: pwolanin commentedComment #59
pwolanin CreditAttribution: pwolanin commentedDiscussed with catch - the better pattern is to not have the table as part of system_schema(), but instead create it when needed as the DB cache back-end code does as of this issue: #1167144: Make cache backends responsible for their own storage.
Taking the pattern from that patch and applying it to the DB config storage.
This also allows us to avoid changing a some of the test code
Comment #61
pwolanin CreditAttribution: pwolanin commentedThe ConfigStorageTestBase wants some things to fail that now succeed, and DatabaseStorageTest was trying to create the table in advance.
Here's a bit of a hack around, or maybe we should change the test?
Comment #63
pwolanin CreditAttribution: pwolanin commentedUgh, the interface and concrete classes and tests don't really agree on when/if exceptions are thrown.
This tries to make it a little more consistent and document some exceptions on the interface, and hopefully makes all the test pass.
Comment #65
pwolanin CreditAttribution: pwolanin commentedA few minor doc tweaks.
Comment #66
pwolanin CreditAttribution: pwolanin commentedoops - throwing an exception for listAll() is blows up install. ok, let's fix the test, rather than coding to the test.
Comment #67
Anonymous (not verified) CreditAttribution: Anonymous commentedpwolanin++ some nitpicks below.
one too many spaces after '='.
this talks about an emtpy array, but seems to return false.
missing blank line after proctected $yaml.
i guess you meant to delete that FileStorage line?
Comment #68
pwolanin CreditAttribution: pwolanin commentedThanks, here's with those cleaned up.
Comment #69
sunGood progress. Thanks for leaving the config directory setup of the installer in place!
I still need to apply this patch and actually review more in-depth, and perform some manual installer tests, but here's a code-only review of the patch:
→ config.storage.active
Let's also move it down to the other config.storage.* definitions.
In case 'config.storage' is required as-is, the service can be defined as an alias pointing to config.storage.active.
Hm. This looks misplaced here, since this validation is a detail of the serialization format being used by the storage.
Hm. All of this special-casing should not be necessary... — at least with regard to the installer.
Happy to help to debug this — however, the comment states "situations, like in the installer"... I assume the installer is the only cause?
I agree that the storage/backend should have methods that allow the backend to create/manage its storage.
However, for config (and key/value stores), I'm not sure whether automatically creating the storage on the fly is legit.
The creation/management of actual storage bins should be left to the application setup/installation code that wraps or runs before the actual usage/consuming code.
In particular in the installer, this automation can cause the storage to be automatically created before it is supposed to be created — possibly using preemptive storage parameters that have not been set up yet and thus were not intended. We have a good amount of bugs caused by similar automations in the installer right now, we shouldn't add a new one.
In short, the installer (or "a" installation process) should be responsible for creating the storage bin. It should not be created automatically on demand.
For completeness, I also think that
createStorage()
(ensureTableExists()
) method should be accompanied by adeleteStorage()
method.Under which circumstances is a PDOException thrown?
Part of the #2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage effort that is so grossly ignored here:
#2208633: Add a Yaml class to Drupal\Component\Serialization
...fully eliminates these manual overrides.
Do we need to change DUTB tests?
Comment #70
pwolanin CreditAttribution: pwolanin commented@sun - the auto-creation pattern was suggested by catch, and makes the whole thing easier to manage. However, I don't feel strongly abotu this.
The docs about what exceptions are thrown when needs some cleanup, but it would throw some kind of PDOException when the DB error is not that the table is already present.
The existing tests expected validation of values, hence
$this->validateValue()
. If we are testing that certain things like a file resource fail, then it's clearly not specific to serialization. I think this is reasonable since it's just applying essentially the same basic check that's applied in castValue() when we have a schema.The DUTB change is certainly needed - we want to test the default active storage.
Comment #71
xjmThis already has the Needs issue summary update tag, but let's try to make it as complete as possible with the pros, the cons, and what we'd do to alleviate the cons. This decision--whichever way we go--has significant implications for all users of Drupal, sites small or large, site builders or developers.
I listed out the disadvantages I saw on during a CMI meeting today. I feel that we need to file additional issues to address these points.
Also, based on a discussion with catch and berdir at Szeged, I feel we do actually need to make this beta-blocking if we are committed to it, as the upgrade path for it would be ghastly.
Comment #72
pwolanin CreditAttribution: pwolanin commentedThere is already a UI of sorts for browsing your config - the single export lets you see all the existing types, and look at any individual one. That feels like it covers the need to browse, but is there additional functionality or UI changes that are needed there?
While there is no native file diff to git, this also seems like something drush can/should handle? I haven't tried recently, but Moshe discussed adding this as part of the config-import command.
The git workflow discussion you linked to seems like it may be outdated. Did you look at https://www.acquia.com/blog/drupal-8-configuration-workflows-using-git ?
Updating the issue summary a bit. I don't think the adding back a caching wrapper is for this issue and for now we are sticking with native PHP serialize()
Per your suggestion, marking this as a beta blocker.
Comment #73
pwolanin CreditAttribution: pwolanin commentedThis patch adds a createStorage() method to the interface and classes (delete seems totally unneeded imho).
Also, sets up a service alias so site builders could swap in a caching layer if desired.
Comment #74
pwolanin CreditAttribution: pwolanin commentedA couple small doc fixes.
Comment #75
tim.plunkettI also think the issue summary needs an update. We need to be able to weigh the pros and cons properly.
Just one note about the code:
Hm, didn't know this worked.
We used to have
Maybe we should use that pattern instead?
Comment #76
scor CreditAttribution: scor commentedComment #77
catchIf it's a beta blocker it's also critical.
I personally find the active/staging reversal workflow for git that relies on files very confusing. Export/commit seems a lot more predictable with less chance of messing around in the wrong directory or getting your local.settings.php variables switched one day.
Additionally, export/commit/import is a workflow that will work regardless of what storage is used for active configuration. Whether it's MySQL, Redis, YAML you'll always be able to export/commit/import.
If we ship with file storage in core and encourage the staging/active directory swap, but every large site + hosted solution ends up using database storage, then some people are going to get a nasty surprise when they take on a site that's using the opposite method.
Comment #78
pwolanin CreditAttribution: pwolanin commentedChanged the alias format in the yml file and made some more issue summary updates
Comment #79
sunThanks! :-)
Hm. Can we rename this to
.active
, please?Since there are multiple different config storages, the purpose of each service ID is to tell which config storage is being returned.
All of the individual
config.storage.*
services are not cached (they cannot), they provide direct access to the storage.Still not happy with these... the system should know when it's able and ready to use a storage.
I could help to sort out the installer environment to make these changes obsolete?
(I'm asking upfront, because I don't really feel "welcome" on this issue here, and I don't want to cause any drama...)
Is there a situation in which
createTable()
is called after the storage has been set up already?I can't really imagine one?
In other words, can't we simply make
createTable()
create the table and be done? (sanstableExists()
check)Can we make the schema definition method protected? If it's public, then it would have to go onto the interface, but it only makes sense for the database storage. Doesn't appear to be called from anywhere else.
We have dedicated tests for each config storage implementation, which assert that each implementation meets all requirements. Therefore, tests can use any of the implementations.
Debugging problems and failures related to configuration is a lot easier when you have direct/verbose access to the configuration files. You're instantly able to see that e.g. a config file is entirely missing, or that a bug causes a config file to suddenly miss a value, or that some values might be using a wrong data type, etc.pp.
(Sadly) there's still a lot that can go wrong with configuration, and often times it's hard to pinpoint the root cause that led to a problem. That is, because the nature of configuration is that it just configures parameters, which instruct the application to do what's being configured. Thus, if the configuration is wrong, the application may still work, but just simply doesn't behave as expected.
In a properly decoupled system (that is D8), the chance for unexpected configuration to get written is large(r), because many different components, controllers, and plugins are interacting with each other autonomously. Unlike static service definitions, configuration is dynamic by design.
That's my primary (development) concern of moving the active config storage away from files into a storage that is harder to access and introspect.
The concern only applies to development. I do not have any objections at all for using the database as active config storage on production sites.
Therefore, I see a clear difference with regard to which storage adequately fulfills the needs of different environments/use-cases. From an architectural standpoint, using a particular storage backend for a particular environment is not different to using e.g. Memcache as cache backend in production, but a Memory backend in development.
— And whether you like it or not ;-), that is what ultimately brought me to the conclusion that it is wrong to hard-code any kind of storage implementation. You should be able to use the implementation that meets the requirements of your environment best.
As with any other persistent, non-volatile storage, you are locked into an implementation as soon as you hit the installer (i.e., there's no way to easily switch to a different storage later on). There's no reason for why that cannot or would not work. → In fact, the effort to make it happen is 90% done already, various child issues are RTBC + just waiting for commits.
In the end, we're shooting for pretty much the identical goal, and this patch does not actually introduce a problem that would present a hard blocker for completing the effort. I especially want to thank you for not removing the config file directory setup as part of this patch. The other issues do not make an implementation create its own storage yet, so that is definitely a step in the right direction, too (the new code here will just simply be moved into the k/v
DatabaseStorage
).And just to further underline that: Even if this issue here would not happen, but the others would, then production sites using multiple webheads (e.g., sites hosted on Acquia's infrastructure?) will be able to use the database as active storage (or whichever implementation suits best).
Long story short, I see the point of avoiding slow YAML parsing in production, I can also see the problem with infrastructures involving multiple webheads, and thus the proposal makes some level of sense for production environments. However, at the same time, I'm very concerned about development environments (not limited to debugging and testing), and that's why I believe we need a solution that is able to address environment-specific concerns separately. But due to the reasons outlined above, there's really no need to make a drama out of this; we're able to deliver a sensible solution and most of the work is done already.
Therefore, I can live with this patch going in before the other patches, if we're committed to do them, too. In other words, it would merely be great if we could make this change "compatible" (mostly done already, only remaining issues in the review above, AFAICS), and I also hope that we can agree that one hard-coded default assumption won't be able to meet the (conflicting) requirements of all possible environments. :-)
Comment #80
pwolanin CreditAttribution: pwolanin commented@sun: brief comments in response.
we can change to .active or whatever, I don't have a strong opinion as long as there's some agreement that it's a clear naming.
Regardless of whether it's during install or otherwise, we should catch exceptions unless they are documented on the interface. So maybe we should re-write the comments with that in mind.
making the schema method protected is fine.
Possibly there is a better place to call the createStorage() method instead of in BootstrapConfigStorageFactory, or we can adjust that in a follow-up. I thought it made since there since you might implement the function named in setting drupal_bootstrap_config_storage and never want the table created at all.
Comment #81
sunBriefly discussed with @pwolanin in IRC. A few adjustments:
This patch does not revert the active storage back to
FileStorage
for tests. — That's still my only remaining issue with this patch. (The change to DB also does not improve test performance; I would notice, because I have a terribly slow filesystem a.k.a. Windows.)Comment #82
effulgentsia CreditAttribution: effulgentsia commentedI haven't read the patch, but I'm +1 to the issue. I haven't read every comment, so I'm curious: is there anyone following this who's against it?
Comment #84
tim.plunkettI need more time to gather my thoughts and explain my opinions.
But just to avoid the false assumption of consensus:
I'm firmly -1 on this issue.
Comment #85
sun@pwolanin and I had a minor disagreement behind the scenes on the question of whether
createStorage()
should be(A) explicit; i.e., calling it twice should throw an error/exception, or
(B) idempotent; i.e., essentially
ensureStorage()
, calling it twice does nothing.After debating and disagreeing for a while, we asked others. :-) Since I/we plan to move/port that code into
KeyValueStore
later on, @msonnabaum was an adequate field expert to ask.@msonnabaum wants to see the same implementation as for cache bins (idempotent), and the methods should be protected. — That is, because alternative storage implementations don't need any storage to get created first.
That said, he later mentioned an interesting idea that might be a fair middle-ground: Instead of try/catch in all methods, make the constructor ensure it once.
I guess we can leave that constructor idea/approach for later. For completeness, I was for the explicit approach, mainly because (1) this is about a required, non-volatile, persistent storage (!= cache) and (2) I don't like the idea of arbitrary code being able to call into
createStorage()
at any time "just to be sure". — However, @msonnabaum's recommendation eliminates that entirely, because the method cannot be called from anywhere else to begin with. So I'm happy to concede :-)This means that the installer would essentially do "nothing", and the first time the
config.storage.active
service is requested and a write operation is attempted, it will try to create the storage.Comment #86
webchickWe just had an epic discussion in IRC about this issue. The big hairy things that folks were concerned about to my recollection were:
a) The first point in the current "con" list: "harder to recover from invalid config causing a site-wide WSOD" In fact, "harder" seems like a pretty major understatement. Assuming you somehow found your way from a WSOD to the specific config key in the database that was causing the problem, your only recourse is directly modifying serialized text with SQL. :\
b) There are a few things in core (breakpoints, Tour) that do not provide any UI for editing, beyond the single file export browser thing. The inability to iteratively develop on those without either the SQL dance seems like a huge drawback, DX-wise. Yes, there's always install/uninstall but it's totally conceivable to want to write tours for existing data-providing modules as new features are added, and you would not want to lose all of your data just to tweak a sentence.
Whoever else was there, feel free to add your 2 cents.
Comment #87
webchickOh and a couple of counter-points to the above that I missed. beejeebus pointed out that if we're worried about invalid config, using files as storage is a much riskier proposition than the DB, which is atomic. And also that a lot of the workflow issues could likely be worked around in contrib.
Comment #88
pwolanin CreditAttribution: pwolanin commentedBased on the discussion with sun, here's back to the automatic storage creation
Comment #89
xjmI don't think this is a huge problem--or it's a problem that exists in HEAD anyway--it's just a little more awkward with DB storage. As far as I know, this would be the workflow:
And yeah, I can see devel including a tool to hack config directly in the UI to save module and theme developers time.
Comment #90
pwolanin CreditAttribution: pwolanin commented88: 2161591-88.patch queued for re-testing.
Comment #91
catchThis would mean running db_table_exists() every request, it was already discussed and rejected for cache bin creation.
Comment #92
pwolanin CreditAttribution: pwolanin commentedWaiting for sun to comment with some of thoughts about making it easier to swap the active storage for different installs. However, I think that can and should be a follow-up since this patch simply replaces on hard-coded class with another, and actually reduces the number of places it's fixed by using the service alias.
Comment #93
tim.plunkettI don't think its responsible to continue to hardcode it. Either we make it swappable or we leave it alone.
Comment #94
pwolanin CreditAttribution: pwolanin commentedI tried to make the config.storage.active added in CoreServiceProvider like the uuid service, but I'm getting exceptions during the installer since the container builder is perhaps looking for it earlier due to the aliases.
I'm not sure it's possible to really get around hard-coding the storage for the kernel during install.
This patch adds a cleaner setting to get the instance to use in bootsrapping the kernel, but someone else would need to figure if it's possible to override the service definition that early in install. Possibly this was failing since the Settings class was still empty.
Comment #96
pwolanin CreditAttribution: pwolanin commentedoops, LanguageServiceProvider also uses the bootstrap config. This should fix it.
Comment #98
pwolanin CreditAttribution: pwolanin commented96: 2161591-96.patch queued for re-testing.
Comment #100
pwolanin CreditAttribution: pwolanin commentedSo, this was relying on the setting being in settings.php, but it fails for the test that starts install with an empty settings.php
This moves back closer to what in already in 8.x, but still provides documentation in settings.php and a method to get file storage if desired.
Comment #101
effulgentsia CreditAttribution: effulgentsia commentedComment #102
effulgentsia CreditAttribution: effulgentsia commentedYeah, I think it's that exact "little" more awkwardness that tim.plunkett objects to. I added that as an item to the cons list in #101.
Comment #103
pwolanin CreditAttribution: pwolanin commentedWe also have the single export/import UI. I would expect that to really be the place you'd do this, as discussed above.
Comment #104
tim.plunkettIf the single export/import UI is used as justification for shoehorning this issue in, I'll be very sad I ever wrote it.
Another thing I've used the file-based active store for:
As a module dev, symlinking specific files back to their original module, so that I can immediately see the changes to the module as I make changes.
This has been immensely useful when working on the core admin views.
Now, every save in the Views UI would require me to run an export of some sort, and copy it into the module I was working on.
Granted, that is not a need for a production site, but that is one of many tricks that could evolve in D8 if file storage is still available for development.
That's part of why I'm pushing hard for it to be swappable.
That said, the last couple patches seem much better. I'll have to have another look.
Comment #105
tim.plunkettThis would be good to leave intact with some name like config.storage.file, so that a module trying to switch it back can refer to it in its ServiceModifier.
I think this is one of the few changes that just looks *wrong*. Can we add a method to the storage instead, or is this just too early?
Looks out of scope, what's up?
Not clear how these are different, and that's vaguely unfortunate.
a) I think the example should have getFileStorage() here.
b) We need to clarify if this is actually needed. I swapped out the services locally without this line (after installing), and things seemed to work. If this is just for the installer, let's rename it. Otherwise we need to actually document when we need this.
I did notice that \Drupal\language\LanguageServiceProvider calls it, but that seems wrong on its own.
If we were to leave the old file storage service, a module (like devel) would only need this:
Another feature we will need is the ability to export from the DB to the active directory while enabling a service provider like this. That of course, would be something for contrib.
Finally, I'd like to see a test module with a service modifier like the above, in order to prove that things do still work with file storage.
I think that would get us to an acceptable level of swappability.
Comment #106
pwolanin CreditAttribution: pwolanin commented@tim.plunkett
Comment #107
pwolanin CreditAttribution: pwolanin commentedok - talked to tim an dI misunderstood hist point #4 - we need to make the dump call consistent
Comment #108
pwolanin CreditAttribution: pwolanin commentedI think this addresses Tim's concerns.
Comment #109
tim.plunkettOkay, this is really the last thing wrong with this. I *think* we want to always use
, PHP_INT_MAX, 0, TRUE
in all places, to be consistent with core.Comment #110
pwolanin CreditAttribution: pwolanin commentedSo, I agree it's not consistent in core, really we'd only want to throw an exception on export.
the calls to dump() in \Drupal\config\Form\ConfigSingleExportForm and \Drupal\Core\Config\ConfigManager don't allow it to throw exceptions, and it's not clear that they should. If we change that behavior we'd also need to annotate all the methods with @throws.
Comment #111
tim.plunkettYeah I missed the ones in ConfigManager. We talked through each of the cases I called out, and this is fine as is.
I'll crosspost the devel.module issues to restore sanity for local dev when I open them, but this looks to be properly swappable.
Comment #112
xjmThanks everyone! Dries needs to take a look at this issue, so assigning to him.
Comment #113
sunAny chance to simply commit #2208633: Add a Yaml class to Drupal\Component\Serialization first, so as to avoid all of these manual/custom parameters to
Yaml\Dumper::dump()
throughout core? That issue + its sister issues are RTBC for several days/weeks already (and painfully chasing HEAD...)Comment #114
aspilicious CreditAttribution: aspilicious commentedBu this one is a beta blocker...
Comment #115
xjmYep, what @aspilicious said. And Dries needs to review it either way.
Comment #116
sunSince others and I raised major concerns against this change proposal, I assume it is a good idea to (1) clear up some (wide-spread) confusion and (2) outline a concrete path that is going to resolve each and every single concern that has been raised against this change proposal.
Doing so requires lots of context and background knowledge, but I'll try to keep it as short and structured as possible:
The storage engine [+ format] used for active/runtime configuration is a regular, non-volatile data store.
The data can be stored in whichever way you like, as long as the storage (1) is persistent, (2) is available at runtime, and (3) is not suddenly swapped out with a different one.
→ A single instance of a Drupal site will operate just fine, as long as the storage to use is defined at install time and sticks forever.
The active config storage is not used for staging purposes. — The process of staging configuration is a unique and complex concept of its own that is not to be confused.
To stage configuration between multiple instances of your site, you have to use the built-in facilities for exporting and importing configuration.
The "export" mechanism retrieves configuration from the active storage, performs validations, and writes to the staging storage.
The result of that process depends on the staging storage implementation. It may be a directory of files, it may be tarball that is offered for download, or it might be a shell process that commits to git.
→ In order to stage configuration from site instance A (dev) to site instance B (prod), you have to use the export/import process.
If you do not follow the config export process, then it is not guaranteed that the exported configuration can be imported.
Identically to the active storage, the staging storage equally is a one-time (cross-)environment-specific decision.
Both storages are nothing else than data carriers. You can use whatever suits your needs best, as long as the higher-level application subsystems are able to rely on them:
The active config storage is specific to a particular site environment/installation. It is the exact same decision as choosing a database driver in the Drupal installer; i.e., MySQL vs. Postgres vs. SQLite.
Active config storage:
Site instance A != Site instance B != Site instance C
The staging config storage is specific to a set of site instances that are representing the same "universally unique site." As long as all instances agree with each other, they can use whichever staging storage implementation they want.
Staging config storage:
Site instance A == Site instance B == Site instance C
This implies the following:
The requirements for environments are vastly different.
Developers need:
Production sites need:
→ None of the concerns should be after-thoughts. At the same time, it is impossible to meet the requirements of all environments with a single implementation.
→ None of both can be prioritized either. — We can't just simply ruin the DX for thousands of developers to account for potential problems on large-scale, high-traffic, and potentially cloud-hosted production sites leveraging multiple web-heads.
To resolve all concerns and cleanly resolve the problem space, we need a proper separation of environments.
→ #2226761: Change all default settings and config to fast/safe production values
By now, I have a very concrete + feasible plan and vision for how that would work:
Look at this in action:
This is not a pipe-dream. The parallel effort is >90% done already.
The following issues are RTBC and merely waiting for commits:
#2216527: Inject a serialization format into database key/value storage
#2208609: Move Json utility class into Drupal\Component\Serialization
#2208633: Add a Yaml class to Drupal\Component\Serialization
The following issue is blocked on a naming bikeshed only:
#2161643: Add a KeyValueStore\IterableKeysInterface (allowing to retrieve keys with a given prefix)
With those (+ this), the following primary change will boil down to some minimal changes only:
#2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage
As clearly visible, the parallel effort even goes one step further:
→ All available key/value stores are immediately available to be used as active config storage.
To omit the storage defaults in the installer - as with any other swappable storage engine - you customize
settings.php
before hitting the installer:(The only remaining issue with that is autoloading of custom namespaces in the early installer, but that is a completely different topic that needs to be resolved separately. — Likewise, such environment-specific service overrides are really supposed to live in a
$conf_path/services.yml
file, but that's yet another separate topic.)A custom storage override is a one-time install-time decision and sticks forever.
Just like the installation profile sticks forever. You cannot simply switch to a different installation profile either.
Just like the public files directory sticks forever. You cannot simply change that filesystem path either.
In case all you want to do is to change the config storage of a particular site instance from one implementation to another:
→ #1613424: [META] Allow a site to be installed from existing configuration
Steps in words:
To summarize:
I'm +1 on this change, if we are committed to resolve the challenge of conflicting requirements for different environments in D8.
That is not a pipe-dream, but doable within the next days — progress is only blocked on commits.
I'm –1 on this change, if there's no buy-in for the concrete plan outlined above.
In that case, large-scale/high-traffic/cloud-hosted sites can and should have to tweak Drupal's storage defaults in the same way they always had to in the past. The proposed change of this issue optimizes for 1% of the total user-base. → Use Pressflow, thanks.
Lastly:
ConfigStorage
improvements will be moved toKeyValueStore
implementations. The customConfigStorage*
implementations will vanish entirely. The changes in this patch are helpful.Comment #117
alexpott@sun in order for the dev prod choice that you've mocked up in #116 to be relevant don't we need #1613424: [META] Allow a site to be installed from existing configuration too? Otherwise how can you have a production and dev version of the same site?
Comment #118
chx CreditAttribution: chx commentedComment #119
chx CreditAttribution: chx commentedAs it is usual it is almost impossible to follow which CMI issue does what and where should one chime in.
I will say this is a good change; I will say that going key-value store is a bad idea unless somehow the config-ness (that the data is recursive arrays and nothing else) is communicated to the store -- this is necessary for the store to store config for config entities in a smart way that helps config entity queries to be fast. You can't do that if you know nothing about the data you store, a generic k-v store can only run serialize over the data. Don't dumb down more than you need to!
Agreed. The discovery in DrupalKernel needs a bit of love and cleanup and namespaces is part of that. Maybe I will do that.
There already is code migrating from one storage to another: http://drupalcode.org/project/mongodb.git/blob/0674b7b054032f4bae463cff5... it is not even horribly complicated. The only challenge was figuring out when to run this code. I delete the files but just as easily I could do a write through in my config storage and keep files up to date and then removing mongodb would be possible. I just don't want to support that stuff.
Comment #121
Dries CreditAttribution: Dries commentedI'm comfortable with this approach (like many others are). Committed to 8.x. Thank you!
Comment #122
alexpottConsidering that we've been talking about active config in files for 2+ years I think we should have an 8.x to 8.x change record for this.
Comment #123
pwolanin CreditAttribution: pwolanin commentedhere's a draft change notice: https://drupal.org/node/2241059
Comment #124
giorgio79 CreditAttribution: giorgio79 commentedSomewhat related to performance: for querying this key value store we could use the innodb memcache protocol as of mysql 5.6 that can be 9x faster than simple innodb #2223757: Automatically use the memcache protocol on MySQL 5.6: 1.5x - 2x faster queries
Comment #125
xjmThanks @pwolanin. We likely also need to update a bunch of existing change records.
Comment #126
xjmI published https://drupal.org/node/2241059, because it's going to break everyone's existing D8 installs. Leaving this open though to improve it and check for other docs that need updates.
Comment #127
sunHow to use files for debugging configuration and hacking on D8 (or if you don't want to re-install):
https://gist.github.com/sun/10732324
Comment #128
Gábor HojtsyI updated the change notice with https://drupal.org/node/2241059/revisions/view/7140545/7145433 so it explains the benefits for people who are freaked out :D
Comment #129
xjmThanks Gábor -- that's what I had in mind. :)
That leaves updates to existing ones. Looks like we need to update:
https://drupal.org/node/1500260 (soooo out of date, lol)
https://drupal.org/node/2164727 (how?)
That's all I could find, actually.
Comment #130
xjmGábor and I updated https://drupal.org/node/2164727/revisions/view/6803915/7149123. The original CMI change record still needs significant work.
Comment #131
xjmAnd I fixed https://drupal.org/node/1500260 by gutting it and linking to the far more current handbook documentation about the configuration API.
Gábor is also filing a followup to fix obsolete text in settings.php wrt the active config.
Comment #132
Gábor HojtsyWhile preparing the change notices, I found #2243439: Both file and db configuration storage are said to be default in settings.php and hit that bump in understanding which storage is used when, etc. The current docs are entirely untrue :)