This is a tracking issue for the file-based configuration work going on as part of the configuration management initiative. The actual work on this is going on in the initiative sandbox in the branch 8.x-file-config, but this issue is for the purpose of providing periodic updates and occasional patches to demonstrate progress and help keep the issue on the radar of members of the core team. The goal is to provide a standard api through which configuration information can be stored, exported, and imported in a standard format (currently XML.) Prior architecture discussions on this issue can be found at http://groups.drupal.org/deployment-build-systems-change-management.
The goal for the initial patch is
- A full API for interfacing with configuration information
- At least one core implementation (ideally more though)
- Whatever work needs to be done at install-time to make this work
Anatomy
- This is a configuration subsystem that intends to read and write configuration objects.
- Configuration objects are basic key/value pairs. However, the value may be in a structured format, such as XML or JSON (or in more usual terminology, the value may use a Content-Type other than 'text/plain'). [« sun has an idea...]
- Configuration objects can be stored in different storages. The subsystem supports multiple storage engines. Each storage engine has to implement methods defined on the
ConfigStorageInterface
. - One of the main purposes of the subsystem is to allow configuration objects to be stored in multiple storages (e.g., files and database) and be synchronized between them. This is primarily for performance reasons, as filesystem access and value parsing can be a bottleneck. It also allows for an explicit "reload" operation, which protects the system from things like syntax errors in the config files. The subsystem therefore differs between "active" storage and "non-active" storage.
- Every configuration object is additionally verified through a signature based on its content.
- The subsystem comes with a bare
SignedFile
handler that uses native PHP functions to read and write a file including its signature. - For Drupal, there is a base (abstract)
DrupalConfigStorage
storage engine, which partially implements the storage engine interface and provides additional methods that use theSignedFile
handler to read and write files to disk. Its main purpose is to implement the synchronization between the file storage and the "active" storage. Most notably, it implements thewrite()
anddelete()
methods, which save or delete a configuration object from both storages in one go. - There is a
DrupalConfigStorageSQL
storage engine, which Drupal uses by default as active storage. It reads and writes configuration objects to the SQL database. Very similar tovariable_get()
,variable_set()
, andvariable_del()
in the past. - Lastly, there is a
DrupalConfig
class, which is the main application programming interface for Drupal modules and all code. It is instantiated and returned by theconfig()
factory function.
In its current design, an instantiated DrupalConfig
class only handles one configuration object. To access a different one, a new DrupalConfig
has to be instantiated.
The internals can be pseudo-depicted like this:
$config = config('file.key');
|
----------- new DrupalConfigStorageSQL('file.key')
|
v
new DrupalConfig(DrupalConfigStorageSQL)
|
v
->read() <------------------------------------ DATABASE
|
v
config_decode()
|
v
->set([property], [value])
|
-----------|= RESULT: $config (DrupalConfig)
->storage = DrupalConfigStorageSQL
->data (array)
[property1] => [value1]
[property2] => [value2]
...
$value = $config->get('name');
|
v
drupal_array_get_nested_value($config->data, array('name'))
$config->set('name', 'value');
|
v
drupal_array_set_nested_value($config->data, array('name'), 'value')
$config->save();
|
v
config_encode($config->data)
|
v
DrupalConfigStorage->write($data)
|
-----------> DrupalConfigStorageSQL->writeToActive($data) ------> DATABASE
|
v
->copyToFile()
|
-----------> new SignedFile('file.key')
|
| DrupalConfigStorageSQL->read() <--- DATABASE
| |
v v
SignedFile->write($data)
Discussion issues
- File formats: #1288090: Investigate better alternatives for encode() / decode() functionality
- system_settings_form(): #1324618: Implement automated config saving for simple settings forms
- Internationalization: http://groups.drupal.org/node/185609
- Synchronizing/re-loading configuration updates from disk: http://groups.drupal.org/node/190729 (especially http://groups.drupal.org/node/190729#comment-632198)
- UI for reloading new config after update
- Server-specific configuration overrides (e.g., database, etc)
- Upgrade path from variable system to configuration system
- #13 Retain Default/Override/Revert/Enable/Disable functionality instead of ditching it?
- #15 Cr1) DX for getting a single configuration value?
- Procedural config_get() helper proposal in #26 - #15 Cr2) Default value handling in contrib/modular scenarios?
- #15 Cr3) Enforced sub-namespacing of configuration objects?
- #15 Ma1) Chainable DrupalConfig methods?
- #15 Ma2) Wildcard configuration lookups?
- #15 Ma3) Replace glob() with DirectoryIterator?
- #15 Ma4) Format encoding/decoding responsibilities?
- #15 Ma5) Proper separation of functionality in classes and methods?
- #15 Mi3) Verified storage vs. unverified storage?
- #22 Do not expose the notion of "files" in the API?
- Singular vs. plural terms in config object names? E.g., the object holds only one, so
image.style.medium
vs.image.styles.medium
?
Comments
Comment #1
cweaganssubscribe.
Comment #2
gddHere we are folks, welcome to the initial core patch for the configuration management initiative. Please note that this is VERY MUCH a work in progress patch. It has known bugs and all sorts of stuff missing and there is lots of cleanup to do. However, it is basically functional and gives an opportunity for public comment on the API along with some examples of how implementations might look.
Before I get into the good stuff, I want to acknowledge the people who contributed to this patch and the system's design to this point, all of whom should be credited when/if this patch lands: Angela Byron (webchick), Yves Chedemois (yched), Larry Garfield (crell), Claes Gyllenwald (letharion), Daniel Kudwien (sun), Karoly Negyesi (chx), Justin Randall (beejeebus), Katherine Senzee (ksenzee), Mark Sonnabaum (msonnabaum), Fabian Sörgvist (fabsor), David Strauss (davidstrauss). I would also like to publicly acknowledge NodeOne for donating 50% of my billable hours to this project. Additionally, my thanks and gratitude to anyone who has allowed me to bounce ideas off them or participated in the gdo discussions to this point. The feedback and response to this has been really fantastic.
For background on the system and its architectural concepts, please review
http://groups.drupal.org/node/191283
It will answer a lot of questions in terms of the how and why.
This patch contains four main areas of interest:
IMPORTANT
You will have to apply the patch and then do a brand new fresh Drupal install, including a fresh settings.php. There is currently a bug in the installer that causes it to not work if the database settings form is skipped.
Beyond that this should be working and I'm looking for general feedback and thoughts. The image styles stuff is pretty rough, there are probably still some bugs and there is still a LOT of cleanup and now-unused code since it is getting refactored pretty heavily.
I know there will be lots of thoughts on XML as the file format, and I have added some of my own based on my current experience with the system at http://drupal.org/node/1288090#comment-5340754. Please move all commentary on the file format there.
There are also some issues around converting system_settings_form() to CMI that I punted on for the purposes of this patch, instead choosing to write my own submit function for the form. Please see the issue at
http://drupal.org/node/1324618.
Things I'm interested in feedback on
Some things that I know are missing
Finally, if you are interested in helping out, work will be continuing in the CMI sandbox at http://drupal.org/sandbox/heyrocker/1145636, in the branch 8.x-file-config. I don't have a lot of time to be handholding or educating people on the system at the moment, but if you get a good grasp on things and want to start attacking specific issues, ping me and I'll get you commit access.
Comment #3
gddComment #5
sunVarious clean-ups and fixes. Mainly to ease reviews.
Comment #7
aspilicious CreditAttribution: aspilicious commentedLets try to delete that todo...
According to http://eprint.iacr.org/2010/548.pdf sha-512 is always faster than sha-256. So I don't see any problems with 32 bit CPU's.
Comment #8
Anonymous (not verified) CreditAttribution: Anonymous commentedi pushed a commit that fixes page caching and makes the hook_boot/hook_exit tests pass:
http://drupalcode.org/sandbox/heyrocker/1145636.git/commit/5e86ed702d791...
too tired to get a patch going at this point.
Comment #9
xjmPushed some code style cleanup. :) (Greg also merged in sun's cleanups from above and further.)
Comment #10
aspilicious CreditAttribution: aspilicious commentedPushed a fix for the config test
EDIT:
Shouldn't
and
be part of the DrupalConfigVerifiedStorageInterface
Comment #11
fietserwin- I agree on that "default" styles should not be kept in code. Keep config at 1 place, not in a config store and scattered in hook_default_... functions. Thus save it to config on install. The flag of being overridden or not could be kept though, but probably needs a "restore" hook besides an install hook.
- New code should follow the new documentation standards: document all param and return types.
- Don't repeat functionality of drupal_array_get_nested_value() (premature micro optimization?):
Simpler:
(occurs multiple times in the code.)
Comment #12
amateescu CreditAttribution: amateescu commentedI also believe that the 'Overridden' flag might still be useful. It's nice that we can get rid of 'in code' and 'in database', but it would be easier for a developer that 'inherits' a Drupal site to see what to expect from it (overridden image styles, default views, etc.) :)
Comment #13
gddIt is possible that the Overridden/Restore functionality could remain, since we keep the original copies of the default config along with the module's code files. We would need to be able to link files in the live config directory back to these originals in order to determine whether or not they are changed and display that information. Not sure how we would do that, but it could be done. I don't think it needs to be part of this patch though, we can do it in a followup and if it doesn't get done I don't think its a dealbreaker either (personally.) I made an issue in the sandbox queue about this, and it can be moved into the core queue after this patch lands.
Comment #14
fietserwinAgree with #12 and #13. Thus yes, keep that information available in the admin screens, but don't make it part of this patch. Comparing stored config with the hooks that defined them shouldn't be too difficult.
Comment #15
sun[ - Moved description/anatomy/callgraph into summary -]
Initial code review issues
Critical:
$value = variable_get('cache_lifetime', 0);
becomes
$config = config('system.performance');
$value = $config->get('cache_lifetime');
To remedy that, I wonder whether we could get back to one key somehow:
$value = config()->get('system.performance.cache_lifetime');
(mayhaps using a different delimiter (:).
If I'm not mistaken, it looks like there's a chance for code getting way more complex:
// We don't depend on someothermodule, but if it's there and 'thing'
// was enabled, account for it.
if (variable_get('someothermodule_thing', FALSE) == 'special_condition') {
mymodule_thingie();
}
becomes a bit more:
// Diff 1: isset() obviously doesn't work on the expression.
// Diff 2: No (potentially custom) fall-back default value.
$enabled = config('someothermodule')->get('thing');
if (isset($enabled) && $enabled == 'special_condition') {
mymodule_thingie();
}
variable_get('mymodule_check');
needs to be "categorized" somehow now? Otherwise, you'd haveconfig('mymodule')->get('check')
and in turn a config file with namemymodule.xml
...?Major:
$config = config('foo.bar');
$config->set('foo', 'bar');
$config->save();
should allow for
config('foo.bar')->set('foo', 'bar')->save();
config_get_signed_file_storage_names_with_prefix()
states:* Given the following configuration files:
* - core.entity.node_type.article.xml
* - core.entity.node_type.page.xml
*
* You can pass a prefix 'core.entity.node_type' and get back an array of the
* filenames that match. This allows you to iterate through all files in a
* branch. Note that this will only work on the level above the tips, so
* a prefix of 'core.entity' would return an empty array.
I'd expect to get the full tree in any case though. Since the comment does not clarify why that limitation exists, I'm challenging it, and don't see why it shouldn't be possible.
verify($contentOnSuccess = FALSE)
) are doing too many different things. Such methods are normally split out into multiple, whereas each one sets and accesses data properties on the class.Minor:
$drupal_config_directory_name, $drupal_config_key
variables i) shouldn't start with 'drupal_' and ii) are not really clearly named. I wouldn't expect a directory basename in the first (but a full path). And the "key" doesn't mean anything to me. Rename to:$config_directory_basename
and$config_signature_key
.DrupalVerifiedStorageSQL
should be renamed toDrupalConfigVerifiedStorageSQL
, and likewise,DrupalConfigVerifiedStorageInterface
should be renamed toConfigVerifiedStorageInterface
.Comment #16
yched CreditAttribution: yched commentedCongrats on the recent code rush, @heyrocker !
For people interested in CMI, and as @hejrocker pointed in #2, I'd like to stress the fact that "reload config files from disk and react to changes" is the one last glaring hole in the API. E.g. for the scope of the current patch (image styles) : the computed image files for a style should be wiped when the definition of the style changes. The current state of the CMI API has no way to detect that.
AFAIK there are currently not two people with the same vision for this, and discussion on the topic has stalled for the last two months. Sorting this out is critical before the CMI code and concepts can be proof-tested against more complex subsystems (like Field API, which will be non-trivial...)
See http://groups.drupal.org/node/190729 - more precisely:
- the section 'CMI : the “from files to active store” flow' in the OP, for the proposal that came out in BAcamp
- my 'Reload flow" comment (http://groups.drupal.org/node/190729#comment-632198) for why I have doubts it will float, and an alternate proposal.
Comment #17
yched CreditAttribution: yched commentedcrosspost
Comment #18
yched CreditAttribution: yched commentedand resetting tag as well.
Comment #19
David StraussSubscribe. Also, I don't care much about performance on 32-bit CPUs, even if SHA-512 does turn out to be slower. The signature check is not a critical-path operation.
Comment #20
aspilicious CreditAttribution: aspilicious commentedFrom irc, as a response to sun:
Comment #22
Anonymous (not verified) CreditAttribution: Anonymous commentedre. #15 - last time i argued we should kill the "you must know that this is about files to use the API" requirement, it was pointed out to me that this is a feature, not a bug.
we don't have a configuration system with data tree, we have a directory of arbitrarily named files. as such, forcing people to think about files makes the API and system more consistent.
it was also pointed out that it would be difficult to figure out where the file path ended, and the data within the file started.
those who support the idea can probably put it better than me, because i completely disagree with the argument :-)
Comment #23
Anonymous (not verified) CreditAttribution: Anonymous commentedre #16, my latest thoughts here - http://groups.drupal.org/node/190729#comment-662963
next task is to try to explain via use cases why we don't need to write to files except on export.
Comment #24
pounardDeleted because duplicate.
Comment #24.0
chx CreditAttribution: chx commentedUpdated issue summary.
Comment #24.1
sunUpdated issue summary.
Comment #25
pounardI agree with sun about
config()->get('system.performance.cache_lifetime');
.Is not really revelant; in both case you have to do multiple calls, so in term of code typed it's more or less the same.
I disagree too, one expose the configuration as one single configuration tree, which is more natural; The second is confusing.
Choosing one of the other doesn't impact the storage backend, nor the internal representation of the data: it's a public facade. But if we choose a public facade that exposes the configuration as one single tree, we may end up having a nice API such as, for example, being able to replace the active layer by a configuration backend such as GConf on Linux or the Windows registry: it sounds really fun (and may be performance wize in some case)!
Plus having a public facade that considers one single tree would be a step more to decorellate the original split from the day to day usage, and would allow to use non biased (incremental?) caching mecanism based on usage and not on internal representation for reading optimization.
Comment #26
sunNote: I fully understand that the Configuration system has to differentiate between the configuration object and the configuration key within that object. I am not questioning nor criticizing that. (Although the granularity of configuration objects remains to be an open question, in light of Drupal being a modular/extensible system.)
In a sense, it is comparable to currently doing:
Contrary to above claims, this current code allows to get multiple values from a single configuration object:
And with some additional tweaking, I can easily see it providing access to the full object, too; as in:
However, it enforces this code logic also on simple/scalar configuration values, which - as even the initial conversion in this patch nicely demonstrate - is the most common use-case for configuration lookups.
Technically, we could plant a tweak into the config() factory or into a separate config_get() helper function, which automatically splits the passed in argument into $file/$name and $key; e.g.:
I've committed some class/interface naming clean-ups yesterday. After sleeping over this, I have many more considerations on the code/implementation to discuss. Will consult @heyrocker for how and where to put them.
Comment #26.0
Anonymous (not verified) CreditAttribution: Anonymous commentedAdded branch information
Comment #26.1
sunUpdated issue summary.
Comment #27
Anonymous (not verified) CreditAttribution: Anonymous commented"Note: I fully understand that the Configuration system has to differentiate between the configuration object and the configuration key within that object. I am not questioning nor criticizing that. (Although the granularity of configuration objects remains to be an open question, in light of Drupal being a modular/extensible system.)"granularity of the object? it's not about the object, it's about files - you can't use this interface without understanding that you are reading values clumped together in a file.EDIT - this is unclear, even to me, please disregard.
Comment #27.0
Anonymous (not verified) CreditAttribution: Anonymous commentedUpdated issue summary.
Comment #28
gdd@sun thank you for the detailed review and architecture description! Hopefully this will be very helpful for future reviewers. Some comments on the concerns
Cr1: I am not against this is we can figure out a way to make it work, and having a colon delimit the file and the key is the way I would suggest if we decide we want to go this route. However we also shouldn't prevent the possibility of being able to init a config object and do a bunch of stuff in it (as bojanz mentioned in #20.) It's possible we can support both maybe? There are serious issues and edge cases with doing image.styles.large.name.effects and forcing the code to figure out where to break from name-of-file to data-in-file. David and Larry and Chx spent a good amount of time on this at our first sprint in Denver and that was when we moved towards this model because it was becoming too problematic.
Cr2: Yes this is just something we're going to have to be more careful with. In many respects, some config changes will essentially be API changes. However in some ways that is good. If a module you are integrating with removes a config value as part of an upgrade, wouldn't you want to know about it? If a config value isn't set, then you should probably deal with it in a different way than if it has a value, and there is no way to make this distinction in the current variable system. For module developers it does create a little more work - when you decide you need a new config value, you actually have to go into your module's default config file, add the value, install it, and reload config before you can use it. I don't personally think this is a huge burden to add though. Will be curious to hear what others think.
Cr3: Yes, that's correct. my_module would provide either my_module.xml or maybe my_module.settings.xml, and within that would be the keys. This actually forces values to be namespaced, whereas with the variables system, modules would (and did) simply ignore any naming conventions within variables.
Ma1: Some chainability is already possible. For instance, you can chain any function from config() (aka
print config('image.styles.large')->get('name');
works.) To get the others to work, the functions would need to return the config object back, so set() would need to return $this. Not a big deal conceptually, but want to make sure it makes sense everywhere (except get() as aspilicious notes.)Ma4: Do these functions have some general utility outside of the config system? If so, it might make more sense to move them into common.inc or something. At least the xml->array->xml functions might qualify in this way.
Beyond that I don't really have any issues with the suggested cleanups and such.
Comment #29
pounard@heyrocker How does the code behaves if some modules gives you:
a.b.var
in filea.xml
and another one gives youa.b.var
ina.b.xml
? Will I have two config files in common dira.xml
anda.b.xml
? If so, how does the API differenciate them?Comment #30
aspilicious CreditAttribution: aspilicious commented@hejrocker and with a.xml in module X and a.xml in module Y. Those files get collected and grouped in one directory.
Comment #31
gdd@pounard this situation is pretty easy right now.
That situation is however one of the cases that makes it very difficult to put everything in one string without a special delimiter (like a colon).
@aspilicious The second module installed would overwrite the first module's file. However I would consider that a bug in the modules, not a deficiency in the config system. Modules should namespace their config files with their module name, just like they have to in various other areas of Drupal. I don't really see that requirement as a problem.
Comment #32
pounard@heyrocker That's what I guessed right after writing my comment, thanks for the confirmation.
EDIT: Did you made a mistake, shouldn't it be:
?
Comment #33
aspilicious CreditAttribution: aspilicious commentedyeah pounard you're right, and thnx hejrocker for your answer :)
Comment #34
fagoI don't see the benefit of having files as part of the API either. To me, files are a storage engine, but not a necessity so there shouldn't be any "file" in any interface.
Yep, having hooks for reactions is a really critical feature.
I find it confusing that parts of the code are in the Drupal namespace and parts are in the config namespace. Any reasons for that?
I'd have expected a single namespace.
Again, it's strange to have config() and DrupalConfig.
Why encode the data before passing it on? I'd assume it's up to the storage mechanism whether the data should be encoded in XML or not?
Also, that pattern of config-objects with an associated storage system is very similar to entities + their storage mechanism. At least, we should compare solutions to learn from each other.
Caching (static + permanent), as well all the associated CRUD hooks is something what the entity api already takes care of. Not sure if there are ways to benefit from each other, but maybe exposing config-objects as entities via a DrupalConfig entity storage backend would make sense? At least, we should streamline all the CRUD function names and hooks with those of entities for DX I guess.
Howsoever, I'd like to see the config system to take care about caching for me.
I find it confusing that parts of the code are in the Drupal namespace and parts are in the config namespace. Any reasons for that?
I'd have expected a single namespace.
That said, the term "DrupalVerifiedStorage" confuses me too. What makes the storage verified?
Again, it's strange to have config() and DrupalConfig.
Why encode the data before passing it on? I'd assume it's up to the storage mechanism whether the data should be encoded in XML or not?
That said, is it planned to support dedicated storage mechanisms per config object, e.g. to put node-types or any other exportable in their own db-table? That's important to don't have to serialize big-chunks of xml to load a single config object, e.g. a view.
Then another random question: Assuming nodes types become config objects too, how would a module expose an additional property to the config object?
Comment #35
Anonymous (not verified) CreditAttribution: Anonymous commentedjust pushed a new branch called 8.x-file-config-reload, with this commit:
http://drupalcode.org/sandbox/heyrocker/1145636.git/commit/f57b542
hejrocker - do you want this in another issue?
discussed this a bit with hejrocker in IRC, but the code doesn't reflect our discussions 100%. not really happy with it, but it's something more than vapour-ware to talk about.
- i've only implemented a reload from disc --> active store, the 'git pull' use case. i have NFI how to do the other way, or how that fits with the current design.
- we're not making config CRUD operations cause reloads, because the current idea is that code that calls config() CRUD operations should know whether a reload is necessary or not, not the config system. so why don't we just let the code that calls config() CRUD do what needs to be done? it seems a config reload hook is only necessary when we don't have any such code, such as when loading in new config from disc. in that case, we do need something to allow 'business code' to run.
- no reload on CRUD #2: if config CRUD always writes to disc to keep it in sync, then we need an in-memory way to keep track of all config changes in case a reload is called. some sort of automagic tracking, and an API for accessing it, that is ready to be used in the (rare?) case calling code decides that a CRUD operation needs to trigger a reload. or do we make calling code responsible for passing in the changes? or ...?
smaller issues from the implementation:
- i thought it would be a good idea to let modules know what has changed up front, as every single module will need to check for this to know if they should react in some way.
- as the code is now, it's impossible to get changed files loaded unless they've been signed by $drupal_config_key. do we really want to force knowledge of $drupal_config_key outside of a production environment?
Comment #36
gddThis is a reroll to core since a bunch of stuff broke with recent conversions of the core cache system to PSR-0. It also includes a lot of cleanups from aspilicious, ksenzee and sun. Some of the bigger changes sitting in branches will have to wait for closer inspection which may not happen for a couple weeks as im in the middle of moving back to the states.
Comment #37
gddComment #38
gddI just realized I merged in the currently-broken head. Will reroll again when all is fixed.
Comment #39
pounardI'm not sure, but if this is the root folder, I guess that I cannot share .info, .xml, .sig, .po, .sh or .sql files anymore using my Drupal blog: this is not something we can allow to be released! Hiding PHP or VCS files is legit, but disallowing all of theses seems radical to me.
In fact anything that is not PHP or . folders should be kept public: the real problem is Drupal not being outside of webroot, not the opposite.
If I'm not mistaken, please move the file match disallow rules into a specific .htaccess into the config folder, but not everywhere else.
Comment #41
nedjoThis initiative seems to have developed so far with primary reference to site staging and deployment use cases.
A second set of use cases relevant to configuration storage relate to bundling and distributing sets of configuration-based functionality, as is done in Drupal distributions. This second set of use cases raises a number of distinct questions and issues. Perhaps these have already been explored, but I haven't found where. It seems important to make sure we're explicitly capturing and addressing configuration needs relevant to distributing configuration.
To begin to scope these, I've posted a wiki page at http://groups.drupal.org/node/205673. It could use attention both from Configuration Management Initiative contributors and distribution and Features maintainers. From there, potentially, considerations can be moved back to this issue or additional issues can be posted.
Comment #42
Dries CreditAttribution: Dries commentedAssigning to myself for a closer look. This shouldn't stop other reviews from happening.
Comment #43
Dries CreditAttribution: Dries commentedSo I looked at this and I'm already very happy with this. Great job.
First and foremost, I think the design and architecture are sound -- it is really not complicated.
I think that (for now), it is OK to require an explicit reload when configuration files are changes on disk. This means that there should be a button in the administration page that explicitly reloads configuration from disk into the active store. I don't think we should support dynamic reloading of configuration files at this stage. Similarly, in the other direction, I'd just write configuration changes in the UI immediately to disk -- at least for the time being.
Let's go with the minimal viable even and than we can worry about more fine-grained caching strategies in follow-up patches when we can do some actual benchmarking and measurements. Different caching strategies could exist; and I can see us exposing a couple of alternatives through, guess what, some configuration settings.
Comment #44
cweagansI have a crazy proposition:
Do we need to keep the idea of an installation profile after this patch lands in core?
I mean, most of what we use install profiles for is configuration of a Drupal site in some predefined way. With the file based configuration api, it seems like the kinds of applications that we build as installation profiles currently would best be distributed as a bundle of configuration files.
Comment #45
Dries CreditAttribution: Dries commentedDo we need 'Verified' in the name? It seems redundant.
Do we need to prefix classnames with Drupal? It may be noise, especially as we add more classes?
Why not call this 'ConfigSQLStorage' or so?
Comment #46
moshe weitzman CreditAttribution: moshe weitzman commentedLooks pretty clean to me. Nice work!
I raised an eyebrow at the format gymnastics here. Do we really need to json_encode and decode right afterwards?
Similarly, we I guess we can't get pretty XML with just simpleXML extension? I do agree that pretty is important.
I noticed that copyToFile() method below is actually reading $data from DB even though it knows we know the value of $data in this method. Perhaps optimize this.
Comment #47
Dries CreditAttribution: Dries commentedSome more comments -- more to come.
#1.
I think it would be good to explain _why_ we want to duplicate configuration files. This is somewhat unusual but I can see why it could matter for security purposes.
#2.
I think it would be good to explain why we want to cryptographically sign this file.
#3.
Again, don't really understand what it means for the storage to be 'verified' -- and whether it is important enough to expose in the class name.
#4.
I thought it was a bit unbalanced that we 'write' to the active store, but that we 'copy' to file.
Two possible alternatives:
Comment #48
gdd@dries Verified makes sense when you think about the fact that data in the active store has been confirmed to be good - it has no syntax errors (otherwise it would not have been loaded) it signature matches the actual data (thus it hasn't been tampered with.) We are probably going to make another class that directly access the files instead of the active store for the update mechanism, this class would specifically NOT have "verified" in its class name. It's an important distinction worth keeping but it could probably be spelled out more clearly. I agree Drupal is probably redundant.
There have been a lot of suggestions about class naming both here and in IRC, but I haven't had the chance to process them all. The move is taking a lot of my mental energy for the next couple weeks.
The duplication of config file is more for modules to be able to ship with default configuration, and for the system to be able to revert to that default configuration. It has nothing to do with security actually. A lot of these comments definitely need fleshing out.
@moshe The gymnastics with json_encode()/decode() are simply the fastest way to get from SimpleXML to an array. I agree it looks hacky and weird but it is very efficient. All that said, I am more and more becoming convinced that XML was a mistake and that the things it buys us are to painful to implement (see discussion at http://drupal.org/node/1288090#comment-5340754). While I hate to reopen this topic, I honestly don't think many people are going to cry if we get rid of it and move to JSON at this point. I am a bit reluctant to commit this first patch without resolving this issue, as we would have to go back and convert a lot of default XML to JSON. Any comments here greatly appreciated.
@pounard that line is stock in Drupal 7 already, the only thing I did is add XML. The rest are already blocked by default.
In IRC, webchick mentioned that one of the things she wanted to test first was to push new configuration to a site, reload it, and see it take effect. That functionality doesn't exist right now and she said it makes it harder to test and grok. Do we consider this a requirement for the first patch?
Comment #49
geerlingguy CreditAttribution: geerlingguy commentedShort answer is no... at the risk of bikeshedding, the reason I say no is that install profiles (especially for distribution on drupal.org) are a lot more than a few configuration directives; they're packages of modules, patches, themes, etc. that are wrapped up and often modified by the install profile itself to make site setup (and setting of configuration options) an easier task.
Comment #50
Anonymous (not verified) CreditAttribution: Anonymous commented"In IRC, webchick mentioned that one of the things she wanted to test first was to push new configuration to a site, reload it, and see it take effect. That functionality doesn't exist right now and she said it makes it harder to test and grok. Do we consider this a requirement for the first patch?"
i have some basic code for this in the 8.x-file-config-reload branch. i don't like it, because the write-to-file-on-all-changes decision just makes this scary.
i don't agree with #43 - i don't think this design is good. we have no answer to the basic reload use case that webchick asked for. we have no answer to 'we have no way to stop your pushed changes from being overwritten'. i've said that before, and i'll shut up if i'm the only one who thinks so.
Comment #51
gdd@beejeebus If we lock the ability to write files during a reload (which has been long discussed) then do we have a problem? We have known for a while we will need to lock things down on reload and we will, and in fact with everyone writing through a single api it should be reasonably simple to do when we start putting everything together. I don't think this is a particularly big deal to fix myself. You don't want your pushed changes overwritten? Either you set this lock manually, or you set as part of your deployment process. Every process has steps to be followed and this can be one of them.
As I mentioned in the original post, we don't even have a consensus on how the update process will look, but these are solveable problems and I personally don't think they reflect badly on the design. If we need to write this before the patch lands that's fine, and maybe its not a bad idea as it would provide a workflow for people to see.
For the record I'm not ragging on beejeebus who has been an enormous help to CMI through the process.
Comment #52
Anonymous (not verified) CreditAttribution: Anonymous commentedre #51, yep, locking out writes during the reload would work.
for it to be effective, the locking has to start before the new files start appearing in the config dir. getting the new files into the config dir is not going to be anything the config system is aware of, so the lock has to be set manually, or by a process outside of drupal code. there is nothing in the design that makes this just work, instead you need an extra step - 1. lock all config writes, 2. git pull/scp/make your edits, 3. run the reload, 4. unlock writes.
i can live with "you have to manually lock all writes, load your change files into the config dir, reload, then unlock", but i don't like it as a design, because the "loading new config from disc" is, IMO, one of the most important use cases for this API.
Comment #53
catchI'm not at all sure about writing straight back to files either, especially since it seems that's potentially going to make things harder rather than easier. Is there an issue where this is being discussed in more detail?
I think some people might cry with joy ;) No complaints from me at all.
Comment #54
Sborsody CreditAttribution: Sborsody commentedThoughts... I apologize in advance if what I write has been discussed before somewhere else and I'm not about to pretend I know what the proposed code is exactly doing.
- "Pushing new configuration to a site" and "write-to-file-on-all-changes"
I normally take this to mean copying configuration files from an authoritative master (i.e., staging) and using something like rsync to put the files on a production instance. Not something Drupal should handle because it can't assume the underlying architecture of the production operation system (I imagine someone creating a module that would push/pull configuration from a master to production using a web-based service as transport.). I think what you may mean by "pushing new configuration" is writing out a new set of configuration files based on whats in an "active store" (which did webchick mean?). I don't know what the write-to-file-on-all-changes decision was about, but as a change manager I'd want to always be in control of when new configuration files are written out because each new configuration set represents a baseline and I'd want to determine when baselines are established (like maybe you don't want to automatically commit to your VCS and tag it as the next version after saving a file). Equally, I'd always want to be in control of when new configuration is loaded into an "active store" as it is a potentially service impacting event. My push process would probably send a reload command after files are copied into place. I would have little need to write new configuration files on a production instance because I've designated my authoritative master to be where such changes are managed (unless production _is_ my authoritative master and I'm writing to files in order to back them up or something). When I push (rsync) new files to production, I always intend to overwrite whatever is in production's "active store". If there's a service impacting issue related with configuration (maybe dblog got left on and the server suddenly got real busy), I'd want to make changes on the fly on the production instance. I might want to do this by editing a config file on production and reloading it or I might want to do this by directly affecting the "active store". During this time, I'd want to disable anything that automatically overwrites the production instance's config files or "active store" temporarily, depending upon however I've architected the production operation system.
- Reload it and see it take effect
Absolutely this is the core functionality that I think most people would expect to see from the CMI initiative but I think it useful to divide "making configuration changes" from "reading and storing configuration". Different types of changes necessitate different implementation methods as some changes are simple assignments while others require a little bit of scripting intelligence/conditional checking/contextual awareness (cf., a variable change vs. adding a new field to a content type vs. adding a new view which queries a new content type). It seems beyond the scope of a subsystem that "reads and stores configuration" in files to handle actual change implementation. Who knows, there might be use cases when an operator won't even use files to read and store configuration in. Maybe reading configuration is just a series of JSON responses from a master server (scary thought?). As for "making configuration changes", change management always involves making changes from a baseline, which is theoretically always known ahead of time. I can't help but always think of hook_install, hook_update_n, etc., when this subject comes up. These functions assume a specific baseline (how many thousands of errors have been thrown when someone's database schema didn't match the baseline hook_update_n assumed? :) ).
Comment #55
Crell CreditAttribution: Crell commented@heyrocker: JSON as a config format would have the advantage of being more common in PHP projects from what I've seen than XML these days, and would be consistent with using Composer to replace info files should we go that route (cf, #1398772: Replace .info.yml with composer.json for extensions). On the other hand, I am increasingly convinced that we are going to need an XML canonical serialized entity format (cf, http://groups.drupal.org/node/197583), even if we also have a JSON version. So really, we're damned if we do, damned if we don't. I can live either way. Based on what you've found with the difficult of working with XML, I'm thinking you're probably right that JSON is less-bad than XML at this point.
@Sborsody: The basic idea is that we shift from a "database is always right for config, and sometimes we hack in a way to push it to code so it can be checked into git" model (as now) to a "config lives natively on disk where it is accessible to version control, but we also have an in-DB (or similar) copy of it for speed and to allow for diffing after doing a git pull on production". That DB copy is the "active store". Of course, keeping those two copies in sync creates all sorts of fun and exciting questions that are still being worked out in this thread. The canonical baseline for a configuration state in this model becomes a tag in your VCS.
Comment #56
gdd@catch @crell One way or another, either format is going to require a schema definition of some type. For instance, we will need to have properties like 'translateable' (already discussed this with Gabor) and some of these will be private to the file. XML has this natively with attributes, JSON does not and will require creating some drupalism around it. I'm not necessarily against this but it should be part of the discussion. I do see that, if we look at general trends in the 'web world' whatever that means, JSON seems to be gaining a lot of ground as a config format.
Comment #57
Sborsody CreditAttribution: Sborsody commented@Crell
Yes, that's the goal, to have the configuration live in a file so it can be checked into a VCS and tagged in order to establish baselines. The secondary purpose of using files is to package configuration (in a standardized format) into an easily transportable form between different environments. That makes it great for those of us building and maintaining/upgrading/expanding sites or managing a staging/prod system. It may not be so great for some operator who sees no need to write configuration to a file. Maybe the operator is just someone running a blog and they don't use any staging environment thus have no need to transport files between environments and they don't use any VCS to manage changes. Or maybe the operator likes to use Services to transport configuration. I think the decision of when to write from active store to files needs to be left up to the operator. Any kind of "write-to-file-on-all-changes" function should be ...wait for it... configurable.
Comment #58
gdd@sborsody there is already plans to make a setting to bypass writing to files, but the assumption currently is that it will be writing files by default
Comment #59
Sborsody CreditAttribution: Sborsody commentedOK, great. I apologize if I bring some use case up that has been discussed elsewhere before. Just trying to think through possibilities and how I'd use this system.
Has the idea of having separate locations for reading and writing files come up? Like, someone is changing the active store and the change gets written out to an "export" file automatically, it won't overwrite someone trying to make changes via the "import" file.
Comment #60
Dries CreditAttribution: Dries commentedI'm going to (temporary) un-assign this from myself as I've signed off on the approach above and given my initial set of feedback. I want to do a second, more detailed review, but I'll give heyrocker and the others time to incorporate the community's outstanding feedback first. Until then, this doesn't need to be assigned to me. Feel free to re-assign it to me when you want my opinion.
Comment #61
Gábor HojtsyI've added #1431292: Migrate date format configuration to CMI (postponed for now) which would be the first to tinker with once we are in a state to work with migrating configuration to CMI, because that is the prime example of translatable config in core (and is also small enough to be doable in a short timeframe while demonstrating how it would work).
Comment #62
Gábor HojtsyAlso marking with D8MI tracking tags because this is a required pre-requisite (not expecting/advocating we add language support right away in this first patch!).
Comment #63
EclipseGc CreditAttribution: EclipseGc commentedSo, I'm actively using this at this point and it'd be really nice if we could get all the values in a config object simultaneously. Currently working to implement this in a plugin solution for blocks and there are definitely instances here where we push the block's configuration down to the theme layer. Passing a stdClass down from a certainly point would be really helpful. I was thinking adding to DrupalConfig something like this:
I think that would make people's lives a lot easier in many places, especially if we're in a situation where we're not 100% sure of all the keys that might exist and would like to offload decisions about that sort of thing.
Comment #64
David StraussI reviewed the patch and the discussion since its posting. Here are my thoughts.
We switched to XML for numerous reasons, including better UTF-8 support (versus inconsistent standards about escaping characters in JSON), support for comments (rather than a Drupalism around JSON keys), consistent IDE editing support, and extensible attributes on tags (potentially useful for i18n and other context). If we get all of that at the cost of some silly-looking code around loading and saving, I think that's a net gain. The only other decent alternative is Yaml, which is prettier (than even JSON), gaining ground as a configuration format, and is less limited than JSON. Unfortunately, there's no native PHP support for the format.
I'd be okay with switching out "verified" for some other moniker like "active" or "validated" in our class names. But, as heyrocker explains, it's an important distinction to maintain in some way.
I'd like to take on-disk formats to a higher level of canonical-ness by sorting the keys. Example files should also get updated to be in canonical format. This will minimize the diff in version control, a design goal we have.
The SQL store should probably just use serialized PHP. It's faster than JSON, and the copy in the database isn't supposed to be human-friendly.
The rest of the patch looks good. Other than XML and putting signatures in separate files, I don't see many functional differences versus what we designed in Denver. I'm glad to see some integration with core.
Comment #65
David StraussFix tag deletion.
Comment #66
gddThis is a new patch containing
- Conversion of existing classes to PSR-0 (thanks to EclipseGC)
- Addition of ability to retrieve the entire data array for a file by passing no key to $config->get()
Also trying to get a handle on where our current core test failures are.
Comment #68
sunFixed SimpleTest integration.
Comment #69
Dries CreditAttribution: Dries commentedWe talked about this patch at length at the WSCCI sprint over the weekend. We spent 4+ hours discussing the current patch. Specifically, we talked about the architecture and suggested some changes to the storage mechanism (i.e. move to a re-usable key-value store). We also discussed some of the known issues such as (1) internationalization support, (2) re-loading of configuration, (3) the various security aspects, (4) upgradability, (5) reacting to configuration changes, etc. (Greg and others are going to share more details on what we discussed.)
After a lot of discussion, we decided that none of the known issues are show stoppers for a first patch to go in -- other than writing tests and making all the core tests pass. Fixing the known issues can happen in follow-up patches as we don't really expect these to change the API in a big way. In other words, the plan and consensus is to commit this patch once the tests are in place and all pass. It will help other work on Drupal 8 make progress.
Comment #70
Anonymous (not verified) CreditAttribution: Anonymous commentedComment #72
Dave ReidAdding xml to the deny list is going to prevent modules like XML sitemap from working at all. I have seen this has been discussed previously but I don't see this is actually been addressed? Seems like these config file type .htaccess restrictions should live in the actual config directories?
16 days to next Drupal core point release.
Comment #73
gddAnother fix to see if we can get the testing profile working
Comment #75
gddThis patch removes .xml from the global .htaccess, which we believe is the cause of the aggregator failures. Working on adding new functionality to auto-create a .htaccess in the config folder at install-time.
Comment #77
sunThe .htaccess bug on .xml files was fixed long ago in my to-be-merged branch already...
#75 did not contain all of @heyrocker's changes in #73. Please make sure you merge the latest 8.x-file-config main branch before rolling patches ;)
Attached patch fixes some more things, and adds a first chunk of basic CRUD tests. All individual changes can be found in my sun-config branch. The interdiff is against cmi's main branch, not against the previous patch.
Comment #79
sunAttached patch resolves almost all test failures. (but not all)
However, I had to add an insane amount of critical @todos, which I wasn't aware of before:
- Config values are not casted to strings (despite promised).
- Config keys are not casted to strings (although promised, too).
- XML can be invalid and not parse-able for many reasons.
- Config keys are not validated/sanitized.
- config()->clear() should really be ->unset().
- Configuration must not be additionally cached by a module in any way (static cache / database cache).
- Some modules invoke drupal_alter() on configuration (e.g., image styles).
- Need a way to list config object names/suffixes _after_ a specified prefix.
- Need a way to determine whether a config object exists.
- Some modules might have a valid use-case for retrieving/listing config objects using a wildcard within the name (instead of only searching by prefix).
- The key of a retrieved value is unknown; since you only get the value. Configuration values (or sub-values) may be passed forward to another function/callback, and thus, that function no longer knows about the key of the value. (unless the key is contained in the value, which is a very very wonky duplication)
- Config keys must not contain periods (within a specific key).
Comment #80
catchFor the generic key/value store there is #1202336: Add a key/value store API open. I'd love to see work on that (or find time to do it), but definitely not a pre-requisite.
Hasn't that been done?
Or does this mean that the procedural functions in includes/config.inc will be converted to a PSR-0 class?
I'd like to benchmark/profile this (ideally both) to see how this impacts page caching performance before we commit. I'm OK with committing a regression to get this in - but let's know if there is one and how much and have an issue open to resolve it if so.
Oh it is catchable if we do this :)
#1247666: Replace @function calls with try/catch ErrorException blocks
I ran out of time, I'll try to look more soon.
Comment #81
catchdreditor status change.
Comment #83
sunAttached patch fixes the ImageDimensionsUnitTest once for all.
Interdiff is against #79.
Comment #85
sunAlrighty, @heyrocker merged my branch into the mainline. Did some minor follow-ups on that to reduce the patch size.
Interdiff is against the 8.x-file-config branch.
@beejeebus wants to look into the remaining Image module test failure tonight (which I can reproduce locally, but fail to explain).
I'll keep on working on completing basic CRUD tests tomorrow.
Comment #87
sunTerribly sorry, ->set() was not chainable yet. Fixed those new test failures.
Comment #89
gddThe ieid for image effects saved through this test is not getting generated, and thus the tests for image effects are failing because they are posting to URLs that don't exist. I have not dug into the code to see why this is, but that's what is going on.
Comment #90
gddOK I think we should be down to two with this one. The last two are related to imagefields not getting updated properly when a style is updated. Can anyone look at those?
Comment #92
gdd#90: cmi.patch queued for re-testing.
Comment #94
tim.plunkettRerolled around #1050746-70: HTTPS sessions not working in all cases
Comment #95
gddHm lets try this
Comment #97
sunRe-rolled, taking all changes into account since #87.
Comment #99
David Strauss@Dries
"Specifically, we talked about the architecture and suggested some changes to the storage mechanism (i.e. move to a re-usable key-value store)."
This sounds like it ought to be created as a persistent derivative of the cache API. (Alternatively, the cache API could be a non-persistent derivative of the key/value store API.) This would also be a great place for session data, form tokens, etc.
"We also discussed some of the known issues such as (1) internationalization support, (2) re-loading of configuration, (3) the various security aspects, (4) upgradability, (5) reacting to configuration changes, etc. (Greg and others are going to share more details on what we discussed.)"
I would definitely like to see those comments. ETA?
Comment #100
effulgentsia CreditAttribution: effulgentsia commentedThis just changes image_style_form_submit(), so that in a rename, the new name is saved first, then the old deleted, so that the new name can be passed to the delete function for updating image field formatter settings. Locally, that makes the remaining 2 tests pass for me. If bot comes back green, I'll post just that change to the CMI sandbox issue queue.
[Edit: looks like @heyrocker already committed my changes to the sandbox, so no follow-up issue there needed.]
Comment #101
sun@David Strauss: While I agree with your thinking and initially had the same considerations, cache backends actually use a schema that is more than a simple key/value store. A cache backend at the very least needs to record and track when a cache item was created and when it expires (and be able to query/filter all items based on that), and the backend also needs to support advanced methods like flushing all items or deleting all items by prefix. While that could be an extension of a mere key/value store, the concept already sounds vastly different merely based on that. But I'd agree we totally should investigate that direction.
Comment #102
catchLet's talk about that in #1202336: Add a key/value store API.
Comment #103
gdd@david I hope to make followup issues on those and other topics over the weekend. There are a lot of things floating around and I want to make sure they get captured.
@catch when I made that comment about PSR-0, I had forgotten there was still procedural stuff in config.inc. So the comment should probably just be removed for now, and a followup issue created about possibly converting those functions to classes.
@sun
Can you elaborate on why this is a problem?
The duplication issue is really tied to XML as much as anything, and while it's a little messy I don't really have a problem with it to be honest. As we discuss the file format again (yet another followup discussion) this can be discussed at the same time. Regardless I don;t think it should be a blocker for this to be committed. The other two I think are good candidates for followups as well.
Yes, the implications of this have to be carefully considered
I think these cleanups should be pretty easy to get in now, while we're also writing tests.
Comment #104
gddThis patch adds extensive tests for get(), set(), and clear(). There are probably some variations on casting still not covered (object->array is one I just thought of) but it covers most of the possibilities. I also looked into some of sun's todos:
- We cannot rename clear() to unset() because unset is reserved. I remember now that his is the whole reason I called it clear() in the first place.
- Sun added some very nice casting functions which are here and part of the tests now.
Comment #105
gddand the patch
Comment #106
sunre: #103:
Can only lead to bugs, really. The config system does not provide a module hook to allow modules to react when configuration is updated. Thus, if a module uses an additional static/db caching layer around the configuration value(s), you can be 99% sure to work with potentially stale and outdated configuration data.
You'd never ever cache the results of a variable_get() either. Wrapping config lookups into a cache is fundamentally wrong.
This inherently means that the config system needs to be as performant as variable_get(). I don't see why that shouldn't be the case already, or at least, as we've already discussed, we most likely want to switch the DB store to use serialized PHP values instead of XML (which needs rather performance-heavy encoding and decoding).
The problem is not related to the file format in any way; it would be identical with JSON or whatnot. The issue is that when retrieving a nested key of a configuration object, and passing that key's value forward to some other function, then the other function no longer knows about the original nested key within the configuration object.
The recent fixes for Image effect tests demonstrated the issue. Without adding the nested key's keyname to the value that is passed forward or saved, functions being called down the line are not able to regain the nested key's keyname from where the value originates from.
However. After further discussing that issue with @beejeebus in IRC, I strongly believe that code like the Image Effect API is something we do not want support in the first place.
That is, because with the changes of this patch, the Image Effect API is now a relatively stupid wrapper around the Image Style API. The effects are contained in a style. This means that the Image Effect API tries to perform CRUD operations and manage a deeper section within a configuration object.
To put this into relation, consider the following scenario:
You'd never design an API to manage value within another variable in the current system. ;) And so I believe the idea of doing so is bogus with the config system, too.
Effectively, we want to remove pretty much all parts of the Image Effect API functions, and adjust the entire Image module code to work on image styles only. Since we fixed all test failures already, that can be a follow-up issue.
I actually think that this is totally bogus, too. Again, would you ever do the following?
I don't think so. Altering configuration on the fly via drupal_alter(), whenever it is loaded, looks completely nonsensical to me.
AFAICS, this (nicely) forces modules to actually update configuration when they need to add or alter something. This will definitely have a lot of implications on all of the existing APIs involving configuration currently, but I strongly believe it's a unforeseen, implicit step into the right direction. We actually had a lot of discussions on DrupalCons and in IRC around this topic in the past, and if I'm not mistaken, pretty much everyone agrees that it's what we always wanted and needed to do, but never came around to implement. :)
In other words, we want to remove that drupal_alter(). As that is a precursor for a relatively huge architectural change, I could see the potential requirement to figure out the replacement code/behavior for such drupal_alter()s in this first patch. Personally, I'd also be happy with a follow-up issue though.
Comment #108
gddRe 2: OK, so as long as the effect is never individually passed around, we can just leave the machine name as its key and refer to it through there ($style->effects->image_desaturate->weight = 5). Yes, this makes a lot of sense.
Re 3: This also makes a lot of sense. At first glance it SEEMS like anyone who would previously alter() a value could simply change / add to the config by hand. One implication of this is that when that module is uninstalled it would also need to undo its changes (assumedly through hook_uninstall()) as opposed to with an alter() where this would just happen as a matter of consequence. This dovetails somewhat into a discussion we had this weekend about cleaning up configuration on module uninstall.
Comment #109
effulgentsia CreditAttribution: effulgentsia commentedRe #106.3: field_get_display() is one use-case where an alter step is absolutely necessary, because run-time information may influence what's desired above and beyond what's saved as the configuration. But I think it makes sense for now for such alters to be called from the client code, and not have the alter embedded in the config system. Especially since such use-cases will almost always require additional context beyond what the config system is aware of.
Comment #110
yched CreditAttribution: yched commentedI'd say an alter() hook on stuff that is loaded (from the db in D7-, from db or config files in D8) and can be saved back is nonsensical to begin with - on next save, the altered stuff gets merged in and becomes the "official" version, that gets re-altered on next load... Larsen effect :-)
I think that the only things that currently have alter() hooks are :
- stuff defined in code : forms, render arrays, menu router paths, views data mapping, *_info() hooks...
- run-time stuff that do not get saved back : e.g. the field_get_display() example @effulgentsia mentioned (adjust the display settings used for a field just before it gets rendered - but do not touch the $instance definition)
That's the very reason why I always resisted a hook_field_alter($field), btw. We don't know how to handle that pattern.
In fact, I'd say this distinction is part of the heuristics about "what goes in config files / what goes in code". The content of config files cannot be altered through hooks in code.
(FWIW, It's also why I'm really unconvinced by the WSCCI / Plugins API proposal to move "list of available plugin implementations" from info hooks to config files - although I'm not sure whether that is still on the table in @EclipseGc's latest code)
Comment #111
yched CreditAttribution: yched commentedUnintended tag change
Comment #112
hefox CreditAttribution: hefox commentedre: 106.3: variable_get somewhat has an alter hook that modules are using: hook_init (or any hook really), because $conf was/is a global.
Modules using it: Spaces, Domain config, Strongarm.
I've also used it in custom module, for example, to change the maintenance theme to the currently active theme so batch operations don't suddenly switch themes (d6).
It's ugly hack but it's been needed, specially for per context configuration.
Comment #113
gddNote that, at least thus far, $conf will remain pretty much unchanged (albeit with a new access mechanism most likely.) The config system currently relies on the database, which means it comes in way late in boostrap. Anyone whose solution currently relies on mucking with $conf will still be able to do that without much change.
Comment #114
bojanz CreditAttribution: bojanz commentedFeels a bit odd to have #1444620: Remove file signing from configuration system as a followup (and not a part of this issue), since we are then adding code that will get removed right away.
I see the alter discussion starting up in other places as well (for example, by nedjo the Distributions groups page he linked to).
Note that right now complex configuration such as Views and Rules (which I envision using CMI one day) is alterable.
This allows a Commerce submodule to alter a default rule to add its own condition, or at least disable the rule so that another one can take over.
The solution depends on general "overriding" ideas and functionality that gets built over time. A great option would be having the ability to say "module X made this change. The user made these other changes". If not, at least having a simple alter when the configuration gets moved from the module to the central place would make it behave close like the current implementation (module gets uninstalled, the configuration can get reverted). Can't say it's very clean or nice, though.
It is a complex problem that doesn't need to be solved by the first few patches, but it is definitely something that needs to be discussed (from here to Denver).
Comment #115
gddIn response to @catch in #80, I ran some simple benchmarks tonight checking page caching with the patch and without. I randomly generated five nodes with devel and loaded a cached home page five times with the patch and five times without. This generated a range of data, and I took the middle one in both cases. They are available here:
http://heyrocker.com/cmi.html
http://heyrocker.com/no-cmi.html
The high level view is that the CMI version is a slight regression, which is not completely surprising. We have no caching at all (not even static caching on repeated requests for the same data) and the code hasn't been seriously optimized. Catch already pointed out a call to db_table_exists() which I can avoid, I'll roll that into a new patch tomorrow. There are definitely a lot more function calls in the cmi version.
I have been hoping to be able to avoid adding another caching layer on top of the config system in order to keep the architecture simple, but if we have to we have to.
Comment #116
gddNew patch. Includes the following:
- Removes the check on db_table_exists() in read()
- Adds a ton more tests
- Rerolled to head
Based on the specs we had laid out last weekend as far as the requirements for this patch to land, I believe we have met those goals. I know that catch and others have had comments about the object design, but we are planning to address those in followups, even if they are forced to introduce API changes. Again, the main goal here is to unblock WSCCI and get something in place for them and the plugin people to work from.
In the last few days I have also added these followup issues:
#1448330: [META] Discuss internationalization of configuration
#1447686: Allow importing and synchronizing configuration when files are updated
#1444766: Determine method for automatically cleaning up configuration after a module is uninstalled
#1444620: Remove file signing from configuration system
There is also discussion of changing the classes around so that they extend a basic key/value store class, which would have a lot of use in situations around Drupal in general. That discussion is happening at
#1202336: Add a key/value store API
I think we're ready to go if everything comes back green.
Comment #117
cosmicdreams CreditAttribution: cosmicdreams commented@heyrocker: newbie question here:
When CMI is in full effect should we be pulling all values of all variables from the $config object?
For example, in _drupal_bootstrap_page_cache() we have
...but later we don't use the $cache object to do ...
See? We're doing a variable-get here instead of pulling the value from $config (which I assume is because the value of 'page_cache_invoke_hooks' isn't stored there.)
Comment #118
Tor Arne Thune CreditAttribution: Tor Arne Thune commentedWhy were the assertion messages ('Expected img tag was found.', etc.) removed in #77 by sun (I suppose)? Aren't those useful?
Comment #120
gddYes, as soon as this patch lands we will open up a huge pile of followups about converting various subsystems to the new system
Comment #121
sun#117 @cosmicdreams: All variable_get()s and variable_set()s will be converted to this configuration system after this patch lands.
At this point, however, we didn't figure out yet how to deal with "internal" variables like the one you mentioned. That is, because it exists via $conf in settings.php only (which is early-early-bootstrap, not to say, pre-Drupal bootstrap), cannot be configured through the UI in any way, and thus rather maps to a global variable (instead of a $conf value). We'll figure out those edge-cases along the way.
All other $conf values are regular variables (variable overrides, to be exact). This hack to override variables won't be supported anymore, AFAICS.
#118 @Tor Arne Thune: Test assertion messages like that are hiding essential debug information that is important in case a test fails, since you do not see why the test fails. I'm actually trying hard to educate people that custom assertion messages should only be used when the default assertion message is bogus. (the raw, asserted values are helpful, not bogus)
I had to do this particular tweak for this exact test twice (for another patch) in the meantime. In both cases, the custom message prevented the developer from seeing what's actually wrong.
Comment #122
Tor Arne Thune CreditAttribution: Tor Arne Thune commentedOkay, fair enough. Good to know for the future.
Comment #123
cweagansDang, I went through the latest patch with Dreditor, but about halfway through, Chrome decided that using my trackpad to scroll down was the same thing as the "back" gesture, so I lost everything.
I don't have time to go through it all again, so here's only a few notes:
Why do we need a wrapper function here? Why not just call config_get_verified_storage_names_with_prefix() directly?
People that have been following the CMI conversations know why this function behaves the way it does, but I'd like to see a bit of documentation that says why we need to create a SimpleXMLElement, then json_encode and decode it.
Coding standards: Should start with a capital, and end with a full stop. There's also trailing space in a few locations throughout the patch. This patch should probably not be held up on coding standards: we can capture that in followup patches, but if somebody has to reroll it, maybe try to fix those (there's only a handful of them).
There's a few places throughout the patch where there's @todo items inside functions (rather than in the function docblock). For some reason, I thought that having @todo comments inside functions wasn't allowed by the coding standards or by the phpdoc spec, but I could be wrong.
Out of the issues I mentioned above, I think only the wrapper function issue is a blocker for this patch (unless there's a good reason for it, that is).
Comment #124
cosmicdreams CreditAttribution: cosmicdreams commentedwith #116 passing the test suite, what does this patch need? User testing? If so I can give this a go tonight.
Comment #125
catchThere's a couple of things that bother me about this. These don't block a commit, but I'd like us to have a follow-up issue (if one doesn't already exist) so we can resolve it earlier rather than later.
First, the storage object is instantiated for each $name. Normally with storage stuff we have a single instance (database, cache) then it's up to that to fetch data as needed. If we changed this, it'd mean the various crud methods on the storage class would need a $name method. So that'd be an API change for storage engines, but not for anyone else.
This would allow a storage class to do similar stuff to DrupalCacheArray (i.e. cumulatively cache requested config objects over time). Or just for static caching of configuration that could be handled by the storage class. Without having to do it static class properties or similar. Also if we move to dependency injection, I'd expect a single class to be passed around (although that class could be a factory that individually instantiates the storage backend each time but that feels a bit odd).
I'm also not sure why we're allowing the caller to specify which config class gets returned or what the use case for that is. I can think of use-cases for the storage engine to swap out the config class (i.e. one that actually calls back to storage rather than holding any data at all or something), but not so much the caller. There's also no explanation of why you might want to specify a different class in the phpdoc so that doesn't help to figure it out.
6 days to next Drupal core point release.
Comment #126
gdd@cwaegans attached patch cleans up these issues. I think I left one of the inline @todos because it made a lot more sense where it stood. The d.o doxygen standards do allow for inline @todos (see http://drupal.org/node/1354).
@catch So if I understand this right we would go from this
to assumedly a singleton of some sort (?) and something like this
right? I admit my OO design skills are not the greatest so I want to make sure I am getting it. If so then guess I see this as a DX vs performance issue. I agree it would be helpful (even likely necessary) to be able to do static caching of requests since right now the get() process is really inefficient because we may check all sorts of things multiple (even dozens) of times in a page load and we're hitting the database for each one right now. But there is no doubt the DX is a lot uglier there and the original process is way more smooth.
This also harkens back to the earlier discussion about possibly combining the filename and key into a single parameter such as
As far as another config class, I think this could become useful down the road as other systems begin to want to use a file-based storage system but they don't make sense within the framework of our two-tier configuration. For instance, there has been talk of putting the module list into a file-based system. However it makes no sense to have such a thing in this config system. We actually need the module list in order to operate and we don't (yet) plan to be available this early. So a thought I had was you could make a new class that wraps all the same functionality DrupalConfig does but operates on a different set of files and in a different way (like it doesn't have an active store.) And again, this all works back towards making all these classes be more generic and decoupled for greater flexibility which we've already started discussing elsewhere.
Comment #127
catchI was thinking more like:
<?php
function config($name, $class = 'Drupal\Core\Config\DrupalConfig') {
// @todo factory.
static $storage_class = new DrupalVerifiedStorageSQL(();
return $storage_class->get($name, $class);
}
So the end-user API is exactly the same, but the storage API changes.
Comment #128
sunAgreed with @catch in #127. I don't think the public API needs to change for this. We merely change config() to act more like a factory with regard to the storage backend.
This means that only the internals of each individual storage backend will change -- collecting the requested keys on an internal property and operating on a key of that property, instead of instantiating an entirely new storage backend class instance for every config object that is operated on. Most of that logic can likely be forked from the existing CacheArray class.
Comment #129
gddAh yes, this does seem to make a lot more sense. However, given that we're already discussing reworking the classes I think it makes sense to work these ideas into that rearchitecture rather than doing it now and tearing it apart again later.
Comment #130
cosmicdreams CreditAttribution: cosmicdreams commentedSo is the changes requested in #127-129 all that is needed for this?
Comment #131
gddNo I'm arguing those should be pushed off for a followup patch and we're ready to go as is right now. What needs to happen is someone who is not me needs to RTBC it, preferably someone who has been working on it and has familiarity with it.
Comment #132
webchickSounds like catch's main concern was around how the storage API is figured out. This was responded to in #129 as something that could be a follow-up with other class re-architecture, since it shouldn't impact the public-facing API. This would still allow us to get something committed that other D8 work could build from. I asked heyrocker where the best place for that discussion was, and he said it was at http://drupal.org/node/1202336#comment-5625320. (#1202336: Add a key/value store API)
So given we have a follow-up for that discussion, given that we have to example implementations, given that this system has been reviewed at various points by dozens of very smart people, and given that we have an entire gaggle of tests for CRUD operations and various data types, I'm going to be bold and mark this patch RTBC.
Comment #133
cweagansNot that it matters much, but I second the RTBC. The patch is good enough right now and, as other have noted, committing it will unblock other work on D8.
Comment #134
fagoI'm a little confused of what's the latest position in regard of caching an reactions.
From #106:
Sure, that's fine for the variable_get() use-case. But for the configuration system to be usable for storing configuration objects we need custom caching functionality on top of it. The same for reactions. Is it still the plan to work on that as follow-ups?
Related, please also make sure there are hooks that allow modules to act when a bunch of configuration objects is read into the active store. I recently ran into the need for this when working with the exportable entities of the entity api module, which kind of already implements the active-store idea by syncing all defaults into the database. More details at #1458568: efficiently rebuild caches when rebuilding defaults.
Comment #135
catchI think the storage back end should be responsible for caching if at all possible, that's part of the reason for #125/#126/127. If there's a need for custom caching over and above that then we need a proper follow-up issue to discuss it before this goes in.
If #1202336: Add a key/value store API is the issue to deal with caching in storage back ends this then we need to bump that to critical task since the absence of either static or persistent caching is going to regress performance once we start converting more things to the new API. I'm not sure that issue is really the right place to track this, but if heyrocker is happy with CMI relying on that issue then it's probably OK.
I'd like to profile this myself before it goes in - Greg's profiling flagged up the db_table_exists() issue, would like to see how a couple of other things look like before commit. We're still way over thresholds and I'd prefer if this didn't go in until we're under. Will probably leave the actual commit to Dries either way.
Comment #136
fagoI don't think that flies, as there are cases where the cache is special and does not correspond 1:1 to the configuration object storage. In Rules, configurations are processed and prepared for fast execution before they are cached. Also, there are custom aggregations like all rules that react on a certain event, which are aggregated to single event-set and cached for efficient execution when the event occurs. (Just getting all X rules would not be the same).
Not sure whether there are more modules that do something like that and have such needs, but at least Rules does.
I don't think supporting such "custom" caching layers is a big deal though, as all we need is hooks for reacting upon changes. And we are going to need that anyway if we want to be able to store bundle-objects as config objects as the field API has to be notified when bundles change, so it can care about the necessary db-changes.
Comment #137
sunHad a quick Skype call with @fago to bring him up to speed on the config system and also discuss recent comments.
Braindump:
@fago was mainly concerned about "entity-ish" configuration objects (such as node types, image styles, rules). Modules currently provide API functions for loading, saving, and deleting configuration. Especially the save and delete API functions invoke alter, insert, update, and delete hooks in all modules. Additionally, Entity API module's exportables approach naturally and natively provides the old/previous/original configuration when invoking the update hook. The hooks and original config have many use-cases and are important for managing configuration in a modular system.
We get behind the idea that the low-level config system should be able to work as independent as possible. Thus, the low-level config system should ideally have no dependency on the module system. We also discussed the idea of two implementations, an independent low-level Config class that would be extended into a DrupalConfig, which would then depend on the module system in order to additionally invoke module hooks - but in the end, the idea of invoking hooks for every single config read/save/delete sounded nonsensical. Rather:
We can get behind the idea that module-specific configuration objects should be managed and maintained through a module's dedicated API functions. No real difference to now; if there's a module API function to read/save/delete a variable, then you better use that function and don't fiddle with the variable directly.
When looking at the patch, it's relatively clear that most probably all module API functions for "entity-ish" configuration, such as the converted image style functions, will all look pretty much identical. We therefore see a potential for adding generic configuration CRUD functions to the Configuration module (config.module) at a later point, which would be wrappers around the low-level config system, but additionally have a dependency on and account for the module system, in order to harmonize those module API hooks for config CRUD operations. That would eliminate a lot of duplicate code (similar to the other entification changes currently happening for D8). Custom functions like image_styles(), image_style_save(), and image_style_delete() would entirely go.
Can totally be a follow-up issue.
The above shouldn't be confused with #1447686: Allow importing and synchronizing configuration when files are updated. The reload/synchronization operation will also need module hooks. That said, it's possible that above mentioned configuration CRUD API functions and hooks will be helpful/re-usable for the reload/synchronization operation.
We agreed that any additional static/db caching layer on top of the config system can only lead to problems. Not to be mistaken with configuration performance optimizations -- the configuration system itself may, similar to some current subsystems, implement some advanced techniques for prefetching known-to-be-used configuration objects for a particular request, or any other means that improve its performance.
We also agreed that drupal_alter() on configuration is nonsensical. @yched nicely clarified in #110 why.
However, no drupal_alter() and whatsoever also means that configuration is effectively no longer "modular". Variables weren't either. But anything non-modular in a modular system always causes problems down the line. Above mentioned config CRUD module API functions might help to remedy that.
Would be a good idea to remove the commented out code and also the left-in drupal_alter() in image.module (and image.api.php) as part of this patch.
It wasn't very clear to @fago whether default module config would be retained within a module's config directory (and read from there). That part could probably use some better documentation. Speaking of, a
@file
or@defgroup config
explaining the high-level functionality and architectural design of the configuration system would be a good idea.We weren't really sure how "default" (?) configuration for installation profiles would be handled. The current patch only cares for module-provided default config. Specifically, we brainstormed about how an installation profile would override a module's default configuration.
With regard to all config, both default and custom/active, it's also not clear how maintenance updates would work. Both from a module perspective, but also with regard to installation profiles (as well as Feature modules, which, like installation profiles, will mainly consist of configuration + updates).
We discussed that the automated write to disk on every config()->save() doesn't fully make sense from a staging perspective. The automated write is suitable for the development site (on which you want to update the possibly versioned file config), but that same automated and unconditional write can hi-jack and break the architectural design idea of the envisioned reload operation, if it happens on the staging or production site, because that would potentially overwrite new but not yet reloaded config in the filesystem.
At the same time, you may not want to completely lock your staging/production configuration, so you can still perform config changes there. In turn, the write to disk should be at least an opt-in/opt-out; in any case, not unconditionally be part of the config()->save() operation.
That would get us much closer to the design idea of original (inactive) and overridden (active) configuration, and we'd have properly and cleanly encapsulated "reload" (synchronize) and "dump" (write-to-disk) operations in the configuration system.
(Speaking of, @sun is still tempted to rename this issue into "Synchronized configuration system", which is the actual abstracted architectural design idea and goal.)
Comment #138
gddDoes it really make sense to profile the system now, when we know we're already going to rework it pretty majorly? We know we will be (almost certainly) adding static caching and optimizing the code in ways that will make the current version pretty irrelevant. The main goal here is just to unblock the other initiatives.
As for the reloading of config, please make sure your comments are put to #1447686: Allow importing and synchronizing configuration when files are updated
Comment #139
catchThis mentions JSON (and even json.php) but the API is currently using xml.
Also there is no upgrade path for the converted variables and I don't see a follow-up issue linked (and that would have to be a critical bug since we'd be immediately breaking the upgrade path on committing this).
Comment #140
gddNew patch rerolled to HEAD and fixing the comments above. I discussed the upgrade path with catch on IRC and we agreed that if it doesn't get done before this gets committed then that should be opened as a new critical bug. It shouldn't be that tough to write for the two subsystems that have been implemented.
Comment #141
webchickOk, testbot came back with a pass at http://qa.drupal.org/pifr/test/231588 (not sure why this doesn't show up on the site yet).
catch's review was around comments and such, which were subsequently fixed in #140. Confirmed the patch applies to 8.x. Therefore, marking back to RTBC.
heyrocker's requested that this commit be done as a merge, rather than via a patch, in order to provide proper credit to everyone who worked on it. The commands for that (apparently) are:
Assigning to Dries to take a look at.
Comment #142
webchickAnd actually doing what I said I did now. :)
Comment #143
catchMy review was not just around comments, we need a critical bug opened to add an upgrade path for the converted variables if it's not done as part of this patch. We already have plenty of critical upgrade path bugs open and are just about under thresholds, so there's space for one more, but let's not forget to open that.
Comment #144
David_Rothstein CreditAttribution: David_Rothstein commentedThe following looks like it would require a critical bug to be opened also:
This might need a little thought on how to solve, but I guess since both $config_directory_name and $config_signature_key are random strings, they could be dealt with the same way as $drupal_hash_salt is (in the case where it's not provided in settings.php)?
Alternatively, we could require that people who want to install Drupal without ever making settings.php writable by the web server must specify a $config_directory_name by hand (in addition to the database credentials which they must specify now).
Comment #145
gddI think the second answer is the best one as far as that goes. This definitely needs a bug report but I don't know if its critical or not, I guess it is since it prevents Drupal from working. I might try and fix that before commit if I can get to it before Dries gets to this.
This patch is a reroll to HEAD, I merged my branch with the changes from the PSR-0 DB:TNG commit and am running it past the bot as a precaution. If It comes back green I'm just going to put this back to RTBC.
Comment #146
gddand we are green!
Comment #147
Dries CreditAttribution: Dries commentedRead up on all the new comments, looked at the code once more, and decided to commit this to 8.x. Epic win.
Great work everyone. See you in the follow-up issues.
Comment #148
gddReopening because we need the following
- change notice
- CHANGELOG.txt entry
Comment #149
gddChanging title to match standard as well
Comment #150
xjmSetting priority/category.
Comment #151
webchickThis can come off Dries's plate now.
Comment #152
effulgentsia CreditAttribution: effulgentsia commentedWhat's the change notice that's needed? We have not yet removed variable_get().
Comment #153
xjmThat we added a whole file-based configuration system, with all its new parts? :)
Comment #154
gddhttp://drupal.org/node/1500260
This is mostly the information from webchick's introduction post cleaned up and streamlined a bit. I don't know if it is overkill or not, but I figured it was worth providing some level of detail. I'll open a followup about CHANGELOG.
Comment #155
gddComment #156
Tor Arne Thune CreditAttribution: Tor Arne Thune commentedLooks good :) The change notice does reference Drush, but that's not really a problem.
Comment #157
webchickHooray!
Comment #158
plachMoving the title back to ease future reference.
Comment #159
xjmAnd, untagging. :)
Comment #160
webchickCross-posting #1508872: Image effects duplicate on edit which is probably related.
Comment #161.0
(not verified) CreditAttribution: commentedAdded a little more detail to the description of the active store