Problem/Motivation

When a config patch is created it may create lots of entries to add and remove just because the integer keys don't match up.
We need to improve this to show only what is really added or removed. This would become more problematic when things are edited and then the diff shows that everything changed while it in fact only changed slightly.

ie we don't want to keep indexes when they are not relevant.

Steps to reproduce

split something affecting permissions for example.

Proposed resolution

Unfortunately it means we can not do the clever trick with the array utilities.. we have to copy that code to config_split and improve it to handle the situation with integer keys. Maybe we could patch drupal core as well, but maybe we will have to use the config schema and that would be outside of the relam of what the utility can do.
tbd.

Remaining tasks

Write a test with permissions and other config where this is an issue (in general sequences are a pain)

User interface changes

none

API changes

none

Data model changes

slimmer config patch.

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

bircher created an issue. See original summary.

pyrello made their first commit to this issue’s fork.

pyrello’s picture

I made a first attempt at getting the logic for sequences working better. Next step is to write a test for this, but I'll probably need some help on that front, since I don't have a lot of experience doing it.

pyrello’s picture

I'm starting over with a new branch and MR so that I can establish passing tests to start with.

pyrello’s picture

Ugh. Didn't realize that using the ProphecyTrait would be a problem, so I had to remove it to get the test runner not to error out. Also reverted the last commit prior to that to establish the initial passing test.

pyrello’s picture

Last commit should fix the new test that was failing, but will fail on the existing testSimpleMergeExample test. I'm still not totally clear on the logic of what is happening in that test, but it fails on the last assertion.

pyrello’s picture

Status: Active » Needs review
StatusFileSize
new1.09 KB

Adding a patch with just the new permission (sequence) test that was added to the MR in the last commit to demonstrate that it is failing prior to the rest of the changes in the MR.

Status: Needs review » Needs work
bircher’s picture

This is looking great already.

One small thing we need to add to the test you are adding (in #10): We should also test that the merging works again. I know the sorting can be off, that is a problem for another day.
But basically if you take the config patch (, revert it) and apply it to one of the examples you should get the other one.

We can also add more contrived examples that add and remove things etc with a dataprovider but that is not a hard requirement for a first version.

I think the failing test is testing pretty much this, though in a more involved workflow. So if you can make an easier-to-understand version like I described in the first paragraph of this comment then that will be helpful for all contributors.

bircher’s picture

maybe it is as simple as switching the boolean value to the preserve integer keys on the merge utility.

Oh and the method should probably not be public, none of this is an API.

pyrello’s picture

The problem that is causing the previously existing patch test to fail seems to happen because of the NestedArray::mergeDeepArray() method, which is unfortunate. When you look at the existing test for this method (https://git.drupalcode.org/project/drupal/-/blob/9.3.x/core/tests/Drupal...), I think it is pretty clear that this use case wasn't envisioned. If $preserve_integer_keys is FALSE, there is no array_search to check if the value already exists so that it doesn't get added twice.

I'm not sure if, in the world of config, it is a valid use case to have multiple elements in a sequence that are identical to one another or not. If that is a valid use case, then I'm not sure how to solve for both that and being able to merge config that may have duplicative elements that are not intended to be duplicated.

pyrello’s picture

Assigned: Unassigned » pyrello

pyrello’s picture

I opened MR 11 because I was needing to test out the code from https://www.drupal.org/project/config_split/issues/3232667 and this issue at the same time and be able to show progress on the corresponding working that I am doing related to this. I can close it when I am able to create a "finalized" patch or alternately these two issues' MR's get merged.

pyrello’s picture

Based on my understanding of the issues that were raised in Slack about the necessity of possible duplicate entries in config sequences, it seems like we have two situations that we are trying to account for:

  1. Some sequences should not allow duplicates. Their presence likely indicates that the config is wrong. An example of this would be permissions in a user.role.*.yml file.
  2. Some sequences may need to allow for duplicates. I am not personally familiar with any examples for this, but there is nothing preventing from using config this way, so we need to allow for the possibility.

I believe that the first case is actually the more common use case for config and if we were going to solve one versus the other in the near term, that it should be that case. @bircher and I have discussed the possibility of using some sort of plugin system that would provide a finer degree of control over individual cases. I think that is probably going to be necessary without changes to core config schema properties. If such a system comes into being, I still think the default case should be de-duplicating config sequences.

I think MR 9 is mostly accomplishing the first case above.

As the most recent test failure shows, there may still be room for adding some sort of post-merge sorting to sequences, so they will be in a predictable order.

pyrello’s picture

Expecting this to go from 1 failing test to 2 failing tests.

pyrello’s picture

And then back to 1 failing test.

bircher’s picture

I just had a crazy idea, what if we make the keys of sequencial arrays string keys with some smart logic before doing the diff?

Like if the array contains unique strings (such as for permissions or dependencies) we make the key csp_unique $value. Or if the array contains another array we check if it has a uuid or id component and make the key csp_uuid $value['uuid'].

Then after merging we clean it up again and remove all the keys where we added the csp_ prefix. For legitimate duplicates we have its own key, and for arrays which dont have a uuid or where it is not unique we leave it as is or just pefix it but keep it an integer so that it will be removed and added as now.

Basically the idea is to try to identify the elemnents better so that the diff doesn't compare by identifying it as "the first element"

pyrello’s picture

We had discussed how we might handle things like core.entity_view_display.node.*, especially in cases where Layout Builder is turned on and being used to provide a content type template. I think ideally, we would be able to have some sort of object comparison go through and figure out what is different for each object it finds. But I also think that how it is working right now is fine for the time being.

Here is an example of a patched config file of the above mentioned type: https://github.com/uiowa/uiowa/blob/0115ed14b9ed5115f21c149301d3a16488f2...

The issue we are trying to solve in this thread is something that makes CS2 unusable for certain types of configuration, which would be necessary for us to move forward with using it. Solving for more intelligent LB template handling and similar use cases seems like more of a nice to have that we could add in the future. So I propose a separate issue for tackling that and decoupling it from this issue.

If we are removing that from consideration, then I think we just need to figure out how to fix the sorting problem in the tests and make whatever adjustments are necessary there to make the tests work in a way that we have confidence in them.

alokbhatt’s picture

I was facing an issue while run config-split:import for user role permissions, in which I was changing permission for some roles, so I added a few permissions with config-split.patch file. Once import config-split, I noticed that some of the permissions has been changed assigned in main user.role permission configuration file and due to that admin toolbar was removed for a role.

I have applied a patch https://git.drupalcode.org/project/config_split/-/merge_requests/11.patch and that fixed the problem. Now only permission affected written in config-split.patch file.

Thanks for the patch.

pyrello’s picture

@alokbhatt - Just a word of warning that work in this issue is ongoing and so the format could yet change.

Glad to hear the patch worked for you!

pyrello’s picture

@bircher - If you are okay with deferring the more complex object handling that we have discussed until this issue: https://www.drupal.org/project/config_split/issues/3238855, then I think the only thing needed here is to resolve the issue that causes the merge test to fail, which is that the merged sequences don't match because we are not sorting them. I'm not 100% sure that the test itself doesn't cause this problem because we are using fictitious config that is not associated with a schema that would allow the Sorter to do its thing. Some guidance on how to proceed would be appreciated.

pyrello’s picture

Status: Needs work » Needs review

This solution resolves the failing test. Looking for feedback on whether this approach is okay.

pyrello’s picture

StatusFileSize
new9.46 KB

Adding a patch that I can reference from my project. Based on my testing, this seems stable enough for me to use for the time being.

bircher’s picture

This looks already very good. I am just a bit afraid of the sorting, but maybe it is just me being paranoid.

I will have to test this a bit more with some real data and maybe commit it in two steps so that we have more traces in the git log of how the array utility methods change.

It may be hypothetical (and in Drupal there is always someone that has a case where it becomes a practical case) but I don't know if this could handle things that are on purpose duplicated in sequences. (like the same layout builder block twice?).
That is why I thought it could be an option to do something more defensively, ie change keys from numeric to predictable strings in cases we know what to expect (ie know the type of values etc).

Also if this works for you and you are not @pyrello or @alokbhatt (and not working on the same teams) then by all means chime in and let me know! The more people that say that this fixes their problem and nobody that says that it makes things worse the more confident I am with the approach.

pyrello’s picture

For what its worth, I believe that for block in layout templates in the core.entity_view_display.node.*, each block (even duplicates) would have a different UUID, so this would not be an issue in that case. I can’t say for sure that this would never come up (your point about there always being a practical case for hypotheticals in Drupal is well taken). My testing with those particular files has seemed to indicate that they patch as expected.

I get being paranoid about the sorting. That is that part of this that I am the most unsure about. But, it also doesn’t break any existing tests, so it is not entirely clear without additional information what the problem with it might be.

tedfordgif’s picture

Here are the configs in one of my codebases that have explicit integer array keys:

$ grep -r '^ *\d+:' -P config -l | sort | head -3
config/dev/devel.settings.yml
config/sync/system.site.yml
config/sync/views.view.[redacted].yml
# many more views

The integer key in devel.settings is for the error_handlers, which is used in devel.module:

      $error_handlers = devel_get_handlers();
      if (!empty($error_handlers[DEVEL_ERROR_HANDLER_STANDARD])) {
        \Drupal::messenger()->addMessage($msg, ($severity_level <= RfcLogLevel::NOTICE ? MessengerInterface::TYPE_ERROR : MessengerInterface::TYPE_WARNING), TRUE);
      }
      if (!empty($error_handlers[DEVEL_ERROR_HANDLER_BACKTRACE_KINT])) {
        print kpr(ddebug_backtrace(TRUE, 1), TRUE, $msg);
      }
      if (!empty($error_handlers[DEVEL_ERROR_HANDLER_BACKTRACE_DPM])) {
        dpm(ddebug_backtrace(TRUE, 1), $msg, 'warning');
      }

system.site has 403 and 404, but those are siblings of the "front" key, so this would technically be OK with your patch. Feels a little squishy to me, if not fishy.

For views.view.* I have four scenarios:

  • Filter groups: Pretty sure this would be affected (see core/modules/views/src/Plugin/views/query/Sql.php:919), but I've never had to split config at this level
  • Grouped Filters: Ditto, but in core/modules/views/src/Plugin/views/filter/FilterPluginBase.php:858. This sometimes has an 'All' key in the code, but that doesn't get added to the config.
  • Views bulk operations: Also need to preserve these keys.
  • Entity ID filters (e.g. term ID): Didn't bother checking if these need preserved, but I can imagine a site with a terrible development workflow where dev has different term IDs than prod, and they resort to config split, even as a temporary measure. Frankly they should pay the price, but this is just the nail in the coffin for this approach as a general one.

Why don't we just make all configs strongly typed? <ducks>

pyrello’s picture

@tedforgif Thanks for bringing up these cases! This truly is not an easy problem to solve.

I have no idea how to resolve the case of devel.settings.yml:

error_handlers:
  1: 1

I think by the time we get to the point where we are doing the array merge, the above would be indistinguishable from:

error_handlers:
  - 1

For the system.site.yml, that is exactly the reason why it only considers the keys to be numeric if all of them are numeric. I'm open to suggestions about how to improve it.

I'll have to look into views a bit more. While I don't think we are currently using config split to override any existing views (which is where this patch merge code would come into play), it is something I had been thinking we would be able to take advantage of after moving to Config Split 2.

I will agree that this solution breaks (or potentially breaks) some of the above config entities if they are partially split. However, I would ask, does that seem like a bigger bug than the way it currently handles something like a user.role.*.yml without these changes? Here is an example of a patch export of a permissions file from my project: https://github.com/uiowa/uiowa/blob/b2f9d8847917aff4ca62637fe1055fffa931...

tedfordgif’s picture

Edit: I know these don't address the "do we ignore the keys" question, but they are related. I don't have time to track down the existence of the "ignore?" ticket.

pyrello’s picture

I don't think that first issue will have any affect on the problem with sequence handling outlined by this issue. That issue is driven by the fact that the merging method used doesn't have an elegant way of handling sequences (which really just reflects the same underlying problem with PHP's array_merge_recursive() function).

I added a test to cover the devel.settings.yml case.

bircher’s picture

hmm instead of checking if the keys are integers, I think we could use the same logic the yaml formatting is doing to decide whether to use a dash or the number. (I am not suggesting to call that function but to copy that code or idea)
I think it comes down to check if the array keys match range(0, count($arr)) or similar.

I think it would also be good to clearly see where we deviate from the core utility.

I am not sure that we shouldn't also be able to deal with duplicates, but I don't have any good example of any config where that would be a thing. But for that I think the only way is to transform the keys so that this information can be kept.

What I like about this is that it solves a lot of problems.

And I think we should not interfere with the sorting here, its fine if it gets added to the end. The sorting should be a different concern. So for the test we can add a sorter that sorts everything recursively and add a dummy test name (I will refactor it anyway sooner or later to make the name required)
Or alternatively we make assertions that ignore the order.

pyrello’s picture

I am not sure that we shouldn't also be able to deal with duplicates, but I don't have any good example of any config where that would be a thing. But for that I think the only way is to transform the keys so that this information can be kept.

To clarify, if we are going to do a check for duplicates, this would be within each array prior to merging. The merge by its very nature might produce duplicates and we wouldn't want to keep those. So, we would need a check prior to looping through the arrays to check this and to set the $is_sequence variable (introduced in the last commit) to be FALSE if duplicates were detected.

pyrello’s picture

For duplicates, I think we could add another private static method that essentially does something like this: https://stackoverflow.com/a/43635394/1375741 and checks each array before processing and sets $is_sequence to FALSE if duplicates are present.

However, since we are not actually aware of any cases where valid duplicates might be used, would it make sense to leave this as is, for now, and add the duplicate check later if someone actually reports it as an issue?

pyrello’s picture

StatusFileSize
new10.33 KB
new0 bytes

Adding another patch based on current progress.

pyrello’s picture

StatusFileSize
new2.86 KB

Messed up the interdiff. Here's a new one.

bircher’s picture

RE #37: Yes that makes sense, we can ignore duplicates for now, and then deal with it when there is a real case.

I suspect that your proposed solution for duplicates won't work though because we would need to detect duplicates also when creating the patch. So we could have duplicates before merging, we could have duplicates in the patch or we could have duplicates in the array before applying the patch.

So in order not to get the perfect in the way of the good I think we can live with it for now.

Just a little thing: the new methods need to be private because they do certainly not constitute an API and I can already see people using them and then their code breaks... All of this here is just a sandbox to find out what an API in core or another module would have to do.

bircher’s picture

Status: Needs review » Needs work

I'll give it a proper review when I have some more time, but for now the public methods need work. (ie the tests need to be updated too)

I would prefer if we tested it through the public api of creating the patch and applying it, but I can live with a closure that uses call to access private functions of other classes in a test or even reflection.

pyrello’s picture

Status: Needs work » Needs review

Seeing what happens with the tests.

pyrello’s picture

I haven't quite figured out the testNonSequenceNumericKeyMerge test yet, but I think I have the rest of them working. If its obvious what I'm missing about how that test should be set up, let me know.

pyrello’s picture

Status: Needs review » Needs work
bircher’s picture

Ah that is because the test you copied is way too complicated for what we want to test here.

All we need is:
$configA
$configB

then we create a $patchAB and $patchBA

then we assert that the patch applied to the config gives the other config. (ie it works both ways.)

And that one patch is the inverse of the other.

bircher’s picture

oh and of course the config A and B could come from a data provider, and we could optionally add a expected patch.

pyrello’s picture

Status: Needs work » Needs review

Refactored the test for numeric keys which are not a sequence. In the process of doing that, I noticed that the diffArrayRecursive method was not respecting the non-sequence keys, which was causing the test to fail. I refactored to fix this issue.

Ready for review!

pyrello’s picture

Status: Needs review » Needs work

Changing the logic in diffArrayRecursive broke another test.

pyrello’s picture

Status: Needs work » Needs review

The test was implicitly using a mixed array (numeric and string keys), which is not what I think we are trying to test.

bircher’s picture

I refactored the tests a bit. I hope it will be easier in the future to add test cases to the provider.
I also added it to the beginning so that it is the first thing you see and not the complicated test I made for myself.

pyrello’s picture

These changes look great! Much more robust testing tools for the patch side of things!

pyrello’s picture

Status: Needs review » Reviewed & tested by the community

I checked with @tedfordgif and he indicated that he was okay with the changes here. I tested again against my setup and am still getting consistent results for user.role.* config files on export. Marking as RTBC.

  • bircher committed 110b275 on 2.0.x authored by pyrello
    Issue #3232243 by pyrello, bircher, tedfordgif: Improve config patch and...
bircher’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for all the contribution!

This is a big improvement even though we left some cases unaccounted for.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.