Problem/Motivation

If users modify scaffold files (e.g. ".htaccess"), then they might not get changes to this file made in newer versions of Drupal.

Scaffold files should not be modified in Composer-managed Drupal sites; instead, users are expected to use "append" or "patch" capabilities to add their custom content to the Drupal-provided files. Some users do not know this yet, though, so some additional behavior to detect and warn users who modify the generate files is being proposed.

We ask people to modify their .htaccess file in the following use cases, and there is probably an expectation that their modifications will not be lost.

  • using Drupal in a subdirectory
  • suggesting a rewrite from or to a sub-domain

Additionally hacking .htaccess can help when redirecting to a front-end application not managed by Drupal.

However allowing composer to update .htaccess, web.config or other files is important because those files may change in new versions of Drupal due to security vulnerabilities, new features and so on.

There are three available ways to manage this in my repository: excluding the files from scaffolding altogether, or adding an append configuration (which is ++ for robots.txt, but not applicable to they types of changes in .htaccess), or by patching the scaffold files with cweagans/composer-patches. Managing scaffold file modifications through the scaffold process (i.e. with append or cweagans/composer-patches) is the recommended method, but poses some risk to developers who are not yet familiar with these techniques, and who might be uncertain about how to configure their file-mappings in order to get the desired behaviors. During this learning period, developers are in danger of accidentally losing any ad-hoc modifications that they may make to their scaffold files, as scaffold files are overwritten every time composer install runs.

Proposed resolution

User Expectations

  1. I would expect to get the latest version of all the various files at the time I install Drupal
  2. I would expect that if I make modifications to those files, that they’re left alone
  3. The exception to that would be if there are updates “upstream” to the file(s) [for example, a change to resolve a security issue, which has definitely happened with .htaccess in the past]
  4. Then, I want some means of knowing there's a change between my local files and upstream, so I can back my stuff up to compare/constrast the two and figure out what I need to do (either accept the changes wholesale or merge them into my modifications somehow)

The scaffold plugin now takes the following actions:

  1. When a scaffold file is written, its contents are hashed, and the hash values are stored in a cache.
  2. When a scaffold file is to be rewritten, then the contents of the source file are once again hashed; if the hash value matches what is in the cache (the source file has not changed), and the scaffold file still exists at its destination path, then it is skipped (not written).
  3. If a scaffold file is modified at the destination path, this is detected because its content hash no longer matches the cached hash value.
    If the scaffold file does not exist at the destination path, it does not count as a modified file.
  4. When a modified scaffold file is rewritten, then the scaffold plugin will warn the user that they may loose data, and will prompt them for the action to take
    • They may choose to discard their modifications.
    • They may abort the scaffold operation, in which case none of the scaffold files are changed.
    • They may permanently keep their modifications, in which case the file is removed from consideration and is no longer managed by the scaffold plugin.

Note that in the last option, where a scaffold file is no longer managed by the scaffold plugin, the user presently (in the current implementation of the most recent version of the patch) will not be warned if the scaffold file changes in a newer version of Drupal. It is desirable if the user knows when a modified upstream file has changed. The proposal is to print a warning during scaffolding. It would not be consistent to prompt, as the user has explicitly removed the file from being managed by the scaffold code. A prompt may be missed in the output of composer install / composer update; however, the user currently is not warned at all e.g. when running drush pm-update, so perhaps this is sufficient.

The hash cache data used by the scaffold plugin must be stored in either a transient location or a non-transient location. The non-transient location was selected.

Discussion on transient vs non-transient hash cache storage

An example of a transient location is inside the vendor directory, e.g. vendor/drupal/scaffolded.json. The advantage of the transient location is that the file does not have to be managed by the Drupal site. It may be committed to the repository if vendor is committed, but users who do this are accustomed to ignoring changes to vendor contents during updates. The disadvantage of the transient location is that it may be inadvertently removed; for example, if the user removes their entire vendor directory and then runs composer install, then the scaffold operation will overwrite any modified scaffold file. This may be unexpected, as scaffold files are written to locations outside of the vendor directory.

An example of a non-transient location is inside the project root, e.g. .scaffolded.json. The advantage of the non-transient location is that it is not subject to accidental removal if the user should decide to delete their vendor directory. While it is still possible for the user to deliberately delete their hash cache, with all of the consequences described above, this is less likely to happen. The disadvantage of the non-transient location is that there would be a new dot file at the project root of every Drupal site, and its contents will change inscrutably during Drupal upgrades. Every site owner must be aware of this file and decide whether to commit or .gitignore it -- although we can place it in example.gitignore to help with this decision, and document the file.

The proposal is to use a non-transient location for now, and add a comment to the file pointing the user to the scaffolding documentation. Since the hash cache keeps transient information that is only useful to users who are not using the recommended scaffold file management techniques (append or patch), we would then have the option to perhaps move the cache to a transient location in Drupal 9 or Drupal 10, once we felt that the recommended method was sufficiently socialized.

Original proposed resolution from becw

  • Update the default behavior in core to specify overwrite: false for some scaffolding files (likely .htaccess, robots.txt, and web.config)
  • Update comments in those files to include a note like "If you modify this, update your composer.json to contain ___", since each of these contain comments suggesting how to modify different parts of their contents
  • Reduce the frequency that the scaffolder touches any files, so that they're only updated when core is installed or updated, which makes it much easier for developers to manage the diffs -- and allows developers to routinely incorporate core updates that affect these files.

Remaining tasks

  • We need to decide what to do about scaffold files that the user has removed from consideration (set to "false" in file-mappings). Currently, no warning is provided if the source scaffold file changes once this is done.

User interface changes

Possible changes to how running composer behaves

API changes

None

Data model changes

Introduces a "transient-cache" field that allows users to decide whether to keep their hash cache in a transient location (inside the vendor directory), or whether it should be stored in the project root, where it may be committed to the repository (the default).

Release notes snippet

TBD

CommentFileSizeAuthor
#144 3092563-144.patch30.22 KBvacho
#138 interdiff_137-138.txt2.24 KBnarendra.rajwar27
#138 3092563-138.patch30.22 KBnarendra.rajwar27
#137 3092563-137.patch30.23 KBdevkinetic
#117 3092563-117.patch45.71 KBravi.shankar
#102 3092563-102.patch45.27 KBgreg.1.anderson
#102 101-to-102-interdiff.txt2.79 KBgreg.1.anderson
#101 3092563-101.patch46.27 KBgreg.1.anderson
#101 97-to-101-interdiff.txt20.75 KBgreg.1.anderson
#100 3092563-100.patch42.63 KBgreg.1.anderson
#100 97-to-100-interdiff.txt13.57 KBgreg.1.anderson
#97 3092563-97.patch1.07 KBgreg.1.anderson
#97 94-to-97-interdiff.txt1.07 KBgreg.1.anderson
#94 3092563-94.patch27.99 KBgreg.1.anderson
#94 88-to-94-interdiff.txt10.05 KBgreg.1.anderson
#88 3092563-88.patch32.35 KBgreg.1.anderson
#88 76-to-88-interdiff.txt9.13 KBgreg.1.anderson
#76 3092563-76.patch26.49 KBgreg.1.anderson
#76 75-to-76-interdiff.txt1023 bytesgreg.1.anderson
#75 3092563-75.patch26.39 KBgreg.1.anderson
#75 72-to-75-interdiff.txt1.13 KBgreg.1.anderson
#72 3092563-72.patch26.01 KBgreg.1.anderson
#72 63-to-72-interdiff.txt4.04 KBgreg.1.anderson
#64 interdiff_63-64.txt910 bytesshubham.prakash
#64 3092563-64.patch25.7 KBshubham.prakash
#63 3092563-63.patch25.69 KBgreg.1.anderson
#62 3092563-61.patch1004 bytesgreg.1.anderson
#62 59-to-61-interdiff.txt1004 bytesgreg.1.anderson
#59 3092563-59.patch25.68 KBgreg.1.anderson
#59 57-to-59-interdiff.txt6.66 KBgreg.1.anderson
#57 3092563-57.patch25.46 KBgreg.1.anderson
#56 3092563-56.patch9.23 KBgreg.1.anderson
#56 52-to-56-interdiff.txt9.23 KBgreg.1.anderson
#52 3092563-52.patch19.29 KBgreg.1.anderson
#52 51-to-52-interdiff.txt1.32 KBgreg.1.anderson
#51 3092563-51.patch19.25 KBgreg.1.anderson
#51 49-to-51-interdiff.txt9.28 KBgreg.1.anderson
#49 3092563-49.patch18.46 KBgreg.1.anderson
#46 3092563-46.patch1.37 MBgreg.1.anderson
#46 43-to-46-interdiff.txt11.4 KBgreg.1.anderson
#43 3092563-43.patch18.48 KBgreg.1.anderson
#43 40-to-43-interdiff.txt9.17 KBgreg.1.anderson
#40 3092563-39.patch15.38 KBgreg.1.anderson
#40 37-to-39-interdiff.txt7.76 KBgreg.1.anderson
#37 3092563-37.patch13.1 KBgreg.1.anderson
#7 drupal-3092563-7-comment_to_avoid_overwriting_htaccess.patch1.94 KBbecw
#6 drupal-3092563-6-avoid_overwriting_htaccess.patch1.96 KBbecw
#2 drupal-3092563-2-avoid_overwriting_htaccess.patch1.95 KBbecw

Issue fork drupal-3092563

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

becw created an issue. See original summary.

becw’s picture

Here's a patch updating the default behavior in core to specify overwrite: false for the .htaccess, robots.txt, and web.config scaffolding files.

Mixologic’s picture

Status: Active » Needs review
Issue tags: +Composer initiative

This looks good to me.

greg.1.anderson’s picture

The solution in #2 is probably not desirable, because .htaccess changes in Drupal core may have security implications. With overwrite: false, sites will never get updates to .htaccess on upgrades.

I think that (b) is the best solution, provided that the comment also contains a warning that users who go this route will not get future updates to the .htaccess file.

Option (c) is a desired future optimization for other purposes, but it's sort of a time bomb for this use case, as it may lead site builders to think their .htaccess customizations are safe, only to have them silently removed at random. We'll probably end up doing this anyway at some point in the future; folks will just have to be careful.

greg.1.anderson’s picture

Status: Needs review » Needs work

Although I think that we should switch techniques completely here, I also noticed that the paths to the scaffold assets are wrong in #2.

  1. +++ b/core/composer.json
    @@ -196,14 +196,26 @@
    +                    "path": "assets/scaffold/files/htaccess",
    

    n.b. wrong path.

  2. +++ b/core/composer.json
    @@ -196,14 +196,26 @@
    +                    "path": "assets/scaffold/files/htaccess",
    

    Also wrong path.

becw’s picture

becw’s picture

If it makes more sense to manage this via a comment / option (b), here's a patch with the type of information that I think would be relevant.

There might be some wordsmithing necessary to add a warning about receiving updates -- since the expectation for most sites is that there will be some modification to the .htaccess file (generally redirecting requests to https, or enforcing the presence or absence of the www subdomain).

webchick’s picture

After discussion with @Mixologic, @greg.1.anderson, @mradcliffe, and @becw, I'm tentatively escalating this to a critical bug report, since it represents data loss.

It sounds like currently, whenever you composer install, composer update, or composer require, your outstanding changes to all files managed by the scaffold get overwritten. That's fine (well, not FINE but...) if it's code stuff we don't want you hacking anyway, but very un-good when it's a file like .htaccess that people often make modifications to, very deliberately (@mradcliffe pointed out that there are even instructions in the file telling you it's ok to do this).

And it sounds like the overwriting just happens, and you're told about it after the fact, vs. given an opportunity to opt-out. :\ Which... if you're not checking these files into Git (which AFAIK checking Composer-managed stuff into Git is not a recommended best practice?), you are quite effed and lose all of your modifications unless you backed up beforehand (which people would probably think to do ahead of a composer update, but not necessarily a composer require which you might do a dozen times in a day when working on a new feature. Also, .htaccess is a hidden file, so not always backed up by default, depending on what flags you passed to your copy program)

I was asked to chime in with what the user expectations would likely be, and I think they are:

  1. I would expect to get the latest version of all the various files at the time I install Drupal
  2. I would expect that if I make modifications to those files, that they’re left alone
  3. The exception to that would be if there are updates “upstream” to the file(s) [for example, a change to resolve a security issue, which has definitely happened with .htaccess in the past]
  4. Then, I want some means of knowing there's a change between my local files and upstream, so I can back my stuff up to compare/constrast the two and figure out what I need to do (either accept the changes wholesale or merge them into my modifications somehow)

If people think that's being absolutely unreasonable, then I guess we can talk about it, but I think we both a) do NOT want indiscriminate data loss (especially at unexpected times), but also b) want a decent shot at getting covered right away (or nearly so, after some manual "diffing") when upstream security patches happen.

webchick’s picture

Category: Feature request » Bug report
Priority: Normal » Critical
webchick’s picture

Issue tags: +Needs security review

Would love whatever solution we come up with here to get checked over by members of the sec team, too. I don't think #2/#6 are going to fly, for example, but maybe so! :)

mradcliffe’s picture

Issue summary: View changes

I modified the issue summary with the template and to try to bring in some of webchick's comment above based on the Slack conversation with @greg.anderson, @Mixologic, @becw, @webchick and myself into the issue summary.

greg.1.anderson’s picture

If we want to know if a file has been modified since it was scaffolded, we could calculate a checksum on the file after it is written and cache it somewhere. This would allow us to detect changes to the scaffold file on the next scaffold run. However, there is a question as to where the checksums should be stored, and what we do if the user deletes the checksum file. There is also a question of what we do if a change in a scaffold file is detected when we are running in no-interaction mode.

Mixologic’s picture

So, the proposal in slack, as I understood would be one of the two following scenarios:

If we are *just* comparing whats on disk vs what we're trying to scaffold:

This is the simple case where we do not attempt to distinguish between whether a file has *changed* from the last time we scaffolded it or not.

If the scaffold tool is trying to write a different version of a file over one that exists on disk,

  1. An update from upstream, (local=A, upstream=B)
  2. Local modifications, upstream hasnt changed (local=B, upstream=A)
  3. Local modifications, upstream has new features (local=B, upstream=C)

Then we need to prompt the user to tell them:

Scaffold is trying to overwrite assetA with new contents? Allow (y/N)?

  1. If the user allows, overwrite the file with the new contents.
  2. If the user says no, we should update the root composer.json with scaffolding configuration to say "[web-root]/.htaccess": false so that the user does not get prompted again. We should also emit a message telling them what we're changing
  3. If the user is in non-interactive mode, we should default to No, but continue to prompt later, without saving the changes to

If we are track what we scaffold

If we *do* attempt to keep track of that somehow, as per #12, then the rules would probably be different.

  1. If there were no local changes, we would just update files as normal, and not prompt (local=A, previoiusly scaffolded=A, upstream=B)
  2. If there *were* local changes and we were not trying to update, then we would do nothing, and not prompt (local=B, previously scaffolded=A, upstream=A)
  3. If there *were* local changes from what was scaffolded, and we were *also* attempting to update, then we would somehow warn the user that "upstream has new changes to this file" (local=B, previously scaffolded=A, upstream=C) and try and give them hints on how to find out the difference between A and C.

Finally, the settings in "drupal-scaffold": { should take precedence over any of the above rules.

Mixologic’s picture

the A's, B's, and C's above are say, example shortenend md5 sums.

Mile23’s picture

I think this should be in the scaffold extra.

Assets already have overwrite: false, and this should also allow overwrite: ask. So for instance .htaccess files are always ask.

The project template should also allow overwrite-ask:never, so that overwrite:ask is promoted to overwrite:true. overwrite-ask:always would then do the opposite and cause all overwrites to be prompted.

We should also have a clear report of what wasn't scaffolded, and a clear path to retrying.

I don't think we need to store checksums. I think we need to teach people how to have their own scaffold path repos, so they can scaffold their own custom .htaccess files.

greg.1.anderson’s picture

I think we need to teach people how to have their own scaffold path repos, so they can scaffold their own custom .htaccess files.

Yes, this is the design for the scaffold plugin; if you are doing something that is completely custom, then you should replace the upstream scaffold file completely. This does not give you security updates, but neither does the customize-in-place workflow.

The question is, how careful do we need to be with folks who are doing it "wrong" / "the old way" / "the way they've always done it"? We also need to track down all of the documentation related to modifying scaffold files in place and fix it up.

greg.1.anderson’s picture

Thinking about this a little more, I thought to compare the behavior of scaffold files with the standard behavior of Composer projects in the vendor directory.

  1. By default, you are not supposed to modify files in the vendor directory.
  2. Some people do it anyway.
  3. Composer even has a feature to install "source" versions of packages that make it easier to modify them in place.
  4. If you run `composer update` et. al., Composer will warn you before overwriting your modified packages.

I think that we should behave similarly with scaffold files. This would imply that we need to checksum, so that we can detect modified files.

Mixologic’s picture

In this case it's *very* typical to want to modify scaffold files. These files, by their very nature, are the 'core that is actually, kinda okay with being hacked', and always has been. I don't think we can assume that modifying robots.txt/.htaccess amounts to "doing something completely custom" -we've got instructions in .htaccess itself that tells you to customize it for various scenarios.

I think we need to teach people how to have their own scaffold path repos, so they can scaffold their own custom .htaccess files.

Yes, this is the design for the scaffold plugin;

This is advanced usage of the scaffold plugin, and hopefully not the default expectation for end users. We'd like to enable advanced workflows, not require them.

What we're trying to handle is the situations where the end user has no idea about the features detailed in composer/Plugin/Scaffold/README.md. , such that we're adhering to the principle of least surprise.

Keeping track of the checksums ultimately feels like a better ux than not, because of the rarity of the scenario that requires management.

ressa’s picture

I think we need to teach people how to have their own scaffold path repos, so they can scaffold their own custom .htaccess files.

That sounds like the correct approach, affecting the least number of users, if I understand it correctly. Applying the 80/20 rule here, the majority of users probably won't need to edit the .htaccess file.

As a user, I expect updating Drupal core will overwrite files like .htaccess and robots.txt, perhaps a left over from using Drupal for many years? From this also follows that I expect to have to re-apply any customization.

Perhaps it is worth considering adding a bit of warning text to the current .htaccess file: It's possible to edit it, but certainly not advisable?

NaheemSays’s picture

I think many novice users will amend the .htaccess file - atleast to set whether to require "www" or not.

Additions are a different thing.

greg.1.anderson’s picture

We should encourage users to scaffold rather than edit their scaffold files, both in the documentation and in the scaffold file comments (where appropriate).

We should also advise users to disable the scaffold file update (by setting its location to false ) in their composer.json if they do edit it. We could even consider doing this for the user, as suggested above.

We should also checksum, so that users who do not read the provided advice are not surprised by the overwrite.

ressa’s picture

I think many novice users will amend the .htaccess file - atleast to set whether to require "www" or not.

That's probably true, and it gets deleted when they update ... An alternative solution is Redirect to HTTPS with settings.php, as well as www or non-www:

You can also force SSL and redirect to a domain with or without www in settings.php, the benefit is that it won't get overwritten after updating Drupal.

slasher13’s picture

You can also use patches:

"post-drupal-scaffold-cmd": [
  "cd docroot && patch -p1 <../patches/htaccess-ssl.patch"
]

see https://git.drupalcode.org/project/drupal/tree/8.8.x/composer/Plugin/Sca...

greg.1.anderson’s picture

I'm going to work on a patch here. The first iteration will not change behavior of the scaffold plugin at all, it will only change the responsibility / scope of existing classes.

Today:

  • Scaffold operations are responsible for deciding what to do, telling the user what was done, and doing the action.
  • Scaffold operations return Scaffold results which represent what was done.
  • Scaffold results are used by the gitignore manager to decide whether to add any files to a .gitignore file.

Future:

  • Scaffold operations are responsibile for deciding what to do and returning Scaffold action objects.
  • Scaffold actions are responsible for telling the user what was done, and doing the action; they also hold their contents, which they can checksum.
  • Scaffold actions return scaffold results, which behave as they currently do.

Relationship between operations and actions:

  • ReplaceOp returns either a Copy File action or a Symlink action
  • AppendOp returns a Copy File action
  • SkipOp returns a Skip action

In the new design, the ConjoinOp will disappear; the parent operation will be passed in to any op that wants it (only append op for now). Maybe the append op will get the action, not the op; haven't decided yet.

Once this is in place, it will be possible to adjust the behavior per #8 and #13. It isn't really possible to implement those ideas in the current design without a refactor, as currently the append op first runs the parent op before appending to it, so there's no good way to calculate a checksum to determine whether or not a scaffold file has changed.

greg.1.anderson’s picture

From a slack thread, a more lightweight alternative:

  1. After calling the post-scaffold-cmd hook, iterate over all of the scaffold files from the scaffold file result list.
  2. Checksum the scaffold files and store the relative path and checksum somewhere.
  3. At the beginning of the next scaffold command, load the cached list of checksums.
  4. Calculate the current checksum of all of the scaffold files and prompt the user [y/n] "do you want to overwrite these changed files?" iff any changed, showing the list of files that changed.

This solution would prompt the user every time a scaffold operation ran (EDIT / CLARIFICATION: the prompt would happen every time if there were modified files), regardless of whether the upstream file changed or not.

Question: Is this good enough for 8.8.x, or do we want to hold the release on #24? Does the answer to that question depend on how long it would take to do #24?

Mixologic’s picture

Though for #25.4 I dont think we should do any prompting, and merely state "The following files have been modified manually, so we will do nothing to them during scaffolding. Please read the docs
to learn how to set this up correctly"

That way we dont have to handle prompting responses /worry about non-interactive etc.

webchick’s picture

Hm. A prompt that comes up every time regardless if there's something to react to or not is just a "boy who cried wolf" situation. It means users will get used to just passing in -y blindly (or typing Y and hitting enter) and not thinking about "oh CRAP, I didn't mean it THIS time!!" until it's too late.

greg.1.anderson’s picture

I think we need to prompt, as Composer does.

If the user says they want to keep their modifications, we could alter the top-level composer.json to exclude that file from scaffolding. That would stop the prompting. The user would have to take explicit action to get the modified file to start scaffolding again.

This was suggested earlier, but I forgot to include it in my summary.

Mixologic’s picture

I suppose I was thinking about the way git handles this, but then again, probably shouldnt take ux cues from git.

mradcliffe’s picture

I think applying some common user scenarios could help evaluate the approach. Like how would Johanna the baker react or what would Enterprise Company X with a CI system do?

In both cases, after thinking about it, some sort of failure/exit state would be appropriate. Automated systems are going to send alarm bells for companies, and then they can see how to adapt to a patch-workflow or adjust their patch. Johanna would see the failure and then be able to make a decision about the changes.

mradcliffe’s picture

For some reason, probably due to hungry, I thought Johanna was a baker. I don't want to bake dogs! They're a dog groomer :-)

ressa’s picture

I agree with @webchick that too many prompts might cause a "boy who cried wolf" situation, and I am just trying to understand, but should this bit be "the prompt would happen every time if there were modified files, but not in a default installation, with no tampered files"? (My addition in bold text)

This solution would prompt the user every time a scaffold operation ran (EDIT / CLARIFICATION: the prompt would happen every time if there were modified files), regardless of whether the upstream file changed or not.

Scenario 1 (default)

  1. A user installs Drupal 8.8.1 using Composer
  2. No files (such as .htaccess) or in core, vendor etc. are modified
  3. It is updated to Drupal 8.8.2 with Composer

Scenario 2 (.htaccess modified)

  1. A user installs Drupal 8.8.1 using Composer
  2. The user modifies the .htaccess file directly
  3. It is updated to Drupal 8.8.2 with Composer

Assuming a security issue was fixed in .htaccess in Drupal core between version 8.8.1 and 8.8.2, I would expect a silent over-write of .htaccess with no prompting in scenario 1, but a prompt in scenario 2. Is that correct?

Having scenarios for different user roles is great idea @mradcliffe!

greg.1.anderson’s picture

My updated / clarified proposal:

  1. Do not prompt at all unless the user modifies a file after it has been scaffolded.
  2. If a modified file would be overwritten by a scaffold operation, prompt the user.
  3. If the user says "no problem", overwrite the file. There will be no further prompting for this file, unless it is modified again.
  4. If the user says "I want to keep my modifications", then add "path-to-file": false to the file-mappings section in the root composer.json file. There will be no further prompting for this file, and this file will never be overwritten, unless the user modifies their composer.json file.

The only remaining question is, what should we do in non-interactive mode? I think for CI systems, I would expect a scaffold operation to overwrite any modified file (e.g. a test modifies a fixture; it would expect that running composer scaffold should revert that change).

For users, though, I would sort of expect the opposite. Whether Johanna is trying to groom her dog or bake her dog, I would think that she would expect that composer require -n would not delete local modifications and would not modify composer.json. Just reverting all scaffold operations to overwrite: false does not feel right either, though (in addition to not being easily supported with the proposed implementation), so I think that implies that modified files discovered in non-interactive mode would cause the action to fail with an exception, unless, perhaps, an environment variable was set to stipulate the default behavior (overwrite or modify composer.json to preserve).

Mixologic’s picture

I feel like in the absence of user input, we should default to 'do nothing destructive' / 'make no changes'

So, I think in the Johannas cake grooming service example, it would behave as you say, not delete local mods, and also not modify composer.json.

I think that even for CI systems that would be the desired behavior, as we dont cannot make any assumptions about whether or not changes should be reverted.

In both cases, a message should be emitted saying "Modified file not overwritten" etc.

ressa’s picture

Thanks for clarifying the process @greg.1.anderson, and spelling it out in layman's terms. It's much clearer to me now, and the logic in step 1 through 4 in the new proposal seems correct to me.

becw’s picture

The only remaining question is, what should we do in non-interactive mode?

I would expect an automated build (whether CI or part of a deployment system) to fail rather than overwriting the developer customizations to these files -- aside from running tests, this is exactly the kind of issue I hope CI and automated builds will catch.

greg.1.anderson’s picture

Version: 8.8.x-dev » 8.9.x-dev
Status: Needs work » Needs review
FileSize
13.1 KB

Here's the patch, explanation coming in a bit.

greg.1.anderson’s picture

Before starting in on this patch, I checked what Composer does if you modify a file in a git clone of a dependency in the vendor directory, and then attempt to overwrite it via composer require. It looks something like this:

  - Downgrading consolidation/log (dev-master b2e8873 => 1.1.1):     The package has modified files:
    M src/ConsoleLogLevel.php
    Discard changes [y,n,v,d,s,?]? ?
    y - discard changes and apply the update
    n - abort the update and let you manually clean things up
    v - view modified files
    d - view local modifications (diff)
    s - stash changes and try to reapply them after the update
    ? - print help
    Discard changes [y,n,v,d,s,?]? 

This is from the Git Downloader, so it is easy for that class to implement view / diff / stash, as it has the git repository at its disposal. In the case of scaffold files, we have no such luxury, so I skipped these commands.

If you answer "no" to the question above, the composer install aborts. For consistency, I made "no" also abort from the scaffold operation. I also added a [k]eep option that continues with scaffolding, but does not overwrite any modified files. Of course if you answer "yes", then the modified files are overwritten. I also added a DRUPAL_SCAFFOLD_DISCARD_MODIFIED environment variable to force a discard without bringing up the prompt. If you run in non-interactive mode, it preserves any modifications to scaffold files.

Aborting the whole operation on a "no" might be a little too heavy-handed, though, especially considering that's probably going to be the go-to response for many users, not all of whom will read the ? text. I therefore thought that perhaps we should change the behavior somewhat.

  • Answering "yes" still discards all changes, overwriting modifications with scaffold files.
  • Non-interactive mode preserves modified files and makes no changes to composer.json
  • Answering "no" behaves like non-interactive mode (does not abort, prompts user again next run)
  • Answering "keep" behaves like "no", and also modifies the root-level composer.json so the user will not be prompted again (and updates from the upstream will also be skipped)

Note that #37 does not modify the composer.json file at this time.

Also, since I did not do the refactoring suggested in #24 (per a slack discussion, this seems too invasive for 8.8.x), there is no way to react differently based on whether or not the upstream scaffold file contents changed. It was suggested that we would only prompt the user about their modifications when a change in the upstream file necessitated the overwrite. In lack of changes to the upstream file, the modified file would just be skipped without prompting.

alexpott’s picture

  1. +++ b/composer/Plugin/Scaffold/Operations/ScaffoldFileCollection.php
    @@ -72,6 +72,41 @@ public function __construct(array $file_mappings, Interpolator $location_replace
    +          $this->scaffoldFilesByProject[$project_name][$destination_rel_path] = new ScaffoldFileInfo($scaffold_file->destination(), $op);
    

    Do we need to clone $op here? Just so we don't have any unexpected side effects? Maybe not... SkipOp is stateless.

  2. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,214 @@
    +    return "$vendor_dir/scaffolded.json";
    

    I would have thought we need to store this information in composer.lock - putting it in vendor means that if you remove vendor and run composer install this information is lost and your files will be overwritten.

greg.1.anderson’s picture

Here's an update that modifies composer.json if requested to keep the modified files perpetually. The prompt behavior is now as follows:

The following managed scaffold files have been modified:
  - [web-root]/robots.txt

    Discard changes [y,n,k,a,?]? ?
    y - discard changes and rewrite scaffold files
    n - keep modified files in their current state; ask again the next time Composer runs
    k - keep modified files and modify composer.json to avoid being asked again
    a - abort scaffold operation
    Discard changes [y,n,k,a,?]? 

SkipOp is stateless, so we can use the same one.

I'm not sure how to store the hashes in composer.lock. We can't really rewrite composer.lock directly because of the hash, and if we did, then we'd just lose it again the next time Composer rewrote it. If we put the information into an "extra" field in composer.json and then updated composer.lock it would go in, but it doesn't feel right to modify a source file (and I don't believe the root-level composer.lock's extra is included anyway). Would a dot file at the project root be a little better than the vendor directory? Then it would be in a place where it might be committed, which also doesn't seem right.

Not sure what the best thing to do is. If we did the refactor in #24 we'd have a bit more info about whether files were modified or not, but that seems like too much for 8.8.x. Suggestions welcome.

greg.1.anderson’s picture

So, thinking about it a bit more, #40 now gives the user the ability to decide they want their modified files to be preserved indefinitely, which updates their composer.json file with persistent information about which files were modified. If this is not done, then they are at risk of losing their modifications, e.g. if they remove the vendor directory. It seems this is a fair trade-off.

The other question I have is that maybe /drupal/scaffolded.json is a better place for the transient hash file cache.

mradcliffe’s picture

Status: Needs review » Needs work

Some formatting nit picks.

  1. +++ b/composer/Plugin/Scaffold/Operations/ScaffoldFileCollection.php
    @@ -72,6 +72,41 @@ public function __construct(array $file_mappings, Interpolator $location_replace
    +    // If we were instructed to keep modified files, then filter them out
    +    // of our list of scaffold files to process.
    

    I think that "of our" would fit on the first line here.

  2. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +      // TODO: Composer also offers 'v' (view) and 'd' (diff); we could add
    +      // 'view' as a follow-on task, but 'diff' would be difficult here.
    

    TODO should be @todo.

  3. +++ b/core/tests/Drupal/Tests/Composer/Plugin/Scaffold/Functional/ComposerHookTest.php
    @@ -116,6 +116,45 @@ public function testComposerHooks() {
    +    // 'composer drupal:scaffold' and confirm the modification is not overwritten
    ...
    +    // And if we run it yet again, then the file should not be modified any more.
    

    I guess these comments are over 80 characters (by 1) so I guess a second line?

    "Overwritten" should end in a full stop.

    Opinion: "Modify the same test scaffold file, run 'composer drupal:scaffold', and then confirm the modification is not overwritten." might be better if this is a list of sequential steps. Or be consistent with the comment that breaks the confirm bit into its own sentence?

    Opinion: Not having any space after each assertion and inline comment makes it a bit hard to read.

greg.1.anderson’s picture

Fixes for #42, plus avoid JSON_THROW_ON_ERROR since it is not available in all PHP versions we support, plus move scaffolded.json to vendor/drupal instead of putting it at the root of vendor.

Mile23’s picture

Status: Needs review » Reviewed & tested by the community

Here’s how I reviewed this. Some of these steps might screw up your dev environment:

$ git checkout 8.9.x
$ git reset —hard
$ git clean -fdx .
$ rm -rf vendor
$ composer install
$ wget https://www.drupal.org/files/issues/2019-11-08/3092563-43.patch
$ git apply 3092563-43.patch
Modify drupal/drupal to have a path repo for the scaffold plugin:
$ composer config repositories.scaffold path composer/Plugin/Scaffold
Edit compser.json to require drupal/core-composer-scaffold:self.version:
"drupal/core-composer-scaffold": "self.version"
Update the scaffold plugin so it’s installed:
$ composer update drupal/core-composer-scaffold
Should look like this:
- Installing drupal/core-composer-scaffold (8.9.x-dev): Symlinking from composer/Plugin/Scaffold
It should also do a bunch of scaffolding, but git status should then show you that this doesn’t represent a modification to any of the scaffold files it knows about.
Modify .htaccess:
$ echo "modified" >> .htaccess
Now try to scaffold again:

$ composer drupal:scaffold
The following managed scaffold files have been modified:
  - [web-root]/.htaccess

    Discard changes [y,n,k,a,?]? ? 
    y - discard changes and rewrite scaffold files
    n - keep modified files in their current state; ask again the next time Composer runs
    k - keep modified files and modify composer.json to avoid being asked again
    a - abort scaffold operation
    Discard changes [y,n,k,a,?]? 

K looked the most interesting to me, so I did that. It leads to this change to composer.json:

        "drupal-scaffold": {
            "file-mapping": {
                "[web-root]/.htaccess": false
            }
        }

Now we try to scaffold again:

$ composer drupal:scaffold
[…]
Scaffolding files for drupal/drupal:
  - Skip [web-root]/.htaccess: disabled

And here’s a test:

$ rm -rf vendor
$ composer install
[…]
Scaffolding files for drupal/drupal:
  - Skip [web-root]/.htaccess: disabled

Remove extra.drupal-scaffold.file-mapping from composer.json and try again:

$ rm -rf vendor
$ composer install
[…]
  - Copy [web-root]/.htaccess from assets/scaffold/files/htaccess

So remember kids, don’t change this config if you like your .htaccess file. :-)
Let’s try with other prompts:

The following managed scaffold files have been modified:
  - [web-root]/.htaccess

    Discard changes [y,n,k,a,?]? ? n
[..]
  - Preserve [web-root]/.htaccess: modified by user.

This is all excellent and nifty. My main suggestion would be that this help screen:

    y - discard changes and rewrite scaffold files
    n - keep modified files in their current state; ask again the next time Composer runs
    k - keep modified files and modify composer.json to avoid being asked again
    a - abort scaffold operation

Also say: “Learn how to scaffold your own files at http://some.docs/“ so that people could avoid this whole thing by scaffolding their bespoke .htaccess files.

But that's a bit of a nit, and generally we're in the RTBC mood here. :-)

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/composer/Plugin/Scaffold/Operations/ScaffoldFileCollection.php
    @@ -72,6 +72,41 @@ public function __construct(array $file_mappings, Interpolator $location_replace
    +  /**
    +   * Filter out any item in the list whose destination path has been modified.
    +   *
    +   * @param array $modified
    +   *   List of destination paths
    +   */
    +  public function ignoreModified($hash_manager) {
    

    Needs type hinting and @param docs.

  2. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +  /**
    +   * Cached list of modified files.
    +   */
    +  protected $modified = [];
    

    I'm concerned about have to maintain this list as it means in order to call decideHowToHandleModified you must have called checkModified first. Maybe we can turn all of this around by instead having a method ScaffoldHash that accepts a ScaffoldFileCollection and then add a method to that that allows a file to be removed.

  3. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +  /**
    +   * ScaffoldHash constructor.
    +   *
    +   * @param string $dir
    +   *   The directory where the project is located.
    +   */
    +  public function __construct(Composer $composer, IOInterface $io, $dir) {
    

    @param docs missing.

  4. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +  public function __construct(Composer $composer, IOInterface $io, $dir) {
    +    $this->composer = $composer;
    ...
    +    $vendor_dir = $this->composer->getConfig()->get('vendor-dir');
    

    We're trying this object to the entirety of a composer object when all we need from this is the vendor directory. That said we're also parsing and writing to the root composer.json - perhaps this object has tools to do that safely?

  5. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +  /**
    +   * Load the cached hashes from disk.
    +   */
    +  public function load() {
    +    $hash_path = $this->pathToHashCache();
    +    if (file_exists($hash_path)) {
    +      $this->hash_cache = json_decode(file_get_contents($hash_path), TRUE);
    +    }
    +  }
    

    This object's functionality is limited and incorrect until this is called. Perhaps a better pattern here would be to have this as a public static method that returns a loaded instance and make the __construct private. That way we can guarantee this is loaded prior to be used.

  6. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +  /**
    +   * Ask the user how to handle modified files.
    +   *
    +   * @param string[] $modified
    +   *   An array of destination paths that were modified
    +   *
    +   * @return string[]
    +   *   An array of destination paths that should not be modified.
    +   */
    +  public function decideHowToHandleModified() {
    

    @param doc is not needed.

  7. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +    // Interactive? Ignore envirnment variables and always default to '?'
    

    Misspelt environment. And needs to end in a full stop.

  8. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +  /**
    +   * Find the list of modified scaffold files.
    +   *
    +   * @param array $file_mappings
    +   *
    +   * @return string[]
    +   *   List of modified files
    +   */
    +  protected function findModifiedHashes(array $file_mappings, Interpolator $location_replacements) {
    

    Missing @param doc

  9. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,248 @@
    +  /**
    +   * Calculate the on-disk hash value of a scaffold file.
    +   *
    +   * @param ScaffoldFilePath $destination
    +   *   Path to the scaffold file, at its destination path.
    +   *
    +   * @return string
    +   *   Hash value.
    +   */
    +  protected function hashFile(ScaffoldFilePath $scaffold_file) {
    

    $destination vs $scaffold_file

greg.1.anderson’s picture

Status: Needs work » Needs review
FileSize
11.4 KB
1.37 MB

Here's the updated patch - explanation to come.

greg.1.anderson’s picture

#44: Added a reference to the documentation.

#45.1 & 3 & 6 & 8 & 9: Updated all of the docblock comments in ScaffoldHash class.

#45.2: Removed $this->modified, which is now tracked by a local variable.

#45.4: The Composer class does not have any facility to manage a composer.json file, but there is a JsonFile class that provides light validation on a composer.json file; we are now using that class to read and write the root composer.json file. This file validates per the composer.json schema, so we are not using it to read / write the hash cache.

#45.5: I usually prefer to avoid doing work like reading files in a constructor, but it was not convenient to call various protected methods from a static constructor, so I mode load part of the constructor so that it would not have to be public.

#45.7: Fixed comment.

greg.1.anderson’s picture

Whoops I accidentally rolled a bunch of unwanted stuff into the patch above, but need to go afk. Will reroll it later.

greg.1.anderson’s picture

FileSize
18.46 KB

Fixed patch; same as #46, except without unwanted junk. See #46 for interdiff.

alexpott’s picture

Status: Needs review » Needs work
  1. +++ b/composer/Plugin/Scaffold/Operations/ScaffoldFileCollection.php
    @@ -72,11 +72,32 @@ public function __construct(array $file_mappings, Interpolator $location_replace
    +  /**
    +   * Filter out any item in the list whose destination path has been modified.
    +   *
    +   * @param array $modified
    +   *   List of destination paths
    +   */
    +  public function removeModified($modified) {
    

    Let's call this method skipFile and pass in the reason. Because (a) we're not removing anything and (b) it makes it slightly more re-usable.

  2. +++ b/composer/Plugin/Scaffold/Operations/ScaffoldFileCollection.php
    @@ -72,11 +72,32 @@ public function __construct(array $file_mappings, Interpolator $location_replace
    -    return new \RecursiveArrayIterator($this->scaffoldFilesByProject, \RecursiveArrayIterator::CHILD_ARRAYS_ONLY);
    +    return new \ArrayIterator($this->scaffoldFilesByProject);
    

    Is this change related?

  3. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +  private $vendor_dir;
    

    $vendorDir - and why private when we've used protected for $io and $dir?

  4. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +  protected $hash_cache = [];
    

    $hashCache

  5. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +  /**
    +   * ScaffoldHash constructor.
    +   *
    +   * @param IOInterface $io
    +   *   The input/output object.
    +   *
    +   * @param type $vendor_dir
    +   *   The location of the vendor directory.
    +   *
    +   * @param type $dir
    +   *   The directory where the project is located.
    +   */
    +  public function __construct(IOInterface $io, $vendor_dir, $dir) {
    +    $this->io = $io;
    +    $this->vendor_dir = $vendor_dir;
    +    $this->dir = $dir;
    +
    +    $hash_path = $this->pathToHashCache();
    +    if (file_exists($hash_path)) {
    +      $this->hash_cache = json_decode(file_get_contents($hash_path), TRUE);
    +    }
    +  }
    

    If you what to make it more explicit that you need load the hash manager before it is useful you can do something like:

    
      /**
       * ScaffoldHash constructor.
       *
       * @param IOInterface $io
       *   The input/output object.
       * @param string $vendor_dir
       *   The location of the vendor directory.
       * @param string $dir
       *   The directory where the project is located.
       */
      private function __construct(IOInterface $io, $vendor_dir, $dir) {
        $this->io = $io;
        $this->vendor_dir = $vendor_dir;
        $this->dir = $dir;
      }
    
      /**
       * ScaffoldHash constructor.
       *
       * @param IOInterface $io
       *   The input/output object.
       * @param string $vendor_dir
       *   The location of the vendor directory.
       * @param string $dir
       *   The directory where the project is located.
       * 
       * @return static
       *   A loaded hash manager.
       */
      public static function load(IOInterface $io, $vendor_dir, $dir) {
        $hash_manager = new static($io, $vendor_dir, $dir);
        $hash_path = $hash_manager->pathToHashCache();
        if (file_exists($hash_path)) {
          $hash_manager->hash_cache = json_decode(file_get_contents($hash_path), TRUE);
        }
        return $hash_manager;
      }
    

    And then in the calling code do: $hash_manager = ScaffoldHash::load()

  6. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +  /**
    +   * Filter out any item in the list whose destination path has been modified.
    +   *
    +   * @param Drupal\Composer\Plugin\Scaffold\Operations\ScaffoldFileCollection $scaffold_files
    +   *   Collection of destination paths.
    +   */
    +  public function findModified(ScaffoldFileCollection $scaffold_files) {
    

    Let's call this processModified() and also fix up the docs. It doesn't filter out necessarily. It let's the user choose what happens when a modified scaffolded file is detected.

  7. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +    foreach ($files as $scaffold_result) {
    +      if ($scaffold_result->isManaged()) {
    +        $hash = $this->hashFile($scaffold_result->destination());
    +        $relative = $scaffold_result->destination()->relativePath();
    +        $this->hash_cache[$relative] = $hash;
    +      }
    +    }
    +    $hash_path = $this->pathToHashCache();
    +    file_put_contents($hash_path, json_encode($this->hash_cache));
    

    If a file has become unmanaged as part of the composer changes I think we should end up removing it from composer.json here - no? I.e.. should we set $this->hash_cache = [] before we process $files?

  8. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +use Composer\Composer;
    ...
    +use Composer\Util\Filesystem;
    +use Drupal\Composer\Plugin\Scaffold\Interpolator;
    

    Not used.

  9. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +   * @param IOInterface $io
    +   *   The input/output object.
    +   *
    +   * @param type $vendor_dir
    +   *   The location of the vendor directory.
    +   *
    +   * @param type $dir
    +   *   The directory where the project is located.
    

    There shouldn't be gaps in between @param declarations and "type" is not a valid type.

  10. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +   * @param Drupal\Composer\Plugin\Scaffold\Operations\ScaffoldFileCollection $scaffold_files
    

    Needs a leading \

  11. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +        if ((!$scaffold_file->op() instanceof SkipOp) && !in_array($destination_rel_path, $checked)) {
    

    This must be untested or not needed because SkipOp is not imported.

  12. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +    // Remove the modfied files if so instructed by the user.
    

    Typo in modified

  13. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +   * @param Drupal\Composer\Plugin\Scaffold\ScaffoldFilePath $scaffold_file
    

    Needs a leading \

  14. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +   * @param Drupal\Composer\Plugin\Scaffold\ScaffoldFilePath $scaffold_file
    +   *   One scaffold file that is about to be placed.
    +   */
    +  protected function checkModified(ScaffoldFilePath $scaffold_file) {
    

    Missing @return

  15. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +   * @param Drupal\Composer\Plugin\Scaffold\ScaffoldFilePath $scaffold_file
    

    Needs a leading \

  16. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +   * @return string
    +   *   Hash value.
    ...
    +    if (!file_exists($path)) {
    +      return 0;
    +    }
    

    This is not returning a string. Should we have the !file_exists() check here?

  17. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,249 @@
    +    return md5_file($path);
    

    Let's change this to sha1_file() just because there are automated scanners that detect md5 usage and this would be a run-time-ish usage.

greg.1.anderson’s picture

Status: Needs work » Needs review
FileSize
9.28 KB
19.25 KB

2. The existing iterator method was unused. I needed it for this refactor, but the recursive array iterator was not behaving as expected. I switched to using an array iterator instead for simplicity.

5. Argh, I could have swore that I wrote it exactly like that the first time, but I had some problem with it. It must have just been a typo, because it worked fine when I put your example back in.

7. We cannot simply erase the hash cache. If a user chooses to keep a modified file, but NOT override it in the file-mappings of the root composer.json file, then file will be handled as if it was skipped. Skipped files count as unmanaged files, so the skipped file would then not be written back to the hash cache, which means that it would be overwritten on the next run of composer scaffold.

11. Not needed.

16. Reworked it to return empty rather than zero when the file no longer exists. It is important to have the 'file_exists' check here. By the current implementation, if you DELETE rather than MODIFY a scaffold file, it does not count as a modification. Deleted scaffold files are always replaced on the next scaffold operation, as wanting to replace deleted files is a common scaffold operation, and there is no data loss in replacing a deleted file. If you want to get rid of a scaffold file completely, you need to override it in your composer.json file.

17. md5 is not a security problem in this context, of course, but replaced this with sha1 anyway to avoid undo attention.

Other items adjusted as suggested.

greg.1.anderson’s picture

alexpott’s picture

Do we have test coverage of

7. We cannot simply erase the hash cache. If a user chooses to keep a modified file, but NOT override it in the file-mappings of the root composer.json file, then file will be handled as if it was skipped. Skipped files count as unmanaged files, so the skipped file would then not be written back to the hash cache, which means that it would be overwritten on the next run of composer scaffold.

That would seem important.

greg.1.anderson’s picture

Yes, 7 is fully tested. Comments from the test:

// Delete one scaffold file, just for test purposes, then run
// 'composer update' and see if the scaffold file is replaced.

// Delete the same test scaffold file again, then run
// 'composer drupal:scaffold' and see if the scaffold file is
// re-scaffolded.

// Delete the same test scaffold file yet again, then run
// 'composer install' and see if the scaffold file is re-scaffolded.

// Modify the same test scaffold file, then run 'composer drupal:scaffold'
// and confirm the modification is not overwritten.

// Run 'composer drupal:scaffold' twice in a row to make sure that the
// hash file of the modified file is retained, such that drupal:scaffold
// still recognizes that the modified file is modified.

// Finally, run 'composer drupal:scaffold' yet a third time in a row, this
// time with an environment variable set that tells drupal:scaffold to act
// as if the user had requested that modified files be overwritten.

// And if we run it yet again, then the file should not be modified.

// Modify the test scaffold file even again, then run
// 'composer drupal:scaffold' with DRUPAL_SCAFFOLD_KEEP_MODIFIED. Confirm
// that the file is now excluded in the composer.json file's file-mapping.

The comment that directly relates to your question is "Run 'composer drupal:scaffold' twice in a row to make sure that the hash file of the modified file is retained, such that drupal:scaffold still recognizes that the modified file is modified."

greg.1.anderson’s picture

+++ b/core/tests/Drupal/Tests/Composer/Plugin/Scaffold/Functional/ComposerHookTest.php
@@ -110,12 +115,57 @@ public function testComposerHooks() {
     @unlink($sut . '/sites/default/default.settings.php');

Here is where we delete a scaffold file in the tests. This test existed prior to this patch.

greg.1.anderson’s picture

Here's another full set of scaffold functional tests where we delete and modify two files in the same scaffold run to cover more permutations.

greg.1.anderson’s picture

alexpott’s picture

Status: Needs review » Needs work
  1. I think the logic here looks good. We're putting the developer in control and informing them about what's happening so they can make informed choices.
  2. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    + * Keep a record of scaffold files and their hash values to detect changes.
    

    Keeps... there's a documentation rule - https://www.drupal.org/docs/develop/standards/api-documentation-and-comm... - that class docs and method docs must start with " a third person singular present tense verb".

  3. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +class ScaffoldHash {
    

    Is this the best name? It's not a hash - it's a` collection of hashes. We assign this to a var called $hash_manager - maybe we should use that to name the class.

  4. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +  /**
    +   * ScaffoldHash constructor.
    +   *
    +   * @param \Composer\IO\IOInterface $io
    +   *   The input/output object.
    +   * @param string $vendor_dir
    +   *   The location of the vendor directory.
    +   * @param string $dir
    +   *   The directory where the project is located.
    +   */
    +  private function __construct(IOInterface $io, $vendor_dir, $dir) {
    

    Let's document that people should use ::create() instead.

  5. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +  /**
    +   * ScaffoldHash constructor.
    

    Creates a...

  6. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +  /**
    +   * Process the cached hashes to determine which files have been modified.
    +   *
    +   * The user is prompted about how they wish to handle any modified files,
    +   * and if they want to preserve them, they are skipped in the scaffold file
    +   * collection.
    +   *
    +   * @param \Drupal\Composer\Plugin\Scaffold\Operations\ScaffoldFileCollection $scaffold_files
    +   *   Collection of destination paths.
    +   */
    

    I think we mention in the docs that we manipulating $scaffold files here. Perhaps also returning it would make it more explicit.

    Also should start with "Processes"

  7. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +   * Check to see if a single scaffold file has been modified on disk.
    

    Checks

  8. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +   * Ask the user how to handle modified files.
    

    Asks

  9. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +   * Determine what our default action should be for prompt.
    

    Determines

  10. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +   * Modify composer.json to preserve modified files on subsequent runs.
    

    Modifies

  11. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +   * Given a list of scaffold results, produce a record of hash values.
    

    This needs to be rewritten to start with "Stores..."

  12. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +   * Returns path to the file where the list of hashes is stored.
    

    Gets

  13. +++ b/composer/Plugin/Scaffold/ScaffoldHash.php
    @@ -0,0 +1,277 @@
    +   * Calculate the on-disk hash value of a scaffold file.
    

    Calculates

  14. +++ b/core/tests/Drupal/Tests/Composer/Plugin/Scaffold/Functional/ComposerHookTest.php
    @@ -76,46 +68,168 @@ protected function tearDown() {
    +   * Test to see if deleted / modified files are handled correctly.
    

    Tests

  15. +++ b/core/tests/Drupal/Tests/Composer/Plugin/Scaffold/Functional/ComposerHookTest.php
    @@ -76,46 +68,168 @@ protected function tearDown() {
    +   * Test deleting and modifying two different scaffold files at the same time.
    

    Tests

  16. +++ b/core/tests/Drupal/Tests/Composer/Plugin/Scaffold/Functional/ComposerHookTest.php
    @@ -76,46 +68,168 @@ protected function tearDown() {
    +   * Test to see if create-project scaffolds correctly.
    

    Tests

greg.1.anderson’s picture

Updated per #58.

Mile23’s picture

Status: Needs review » Needs work

The patch in #59 takes care of everything from #58. Just a few points to work on:

  1. +++ b/composer/Plugin/Scaffold/HashManager.php
    @@ -0,0 +1,283 @@
    +   * Constructs a HashManager; use HashManager::create instead.
    

    Let's avoid semicolons for the sake of non-English-speakers; make it two sentences, please.

  2. +++ b/composer/Plugin/Scaffold/HashManager.php
    @@ -0,0 +1,283 @@
    +   * @return string[]
    +   *   List of modified files to preserve; empty if there is nothing to keep.
    

    I don't actually think we need to return this value, but this patch did successfully make that change per #58.6.

    Also: Avoid semicolons, please.

shubham.prakash’s picture

Assigned: Unassigned » shubham.prakash
greg.1.anderson’s picture

Fixed grammar in comments per #60.

greg.1.anderson’s picture

Fumbled my patches again. See #62 for the interdiff.

shubham.prakash’s picture

Wrapped the commented line.

greg.1.anderson’s picture

#64: "keep." fits in 80 columns. Are we supposed to stick to 79?

alexpott’s picture

#63 was correct. It's just dreditor making it look wrong.

I feel I haven't explained #58.6 well enough.

The point of \Drupal\Composer\Plugin\Scaffold\HashManager::processModified is to modified the input object ScaffoldFileCollection $scaffold_files. This works because all objects are passed by reference. I think the return should be return $scaffold_files as that's what we want to use or it should return NULL. But it is really important that we document that this method relies on side-effects of calling \Drupal\Composer\Plugin\Scaffold\Operations\ScaffoldFileCollection::skipFiles()

greg.1.anderson’s picture

#66: Could we simply document the side-effect and not return the object? It might actually send a false signal to the caller if they saw the ScaffoldFileCollection being returned, and not realize that there was a side-effect.

If you feel strongly about the side-effect, then we could clone the collection and return the modified collection.

Mile23’s picture

+++ b/composer/Plugin/Scaffold/Handler.php
@@ -161,6 +161,14 @@ public function scaffold() {
+    // Filter out the collected items that were modified (if user decided that
+    // it was okay to overwrite, then this list will be empty).
+    $hash_manager->processModified($scaffold_files);

If I'm reading #66 correctly...

Maybe this bit of code could look like this:

$modified = $hash_manager-> decideHowToHandleModified(
  $hash_manager->processModified($scaffold_files)
);
if ($modified) {
  // Remove the modified files if so instructed by the user.
  $message = '  - Preserve <info>[dest-rel-path]</info>: modified by user.';
  $scaffold_files->skipFiles($modified, $message);
}

This way the user interaction is not a side-effect of processModified().

alexpott’s picture

#68 looks nice. @Mile23++

Mixologic’s picture

Im guessing you mean #68?

alexpott’s picture

Indeed I do! oops. Thanks @Mixologic - I'll update the comment...

greg.1.anderson’s picture

Here's a patch that is essentially #68, with some slight touch-ups.

greg.1.anderson’s picture

I added a change record for this issue. At the moment, I have marked this CR as being part of the 8.9.x release, although we hope that this will ultimately be backported to 8.8.x as well.

I don't think a release note is necessary here, but if anyone feels one would be desirable, I'll prepare a summary.

Mile23’s picture

Status: Needs review » Needs work

Revisiting my review in #44 I did all the same steps.

When I modified .htaccess and then chose K to store the change, I got this in my composer.json file:

             ]
+        },
+        "drupal-scaffold": {
+            "file-mapping": {
+                "[web-root]/.htaccess": false
+            }
         }

So far so good.

However, running composer drupal:scaffold again, it asks me what to do with this file:

$ composer drupal:scaffold
The following managed scaffold files have been modified:
  - [web-root]/.htaccess

    Discard changes [y,n,k,a,?]? 

So I think we had a regression since #44 on this behavior.

greg.1.anderson’s picture

Status: Needs work » Needs review
FileSize
1.13 KB
26.39 KB

Argh, I thought we had all of the permutations tested, but there was one missing. @alexpott even called it in #50.11. Here is a failing test.

greg.1.anderson’s picture

The last submitted patch, 75: 3092563-75.patch, failed testing. View results

Mile23’s picture

Status: Needs review » Reviewed & tested by the community

Excellent. :-) I can re-perform the manual test in #44 and the plugin remembers my exclusions.

We addressed the naming things and side-effect issues from #66, and the new test inspired by #50 proves that adding modified files to composer.json extra config works properly.

LGTM.

alexpott’s picture

Status: Reviewed & tested by the community » Needs review

Nice work everyone! I was going to commit this because I think it is important but whilst having a final think I believe there are some extra scenarios that we should at least test and/or think about. One of things people often do is to rm -rf vendor && composer install - doing this is going to lose the hash information. What do we think should happen in this situation? #44 hints that this has been tested at least manually but are we really sure that blowing away the hash information is a good thing?

mradcliffe’s picture

+1 on the Change Record, @greg.1.anderson.

I think it might help to add a note about security that site maintainers who override these should review their modified scaffolded files. Or maybe to direct towards documentation page Using Drupal's Composer Scaffold?

greg.1.anderson’s picture

In #40 I suggested that the hash cache might be written to a dot file at the project root; however, locations outside of vendor are at risk of being accidentally committed to the repository. This doesn't really seem to be what we want, although technically speaking it would probably work out okay, as the hashes stored in the hash cache contain the values for the contents that the scaffold plugin writes, so they should be valid if pulled on a remote system.

If the scaffold hash cache is removed (e.g. via rm -rf vendor, with the cache stored in its current location), then all files that are modified that were never [k]ept will be overwritten on the next composer install. Our expectation is that the user should use [k]eep if they want to preserve their modifications.

LMK which way you think it should go, and then I will update the code and/or documentation.

alexpott’s picture

at risk of being accidentally committed to the repository.

Would this be so bad? I'm not sure it would be. I think I prefer putting this outside vendor but other opinions would be great. It is kinda operating like composer.lock.

mradcliffe’s picture

I think the risk is low if scaffolded.json was prefixed with a dot as that would be ignored be the web server in case Drupal is not installed with vendor/webroot separation. It might be more annoying to have hash changes in commits, but it also might be a good thing because it would require someone to review that file in code reviews.

We could also add it to example.gitignore?

greg.1.anderson’s picture

The thing about PROJECT_ROOT/.scaffolded.json is that this placement promotes this file to something that every Drupal site will notice and must manage. I think that's fine for a primary mechanism, e.g. if this were the way we were telling users to manage their modifications to scaffold files. As it is, though, it is not, and it seems to me that this file would just be an encouragement for users to do it wrong.

If this is really what we need to avoid users running drush sqlq 'FLIP table USERS;', then we can do this, but I have reservations.

Suggested mitigations if we go this route:

  • Put the file in example.gitignore per #83
  • Add a warning message any time a user selects "no" from the prompt
  • Put the hash values in a "hash" element, and add a "_README" element advising the user to manage their scaffold file properly
greg.1.anderson’s picture

Issue summary: View changes

Update issue summary.

Mile23’s picture

If the scaffold hash cache is removed (e.g. via rm -rf vendor, with the cache stored in its current location), then all files that are modified that were never [k]ept will be overwritten on the next composer install. Our expectation is that the user should use [k]eep if they want to preserve their modifications.

I think we should not have an N option. Either scaffold keeps the files, or they get overwritten. The hash only tells us whether we need to prompt the user to K.

The other option is that N is similar to K in that it keeps an extra for that file, and the extra is the modified hash that the user wants. So for instance:

        "drupal-scaffold": {
            "file-mapping": {
                "[web-root]/.htaccess": "ignore_for_this_hash:234123421342134"
            }
        }

Now you know that if .htaccess matches that hash, it should be ignored during scaffolding. And it's kept in composer.json so it won't be deleted if you rm -rf vendor.

The hashes under vendor/ only allow you to keep track of whether a file changed since last time. The only vulnerable time is if the user deletes vendor/, because you don't know if they changed the scaffold since then. However, if they changed one that was previously Nd, we can ask about it because we still have the hash. If they changed one that was Kd, then it gets ignored like before.

greg.1.anderson’s picture

I do not agree with removing the N option. If we run composer install --no-interaction, it has to do something. Modifying the root composer.json file in this instance would not be correct. N is the correct default behavior.

I suppose that we could hide the N option from the prompt and the help text, but it needs to be there for the non-interactive case.

I am still thinking about what I think about "ignore for this hash". We want to prompt the user again if they edit the file again? That doesn't seem necessary. What we want to do is prompt the user again if the file changes in the upstream; however, we do not have the facility to do that yet.

greg.1.anderson’s picture

Here's a patch that implements the suggested resolution in the issue summary. This is not meant to close the door on #86 and other discussion, this is just an illustration in code of one possible solution.

Status: Needs review » Needs work

The last submitted patch, 88: 3092563-88.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

alexpott’s picture

  1. +++ b/composer/Plugin/Scaffold/HashManager.php
    @@ -250,7 +261,29 @@ public function storeResultsHash(array $files) {
    +  /**
    +   * Returns the _README contents to be stored in the hash cache file.
    +   *
    +   * @return string
    +   *   Comment to be written into the hash cache.
    +   */
    +  protected function getHashCacheReadme() {
    +    $docs_url = self::SCAFFOLD_DOCUMENTATION;
    +    return <<< __EOT__
    +This file contains a list of hashes of the contents of all scaffold files
    +managed on this site. It is recommended that this file be .gitignore'd, although
    +it is not harmful to commit it.
    +
    +For details on how to correctly manage customizations to a site's scaffold
    +files, see $docs_url.
    +__EOT__;
       }
    

    I think this is a great idea. Putting docs in the file saying what it is is great.

  2. +++ b/composer/Plugin/Scaffold/README.md
    @@ -357,6 +357,22 @@ assets from being altered.
    +
    +### transient-cache
    +
    +The `transient-cache` property selects whether the internally-managed scaffold
    +file hash cache is stored in a transient location or not. The hash cache keeps
    +track of scaffold files which have been locally edited. This is *not* a
    +recommended workflow; sites should use the "append" feature or
    +cweagans/composer-patches, as described below, to alter scaffold files. Local
    +modifications are tracked in an attempt to avoid data loss.
    +
    +By default, the hash cache is stored at the project root. Set `transient-cache`
    +to "true" to instead store it inside the vendor directory. It is more
    +convenient to use the transient location if scaffold files typically are not
    +locally modified (e.g. when using the recommended workflows described in
    +"Manageing Scaffold Files", below).
    

    I'm not entirely sure why anyone would ever set this to true.

  3. On more reflection about #79 - I think storing stuff that can not be rebuilt by running composer install in vendor is breaking a key expectation about how composer works and if someone discovered this be accidentally deleting the transient cache on a project that they'd just taken over they'd be really surprised and a bit annoyed.
mradcliffe’s picture

+++ b/composer/Plugin/Scaffold/README.md
@@ -357,6 +357,22 @@ assets from being altered.
+"Manageing Scaffold Files", below).

Spelling nitpick. I didn't have any chance to review more and saw this in @alexpott's review in #90 in my email.

"Manageing" should be "Managing"

greg.1.anderson’s picture

I'm not entirely sure why anyone would ever set this to true.

Any site that's managing their scaffold files per the recommended methods does not need the hash cache. These users can use this feature to get the file out of the way. They can also just .gitignore it, so I care 2/10 about keeping this feature. Putting it in gives us a route to making the hash cache transient by default e.g. in a profile or in an ISP template, or in Drupal 9 or Drupal 10.

On more reflection about #79 - I think storing stuff that can not be rebuilt by running composer install in vendor is breaking a key expectation about how composer works and if someone discovered this be accidentally deleting the transient cache on a project that they'd just taken over they'd be really surprised and a bit annoyed.

The transient cache does not contain any data that the user needs. It only identifies files that have been modified out-of-band. This is not how scaffolding is supposed to work; you are supposed to append or patch your scaffold files. If someone does modify a scaffold file out-of-band, then they should commit it to git, and must commit it to git if they're going to hand it off to someone else.

I just feel like putting the .scaffolded.json file right in the user's face is going to make people think this is the primary way that you're supposed to customize scaffold files.

[Edited down to sound less panic-y.]

Could you give clarification on the desired changes for #90.2 and #90.3? At this point I'm really okay with any outcome; I gave my opinion, now I just want to wrap things up.

alexpott’s picture

My gut says let's not and the transient stuff now - we can always add that as an addition if we want to later.

greg.1.anderson’s picture

Here's an update without the transient switch. n.b. #88 failed due to a missing test fixture, but that is no longer needed here.

greg.1.anderson’s picture

Status: Needs work » Needs review
greg.1.anderson’s picture

Issue summary: View changes
greg.1.anderson’s picture

Oh, one more thing I forgot; we can add .scaffolded.json to the example.gitignore file.

greg.1.anderson’s picture

On slack, @mlhess suggested:

I think the default should be to replace the htaccess file unless someone has set a config change someplace. I am not sure if it is possible , but we could put a comment at the top of the .htaccess file that says something like:

#DO_NOT_OVERWRITE_ON_UPDATES=false
Warning, setting the above may have security implications, please read drupal.org/page

Then have composer read the .htaccess file and parse the comment or something

We can certainly read modified scaffold files and check to see if the first line contains a pattern, e.g. we could ignore the modified status of a file if it does not contain "DO_NOT_OVERWRITE=true". The difficulty comes with the "on updates" part. Doing that would require #24, though, which was considered too complex to get into for 8.8.0 in a different slack thread.

We need to decide how important this behavior is, and whether there's a happy compromise we can make now, and add the "on update" stuff in 8.9.x. We really hope that people will learn how to do it "the Composer Scaffold way" and we won't have many instances of folks locally modifying their scaffold files. Ideally this on-modified handling is just fallback behavior.

greg.1.anderson’s picture

In slack @mlhess suggested that perhaps we should do nothing in 8.8.x (replace all files, per default), and do something more complex for 8.9.x.

A number of folks have said that "on update" is an important part of the behavior. Perhaps we should reconsider doing #24, commit it to 8.9.x, and then decide whether the backport to 8.8.x is reasonable, depending on the complexity of the implementation and how long it takes to get it into 8.9.x.

greg.1.anderson’s picture

Here is a new version that implements something along the lines of #24; however, this version is a little simpler than what was originally proposed.

This patch does not change behavior at all; it merely allows the scaffold operations to calculate their contents in advance, prior to writing them to the destination path. This will allow us to calculate a hash without changing the filesystem, which will allow us to detect when a source file changes after being scaffolded.

greg.1.anderson’s picture

Boom.

This patch notices when the source scaffold file changes after it has been scaffolded to disk. In these instances, if the file still exists on disk, we will skip scaffolding. This means that any modification made to scaffolded files will only be in danger of being overwritten when the source file changes.

n.b. "source file changed" is only tracked only in the ".scaffolded.json" file; if this file is removed, then this protection is lost.

This patch does not change the way we handle modifications when they are about to be overwritten. Some improvements should now be possible in this area.

greg.1.anderson’s picture

Now that modifying a scaffolded file is a no-op unless the source scaffold file has been updated, it is possible for us to take the advice in #86. This patch makes 'n' act like 'k' used to: it writes the list of modified files into the root-level composer.json.

Since it is now unlikely that non-interactive mode (e.g. in CI) will encounter the modified files prompt, the default option for non-interactive mode has been changed to "abort".

The new behavior is simpler than before, so some of the test permutations are no longer necessary.

greg.1.anderson’s picture

This gets us a lot closer to a desirable behavior, but we're not done yet. The feedback from the security team was that getting the changes from an updated .htaccess file is more important than avoiding data loss to modifications in the same file when updating a site. Currently, if the user chooses to override .htaccess in their top-level composer.json file (i.e. set it to false), then it will no longer participate in scaffolding at all. This means that if someone has overridden the .htaccess file, they will no longer be notified, warned or prompted if a newer version of the file becomes available.

We now have the information we need to make this behave the way we want, though. The question is, how should the system behave in interactive and non-interactve mode when a scaffold file that has been modified locally is also update in the source project.

greg.1.anderson’s picture

Workflow example, as currently implemented:

  • User installs Drupal and runs composer install: .htaccess copied and .scaffolded.json created
  • User modified .htaccess
  • User runs composer install (repeatedly): no change of state (.htaccess still modified)
  • User updates Drupal Core, and .htaccess is unchanged: no change of state
  • User updates Drupal Core which contains a change to .htaccess: User is prompted "discard changes?"
  • User answers "no" to prompt: .htaccess is added to list of "overridden" scaffold files in composer.json
  • User updates Drupal Core Yet Again with Yet Another change to .htaccess: no prompt is presented

This probably does not meet the level of safety desired by the security team, so we should keep going. Note that users may also manually add scaffold files to the override list in their root composer.json file. This is supposed to mean "stop paying attention to this file, I am managing it." This is why the scaffold tool does not presently prompt any longer once the file is overridden.

  • Is it sufficient to just print a big warning if an overridden file has changed since it was updated?
  • Should we prompt every time there is a change (overridden just means "modified")?

We also need to decide what to do when .scaffolded.json is lost. Right now this causes the scaffold tool to assume that .htaccess (or whatever) was not modified. I'm not sure there is a way we can tell in this situation whether the .htaccess file was modified by the user, or whether it is simply an older version of the file from a previous version of Drupal. @mlhess suggested that maybe there could be a comment on the first line of .htaccess that would dictate behavior, e.g. always overwrite .htaccess unless the user added a well-known "I accept the risk" sort of string. If we did this, then we could perhaps choose which files were thus protected, and follow default rules for other scaffold files.

nikathone’s picture

Is .scaffolded.json created at the project or Drupal/web root folder? If the former, may I suggest that file created at the Drupal/web folder since most of the scaffolded files are under this folder?

greg.1.anderson’s picture

Issue summary: View changes

Update issue summary to describe current behavior of patch.

greg.1.anderson’s picture

Issue summary: View changes
greg.1.anderson’s picture

Issue summary: View changes

More issue summary refinement.

alexpott’s picture

Discussed the remaining challenge from #103 a bit with @greg.1.anderson

I think this is a tricky call to make. If a user follows the instructions currently in the root .htaccess file and changes the line to always forward to https - which is a reasonable thing to do from a security perspective. Then in order to comply with best practice they need to use cweagans/composer-patches to apply their change as a patch to the root .htaccess in a post scaffold hook. Or with this patch they can choose to ignore upstream changes to the scaffold files (note that one answer applies to all files been scaffolded). And then we want some way to warn / inform that their choice here might get them into a security issue because they won't get upstream changes.

I wonder if instead of all of this complexity we try a whole different process.
1. On composer install only scaffold files that are not there.
2. On composer update run scaffold all the files and overwrite the existing files. BUT we take care to copy the existing files if they are different to a .scaffold directory so the user can recover any changes. And we warn users.
3. If users run the scaffold command manually we do the same as for composer update.

This brings us inline with the behaviour of drush pm-updatecode and I think will match the behaviour users expect. I.e. they only have to deal with scaffold file updates when the thing that is scaffolding them is updated.

greg.1.anderson’s picture

#109 sounds like a good strategy. I don't know that will be ready for 8.8.0-rc1 this week. Should we just focus on working on what we think is the closest approximation to the right thing, and potentially do nothing at all for 8.8.0, or is there anything from the most recent patch that we might want to try to capture for the upcoming release?

greg.1.anderson’s picture

greg.1.anderson’s picture

Some of the work from refactoring done in this issue would be helpful to some other work that is currently in progress. I split out #3103090: Avoid re-scaffolding unchanged files (and printing scaffold file information over and over) to capture that work and hopefully get it committed to unblock the other critical, #3091285: Composer scaffolding fails when permissions on default.settings.yml or default.settings.php is not writable..

greg.1.anderson’s picture

I do have some concerns about #109. Creating backup files inside the project will often create files the user does not want (since we cannot tell the difference between user-modified files, and scaffold files whose source changed from being updated). If the user relies on the backup files being created, there is a danger that two subsequent updates might erase a file the user needs.

greg.1.anderson’s picture

Postponing on #3103090: Avoid re-scaffolding unchanged files (and printing scaffold file information over and over). We can decide how much of #109 we're going to do once that refactoring is in.

greg.1.anderson’s picture

Status: Needs review » Postponed
greg.1.anderson’s picture

Status: Postponed » Needs work

#3103090: Avoid re-scaffolding unchanged files (and printing scaffold file information over and over) has been committed to 8.9.x and 9.0.x, so we could re-roll and continue here now.

ravi.shankar’s picture

Status: Needs work » Needs review
FileSize
45.71 KB

Here I have added a re-roll of patch #102.

greg.1.anderson’s picture

Status: Needs review » Needs work

Thanks for the re-roll; this will help with the continuation of the work here.

Regarding the continuation, #102 does not address #103; the other concern we have is that under #102, is that the behavior will be adversely affected. Maybe this is unavoidable; fallback behavior should favor upgrading the file over preserving user changes.

sic’s picture

i wondered why my htaccess is reset to default again because i removed it from the update before copying the files. now i see this.

drupal 8 never seizes to amaze me...

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

thursday_bw’s picture

I've just had a read through all this as my introduction.

I can't help wondering if the simple solution of adding more documentation into .htaccess itself solves 99% of use cases.
Simply adding comments to .htaccess that says "the recommended way to modify this is by managing your scaffold, see the docs here".
And update other relevant docs that refer to modifying .htaccess to use different language, such "modifying the .htaccess scaffold".

It's the same kind of thing you see in `/etc/resolv.conf`

cat /etc/resolv.conf 
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

I must say that I love the suggested implementations here though. Having the feedback for the user of composer right there in compser with
clearly stated problems and choices is fantastic.

thursday_bw’s picture

I've come at this issue fairly cold so I've easily noticed a few places where code readibiltiy could be improved.

  1. +++ b/composer/Plugin/Scaffold/Operations/ScaffoldFileCollection.php
    @@ -72,11 +72,52 @@ public function __construct(array $file_mappings, Interpolator $location_replace
    +  /**
    +   * Filter out any item in the list whose destination.
    +   *
    +   * @param array $files_to_filter
    +   *   List of destination paths.
    +   */
    +  public function filterFiles(array $files_to_filter) {
    +    foreach ($this->scaffoldFilesByProject as $project_name => $scaffold_files) {
    +      foreach ($scaffold_files as $destination_rel_path => $scaffold_file) {
    +        if (in_array($destination_rel_path, $files_to_filter)) {
    +          unset($this->scaffoldFilesByProject[$project_name][$destination_rel_path]);
    +        }
    +      }
    +      if (empty($this->scaffoldFilesByProject[$project_name])) {
    +        unset($this->scaffoldFilesByProject[$project_name]);
    +      }
    +    }
    +  }
    

    The word item is used in the description of filterFiles and files in the function name and it's parameter.

    The function should be called something like filterFilesByDestination to make its function more explicit.

    Nested foreach loops with multiple if statements are difficult to read on first glance. Prime candidate for
    extracting to other private methods.

    With an incomplete description, ambiguous naming, It's difficult to know what this method does.

  2. +++ b/composer/Plugin/Scaffold/Operations/ScaffoldFileCollection.php
    @@ -72,11 +72,52 @@ public function __construct(array $file_mappings, Interpolator $location_replace
    +  /**
    +   * Skip any item in the list whose destination path has been modified.
    +   *
    +   * @param array $files_to_skip
    +   *   List of destination paths.
    +   * @param string $message
    +   *   Explanation of why files were skipped.
    +   */
    +  public function skipFiles(array $files_to_skip, $message) {
    +    $op = new SkipOp($message);
    +
    +    // If we were instructed to keep modified files, then filter them out of our
    +    // list of scaffold files to process.
    +    foreach ($this->scaffoldFilesByProject as $project_name => $scaffold_files) {
    +      foreach ($scaffold_files as $destination_rel_path => $scaffold_file) {
    +        if (in_array($destination_rel_path, $files_to_skip)) {
    +          $this->scaffoldFilesByProject[$project_name][$destination_rel_path] = new ScaffoldFileInfo($scaffold_file->destination(), $op);
    +        }
    +      }
    +    }
    +  }
    

    Same items/files naming mismatch

    "Adds files with modified destinations to scaffoldFilesByProject

    It's also confusing that it's called skipFiles. It doesn't actually skip the files, it marks them for skipping (or more accurately just removes them from a list).

    Again we have nested foreach statements with embedded ifs. Another candidate for extraction. Will go a long way to making this code much more comprehensible.

Frank HH-germany’s picture

Title: Avoid overwriting .htaccess changes during scaffolding » Avoid overwriting .htaccess changes during scaffolding > security problem

Hi,
I have the same little problem.

When I install a module with Composer, the htaccess is always overwritten.
Even if I set the file permissions to 404.

How can it be that the access rights are simply bypassed.
These should actually be set on the server side.

I see a security problem here.

  - Installing drupal/cookiebot (dev-1.x 9cd0f09): Cloning 9cd0f099bf
Writing lock file
Generating autoload files
Scaffolding files for drupal/core:
  - Copy [web-root]/.htaccess from assets/scaffold/files/htaccess

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

RBeast’s picture

I have been dealing with the .htaccess file being over written and have tried several scaffolding methods. I am not an experienced developer and found it pretty finicky. The last scaffolding approach I worked on is based on this one:

"post-drupal-scaffold-cmd": [
  "cd docroot && patch -p1 <../patches/htaccess-ssl.patch"
]

I am sure someone can better my htaccess patch, but here it is for reference.

cat htaccess_ssl.patch

182a183,194
>
> # Append to htaccess - RB 20210207
>
> # Redirect to HTTPS - RB 20210205 - works to force https://
> RewriteCond %{HTTPS} off
> RewriteCond %{HTTP:X-Forwarded-Proto} !https
> RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
>
> # Redirect all users to access the site WITH the 'www.' prefix - RB 20210205
> RewriteCond %{HTTP_HOST} .
> RewriteCond %{HTTP_HOST} !^www\. [NC]
> RewriteRule ^ http%{ENV:protossl}://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

It seems like the possible way to deal with this would be to have a tested patching routine in the Drupal composer.json file and perhaps a patch folder in the Drupal directory above docroot (drupal/patch above /web). That folder could be populated with diff patches, prepends and appends that point to the file to be patched. If it just ran at the end of a composer command, it would just be part of the eco-system. I am not sure how the scheme would work. Perhaps the patch path is in the patch file or the patch file gets a naming convention like templates. These are considerations for pro devs, lol.

greg.1.anderson’s picture

If your additions to a file can be done strictly with an append, then you are better off using an append rule:

{
  "name": "my-org/my-project",
  "extra": {
    "drupal-scaffold": {
      "file-mapping": {
        "[web-root]/.htaccess": {
          "append": "assets/my-htaccess-additions.txt",
        }
      }
    }
  }
}

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

RoSk0’s picture

NaheemSays’s picture

This is also relevant to the phpunit.xml file that needs creation/modifying to develop drupal or modules: #3220203: phpunit.xml is deleted when doing composer update

devkinetic’s picture

Here is a quick guide to the patching of the htaccess file.

Clone the version of Drupal you are on:

git clone git@github.com:drupal/drupal.git
cd drupal/
git checkout X.X.X

Make your edits:

vi core/assets/scaffold/files/htaccess
git diff --patch > htaccess.patch

Add the patch to your project:

cd your-project
composer require cweagans/composer-patches
cp ../the-patch-file/htaccess.patch scaffold-modifications/
vi composer.json
"extra": {
    "patches": {
        "drupal/core": {
            "Htaccess Modifications": "scaffold-modifications/htaccess.patch",
        }
    }
}

Update your project:
composer install && composer update --lock

mlncn’s picture

I think we're letting the perfect be the enemy of the good here. We had an RTBC patch two years ago that would have saved lots of developers hours of time per site over the course of those two years, even if we had reduced its ambitions a good bit.

The current behavior is that every time any composer addition or update is run, any changed scaffolding file gets overridden.

Greg Anderson has done all the hard work so we can at least make this improvement:

  1. If you have changed a scaffolding file, and core has not, your change does not get overridden.
  2. When core changes a scaffolding file, your files get overridden whether you've made a change or not.

We're expected to have these files committed to git— or at any rate, it's much easier to get a full picture of how Drupal's changes interact with your changes that way, than trying to maintain one's own core patches just to redirect from www to non-www domain. The change above moves us from perhaps a hundred times a month needing to undo (git checkout -p) Drupal's overrides during active development to once every six months or so.

We can put aside any concerns about messaging, not getting security updates, etc. (and make it far less likely for people to take these files out of scaffolding entirely, thus enhancing security) and get the above benefit by taking a subset of #102, yes?

douggreen’s picture

Please don't let this thought get in the way of the solution from @ greg.1.anderson. I have an alternate idea. We could do both.

We've only considered a composer patching solution. Why not a code solution? We could rewrite the .htaccess in Drupal from the scaffolding, have various config options that can be deployed or the user can using a UI. read changes from config and apply against the scaffolding, and offer an alter hook to allow more modules to modify it.

This solution would allow admins to quickly add changes for things that are immediate, like blocking sites, and/or path redirects.

I'm a little surprised that we don't have a contrib solution for this. I'm also surprised that https://www.drupal.org/project/htaccess has gone unmaintained in 8.x/9.x, although my thoughts are something a little different than what that module did.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

stephencamilo’s picture

Status: Needs work » Closed (won't fix)
andypost’s picture

Status: Closed (won't fix) » Needs work

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

devkinetic’s picture

So I started working on rerolling against 9.4.7 with the patch from #117. I can apply the patch locally, but not through composer for some reason. Hopefully this help keep the ball rolling.

narendra.rajwar27’s picture

Status: Needs work » Needs review
FileSize
30.22 KB
2.24 KB

Adding patch by trying to fix failure.

Status: Needs review » Needs work

The last submitted patch, 138: 3092563-138.patch, failed testing. View results

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

nhoeller’s picture

I just got hit by this issue, fortunately on a test system. On October 8th, I had updated .htaccess to enabled PHP 8. When I ran composer to upgrade CATCHA this morning, I checked the Drupal Status report and noticed OpCache was disabled. I tracked down the problem to .htaccess being changed. At first, I thought my site had been hacked, but I then realized the system is running on a shared server two timezones west - the adjusted time of the 'new' .htaccess file now matched when I ran composer.

I have not read all the posts in this thread but other Linux components handle changes to key configuration files by asking for permission. If permission is not granted, the 'updated' configuration file is saved with an easily identifiable name so that I can compare the files.

BramDriesen’s picture

What was said in #131: "Bail" in update.php to me sounds like the ideal solution.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

vacho’s picture

Rebasing the patch #138

wheelercreek’s picture

I realize that patching or appending is the recommended way to do this (to keep up with possible future security updates), but if anyone is looking for a quick way to prevent scaffolding from reverting a file, you can also just exclude it:

https://www.drupal.org/docs/develop/using-composer/using-drupals-compose...

I have a bunch of custom commands in my .htaccess to accommodate another PHP application running on the same web root, and this issues has hit me 4-5 times now. Not dealing with it any more.

crutch’s picture

Our main problem is htaccess. We are going to need to do #145. It's better for us to review what changed rather than get the update and overwrite every time composer is used.

We have custom apps that require ssl. Composer updates/overwrites htaccess. If we don't immediately overwrite it with our custom version support requests start coming in. #132 sounds nice to be able to set common configurations to htaccess that stay in place no matter what core changes are made.

Bhanu951 made their first commit to this issue’s fork.

cilefen’s picture

With #145 is this even a bug issue at this time? Excluding, altering, and patching files is documented. We do it regularly without problems.