Problem/Motivation
Follow-up based on #2948579-15: Block web.config in .htaccess (and vice-versa)
The .htaccess and web.config files provided with core both contain rules to block access to source / config file which should not be viewable through the webserver e.g.
# Protect files and directories from prying eyes.
<FilesMatch "\.(engine|inc|install|make|module|profile|po|sh|.*sql|theme|twig|tpl(\.php)?|xtmpl|yml)(~|\.sw[op]|\.bak|\.orig|\.save)?$|^(\.(?!well-known).*|Entries.*|Repository|Root|Tag|Template|composer\.(json|lock))$|^#.*#$|\.php(~|\.sw[op]|\.bak|\.orig|\.save)$">
However, as @longwave pointed out in the linked issue, the rules seem to have got out of sync (i.e. they don't match exactly) across the two files:
I did also notice that the regexes differ elsewhere in a few places - web.config blocks code-style.pl (not sure why) and what looks like SVN repository files, but not .make files - we should probably open another issue to unify this properly. Both files also block .xtmpl files, which are surely long obsolete.
Proposed resolution
Review the relevant rules in .htaccess and web.config and alter them so that they match, as appropriate.
Remaining tasks
* Review the rules, and decide what changes need to be made to each.
* Make those changes.
* Check whether any tests need to be updated.
User interface changes
None.
API changes
None... I think; although it's possible that changing these files could change the behaviour of existing sites... so that'd need to be considered.
Data model changes
None.
Comments
Comment #2
mcdruid commentedComment #3
ayesh commentedComment #4
ayesh commentedHi @mcdruid and everyone,
I did some research round the web.config and .htaccess regular expressions, and you are right there are things that we should align well, plus there is a big room for improvement.
https://regex101.com/r/WZAIsO/1
1. Performance
This regex has 1789 steps. This is too much considering we only need to match the file extension and certain directories. Because we will need to release an .htaccess update, I think it's better if we optimize this s well.
2. Add-only modifications over time
The regular expressions long and somewhat difficult to maintain. In addition, we have one more rule that blocks all files and directories starting with a dot. This longer regex is long and hard to wrap head around. The regex is so long you can write and publish small npm package before you finish reading it. I suggest that we remove some redundant rules.
Comment #5
ayesh commentedI created another issue because I didn't want to hijack this issue, which I believe the main focus is reconciling .htaccess and web.config. However, I also believe a wider clean up for these files are also in order.
Comment #14
mastap commentedI am wondering why we don't reverse this logic, and only list the extensions and file pattern that we want to serve, excluding all the others from being served.
I feel this would secure even more the environment.