Problem/Motivation
There's competing discussions on how to tackle moving to Level 2 for PHPStan (#3322735: [meta] Fix as many PHPStan level 2 issues as possible before bumping the rule level vs #3425412: [Sep 14, 2026] Bump PHPStan rule level to 3) and we can see that such discussions cost a lot of time. In the meantime we're actively adding new code that might not even live up to level 2 yet (because we're not testing for it).
Additionally there's a lot of discussion in the Coder issue queue about better supporting PHPStan types in PHPCS which causes a lot of work to be done on improving the sniffs. However, there's significant overlap in the things PHPStan can do at higher levels and the discussions we're having in Coder. For example making sure that a @param references an actual parameter for the function.
Additionally we're rapidly introducing new typed code which will only make our lives at higher levels a lot more difficult because we're not taking proper typing into account yet (we're not letting PHPStan enforce it). PHPStan also explicitly mentions this as a risk in the docs about the baseline.
An example of the cool things that validate at level 9 and can be used to prevent errors by forcing proper error handling is the introduction of a #3425201: Implement a Result type in Drupal Core result type. The return type of Result::getValue() will be 1 of two types, but calling Result::isOk or Result::isError first will narrow that to a single type. This helps force calling code to make sure the result was actually successful or an error. However, PHPStan will not actually help us with any of these issues until we go to level 7 which will "report partially wrong union types - if you call a method that only exists on some types in a union type, level 7 starts to report that; other possibly incorrect situations". Until that time any new code introduced can simply treat the value of a Result as equal to the success case without getting any feedback, potentially causing breakage in the error case.
One argument that has been raised against moving PHPStan to a higher level is that this leads to a very large baseline which can cause memory and performance issues. To solve this issue PHPStan has introduced a PHP baseline format instead of NEON which improves performance and memory usage for baselines that reach megabyte sizes (which Drupal may hit, but our level 1 baseline is only 100kb). This further emphasizes PHPStan's mindset of "Make sure you introduce new code at the highest level and don't get stuck in existing errors".
Steps to reproduce
Proposed resolution
We should follow PHPStan's recommendation to convert the baseline from Neon to PHP and make it capable of being a lot bigger while not negatively impacting performance too much.
Next we should bump PHPStan to level 9. Level 9 was chosen because all lower levels leave something on the table. We're already seeing a lot of new code in Drupal where we use mixed as an escape hatch. mixed in PHP is similar to TypeScript's any type in that it's poisonous and eliminates type-safety where it's desired.
Similarly level 8 prevents accidental crashes through "Tried to call X on NULL" and level 7 helps ensure things like the Result type mentioned above work properly.
ℹ️ The baseline captures currently existing errors allowing us to add new code at higher levels without having to fix existing code. This ensures that we aren't working against an ever increasing amount of things to fix. This issue is focused on stopping the increase of new issues, not resolving the existing one (although that should also be easier with the help of higher PHPStan levels).
Exceptions added
In order to not frustrate Drupal's use of render arrays and instead allow people to slowly start adding types to those we ignore the error ... has no value type specified in iterable type array. We do not disable the type requirement for all iterable types because specifying them does have a lot of value outside of render arrays in improving the type-safety of loops in code.
Performance
One of the concerns with the large baseline was performance but PHPStan is really good on this front.
A PHPStan run on Drupal without cache will complete in about 2.5 minutes on a 2021 Apple M1 Pro. Once the PHPStan cache is built the analysis completes in 10 seconds.
Given the amount of bugs that PHPStan can find for us at higher levels and the improved developer experience that Drupal can provide through better types (e.g. typing AccessibleInterface::access's $return_as_object without help from PHPStan Drupal) the small time to wait for an initial analysis seems like a very worthwhile tradeoff.
Follow-ups
Of course we now have a large amount of errors in our baseline which will need solving. A good first step that I've found on other projects is to find the exceptions that say "always" and fix those, before moving on to other things. However, making sure new code and refactorings end up with better typed code (using all of PHPStan's power) will make cleaning up older un-/mistyped code as well.
In #13 mglaman mentioned "Which made me realize that Drupal core is scanning itself and having it's phpdoc overridden by phpstan-drupal for defining iterables". The types currently defined by PHPStan Drupal might be good initial follow-ups to improve Drupal's typing and reduce the reliance on type-hints that live outside of Drupal's codebase.
Remaining tasks
- Agree on the proposed configuration for PHPStan
- Receive release manager sign-off
- Write release notes
- Create change record if needed
User interface changes
API changes
Data model changes
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #52 | Drupal 10.1.x to 11.x at PHPStan Level 9.png | 98.93 KB | kingdutch |
| #52 | Drupal 10.1.x to 11.x at PHPStan Level 1.png | 55.34 KB | kingdutch |
| #42 | 3426047-nr-bot.txt | 90 bytes | needs-review-queue-bot |
| #38 | 3426047-nr-bot.txt | 10.4 KB | needs-review-queue-bot |
| #16 | 3426047-nr-bot.txt | 90 bytes | needs-review-queue-bot |
Issue fork drupal-3426047
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #3
kingdutchComment #4
longwaveI worry here that just jumping straight to level 9 will mean contributors have more work to do when submitting MRs against existing code, sometimes we have to do things for backward compatibility reasons and I think satisfying PHPStan is not always possible without breaking BC? In turn that means having to make decisions whether or not to explicitly add more things to the baseline.
Comment #5
mondrake-1 (for now)
It will simply make impossible to contribute new code, since you'll have to develop huge workarounds to get sorted all the info which is missing from core codebase at the moment (description of iterable types, narrowing of types, etc etc).
An average contrib module can get to L5 relatively simple, L6+ is more work in satisfying PHPStan than in anything else.
Comment #6
kingdutchI'd want to respectfully disagree. I definitely think there's a learning curve. We'll have to go through that as a community at some point, but if we do our lives will become easier. Better documentation on how to solve common type issues can help there and I'm willing to help write that if it can help speed up Drupal's move to strict-typing.
One habit that I see with developers is the impulse to add
/** @var */annotations to solve typing warnings when upstream types are not yet set. That is indeed something that becomes harder to maintain. A much better way of solving that is to useassertinstead. It has the benefit of ensuring that your tests fail if what you're saying is false and PHPStan will let you know when the upstream types have improved and theassertis no longer needed.If iterable types is truly a problem then we can disable that behaviour specifically with
checkMissingIterableValueType: falsebut having the iterable types specified makes yourforeachloops so much nicer and gives you type completion without having to do extra type annotations or asserts within your loop.As a counter-example, I would consider the Open Social distribution a non-trivial codebase with about 69 top-level modules (and a whole bunch of nested modules). We've been running at level 8 for three years and counting and our code has become significantly higher quality with a measurable reduction in bugs as a result.
Comment #7
kingdutchI think that's a fair worry, specifically because they'll have to think about types in a way that we don't have to do now though. It's a trade-off of short-term pain for long term gain. The work they'd have to do now would otherwise be work we'd have to do at some point in the future.
The baseline is also very good at ensuring that these kinds of changes can happen locally without having to go and fix the whole file just because you're changing a single line.
I'd be curious to see examples of this and whether I can offer ways around it! In general PHPStan allows a lot of ways to add type information without actually changing the functioning of the code (e.g. through
assertor@annotations).What might happen is that Drupal becomes better at typing its code which will cause downstream projects to suddenly see PHPStan errors where there were none before. However, I personally don't consider that a breaking change because it's clarifying intention. As long as the PHP code keeps working without errors and doing what it promises then that would still be backwards compatible.
Comment #8
mglamanIf we're worried about size, can't we add the baseline as an ignore in https://git.drupalcode.org/project/drupal/-/blob/11.x/.gitattributes ?
Then it won't be part of the distributed archive
Comment #9
alex.skrypnykI really like this approach to working with the baseline config: there are 486 exclusions in the baseline config, but they are all laid out nicely in a single file. This means that those items can be "sliced and diced" into standalone tickets and tackled by the community. At the same time, all the new code will be coming at the level 9.
Comment #10
acbramley commentedCould we target something slightly lower at first?
My main concern with a very large baseline is not performance, but conflicts in the baseline file - making rerolls tedious. Conflicts in baselines are sometimes hard to know how to resolve especially for non-senior developers. This usually results in having to push follow up fixes to MRs creating more noise/CI minutes/$$$
Comment #11
mstrelan commentedI'm keen for something like this but agree with #10 that level 9 might be a bit ambitious. I'm thinking something more like level 4-6. For example, level 4 would have prevented #3426217: Fix @param docs for $deriver on plugin attribute classes.
Comment #12
dpiFor me, 6 is always the minimum for projects I'm introducing PHPStan to. I've found it has most value, without getting in the way. Though for new projects, I agree 9/max is fine.
9 seems nuts, especially maintaining a baseline.
When we do end up going there, I'd want us to consider bringing in
phpstan/phpstan-strict-rulesAnd at least having configs like:
So we can have a manageable baseline.
Stepping stones
Comment #13
mglamanVia Slack I had mentioned we should use
checkMissingIterableValueType: false. I'm just bummed it kills analysis for all iterables and not just arrays :/But I guess this is OK because all of the phpdoc enhancements for that live in phpstan-drupal right now. So eventually they can be moved here once the level is bumped and they're actually analyzed.
🤔 Which made me realize that Drupal core is scanning itself and having it's phpdoc overridden by phpstan-drupal for defining iterables
Comment #14
kingdutchI'm skipping #8 for now until we know what the baseline will actually look like. At that point we can decide whether it needs to be excluded from generated archives. I think it's a good proposal and an easy addition though.
Replying to #10, #11, and #12, paraphrased:
I think there's two things I could bring up to hopefully alleviate this concern.
Firstly, I think the decision to use PHP for the baseline rather than NEON will help git reduce the number of conflicts. The NEON baseline is basically a giant list of YAML which I believe git treats as plaintext which often causes it to not know which array item marker (
-) belongs to which rule. Example excerpt from Open Social below.For PHP there are more clearly defined boundaries because the start and end of a rule are distinctly delineated by different tokens (the PHP open and close array syntax). An excerpt from the new PHP baseline from Drupal Core:
This should help git better differentiate between different changes and reduce the number of merge conflicts that are actually flagged. It should also help developers better understand conflicts because they're dealing with PHP code, rather than the middle of a huge YAML list.
Secondly, however, if developers are manually fixing conflicts in the baseline file then I think we're doing a disservice to them. The baseline is the result of the changes in our code and is a file that's automatically generated by PHPStan so we really shouldn't be fixing conflicts manually. My recommendation here would be to educate developers to fix conflicts by regenerating the baseline file (during a rebase or merge commit):
PHPStan Baseline reviews are also pretty straightforward to know whether such a regeneration was successful. Any removals are improvements in code-type quality and will be caught by the code review itself, so they can be blindly accepted. Any additions need some attention and hopefully spark discussion for code improvements.
I'm a big fan of the strict rules, but I think there's value in the higher PHPStan levels even without it. So I would propose we evaluate it in a follow-up issue :)
I can happily say that
reportUnmatchedIgnoredErrorsis enabled in PHPStan by default, so there's no need for us to add it to the config unless we want to turn it off (which we shouldn't).For
checkMissingIterableValueTypeI already had a short discussion with mglaman on Slack which I think is worth repeating here:PHPStan tells us the number of errors in the baseline with the new level 9 settings is 98503. Of those, 19447 contain the sentence "no value type specified in iterable type array" which brings our error count down to 79056. About 372 further errors contain missing "in iterable type" in a non-array types. I still think there is value in fixing those non-array type iterables (think about things like field item lists) and thinking about how we introduce other iterable types. So my proposal would be to manually ignore errors of the type
*in iterable type array*then we can still fix other iterable types but at least don't need to go out and fix all the render arrays that are being pushed around immediately.Disabling
checkGenericClassInNonGenericObjectTypewould remove another roughly ~500 ignored errors. Though I would argue that use of generic typing in Drupal might increase (as we add@templateannotation to indicate a class is generic over a certain type) and having that information available can make consuming such code a lot easier (because it ensures the return types don't need to bemixed). So I'd urge keeping the generic check enabled. Disabling it would for example negate almost all benefits of the types for #3425201: Implement a Result type in Drupal Core.On Slack alex.skrypnyk mentioned we probably want to have a core committer review this issue. Given the impact this issue has on the requirements for future code to be merged and reviewed, I agree. I believe the release manager tag is most appropriate (and they can escalate to "Needs committer feedback" if needed (since it's mentioned as use sparingly).
Comment #15
acbramley commentedThat would be nice, but I still need convincing 😅 - Maybe we open a separate issue to swap the baseline to use PHP as that seems much more likely to get in quickly?
I agree, perhaps yet another issue to add a script to make this easier to run?
Comment #16
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #17
catchYes this would be an easy change and then we can see how it goes with the existing level of conflicts we have to deal with.
My old laptop, which was not that old - it was a 6th gen carbon x1 with 16gb ram, used to max out all CPUs and memory doing a full phpstan run and go into swap. I used to have to patch core to limit the number of CPUs that phpstan could use to 3 or 4 so that it would stay within bounds, then it would be OK, but sometimes I would forget - the config is in git so couldn't just change it permanently. On my current machine that doesn't happen, but it can still take a long time to do a full run - this is just the scan, not generating a baseline. I think @Gabor has run into that probably more recently than me on completely different hardware too.
Comment #18
kingdutchI've split out the change from Neon to PHP into #3426548: Convert the PHPStan baseline from NEON to PHP
That also includes the update to the GitLab CI to output the baseline in the PHP format there. The functionality to output the baseline as artifact was previously implemented in #3403653: Incorporate improvements to how contrib runs PHPStan to core
This branch has been rebased on top of 11.x and the work from #3426548: Convert the PHPStan baseline from NEON to PHP has been cherry-picked in to see the pipeline status in this one. That means this issue is now postponed on the other one :)
Comment #19
quietone commentedMentioned this to the other release managers and catch suggested adding a tag.
Comment #20
larowlanI wonder if we can run this in CI against files in `git diff --diff-filter=A --name-only` i.e. require level 9 for all new files but avoid the conflict hassles of the baseline?
Comment #21
mstrelan commentedI think we had that previously and it didn't really work, you need to scan the whole codebase.
Comment #22
kingdutchThe maintainer of PHPStan wrote a blogpost about why we shouldn't do that :D
It also makes intuitive sense because if I change the type of an interface then that's the only file that's changed. However, we know that changes in interfaces might affect all classes that implement them, so we can't rely on only analysing the interface.
Comment #23
cmlaraI want to throw this out there as a suggestion in case the baseline rebase becomes a blocker to this being accepted.
We could at least consider changing the .gitlab-ci code-quality report asset generation run of PHPStan to use L9 while the enforcement run remains L1.
This would allow us to visually see errors in GitLab and manually enforce fixing them in changed lines without otherwise failing the builds.
Not as ideal as having the MR fail however I would contend even that change would create an improvement in code quality compared to the current use of L1 only.
If we wanted to keep the L9 quality report small or see overall progress we could add a second PHPStan L9 baseline that we manually bump every few weeks.
IIRC a config file change invalidates the PHPStan cache so this would likely be trading a longer test execution time, or adding a second PHPStan Job to avoid wall time increase for higher quality code.
If we did add a second job we could consider use both reports, and possibly downgrade (using sed) the severity of errors in the L9 to info while allowing the L1 run to show Blocker to allow easily determine if it’s a “must fix” vs “should fix”
Comment #24
alexpottI think there needs to be a conversation about adding a 15 megabyte file to the core directory and at the very least decent instructions about how to not deploy this to your site.
Comment #25
longwaveLooking at https://phpstan.org/user-guide/rule-levels (and from past experience with work projects) I am not sure we should go past level 5 in core just yet. My reasoning is that level 6 requires us to add typehints and we are still quite far from adding typehints everywhere - everyone wants to do it but there is a lot of backward compatibility to figure out first. In the MR, 21k of the almost 60k errors are "...has no return type specified".
Comment #26
cmlaraIn my experience with contrib the majority of these are usually due to our implicit expectation of
@return voidwhen the @return block is missing in docblocks. See https://www.drupal.org/node/1354#functions and #1510838: Define and document @return void policy.While we have a standard now that all new code should be using strict typehints and return typehints it relies on human review to observe.
In my opinion, having a baseline for this would also help us remove these warnings. Once we know what the errors are we can write automated tooling to bulk add
@return voidto any remaining methods (in small batches as the core team will likely not want to cause a mass need to re-roll all pending MR's at once) and by just reviewing for errors (anywhere we now see "should return void but returns something") we know we have an API breach that we need to exclude for further followup.Adding
@return voidwould not have BC concerns as that is already the current specified for the method, we would just be documenting it in code.Comment #27
catchEven if @return void is OK to add
: voidin interfaces/base classes etc. is not without breaking existing classes - i.e. we need to add it in a major release after triggering deprecations for extending/implementing classes.I don't think we should add a baseline which requires adding type hints to existing code until we actually have an approach for how to do that in core, which is #3050720: [Meta] Implement strict typing in existing code, #3333824: Enable existing interfaces to add return type hints with a deprecation message for implementors etc. If we don't have the approach, we're adding baseline we can never remove. Help on those issues would be appreciated.
If bulk adding @return void to methods will reduce the baseline significantly (without having to add the actual return type hint), then a spin-off issue to do that first and bring the baseline down seems sensible too - even if there's a window where some could creep in between that commit and this one, seems like an easy way to get the file size down and reduce merge conflicts over time.
Comment #28
kingdutchI made a small script (attached as txt file) that allows us to use the PHP baseline to get some insight into what type of errors we're dealing with, which can be a bit more precise than my previous estimates. I still believe that we should embrace the baseline to make sure we don't increase the count of errors we're introducing in new code (and a lower PHPStan level simply isn't going to do that).
However, I am open to seeing if we can get better insight into the contents of the baseline and see if we can resolve some issues using PHPDoc annotations in parallel. I'd also be willing to invest some time in that if that helps acceptance of a baseline size and this issue landing sooner.
Here's the output of the current iteration of the script. I'd love to know if there's a preferred way of tackling these (e.g. I could add all return types to Drupal's functions as I expect them to be relatively straightforward, but that might be a patch with 2000 changed lines.
There was a thread in Slack about the issue raised in #24
drumm mentioned "Composer does no archiving, its just automating downloading zip files from from GitHub & ftp.drupal.org, and git clone git.drupalcode.org & elsewhere". That means that the proposed solution by Mglaman of adding the following to
.gitattributesundercore/would ensure the file is not distributed through composer or Drupal.org archives.This does still let the baseline be downloaded when people install with composer using the
--prefer-sourceflag which I would say is proper behaviour since the baseline is part of the drupal/core source.Comment #29
kingdutchAdjusted the script to gain a bit more insights at level 9 :)
The lines that say "always" or "unnecessary" is where PHPStan allows us to remove code :D
Operations on "may be NULL" or "may be FALSE" are code-paths that may manifest in bugs.
Comment #30
bradjones1Crazy idea/question - what size is the baseline file, compressed? Given it is text and (presumably) has a ton of repetition in it, it could be put in the repo in a compressed state (still with some prevention to have it packaged) and that helps solve the repo size issue? That does make it harder to diff "at rest," but uh, yeah. Crazy idea.
Edited to add another idea: What if it doesn't live in the repo but we leverage artifacts and make this a CI "only" construct in some way?
Comment #31
kingdutchI don't think modifying the baseline further or moving it out of the repository is a good idea. It would make the development experience of running PHPStan locally (and updating the baseline) too cumbersome.
While I do understand that 15 MB is a lot on the scale that Drupal is operating, we should also place the size of the file in perspective. I think our primary goal should be that people who deploy Drupal to production shouldn't suffer its cost. For that we have a good proposal which I've committed to the branch (although it does need testing which I can not do against a dev branch in a fork). I believe that as long as the file is not bundled in the archives on Drupal.org or in the dist version of Composer that's on Packagist then we've achieved that goal.
It's unrealistic to try and save the same 15MB when dealing with the raw source repositories (e.g. a
git cloneor--prefer-sourcecomposer install) because those are likely scenarios in which you want to have the baseline present. For an initial clone ofdrupal/drupalthe new baseline file would represent a 7% increase in repo size; for an inital clone ofdrupal/corethe new baseline file would represent an 8% increase in repo size (counting the existing baseline as 0-size in both cases).However that size would only be downloaded once, since it's a text file that git can efficiently update rather than having to re-download it on every pull request.
Additionally we know that this is a file where our aim is to make it smaller quickly (because we'd be addressing real issues and improving the developer experience of anyone using Drupal) and it's not a file that will stay this big indefinitely.
Fresh git clones of both drupal/drupal and drupal/core to show the size of the repo's inital clone for comparison to the increase caused by the baseline.
Comment #32
mfbLooks like when compressed in a git packfile, this baseline compresses pretty well, down to 851KB of space
Comment #33
longwaveFrom a release manager perspective I'm generally fine with the filesize in the core development repo for the reasons stated but agree that we should remove it from
drupal/coreinstalls (composer and tarball) because end users simply don't need this, it's only useful for core development.Comment #34
kingdutchThe other issue landed! Marking this as "Needs work" because it needs a rebase.
In #3426548-21: Convert the PHPStan baseline from NEON to PHP mstrelan mentioned
This needs some more investigation to impact of the large file. My initial response was:
longwave's feedback from #33 was previously addressed in the commit between #30 and #31 assuming the packaging works how we thought it did on Slack.
Comment #35
kingdutchI did some local experimentation with PHPStorm by opening it on the 11.x branch. This allows PHPStorm to settle on 2,23GB of memory used according to the activity monitor. I then switched to the
3426047-bump-phpstan-to-level-9branch for this issue. The indexing time that PHPStorm needed was near instant (basically a progress bar flash) and memory changed to 2.63GB. I'm not sure how scientific this is and whether all 0.4GB of memory pressure can be attributed to the baseline or whether PHPStorm does other things when switching branches that require memory.I added ".phpstan-baseline.php" to "Exclude files" under "Settings > Directories" and retried the experiment. I opened PHPStorm on the 11.x branch which settled at 2.25 GB. This time no re-indexing seemed to be triggered and memory usage changed to 2.36GB when checking out the branch for this issue.
Comment #36
bradjones1Thanks for addressing my hairbrained ideas on the file size.
My perspective on this is tainted a bit by working within codebases that have this problem at scale - e.g., Expo. The repository is legitimately bloated to the point it's difficult to download in a reasonable time on anything other than fiber. We are not there and this is a rare situation.
This is true for the _current_ file, but cloning with no depth limit will still pull in all the prior revisions. Not really gonna change anything, but unless we rewrite history or people routinely clone without depth limit (could be an idea for CI if it isn't already) then this is "here to stay" once it happens.
Comment #37
mstrelan commentedThank you for the PhpStorm test. I did similar and agree that indexing after switching to the MR branch finished very quickly and no noticeable change to memory usage. I foolishly decided to open the 200 000 line file expecting everything to freeze but it opened just fine, although CPU did increase a lot, and remained elevated until I closed the file. The file is not really meant to be opened anyway.
I noticed the file has been excluded from phpstan.neon.dist and phpcs.xml.dist, but should it also be excluded from .cspell.json?.
Comment #38
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #39
mglamanActually, the needs review bot and open MR should tell us how often there are conflicts or other issues, right? For that concern.
Drupal core really needs gitattributes to exclude the export of the file (and more.) Then it won't be in the downloaded artifact. Maybe if someone is patching core and Composer downloads via Git they will.Didn't check that it was added to the MR 👆
Comment #40
alexpott#37 - definitely should be excluded from spell checking. There's no point to wasting time doing that.
Comment #41
kingdutchRebased the MR on top of the 47 new commits on the 11.x branch. Also added the baseline file to CSpell as requested in #37 and #40.
I ran the same analysis script I attached in #29 to give an indication of how many new errors (according to the higher rule level) were added in those 47 commits. I've attached the analysis output and manually added (+x) increase, (-x) decrease, or (~) equal indicators for each rule.
From #39
I'm not entirely sure. A merge conflict occurs when two parallel changes touch the same lines of code and git is not able to determine how to reconcile those. This MR makes + 295436/− 1079 changes to the baseline (since we include all currently known possible errors), so it essentially touches every line in the existing baseline and any other change will have a conflict with it once.
The normal number of changes in the baseline are much lower than 300k lines. With that in mind, the chance of two MRs touching the same lines of code in the large baseline go down significantly.
From #36
Unless I've missed something the only open concern (besides the potential for merge conflicts) is that because of the way git works (it captures history) the increase to the repository size with the new baseline size would be permanent for the lifetime of the project. This would affect clones which do not limit depth (by default clones fetch all history depth); we have taken steps to ensure this doesn't affect production deployments which don't use git.
I don't have a good solution for this and with access to 4G/5G and fiber in all my workplaces I think I might be slightly too privileged to have a proper say in this matter. Git has improved support for very large repos (measured in Gigabytes, so much larger than where Drupal is at) but these do require steps taken on the consumer side and are not something we can do from the repository side of things.
Comment #42
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #43
alexpottI've been running level 9 checks against the new recipe initiative code for quite some time and the checks are proving quite frustrating even for experienced contributors. The job we have does generate a baseline for you and everything but so many of the warning cannot be resolved because core is so non-compliant. See the recent commits to https://www.drupal.org/project/distributions_recipes/issues/3422821 for an example of how frustrated a contributor gets. I think we need to be really careful here.
Comment #44
catchFor me having to add entries baseline (which we'd have to add to core too unless we blocked issues like that on core refactoring) would undermine the entire point of this, it would break the pattern of not adding new entries to the baseline. At the moment we only ever do that for dependency updates (and then often immediately open follow-ups to remove them again), not for regular core MRs.
Comment #45
longwaveThat is basically what I was trying to say back in #4. I think maybe level 5 or so is a reasonable compromise but level 9 will require adding to the baseline for any non trivial issues and that will deter contributors.
Comment #46
phenaproximaOh, dear god, please don't do this.
As @alexpott said in #43, level 9 is a nightmare. Core's current state makes it basically impossible to write even very strict code.
Here's an idea: what if we progressively got to level 9, over time? Let's clear out everything in the current baseline, then bump it to level 2 or 3, clear out everything in that baseline, and repeat until we get to level 9. That will keep the tasks manageable and relatively straightforward. It'll take a long time, but that's what happens when you have a big codebase.
Or, another idea - add a level 9 job to core's CI pipeline, but allow it to fail. That will at least allow us to know if we're introducing new errors, whilst not blocking on preexisting problems.
I am certain that going directly to level 9 will effectively halt core contribution until a lot of disruptive issues are fixed. Inexperienced contributors will never, ever want to go near Drupal core again.
Comment #47
longwaveThis is the current plan anyway, see e.g. issues tagged with PHPStan-1 or #3322735: [meta] Fix as many PHPStan level 2 issues as possible before bumping the rule level
Comment #48
spokjeComment #49
catchI can see the arguments for going to a higher baseline so that new code is compliant, rather than adding to the amount to be fixed later, but this should mean that once we've generated the baseline once, it should either stay the same or go down, not get bigger.
@Spokje with bc policies do you mean type hinting existing code (which is where progress is almost zero) or more generally?
Comment #50
alex.skrypnyk@Kingdutch
I found it very difficult to understand that you are proposing to enforce lvl 9 for NEW code only.
Could you please update the title of the issue and maybe put something in bold in the description to clearly state what this issue is about.
Thank you
Comment #51
bradjones1Re: #50, that's the point of the baseline. Any new code cannot add to the baseline.
Comment #52
kingdutchThanks for the input everyone! And thanks alexpott and phenoxproxima for sharing your experiences with level 9 in the recipe fork.
Responding to the recipe example
If I look at the commits it's a bit difficult for me to see where exactly the friction is coming from (don't get me wrong, I do see the friction). There's two points I think I could summarize it as (and please correct me if I'm wrong): firstly a learning curve, secondly Drupal's use of arrays.
The learning curve is very understandable and I do think it's something we'll have to guide the Drupal community through. We're essentially going from a very loosely "do what you want" language to a very strictly typed language. Adding
@varannotations or preferablyassertstatements (since they fail your test if your assumption is wrong) is a part of the journey we'll have to take. Regardless of our strategy with PHPStan levels, that's something we'll have to go through at some point. The good news is that PHPStan will tell us if asserts are wrong (and thanks to Drupal using Bleeding Edge will also tell us if@varstatements are likely wrong). That provides us with a way to fix issues at the call site ("I know this is type X") while allowing us to fix the origin of the bug (the actual type annotation) and then have PHPStan tell us which call-sites no longer need the asserts/annotations.The second thing is Drupal's use of arrays. This is the only thing that ended up in the baseline. It's probably the hardest thing to fix in Drupal regardless of levels because it'll require re-thinking how we pass around our render data and define a more well-defined structure for it. This is exactly why this issue proposes a singular exception to the strict rules of PHPStan (from the issue summary):
We do not disable generics altogether because there's a lot of places (like Drupal's Entity API which is now partially patched by PHPStan Drupal) where generics can really add value in telling downstream code what type of value exists within a container. Introducing generics without immediately validating them with PHPStan is likely to cause us even greater pain in the future, hence why they're only disable for arrays and we keep them for other classes (where assertions/@var annotations can more easily be written if needed).
Why running PHPStan on a lower level is a fallacy
It's a bit challenging that PHPStan called its configuration steps "levels" because they make people think of them like "levels in a video game" where if you beat one level, the next one becomes easier. However, that's not really the case for PHPStan. I personally see them more as "groupings of complexity" (although depending on your code-base some higher levels might be easier than lower levels).
The challenge with Drupal's current approach is that while we're slowly fixing some issues at the lower levels, we're steadily introducing new things to fix at higher levels which are just sitting there waiting. The value across the levels also isn't equal. For example we all agree that type-hints are incredibly useful but these aren't checked until level 6 (that's the same level that introduces the generics that many people stumble over). It's not until level 8 that PHPStan will help us out in preventing "calling some method on NULL" which are errors that, if they slip through tests, will case a white screen of death for a site. The only difference between level 8 and 9 is PHPStan being strict about "mixed" which is a good thing because without that rule people might see "mixed" as a solution over finding the more specific union they're probably actually interested in. PHP's
mixedis akin to TypeScript'sany, which the TypeScript community has long since learned is dangerous. To remedy that they've introducedunknownwhich changes any from "could by anything" to "you must figure out what this is", that's basically all PHPStan's level 9 does over level 8.Some statistics
To back-up my claim that even though we're slowly chipping away at lower levels, we're introducing issues at higher PHPStan levels at a higher rate that we'll want to fix in some future, I've analysed ~2100 commits from 10.1.x until 11.x (around the end of March) for Level 1 and Level 9 of PHPStan and plotted their difference. This is based on the analysis script that I shared in this issue earlier but with all errors grouped into a specific category/explanation.
The scripts used to do this as well as the raw result data can be found in https://github.com/Kingdutch/analyse-drupal-phpstan
As we can see we're slowly chipping away at the 600 errors that are in our current baseline. Meanwhile we're introducing about 2 errors per commit on average, which dwarfs the work that we're doing at lower levels as we climb from about 55000 errors to nearly 60000 errrors.
This is to me what spurs my motivation to try and change our approach to how we tackle code quality since I believe the benefits that preventing the errors shown by PHPStan can bring us truly outweight the effort needed to prevent them.
In an attempt to make discussing the different levels of PHPStan based on data rather than my own or others' feelings I've analysed commit ec28f25d1e on the 11.x branch at level 1 through 9 and ran the results through the analysis script. The resulted categorisation is listed in a table below to show which errors and how many of them there are for the different levels of PHPStan.
Based on the table above I would love to hear the value people assign to the different categorised errors and whether it's worth preventing them. For example I value preventing null-based bugs because I've seen them as errors a lot and similarly believe that "mixed" is not a good type to use in 90% of circumstances. However if people disagree with me that might give us a more data and decision driven approach to finding which level we should land on going forward. Additionally if that's not 9 it may give us some ideas of what criteria we want to adhere to to move to higher levels in the future and how to ensure the higher levels don't grow as we fix lower level issues.
Comment #53
larowlanThanks for that amazing analysis @Kingdutch
If we did accept a larger baseline and bump the level, do you think it would make sense to then take a horizontal approach instead of a vertical approach to reducing the baseline size? e.g. Working by rows in your table, instead of by columns (which is what we're doing now).
It does feel like a lot of those rows would be easy wins but also with a high return. There are definitely some rows where reward vs effort vs disruption would definitely be difficult.
Comment #54
quietone commented@Kingdutch, thank you for the data, it is very helpful.
This seems reasonable and something to explore.
Comment #55
kingdutchYeah I think that would absolutely be the way to go. I do think there's a challenge that they may not be simple find/replace, so I'm not sure how large we can make batches. For example we may introduce a return type in the PHPDoc of an interface and then fix our classes to adhere to them (where we can introduce it in PHP). That in turn may cause PHPStan to say "Hey these type assertions you're making here are no longer needed" and we can remove those. So there's connection between the different rules.
The benefit is that the higher level of PHPStan can give us the confidence that if the types are correct at a local level (e.g. we make sure to accept
string|\Stringable) they'll also help us out on the project level. That should be a big help to reviewers for these kinds of issues even if they're not copy/paste.I saw the suggestion and like it in theory but I'm struggling with a viable way of making it work. One of the differences with PHPStan vs tools like CSpell/PHPCS is that we always want to run it on the entire codebase because changes in one file can result in behaviour changes in any other file. So with that in mind I think this route would still require us to commit and update the baseline in case of errors, but there'd be a discussion of which errors are then acceptable.
If we don't commit (or update) the baseline then it's going to be really difficult to say what part of the code caused the new errors and what errors were introduced in previous commits. I think that probably leads to more frustration.
The only way I can come up with to do this and not commit the baseline would be to generate the baseline from the base commit of the PR and then run it against HEAD of the PR, so only the change in errors would be reported over which reviewers could make a decision of whether it's acceptable or not. So basically run PHPStan twice in the CI. The downside of this is that it does make people reliant on the CI for their PHPStan work and does not allow to easily perform these checks locally (unless they also run it twice locally). A composer install between runs might be needed because updated dependencies can affect type information.
Comment #56
andypostAs baseline been converted to PHP, now it can be preloaded to negate size-effect of it on performance, ref #3426548: Convert the PHPStan baseline from NEON to PHP
Comment #57
cmlaraI would like to see us maintain a list of known issues created that could have been prevented by higher PHPStan levels so that we can correlate the costs of not increasing the baseline. Linking #3559132: FileSystem::deleteRecursive() leaves files/directories when realpath() returns FALSE as a real world example that if we had adopted L9 in 2024 would likely not of existed
Regarding #55: Is dependence on CI that large of a concern? I can't speak to others, however for Core I don't run full CI runs locally (I leave those to CI) instead I at most run single file, if it is a line I'm touching I will proactively fix it to L9 (this does occasionally cause Core devs to ask me why I am validating XYZ requiring me to point out its an L9 concern) otherwise I just ignore the error existing.
There is some talk of publishing lint caches to S3, perhaps that could work for this too (I believe the composer lock file would give us repeatable results that we don't have to worry about drift, however I've not validated) which would allow the baseline to stay outside of the repository yet still allow the baseline to be referenced.