Problem/Motivation

LLM use in Drupal is not slowing down, it's picking up in Drupal core too. Drupal.org requires disclosure of LLM use when contributing. They can't do that if they're afraid of the response from the community. It means more "hidden" use of LLM, more time wondering if a contribution is LLM generated or not. Shaming people into changing their behavior doesn't work, what it does is make people hide their use, leave, or lash out, none of which is helpful.

To keep things manageable I think having an AI policy turned towards harm reduction would be a good thing. In some cases harm reduction methods can be counter productive, it's up to us to monitor the effects and adjust accordingly. LLMs are not going to solve the issues they cause, but we can make the problem less bad to deal with. Critically other humans should not be cleaning up unreviewed LLM output, as Dries puts it, never submit code you don't understand and I would add, never submit text you didn't review.

I've written a bit more details in a blog post with links to references and discussions about some of the problems raised in the previous AGENTS.md issue.

Examples

What would it change? From the recent post https://www.hojtsy.hu/blog/2026-apr-10/solving-small-drupal-issue-plenty... we have a sample of LLM generated comment on a real issue. This is not a core issue, but it's a good example of what the guidelines would change #3570518-8: Core language list parsing is not Drupal 11 compatible anymore.

Original Rewritten with new guidelines
I have a local setup of potx and I asked Claude this for demo purposes:

why would _potx_find_language_names not be able to parse LanguageManager.php in Drupal 11?

Answer from Claude is spot on:

The _potx_find_language_names() function fails to parse LanguageManager.php in Drupal 11 due to a PHP syntax change.

The Problem

Drupal 11's LanguageManager.php uses modern short array syntax:

public static function getStandardLanguageList() {
    return [
      'af' => ['Afrikaans', 'Afrikaans'],
      // ...
    ];
  }

But potx's parsing logic expects old array syntax:

public static function getStandardLanguageList() {
    return array(
      'af' => array('Afrikaans', 'Afrikaans'),
      // ...
    );
  }

Technical Issue

In _potx_find_language_names() at line 1662:

while ($_potx_tokens[$ti][0] != T_ARRAY) {

The function looks for T_ARRAY tokens, which are only generated by the old array() syntax. The modern short array syntax [] uses different tokens ([ and ] characters), so the parser never finds what it's looking for and can't extract the language names.

The Fix

The function needs to be updated to handle both syntaxes:

Old syntax: Look for T_ARRAY tokens
New syntax: Look for [ bracket characters

The parsing logic would need to:

Detect which array syntax is being used
Handle the different token sequences appropriately
Extract language names from both array(...) and[...] structures

This explains why potx works fine with older Drupal versions but fails with Drupal 11's modernized syntax in the core LanguageManager.php file.

_potx_find_language_names() at line 1662 loops until it finds a T_ARRAY token. T_ARRAY is only emitted for the array() call syntax. Drupal 11's LanguageManager.php uses the short [] syntax instead, which emits bracket tokens. The loop never finds T_ARRAY, so language names are not extracted.

The fix needs to handle both token sequences: T_ARRAY followed by ( for the long syntax, and [ for the short syntax.

Generated with the help of an LLM.

I think it's better. A nice side effect is that shorter means cheaper, it's less tokens to generate overall.

Proposed resolution

For a 3 months trial period: Add an AGENTS.md file enforcing a few necessary rules, that are aimed at reducing the negative impact of LLM use on human contributors.

This prompt has been tested with the following models to make sure they would apply the guidelines. Some are better than others obviously.

Sonnet
Qwen3 8B
Qwen3.6 35B-a3b
Granite 4 h-small
Mistral Nemo 12B
GPT-OSS 20B
Gemma 4 31B 

Remaining tasks

Agree/Commit.

Issue fork drupal-3585894

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

nod_ created an issue. See original summary.

nod_’s picture

Issue summary: View changes
Status: Active » Needs review
nod_’s picture

Issue summary: View changes
smustgrave’s picture

nod_’s picture

Looks like it'll complement it, in the other issue there is "If a more specific `AGENTS.md` exists in a subdirectory you are working in, its guidance supplements these instructions." And here the agent file is scoped to core/

smustgrave’s picture

As someone who deals with users using AI almost daily at this point +1 to making the problem less. If this is the path forward for it I'm for it as I don't have any other solutions. Has almost become a game of whack a mole.

I have had the scenario where a user responded to every comment with AI, every feedback with AI, but I know they didn't know what the code was doing. I think this would of helped maybe?

Another scenario that I've hit that may not have been caught is that a user used AI to scan Drupal issue queue and start to mass post MRs.

All this to say yes to helping alleviate the pain as it does become demoralizing.

webchick’s picture

LLM use in Drupal is not slowing down, it's picking up in Drupal core too. Drupal.org requires disclosure of LLM use when contributing. They can't do that if they're afraid of the response from the community. It means more "hidden" use of LLM, more time wondering if a contribution is LLM generated or not. Shaming people into changing their behavior doesn't work, what it does is make people hide their use, leave, or lash out, none of which is helpful.

10,000% true. Thank you for saying it. 🙏

Post-DrupalCon, several of us have been working on a project called Drupal AI Best Practices which is attempting to attack this from a “filter on output” POV: meaning, help the coding agents write far less stupid Drupal code (or, more accurately, train them on all of our peculiar Drupalisms that go against the other millions of PHP + software projects they’ve been trained on ;)).

That project is still VERY early days, but the “proof of concept” is there:

1) Drupal best practice advice is covered in “skills” (example), which are Markdown files that can be consumed by either humans or robots.

2) To tell if adjustments to skills are making things better or worse, we also have an “evals” framework (example), analogous to automated tests for Drupal. Now when LLMs make *new* bone-headed mistakes writing Drupal code, we can capture those as evals and make sure we don’t regress even as guidance in markdown evolves over time.

3) Now that the basic skeleton is there, what we are hoping to do is take the comparative analysis we did (or, rather, 3 different AI models did :)) in #3585379: Perform analysis on similar repos looking deep into a litany of existing projects that have gone their own way, and pull in the “best of the best” of these in a DrupalCMS-like starter kit for developers (and hopefully eventually site builders) to give them a strong foundation of basics.

Here’s our path to MVP release: #3585542: [meta] Roadmap to MVP Release

None of this is arguing against the introduction of AGENTS.md in core, just pointing out that this effort exists in case it makes sense to borrow from one another.

webchick’s picture

Out of curiosity… Why:

“Do not add a model name, vendor, or tool name.”

I would think this would be desired, in the interest of transparency, and so that we could see community adoption trends over time.

See Linux’s guidelines as an example: https://github.com/torvalds/linux/blob/master/Documentation/process/codi...

bramdriesen’s picture

I also would not mind seeing the model/tool name used. It can be a great insight into quality of the answers generated as well.

nod_’s picture

In my tests smaller local models just hallucinate their model name. This is intended to be automated, it's the minimum acceptable so vague is better than straight up wrong. I'm all for more precision but that's not where we are now with local models yet.

We should make sure local LLM options are not excluded. Not everyone can pay for so called frontier models

nod_’s picture

I also would not mind seeing the model/tool name used. It can be a great insight into quality of the answers generated as well.

How would that be relevant? people should make sure the code and content they send is of adequate quality. We're not going to externalize quality check of LLM output to other people. The Agent.md file is for limiting harm, not make a case for more LLM.

If you want to benchmark how well a model does, the core queue is not the place to do that kind of experiments. This issue is about harm reduction, benchmarking models in the core queue is pretty much harm maximalization and possibly against the Policy on the use of AI when contributing to Drupal as an example of what does not meet the standard is: "Posting issue comments, MR descriptions, or forum posts that are unreviewed AI output, not your own words." and that exactly what would be useful to do to actually evaluate model output.

gábor hojtsy’s picture

I think the LLM instructions included make sense. We should aim to reduce harm done by LLMs absolutely.

I don't have a strong opinion on whether to disclose LLM models, I can see the use for it, but if this improvement is held up with that I don't think its worth holding it up.

I like the example nod gave in the issue summary based on my post :)

The main thing I don't understand is to primarily suggest to rewrite LLM prose. That sounds like could be for the sake of rewriting. I can easily see that those that are extensively exposed to LLM output, conversing in their dayjobs with LLMs heavily will not necessarily improve the quality or clarity by rewriting the text?

The current d.o rules are very vague on this BTW they both say you should rewrite the text and also that validating the text is enough: https://www.drupal.org/docs/develop/issues/issue-procedures-and-etiquett...

Posting issue comments, MR descriptions, or forum posts that are unreviewed AI output, not your own words.

Unreviewed AI output vs your own words are two ends of a spectrum, reviewed AI output satisfies the first part of the sentence but not the second part.

larowlan’s picture

Thanks for taking the time to create this issue and the research/post that went with it

The proposed text looks good to me - this should be a living document - we should get it in and evolve from there

nod_’s picture

The main thing I don't understand is to primarily suggest to rewrite LLM prose. That sounds like could be for the sake of rewriting. I can easily see that those that are extensively exposed to LLM output, conversing in their dayjobs with LLMs heavily will not necessarily improve the quality or clarity by rewriting the text?

If I spend time working on an issue and someone post a LLM response, more often that not I feel it's disrespectful.

If you check the text the primary suggestion is to not use LLM for prose. The "rewrite in your own word" is a reminder in case the first suggestion is ignored (that might also trigger a specific user skill if they trained the LLM on their writing style). Same for the writing rules, it's a security if the person ignores the second suggestion. The writing rules are not best practice on using LLM to communicate, they're the equivalent of a grayscale filter on a smartphone. We don't know people circumstances so we try to account for various level of LLM use while keeping the impact on others contained at the very least.

alex ua’s picture

Shorter issue descriptions are a win for every reviewer type. Human skimmers read less. Human deep reviewers find detail faster. LLMs with small context windows waste less capacity. LLMs with large context windows spend tokens on signal instead of noise.

The layered design in this proposal works well. Discourage first, constrain second, filter third. A natural extension: keep issue bodies short per these rules, put code-level detail in the MR description, and attach deeper technical context as a file when needed. Attachment contents do not appear in the email body. The reviewer knows more context exists but is not asked to read it until they decide it is necessary. The ai_best_practices project #3585542: [meta] Roadmap to MVP Release seems like the right place to develop a convention for structuring issue context.

I attached a short outline of that idea as a starting point.

Generated with the help of an LLM.

nod_’s picture

Title: LLM use in Drupal core contribution, AGENTS.md guidelines » LLM harm reduction in Drupal core contribution, AGENTS.md guidelines

Not judging people for their LLM use is a prerequisite to apply the harm reduction framework, that doesn't mean we can't say anything about the outputs.

@alex ui: I don't understand what you tried to convey and how it relates to this issue of LLM harm reduction in the core issue queue. I'm not quite sure you reviewed the comment you posted, we do not use emails to communicate in the issue queue for example and spending 1.5 paragraphs restating the point and try to summarize the issue is not helpful unless the issue summary is updated if it wasn't clear enough.

To be honest I didn't expect this issue itself to be a test ground for the Agents.md file, but here we are. Refining the AGENTS.md file.

Feel free to propose this to the ai best practice initiative, the core queue is not the right place for it.

nod_’s picture

Issue summary: View changes

Made the file shorter. Used the existing AI posts in the core queue to find new rules to stop, hence the expanded "writing" section.
Updated IS example with new rules.

Confirmed that the shorter text is as good as the previous one across all the models tested initially.

alex ua’s picture

My point connects directly to your proposal: the optimization you're recommending seems great, but it raises the question of where other information that doesn't fit within the shorter comments/descriptions goes.

I'm also saying that there are three types of reviewers that this should be aimed at:

  1. Those not using any system: more information is available but not pushed at the reviewer. Scan and ignore if you want.
  2. Those using local LLMs who have a smaller context window (I've started working with Qwen and Gemma personally): progressive disclosure allows them to only load what fits.
  3. Those using frontier models: the savings on a progressive disclosure system are dramatic. ~240-token vs a ~2,000-token description is a huge difference that grows with every turn.

I don't know how many core reviewers will use a frontier model (seems like very few right now), but I'm suggesting that you structure this to at least allow for that possibility. I'm pointing to a risk: that if you shorten the comments too much, and there's no convention for where the additional context goes, you might remove too much signal. A convention for where the deeper context lives keeps the top layer succinct without info loss. That seems relevant to harm reduction.

This was mostly written by me, though an LLM helped with the token math, spelling, and grammar checks.

alex ua’s picture

...also, regarding the "email" comment, that was really just a me using own oddball method I useto make sure I keep up with issues (I have a label in my email applied to d.o. issues and I scan for replies). That oddity aside, I think it still stands when you are scanning the issue pages. The question is: in a scenario where you aren't adding an MR, then where should additional context could go? It seems like an attachment is the natural spot

nod_’s picture

@alex ua: please open a follow-up in the ai best practice queue. The goal here is to make LLM use less negatively impactful to other contributors in the queue. Spending time talking about how to optimize LLM use is pretty much against the spirit of this issue. You're absolutely free to try and optimize things, don't make other contributors in this queue part of it. Work it out outside core where people are willing to spend time and help you with it. We want to reduce the time taken by LLM discussions in the core queue, not have more of it.

alex ua’s picture

@_nod: sure, I will do that. I am hearing that you don't want to hear from the LLM users in figuring out how to deal with LLM users, and I'll not make the obvious mistake of engaging in the conversation again.

Thanks for all you do.

Edit: the issue is now at #3587092: Skill: progressive context convention for issue descriptions and reviews.

nod_’s picture

I'm open to discussion, this is simply not the right place as I said back in #17. Scope creep is still something that needs to be managed.

penyaskito’s picture

A nit about headers in html.


I have opinions on disclosing tools + models too, and even using Assisted-by. But I'm +1 to merging as is as plumber's tape, and moving forward in follow-ups if needed.
alex ua’s picture

@_nod- understood and totally understandable. I can't blame the LLM for my own challenges with both attention and hyperactivity, and I'll try to do better to avoid the scope creep. Again, thanks for all you do!

nod_’s picture

Going for a trial period with metrics monitoring to implement: "In some cases harm reduction methods can be counter productive, it's up to us to monitor the effects and adjust accordingly."

Proposing we go with a 3 months experiment. We add the file to core, and see what happens. if it's better, great. if it's not we analyse what happened, and discuss how to move forward.

I have a whole data pipeline setup to monitor the core queue, I'll make a public dashboard that tracks (in aggregate, never individual users/comments) some metrics. I'll propose some metrics to look at.

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

From dealing with users who push AI solutions and worst AI summaries it would be good to finally get something in place so marking this