The original issue has been trolled to death and has been refiled at #3580761: Ban LLM code contributions from Drupal core with a warning against further trolling.

This issue is now only for banning slop text.

Issue fork drupal-3574093

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

ghost of drupal past created an issue. See original summary.

ghost of drupal past’s picture

Issue summary: View changes
ghost of drupal past’s picture

Issue summary: View changes
quietone’s picture

Let's reference the governance issue.

andypost’s picture

such amount of hate and wording is not constructive

ressa’s picture

I agree with @andypost, the rhetoric is not helping your argument. If you want people to listen, a first sentence like this would be more productive:

When the fundamental problems with LLMs were limited to be a technology created by and for the tech oligarchs to plunder intellectual assets (art, code, music, etc), environmental destruction, exploitation of the poor, and so on, it was easier to wave away those concerns as happening to someone else.

Here's the article version of Jeff Geerling's great video: AI is destroying Open Source, and it's not even good yet.

ghost of drupal past’s picture

OK. Now that you aired your grievances with my choice of arguments which are pure facts known and ignored for years can we get back to banning LLM contributions made urgent by the emergence of Openclaw? The way I see it, either you mark Drupal as slop free or it will be swept under the tide of it once people realize the quality of such software.

ghost of drupal past’s picture

Issue summary: View changes
ghost of drupal past’s picture

Title: Ban LLM contributions » Ban LLM code contributions
Issue summary: View changes

Narrowing to code. Let the other issue deal with the rest.

ghost of drupal past’s picture

Issue summary: View changes

bircher made their first commit to this issue’s fork.

ghost of drupal past’s picture

Issue summary: View changes
Status: Active » Needs review
phenaproxima’s picture

Although I feel that the siren song of speed and convenience (even at the expense of things like "the environment" and "general human welfare") will overpower all the evidence that it's a Faustian exchange -- a pattern which is neither new, nor unique to the mediocre slop AI era -- let's just say I would shed exactly zero tears if Drupal core banned LLM contributions except in (maybe) specific, exceptional circumstances.

Might even have a little extra spring in my step for a couple of days.

I also don't think "more AI" is the key to improving Drupal's velocity or general relevance. Better tooling, a sustainable funding model, and a more solid product (sayeth the Drupal CMS technical architect) are.

dww’s picture

+1,000,000 to this. I posted my thoughts at #3570498-34: AI tools for contributors and maintainers since I couldn't let the recent comments in there go unaddressed.

cainaru’s picture

I’m not opposed to this idea for the Drupal core project itself for the foreseeable future, to be honest. My concerns regarding accepting LLM-generated contributions to the Drupal core project are shaped by what I’m seeing in the current broader ecosystem (culture?) around “AI”— specifically LLMs (e.g., the reckless contribution behavior, the coercive cultural pressure, the concentrated power, the unsustainable economics, the environmental cost, the training ethics) and how that could end up as a detriment to Drupal core’s reputation of quality and trust.

I also want to note that, had this issue been made a couple months ago, I would have felt differently; things that are happening in real-time have changed my thinking.

Irresponsible contribution behavior is already here and overwhelming maintainers. Right now, we’re seeing other Open Source projects dealing with some people (and their bots) acting irresponsibly by producing avalanches of LLM-generated code at scale with little regard for context and quality (especially since things such as OpenClaw have come into existence) — sometimes without even reading the relevant issue at all. Right now, the culture around it seems very Wild West without much conscientiousness or etiquette, and that’s not something I would want to see our hard working core maintainers dealing with at the moment.

Just because we think we can “forecast” the future, it doesn’t mean things will actually pan out. Despite executives, analysts, venture capitalists, and oligarchs “forecasting” that LLMs will definitively be the future of development (among other stuff) for everyone in existence in the coming years, let’s cut through the fluff, puffery, and superficial 5-year growth cycle graphs and be honest: none of us have any idea what the future actually holds. No one does.

For all we know, there could end up being a Hindenburg, Challenger, or Chornobyl-level event regarding LLMs that results in a massive backlash of distrust towards systems using LLM-generated contributions and leads to people gravitating towards other systems that didn’t accept LLM-generated contributions.

Backlash, pushback, and disengagement is also already materializing in various forms, for various reasons — even for reasons nobody foresaw; a common denominator seems to come down to trust. I mean, hell, we’re already seeing — on a smaller scale — movements of backlash and distrust such as QuitGPT (granted, IIUC, that started because a ton of people were upset about ChatGPT-4o getting decommissioned, but now more people joining the QuitGPT movement in response to OpenAI’s dodgy-ness regarding mass surveillance of Americans and autonomous murder weapons). Apparently 1.5 million users have already left or cancelled their ChatGPT subscriptions and ChatGPT uninstalls surged by 295% in a single day after the OpenAI-Pentagon deal as a result. Who would have forecasted that?

Not to mention, some people are reconsidering their own over-reliance on LLMs as well and engaging with it less than they might have initially thought; see what @nod_ had mentioned in #3568936-31: Embrace the chaos, add a couple of AGENTS.md file to core.

Also consider that, more generally, right now there is a large segment of people who are already tuning out and avoiding systems that are plastering “AI” everywhere. Why? For some of these folks, it is due to economic, ethical, social, and political concerns. For others, it is due to overall fatigue from nearly every tech/tech-adjacent corporation and start-up deciding practically in unison to shove “AI” into every nook and cranny — just to look “on trend” and/or appease venture capitalists— even when it isn’t working well or is unnecessary. If you’re primarily surrounded by executives, venture capitalists, oligarchs, and classes who try to appease them then you might not be aware of this large segment of people (or you might think it is smaller than it actually is).

So, a ban or moratorium on LLM-generated contributions to the Drupal core project itself could be something that gets revisited periodically as things change, too. This can be a prophylactic to maintain and protect the quality and trust that Drupal has built over the past couple of decades — which IMO is extremely important at a time when the current broader ecosystem (culture?) around LLMs poses a risk to both. Perhaps contrib projects could still permit it, depending on preferences of maintainers of individual contrib projects.

Also +1000000 to @phenaproxima, @dww, and @ghost of drupal past. There are a lot of folks who feel similarly but are either afraid to speak up about these things due to fears of professional repercussions or that speaking up will be a waste.

Edited to add: key excerpt pulled from Anil Dash’s “The Majority AI View” (a “must read” IMO):

Perhaps the biggest cost of ignoring the voices of the reasonable majority of those in tech is how it has grossly limited the universe of possibilities for the future. If we were to simply listen to the smart voices of those who aren't lost in the hype cycle, we might see that it is not inevitable that AI systems use content without the consent of creators, and it is not impossible to build AI systems that respect commitments to environmental sustainability. We can build AI that isn't centralized under the control of a handful of giant companies. Or any other definition of "good AI" that people might aspire to. But instead, we end up with the worst, most anti-social approaches because the platforms that have introduced "AI" to the public imagination are run by authoritarian extremists with deeply destructive agendas.

And their extremism has had a profound chilling effect within the technology industry. One of the reasons we don't hear about this most popular, moderate view on AI within the tech industry is because people are afraid to say it. Mid-level managers and individual workers who know this is the common-sense view on AI are concerned that simply saying that they think AI is a normal technology like any other, and should be subject to the same critiques and controls, and be viewed with the same skepticism and care, fear for their careers. People worry that not being seen as mindless, uncritical AI cheerleaders will be a career-limiting move in the current environment of enforced conformity within tech, especially as tech leaders are collaborating with the current regime to punish free speech, fire anyone who dissents, and embolden the wealthy tycoons at the top to make ever-more-extreme statements, often at the direct expense of some of their own workers.

cainaru’s picture

I think @nod_’s comment #31 on https://www.drupal.org/project/drupal/issues/3568936#comment-16445806 is also worth reading and taking into consideration.

Posted this comment to the wrong issue! Pretend this comment doesn’t exist please. 🤪

aporie’s picture

I'm also gonna allow myself to give my 2 cents on this, even though I'm not a big core contributor. What I do have is the freedom of speech because whichever repercutions I could have had from AI, already happened.

Without cautioning the choice of wording of the ticket description, I think banning LLMs contribution at different level is worth discussing. I'm pretty sure there will be a niche in the future for softwares not built with LLMs assisted techs, not because those tools are not useful but because there is a privacy concern issue.

To try to keep my post as neutral as possible I'd take an example of a private company using Drupal. Currently, this company, if dealing with sensitive data, can totally own the entire chain of ownership. From code generation, code storage, data storage to app hosting. Once, one single dev in that organization uses an IDE assisted agent, the entire chain collapse. Everything is sent to a big tech, who will surely fructify these data to train their next LLMs (at least in the current state of the domain, where there is no regulation). Hence, the issue is not the tool itself, it's the privacy around it.

Open source has been built around the idea that people should have the right to own their data (and it's just about freedom actually), when prompting an LLM, you are giving away these data (and your right).

IMO, banning LLM could be an option, but the worst. Creating our own local LLM which can assist developers to produce code for Drupal, locally, in a secure way, is the way to go to make Drupal a pioneer in secure web development and keep its place of trusted software among the organisations for which privacy matters.

Our community is big. I'm sure it's something we can pull off out of open source produced LLMs from the different companies out there.

Rather than cutting ourselves from impressive tools, we should appropriate them for ourselves.

ghost of drupal past’s picture

Creating our own local LLM

Unless you have hundreds of millions or even billions in US dollars that is not happening. And Drupal does not have the amount of data for an LLM anyways. What you mean here is that you will take a pre-trained base model and continue training it on Drupal data. Which they already ingested since it's open source anyways (and it's utterly unclear whether the results they spit out trained on these violate the GPL or not). Running an LLM locally solves a few problems however it does not address the "on the hand" problem of LLM quality ie slop neither does it solve the much larger ethical concerns around the creation and proliferation of these. LLMs are incredibly addictive, they operate on the same psychological pathways as gambling and so once people start with "oh I will just use a local one" that will quickly devolve into "but the online one is faster/better" etc.

cainaru’s picture

LLMs are incredibly addictive, they operate on the same psychological pathways as gambling

FWIW, I would assume this is especially applicable to the commercial cloud models since the corporations that create them are financially incentivized to make them more addictive in order to maximize dependency and engagement to lock people in. IIRC various people in the AI ethics space have been trying to flag this risk for a while now (Timnit Gebru, Tim El-Sheikh, and others I’m blanking out on at the moment).

The coercive cultural pressure of the current moment probably exacerbates the addictivensss risk too, on top of all that. 😫

aporie’s picture

@ghost of drupal past
I understand your concerns and I invite you to join the #ai-and-ethics channel on slack to have more informal discussions about it. IMO, you cannot forbid new technologies, because they may change how society behave (or your fears of what society could become). I see what you mean because I had similar thought about social medias when they appeared. I personally didn't see any interest into exposing my life to others and didn't understand the purpose of it. Now, they've become what they have become, something very addictive (apparently) which diminishes people's attention span.

At the end, it was always a personal choice to use them or not.

Where societies failed IMO, is that we never tought our kids how to use these things. We've let that to parents who where themselves hooked on them, without probably measuring their effects.

I personnaly don't know if LLMs are addictive yet. I'm trying to stay objective and asses whether they are making my job easier or not. Currently, I have a mixed opinion, because they do help as much as they make my life harder (for personal and professional uses).

Of course, I wasn't talking about creating a neural network from scratch but to train (and not fine-tune) a pre-existing model on Drupal data. Yes, commercial ones will still be better, but we solve the privacy concern issue. The one open source fixed and should still fix. Why stop now, because there is just a new kid in the yard?

The coercive cultural pressure of the current moment probably exacerbates the addictivensss risk too, on top of all that.

Totally agree. But we've seen that already, again and again. With social medias, with the smarphones, with the robot's attempt of a VR digital world :) It's all a bubble of extra excitement built up on top of faulse promises. At the end, everything fall back into place and we do take the best out of it.

I really do not think this type of discussion should be led in the Drupal issue queue, but rather on slack where it's less formal. Also, because appart from the wording, I think discussing about LLM contribution on Drupal makes sense (this ticket) and right now, none of our discussions are actually added value to a potential decision on how we should treat AI generated code for the Drupal project.

cainaru’s picture

Eh, I respectfully disagree that these discussions here haven’t added any value to a potential decision regarding a ban/moratorium/whatever of LLM contributions to Drupal core.

There’s value in here with having a forum where folks can speak up about whether or not they support such potential decision, what concerns they might have, and a rationale for why they might feel the way they do (including any ideas or suggestions, of course). IMO that helps in gauging whether such a decision could or should move forward.

I think that is something especially important right now given the current climate of coercive conformity among the broader technology industry right now towards “you must do XYZ with AI… adapt or die”/“something something or be left behind” rhetoric; like I said earlier, a lot of people who feel differently from the rhetoric that dominates the industry right now are staying silent because they feel afraid of professional repercussions if they dare air any valid concerns or critiques of the usage of LLMs (that Anil Dash post explains it better than I’m doing right now 🙃). I personally want those folks to know it is okay to speak up and use their voice. Otherwise, we’re not getting an accurate pulse check on how the community is actually feeling.

Personally, I do think any decision should ultimately come down to the core maintainers of Drupal core. Why? Well, they’re the ones on the front lines who are responsible for reviewing and merging MRs. And discussions here might help inform them of how folks in the community are feeling about a potential decision like this.

With that all being said, thank you for the invite to join the #ai-and-ethics channel (glad we’ve finally got one of those!) That sounds like it could be a great space for more extensive discussions!

cainaru’s picture

One more thing and then I’ll shut up for now 😅:

Personally, given what we’re seeing up until now (within the Drupalverse itself, other Open Source projects, and the broader tech/tech-adjacent ecosystem) and the fast-moving yet very unpredictable nature of this current “AI”/LLM moment in real-time, I’d suggest considering a “happy-medium” approach:

For Drupal core

  • Moratorium on accepting LLM-generated code contributions for the foreseeable future. This could (should IMO) be revisited from time-to-time (unsure about frequency) by decision-makers (core maintainers, maybe security team too, and whatever else relevant groups… pretty much anyone who is either on the front lines of merging code into core, ensuring things are secure, and that things are legally on the up-and-up regarding GPL and licensing stuff for the core project itself) as things evolve, with continuing to monitor how other Open Source projects are handling things (since this is something affecting pretty much all Ooen Source projects, not just Drupal).

For contrib projects in the Drupalverse

  • Let it be up to the maintainers of individual contrib projects on whether or not to accept LLM-generated contributions. This allows contrib maintainers agency for the projects they are responsible for, and also I think it would be much much harder to enforce for contrib projects anyway at this point since I’m assuming some contrib projects already have accepted LLM-generated contributions. It’d also allow Drupal AI initiative folks to keep doing their thing too, since there is a still good amount of interest and demand for that currently.

These are just some ideas I’m tossing at the wall as food-for-thought and a starting point.

ghost of drupal past’s picture

LLMs being addictive I thought that's just a foregone conclusion I didn't source it. Apologies. Here's an excellent summary: https://yasmin-fy.github.io/ai-heart-project/articles/ai-addiction/

aporie’s picture

Personally, I do think any decision should ultimately come down to the core maintainers of Drupal core. Why? Well, they’re the ones on the front lines who are responsible for reviewing and merging MRs. And discussions here might help inform them of how folks in the community are feeling about a potential decision like this.

I guess that's what we are doing... Just sourcing the community feelings for Drupal core team to take decisions.

Eh, I respectfully disagree that these discussions here haven’t added any value to a potential decision regarding a ban/moratorium/whatever of LLM contributions to Drupal core.

IDK, it feels to me, it's better to come with solutions than problems. And here we're just discussing about problems. I still do think a more informal place is the place to talk about that, to come with solutions rather than problems. Because even when we'll come with solutions (opening tickets on d.org) for the Drupal core team to take decision on, they'll need proper explainations about the problem to act on it.

I think we should filter them the slope of our brain farts, so they can take proper documented decisions ...

@ghost of drupal past
It is a bit of a foregone conclusion, no need to convince me. LLMs are addictive because they replace searches and are pretty much awesome from the technology stand point. They are addictive because they are the future of searches ... Google search is dead, long live google prompts!

ghost of drupal past’s picture

LLMs are not the future of anything, this charade will end when VCs run out of money to set on fire, that's also a foregone conclusion. The only question is, how much destruction will happen until?

aporie’s picture

I think the question is more about, who cares?

ghost of drupal past’s picture

I do. Now I will speak personally plain because I was addressed in "afraid to speak up about these things due to fears of professional repercussions". This does not need a reply and perhaps shouldn't be here but I think it might help in understanding my stance here. When the original nazis hauled my grandmothers into concentration camps no one spoke up. (They survived.) When Russia invaded Hungary in 1956 no one spoke up. But when Russia invaded Ukraine in 2022 I said to myself "well, now I am part of the West which let those things happen and this time it'll be different" and I donated to Ukraine as much as I could. And now that these fascist billionaires created these algorithms, these tools to further the rise of fascism in the United States and elsewhere how could I not speak up? How often do you get the chance to resist fascism without risking bodily harm? If this makes me unemployable in tech then so be it. That's nothing compared what any usual form of resistance entails.

Edit: sources in this comment, another writeup, yet another blog post.

dww’s picture

@aporie:

it's better to come with solutions than problems

https://git.drupalcode.org/project/drupal/-/merge_requests/14932/diffs is a solution to a big problem. You apparently don't agree or acknowledge the problem, but that doesn't mean you can ignore the proposed solution.

I think the question is more about, who cares?

Apparently not you. Sadly, not nearly enough of the tech world. But I think a lot more people care than the sloptimists, VCs and tech oligarchs would have us believe. Guess you didn't read my comment at #3570498-34: AI tools for contributors and maintainers (linked above). I highly recommend you do so. If you won't read that, I request you at least read the linked blog:

https://www.garfieldtech.com/blog/selfish-ai

Maybe then you'll care. If not, I have neither sympathy nor respect for you and your point of view.

aporie’s picture

Your account is a fake, mine is not.
Your questions are irrelevant to the mighty god of earth :))))
When you'll be ready to talk right, I'll be waiting for you on #ai-and-ethics on slack.

aporie’s picture

Ok, my previous message was for @ghost of drupal past.

But guys, will you be honest about your positions right now? Are you just devs who got fired and preaching for your own house?

I mean, if you're dealing with revange of being screwed up by progress out of people who don't understand it, you have to mention it!

I've been screwed up! I try to stay objective. Now don't bring me your slope brain farts trying to convince me before was better!

volkswagenchick’s picture

This discussion appears to include escalating emotions, creating the opportunity for miscommunication. The invested parties are encouraged to take a break from this discussion to help gain perspective. It is important to the community that all members are shown the appropriate amount of respect and openness when working together. Additionally, there are resources offered by the Drupal community to aid conflict resolution should those be needed.

For more information, please refer toDrupal’s Values and Principles of seeking first to understand, then to be understood. We ask to please suspend judgment until you have invested time to understand decisions, ask questions, and listen. Before expressing a disagreement, make a serious attempt to understand the reasons behind the decision.

This comment is provided as a service (currently being tested) of the Drupal Community Health Team as part of a project to encourage all participants to engage in positive discourse. For more information, please visit https://www.drupal.org/project/drupal_cwg/issues/3129687

dww’s picture

I had a very angry response I was going to post, but thanks to @volkswagenchick's comment, I've shelved it for now. I'll summarize with: No, I haven't been laid off. I'm still working. And the irony of @aporie accusing me of hating on LLMs for my own selfish reasons is quite baffling.

Meanwhile, here's yet more evidence of LLM coding gone wrong. This time, bringing down various aspects of amazon.com itself:

https://fortune.com/2026/03/11/elon-musk-amazon-outage-ai-relate-inciden...

The e-commerce giant held a mandatory meeting on Tuesday for a “deep dive” into multiple outages, including some as a result of the use of AI coding features, the Financial Times reported, citing internal briefs and emails. According to the outlet, Amazon said there was a “trend of incidents” in the past few months with a “high blast radius” and relating to “Gen-AI assisted changes,” as well as other variables.

Earlier this month, Amazon’s website and shopping app were down for some users, with more than 22,000 users reporting an issue, according to outage tracker Downdetector. Customers were unable to check out, view prices for goods, or access their account information. At the time, Amazon said the outage was a result of “a software code deployment.”
...
“Amazon is holding a mandatory meeting about AI breaking its systems,” Olejnik wrote.
...
Dave Treadwell, Amazon’s senior vice president of e-commerce services, reportedly wrote in an email that the team’s weekly “This Week in Stores Tech” (TWiST) meeting would in part be used to implement additional guardrails on how AI is used by engineers, including requiring more senior engineers to sign off of AI-assisted changes made by junior and mid-level engineers.

You know, because people who actually understand big complex systems are irrelevant now that LLMs can write code. "Anyone can do you fracking job!" Sure. Good luck with that.

phenaproxima’s picture

I would simply say that any software engineer who thinks AI can do their job...probably shouldn't be a software engineer.

AI can be a very useful tool, and I think I would be lying if I said it wasn't. But it cannot replace a software engineer any more than an impact driver could replace a carpenter. If C-suite folks think otherwise, they are fools. (But that's a whole other rant.)

I know how to use an impact driver, but that doesn't make me a carpenter. "Anyone can do your job" is an arrogant, ignorant, and rude statement in pretty much all cases.

scott falconer’s picture

Policing input to ensure quality of the output feels like an anti pattern. That’s like aiming for a high standard of writing by banning keyboards.

I agree that one of the value propositions of open source and Drupal is that it can be trusted. We can still set that standard even as tools change. I’d recommending shifting this conversation to how we accomplish that.

dww’s picture

At the risk of repeating myself, if the only reason to hate LLMs and ban their use for contributions to Drupal Core was the poor quality they produce, I'd agree that we could attempt to solve that problem by other means. But that's not the only (or even primary) reason to hate them. Since it's hard to get anyone to read and comprehend anything longer than a paragraph these days, I'll summarize:

  1. Environmental destruction:
    • Unchecked burning of fossil fuels and a refresh of the nuclear power industry to satisfy exploding demand for electricity. All pledges of "carbon neutrality" are totally out the window now.
    • Vast quantities of water being used to cool all those data centers, instead of growing food or providing clean drinking water
  2. The training of LLMs is based on vast quantities of very underpaid "digital sweatshop" labor, mostly in the "global south". Yes, humans do this work, it is not fully automated.
  3. Impending economic collapse.
  4. Supporting a technology built by and for fascists. Mass surveillance. "Autonomous weapons". Etc.
  5. Total disregard for copyright, license, etc. The LLM scrapers have stolen all of human creativity and then glue it back together as if creating "original works".
  6. ...
ghost of drupal past’s picture

was the poor quality they produce, I'd agree that we could attempt to solve that problem by other means

Even this is extremely dubious. I have reviewed a few patches here and there for Drupal core this last 22 years and professionally too and I am extremely concerned about this. Aza Raskin called LLMs a zero day on human cognition and I couldn't express my concern better. In other words, these systems produce plausible code and our very brains tend to accept plausible as correct. It is very literally a bigger mental effort to combat this, Thinking Fast And Slow by Kahneman has heaps on this. So yes we could attempt but would we succeed? Maybe in the beginning but again at what human cost? And then LLMs can produce much more code than our limited human resources can review them. (And no, using another to do the review is not a solution, that ought to be obvious.)

scott falconer’s picture

@dww I did read every comment you posted above and understand your concerns, but the specific proposed resolution in this issue is to "Declare the project to be free of slop as a mark of quality and be loud about this.", which is the point I responded to.

One of the problems with AI related threads is they tend to spiral out to many different topics, which makes it hard to move forward with actionable steps. Having a conversation about how to ensure quality in Drupal in the world of LLMs is a very real conversation we should have, and I want to make sure it doesn't get lost. This doesn't mean all of the other concerns aren't also important, but to keep the conversation focused can we branch those into their own issues/threads?

ghost of drupal past’s picture

, but to keep the conversation focused can we branch those into their own issues/threads?

You want me to open a thread for banning LLMs because they are fascsist, another to ban LLMs because they were created by exploiting people, another to ban LLMs because...? I mean, I am game but isn't that slightly ridiculous?

dww’s picture

We’ve got 1 “solution” or “actionable step” proposed here. We’re providing every possible justification for that actionable step. It’s not an issue scoping problem that LLMs are a disaster for lots of reasons. No, we can’t untangle all that and approach each disaster separately. That’s neither a good way to convince the core committers to take this particular step, nor an effective way to make sense of the bigger problem.

Trying to just look at the code quality aspect, ignoring the environment or fascism or mental health or theft or exploitation pieces is another example of what’s wrong with how people are thinking about “AI” in general. As if the only problem is one’s own personal preferences, how effective or not it’s making any individual “coder”, etc.

dww’s picture

P.s. re #40 completely agree. I could have made “could attempt” more strongly worded (or at least thrown in some em or strong tags). But that wasn’t the point I was trying to make in #39. 😅 I 💯 would add #40 to the (growing) list of reasons to do this. And the addiction/gambling piece from #16. And…

scott falconer’s picture

@ghost of drupal past I want us to be able to have a productive conversation and the scope of the issue is too large to for it all to be in one thread, so yes some should be issues, some may be blog posts, some are wider policy discussions etc. They are all important conversations.

My concern is that calling for a blanket ban without a sincere attempt to solve the fundamental problems we see doesn't move us forward. We as a group have the ability to help guide the conversation and build the world we want. Can we change everything that's in motion? No, but we will have an impact and it's fair to discuss what that impact might be. As-is though this conversation feels like a focus on what we can't do rather than what we can.

As to what we could file, an issue on something like "I want to run Drupal free of LLM generated code" feels like it could be a productive and interesting conversation, i.e. what mechanisms for provenance are available? How can we guide someone to the most recent version of Drupal that we can ensure does not have LLM generated code in it? Does it extend to the whole stack? Are there similar conversations happening in in Symfony, etc? We might even get something out of it that's like the GPL for the world we see coming, i.e. something that can ensure human provenance all the way down.

dww’s picture

To be extra clear, the “1 actionable step” I’m referring to in #43 is merging this and cherry-pick’ing to all supported branches of core:

https://git.drupalcode.org/project/drupal/-/merge_requests/14932/diffs

dww’s picture

X-post. Re #45 — will #46 solve all the problems with LLMs? Absolutely not. Will it even prevent people from continuing to try to contribute to core with LLM-generated MRs? Nope. But at least it’s an actionable step, a start, and something we can point to moving forward.

scott falconer’s picture

I'm a -1 on merging that MR, outside of my personal feelings on the matter, the MR doesn't resolve the problem statement in the issue summary. For this issue, we still need a resolution that directly addresses the problem of Drupal being "free of slop as a mark of quality". I feel the current CI pipeline and code review process already solves for this, though the impact of LLMs on maintainer overhead is a critical discussion we should continue to have.

scott falconer’s picture

And just to give some perspective on why I'm advocating against this change, i.e. "my personal feelings on the matter,". I have a child with a cognitive disability. I am hopeful that tools like LLMs will allow them to shape software based on their specific needs as that opens up a world of possibilities for them that would otherwise be unavailable.

ghost of drupal past’s picture

My concern is that calling for a blanket ban without a sincere attempt to solve the fundamental problems we see doesn't move us forward

We can not solve the fundamental problems. That'd require trillions of dollars if not tens of trillions. We can only try to keep this project from the harms.

As for #49, https://mastodon.me.uk/@pikesley/116096446618575261

My "No, your use-case does not negate all of the horrifying externalities" t-shirt has people asking a lot of questions already answered by the shirt

aporie’s picture

I'm sorry I wasn't available the past day to continue this discussion. I'm very glad @scott falconer joined it, to recenter it to what it should be about (the ticket description).

@dww, I'm sorry you understood my message as a personal attack, most of my previous messages were attempts to:
1 - Show to @ghost of drupal past, that Slack was a better place to express feelings like that (judgmental, subjective and out of scope of the ticket). Hence my 3 posts in a row, whereas I could have just edited my post.
2 - Be voluntarily provocative to clear out some eventual underlying fears to "hate LLMs", but also to point a reality, hence my questions on our respective situations. I've been on the market for some time now, which you may have no experience of, if you're still working for a client or an employer, and I can totally tell you that this statement "We suck as developers! We totally do. Forget your diploma, forget everything you stood for." is how the market sees us today. That is of course not my personal opinion and was playing devil's advocate (and in 2 clicks to my personal website you may have noticed it).
3 - I repeat and +1 @scott falconer, topics about AIs are sensitive, because they impact us, our workplaces, and our way of work, deeply. Hence, "arguing" about it is part of how we can all work together to try to find good solutions for the Drupal project. But tickets are not where we should do our dirty laundery IMO. Maybe slack is not too ... But to me it's more informal and discussions get burried under others. Here we were writting lines and lines of "human slope" (pun intended) unreadable for anyone who would like to join the discussion of the ticket purpose (except by using an AI to create a summary, second pun intended).

To get back to the main purpose of this ticket and trying to stay constructive, I do not think the Drupal project should ban the use of LLMs for the following reasons:

- We have no way of identifying what would be an AI contribution in comparison to a human one for small contributions, bug fixes, one feature addition, code modernization etc...
- If some clients of Drupal may find the idea of banning LLMs for their products (for plenty of reasons) a good solution, they can still rely (and that's what we have to work toward) on the trust of Drupal code (core and contrib, AI generated or not), because us, contributors vet this code.
- We shouldn't forbid new commers to join the Drupal project. And if someone publish an entire module which has been vibe coded, we should find a way (on a voluntary basis) to display that to the site builders, probably on the module page. The vibe coders of today are maybe the developers of tomorrow, meaning vibe coding is an entrance door which may lead to learning deeper how the code work, in term of security or debugging.
- In the end, site builders are always responsible for the code they use, being drupal dependencies or third parties. So, yeah, in reality we don't have time to check every dependencies ourselves and rely on the trust from the open source community and IMO, like scott falconer said, that's the only thing we should work toward. How we keep the Drupal project trust even with AI contributions.

We should not forget that our interest is to make Drupal great for clients to adopt it. We do not get to chose, if we dislike something, to ban it because of whichever reasons we may find relevant to our personnal opinions. Drupal is a product and we should work for it to stay a product people want to use.

catch’s picture

I feel the current CI pipeline and code review process already solves for this, though the impact of LLMs on maintainer overhead is a critical discussion we should continue to have.

Let's have some of that discussion here then. Some examples:

1.

A 5,000 line vibe coded MR against Canvas that re-implemented the image styles system as if it didn't exist #3515646: Add automated <img srcset> generation / https://git.drupalcode.org/project/experience_builder/-/merge_requests/822 posted by a director at Acquia.

I count five core committers responding to that issue, not in their capacity as core committers but these are the same individual people with finite time.

I was the first person to reply to the issue, realised it was vibe coded and refused to review it beyond that point. If I refuse to review thousands of lines of slop is that 'Policing input to ensure quality of the output' or fair enough?

The work was eventually restarted from scratch.

2.
A 4,500 line vibe coded MR against Canvas that re-implemented some of the functionality of Views #3515399: [MR generated with AI] Dynamic List Component https://git.drupalcode.org/project/experience_builder/-/merge_requests/8... also posted by a director at Acquia.

In this case, @longwave, another core committer, actually tried to review the MR. The initial review extended to 121 comments #3515399-8: [MR generated with AI] Dynamic List Component.

Perhaps this is an example of "the code review process already solves for this", certainly not 'policing at source', but how is it a good use of @longwave's time to review a 4,500 MR if the 'author'? 'co-author'? of the MR did not spend the effort to find a single one of those 121 issues?

That issue is currently 'needs more info' with no recent progress.

3.
A very long, very plausible, very detailed bug report about the interaction between Twig rendering via output buffering and Fibers. #3574746: PHP output buffers leak between Fibers in Renderer::replacePlaceholders(), causing swapped block HTML output

This looked a like a serious, obscure, and hard to find bug so a core committer posted it into the committer slack channel to give people a heads up.

I had a quick look at it, and recognised it as a duplicate of a bug I fixed six months ago already: #3546376: Use the 'yield' option instead of output buffering for twig rendering to support async rendering.

Duplicate bug reports are not that unusual, but what is strange is that #3546376: Use the 'yield' option instead of output buffering for twig rendering to support async rendering was fixed before any code was committed to core that would have triggered the bug, e.g. it was a pre-emptive fix because we caught the bug in other in-progress issues before it got anywhere near a tagged release. There is a very tiny chance that someone managed to write some custom/contrib code that would expose the same problem, but this is extremely unlikely because even the code that was in core by this point had zero integrations in core or known ones anywhere else. Also sometimes people experience real bugs but misdiagnose the cause.

However given this disclaimer at the end of the issue summary:

This issue was analyzed and drafted with the assistance of AI. The root cause investigation, debug tracing, fix implementation, and issue write-up were conducted collaboratively between a human developer and AI.

it seems more likely that someone pointed an LLM at core and asked it to generate a bug report. I asked for clarification a month ago whether the user had ever actually reproduced an actual live bug and there has been no response.

Even though I was familiar with the area of core under discussion etc. I still had to read and understand the very long bug report to realise it was reporting the thing I'd already fixed.

So that's another issue where two core committers got involved, in what as far as I can tell is a completely phantom report.

If we're not going to ban LLM contributions, there will be more and more examples like this. Drupal has not yet seen the plague of high volume vibe coded 'contributions' that are affecting projects on github, probably because there is some friction in registering on Drupal.org etc., but even the three examples here wasted hours of people's time that could have been spent doing something else. And there are active elsewhere in the queue to make this easier for people not harder.

In the meantime despite 224 commits to core in the past month, there are still 95 issues in the core RTBC queue, so we are not short of actually viable code to review.

scott falconer’s picture

@catch thank you for pulling those, they’re great examples.

My read of them is that the review process prevented low quality code from getting in, but at the cost of undue burden on others. That’s absolutely something we should get ahead of.

ghost of drupal past’s picture

My read of them is that the review process prevented low quality code from getting in, but at the cost of undue burden on others. That’s absolutely something we should get ahead of.

But you refused to believe this when we already, repeatedly said so and needed to waste the time of a core committer to write a small essay of more than 700 words.

We do not get to chose, if we dislike something, to ban it because of whichever reasons we may find relevant to our personnal opinions. Drupal is a product and we should work for it to stay a product people want to use.

Not personal opinions. What you call opinion are facts. I tried to source every claim, if I missed any let me know, I have them. You try to assert that somehow allowing LLM contributions would make for a better product. The exact opposite seems to be true, so far both inside and outside of Drupal. If and when LLMs have a breakthrough then the ban can be revisited but I am quite sure there can't be one.

To quote https://www.drupal.org/about/values-and-principles

Drupal has such a large impact on the digital landscape that our community cannot afford to be careless.

What would you call submitting code you can not vouch for but carelessness? It's a fairy tale that people who use LLMs would somehow do a through review of said code. If they were to do so the end result would not even be recognizable as one. There's no doubt people will continue to submit slop to core, this issue is establishing a policy where those can be immediately closed and in extreme, repeated cases the submitter barred from submitting any more of that. But there's zero doubt people will not stop using LLMs maybe for parts etc -- we can't stop that.

kentr’s picture

Another way to frame it is that it's also about preventing low-quality code and bogus reports from being submitted.

Sadly, any system with incentives (like issue credits, or maybe pay increases based on volume of contributions) will motivate some people to abuse the system or even take well-meaning but misguided shortcuts.

nicxvan’s picture

I have opened this to comment so many time since it has been opened and could never quite find the words, but I feel I need to just put my thoughts to paper so to speak.

I strongly believe LLM contributions to Drupal core should be banned at this time for many of the reasons already stated.

There is a saying, come for the code, stay for the community, LLMs as they stand erode both of these pillars of Drupal.

There are bottlenecks to Drupal core velocity, writing code is not one of them, good quality reviews and committer time are the two biggest, LLMs erode both of those as well at this point.

It's a new technology, the burden of proof of utility is on the new tool, as catch pointed out there are multitudes of issues as examples.

You also have to think about the risks and mitigations of both choices.

1. We choose to ban LLM contribution and later decide we were wrong, the mitigation is to unban it and start working with LLMs, if someone is going to use LLMs to write a Drupal killer, Drupal adopting LLMs today will not prevent that.

2. We choose to adopt LLMs full scale, and later realize we were wrong, there will be potentially thousands of lines of code to dig out of core adding more to the burden of maintaining core.

As catch said, the barrier to contribution has been a blessing in disguise for this current wave of low quality open source contribution. Open source projects are completely closing down PRs and contribution on Github, that is not a sign of a healthy technology for open source contribution.

To be clear, I have had some clients adopt LLMs to success and some adopt them disastrously. Core would not be a good application of LLMs.

As has been implied banning it won't stop people from using it, but having a direction that says not to use it is a clear sign and gives a clear policy decision to ban people that clearly abuse the policy. As has been pointed out at the moment we even struggle to get people to follow the current guideline with is to disclose usage.

Finally, I'd say, a large part of the reason Drupal is so trusted is the core gates and committers time to review, the only way to truly gain on that front is to trust LLMs to write and self review, and as shown we cannot rely on LLMs for the first draft without human review, which many times doesn't happen with these contributions.

Just for additional info, the clients that have had success use it for prototyping, parsing, searching and translation with review by human. The clients that have had poor results have been vibe coding and largely trusting the results. Again I don't see a space for this in core contribution for the forseeable future.

scott falconer’s picture

But you refused to believe this when we already...

@ghost of drupal past at no point have I refused to believe that low quality contributions are a major concern, and I have mentioned it as a very real concern in many threads, including this one:

though the impact of LLMs on maintainer overhead is a critical discussion we should continue to have.

That concern is one of the main reasons I jumped in this thread, it's a real issue that needs solved, but I do disagree with your proposed solution above, in part because the solution as written does not seem like it would be effective.

scott falconer’s picture

You also have to think about the risks and mitigations of both choices.

1. We choose to ban LLM contribution ...

2. We choose to adopt LLMs full scale..

@nicxvan I don't think we need to limit ourselves to just those two choices. There's a lot of ground between "ban" and "full scale". i.e. I'd 100% support an expansion and enforcement of guardrails around LLM use and disclosure (including barriers to LLM contributions).

cainaru’s picture

For the record, what @catch brought up in #52 is exactly why I’d floated the idea of a potential “happy-medium” in #23 above. @catch is absolutely correct:

Drupal has not yet seen the plague of high volume vibe coded 'contributions' that are affecting projects on github, probably because there is some friction in registering on Drupal.org etc., but even the three examples here wasted hours of people's time that could have been spent doing something else.

aporie’s picture

Thanks @catch for this very insightful addition. I'm totally not aware of how the core team has been under requests of reviewing partial revamp of entire parts of drupal core.

There's a lot of ground between "ban" and "full scale". i.e. I'd 100% support an expansion and enforcement of guardrails around LLM use and disclosure (including barriers to LLM contributions).

I also totally agree. Could we think of a system where, like we have for "security coverage", we grant developers a role of "AI assisted coder" to be able to publish "slope / mass revamping" with AI assisted tech? Meaning, they are trustworthy of at least self-reviewing their code?

From what I read, it seems the issue is more the abuse. We should maybe implement (but again relying on d.org infra and I was just reading a post from Dries about the cost which are hard to covers), some kind of AI generated code detector. It could then tag the ticket with different level: full vibe coding detected, genAI detected, human code detected. At least on 5000 lines contributions and big revamp, this could help the core team quickly identify where to spend time. Also, I have no idea if such kind of tool already exist and how efficient they are, from a quick search I wasn't able to find anything not proprietary at least.

From where I stand (contrib contributor more than core one), there is a distinction that seems obvious to me, it's the distinction between "core" and "contrib". I mean, anyone who just come up with a total revamp of one of my contrib module, I'd suggest him to create a new module, where like I suggested before we should mention it's AI generated.

The overall idea is to invent a system where we can co-live with AI (this sounds weird said like that^^).

catch’s picture

From what I read, it seems the issue is more the abuse. We should maybe implement (but again relying on d.org infra and I was just reading a post from Dries about the cost which are hard to covers), some kind of AI generated code detector. It could then tag the ticket with different level: full vibe coding detected, genAI detected, human code detected.

That's even more for people to review then, you need to look at the output of the tool, and it could be wrong, so you then need to evaluate whether it's correct or not. Just adding even more work.

At least on 5000 lines contributions

We generally do not accept 5000 MRs to core point blank, there are two exceptions:

1. MRs generated with rector to mass apply the same change in a predictable way, examples would be the OOP hook and annotation -> attributes conversions. These are usually pre-agreed and decided anyway, they don't just show up out of the blue.

2. Deletions of deprecated code where everything is a deletion anyway (although we still have to review those to make sure they're not deleting e.g. valid test coverage).

3. Additions of new experimental modules or themes, which have previously been worked on in a contributed project. Recent examples would be the admin theme, package manager. Older examples are media, views.

However #3 is where vibe coding can potentially end up 'laundered' into core, if the contrib project is accepting huge MRs like that, which then become part of the core MR months or years later.

jurgenhaas’s picture

@nicxvan makes an important point: once low-quality code sneaks into the code base, it's too late. The same applies to privacy: once your social security number is out there, you can never fully recover from the damage.

However, the situation is not black and white. There's a lot of middle ground worth exploring. To do so, let's take a closer look at the problems we're facing:

  • Limited and stretched resources, especially among core maintainers
  • High expectations (and requirements) for issue reporting and code quality
  • The desire to attract more contributors

Various initiatives have already been addressing these issues, including the Bug Smash Initiative and the highly successful mentoring efforts at conferences, meetups, and online.

We should explore if and how new tools (such as LLMs, but not limited to them) can help us accelerate these efforts and resolve the above issues. Let me outline a few ideas I have in mind:

  • Let's educate ourselves and new contributors on how to use modern tools to analyse problems, describe them in issues, and write excellent code and tests to resolve them.
  • Build and apply filters for bug reports and MRs to ensure they only reach maintainers when they meet the necessary criteria - similar to the successful approach already adopted by @smustgrave.
  • Use the filtered contributions as additional learning material on how to improve.

We neither want to blindly trust LLM-generated contributions nor should we push back on them entirely. Let's utilise them to our benefit, as we've done in the past with other tools.

In my view, LLMs are in no way magic (or even intelligent) - they are tools that I use on my own terms to do a better job. But what comes out of it, and what I submit to upstream projects, is and remains my own responsibility. I will be held accountable for all of it, whether written manually or with the help of tools. That shouldn't matter.

catch’s picture

Here's another example.

My employer gave us 10 hours funded time to try contributing to open source using Claude. I don't use LLM coding tools, but I thought I should try them so I could usefully talk about having done so in issues like this. I picked #3546376: Use the 'yield' option instead of output buffering for twig rendering to support async rendering because I had been stuck on it for a while, and I knew it would need test coverage, and test coverage seemed like the least harmful thing to point it to. I also committed myself to reviewing all the code it produced etc.

I got the LLM to scaffold some test coverage (a custom twig template), and without the LLM involved, was able to work that scaffolding towards a failing test for the bug. Getting excited about having both a fix and a failing test, I pushed the code to the MR. Very quickly, @godotislate had already reviewed the new test (before I'd gone back to clean it up), and pointed out it was creating an entire test module etc. to define the template, when an inline twig template would have been fine. Switching to an inline twig template removed hundreds of lines of test boilerplate from the MR with zero loss of test coverage. Had I not used the LLM, I definitely would have got bored writing that boilerplate, and even if I'd not thought about an inline twig template up front would have switched to it very quickly. So even 'responsible' use of an LLM can very easily result in bloating the code base, because it makes it much easier to write that bloat. None of the LLM code made it into core at all, because the one thing it was 'useful' for, it wasn't, but it did mean essentially that the work happened twice. It would of course be possible to not think of the inline template idea without the LLM and still write the unnecessary boilerplate manually, but it would have been less likely.

joachim’s picture

> We neither want to blindly trust LLM-generated contributions nor should we push back on them entirely. Let's utilise them to our benefit, as we've done in the past with other tools.

That's not a valid comparison. All our other tools are deterministic -- if we use PHPCS, Cspell, PHPStan, etc, they always produce the same results for the same input, and that's why we CAN trust them.

If people feel that writing code takes too much time and they want to use LLMs to cut corners and write the boring boilerplate, then the problem to fix is that we have too much boilerplate. I suggested we look at this way back in 2019 (#3027683: Boilerplate reduction initiative) but it got no traction, because people are too busy haring after what's new and exciting and trendy.

jurgenhaas’s picture

That's not a valid comparison. All our other tools are deterministic -- if we use PHPCS, Cspell, PHPStan, etc, they always produce the same results for the same input, and that's why we CAN trust them.

I didn't suggest trusting the new tools. I'm not doing that for older tools either.

If people feel that writing code takes too much time ...

That's not really where we could find benefits. In fact, it's not all about coding in the first place. I touched on a few other areas as well:

  • Problem analysis
  • Issue reporting
  • Quality gate keepers, again for both issues and code
  • Learning and mentoring

And then, when it comes to code generation, I can talk from my own experience, as a project-manager or as a solution architect, I put measures in place to review the code generated by my team as well. The same applies to the output of assisting tools that somebody uses, either team mebers or myself.

scott falconer’s picture

That's not really where we could find benefits. In fact, it's not all about coding in the first place.

Fully agree @jurgenhaas.

This is where hard blocks like the inclusion of https://github.com/lobsters/lobsters/blob/main/AGENTS.md can really set us back. I tend to use LLMs for tasks like the following quite often: "Read this issue, set up a local environment, run the automated tests, write some of your own smoke tests, then give me the url where I can manually test."

In this case, there is no intention that the smoke tests would ever be committed, but they often catch the sort of random things someone poking around on the site would find. If the lobsters/AGENTS.md were included that would no longer work. That then puts me in a weird position as someone who wants to responsibly contribute using LLMs. You very likely end up with a situation where responsible contributors are blocked, so you only end up with contributions from those who ignore the rules.

If people are interested, I'd be happy to start work on an MR that attempts to create an AGENTS.md file that guides towards responsible contributions (and yes I would use AI to help draft this).

joachim’s picture

It seems we need to point out #39 again -- the ethical problems of LLMs.

scott falconer’s picture

It seems we need to point out #39 again -- the ethical problems of LLMs.

@joachim, respectfully the proposed resolution in this issue is "Declare the project to be free of slop as a mark of quality and be loud about this."

This is not to say the ethical / environment / existential concerns should not be discussed, but if we attempt to solve it all in one issue we'll never get anywhere.

joachim’s picture

Right, but the ethical concerns (and the environmental concerns are ethical too) are such that I don't think partial AI use is acceptable. I don't think enabling AI use for analysis and issue reporting is acceptable.

catch’s picture

@scott falconer we already had an issue discussing adding an agents.md file to core, you can read the whole discussion there #3568936: Embrace the chaos, add a couple of AGENTS.md file to core.

In this case, there is no intention that the smoke tests would ever be committed,

If that's the case, why did you add them to #2835545: Provide a Workflow FieldType that references a workflow entity and tracks the current state of an entity?

scott falconer’s picture

@catch re #3568936: Embrace the chaos, add a couple of AGENTS.md file to core, I participated in that discussion and it was closed without resolution, hence my offer.

re #2835545: Provide a Workflow FieldType that references a workflow entity and tracks the current state of an entity, the tests there are not the ad hoc smoke tests I'm discussing. Those commits are actual test coverage which I felt would be of benefit to add, all of which were manually reviewed line by line and tested by myself. i also disclosed in the comments that AI was used. Can I ensure that they're perfect? no. Can I confirm that I spent time manually reviewing and testing by hand before contributing them, 100%.

ghost of drupal past’s picture

Issue summary: View changes

@joachim, respectfully the proposed resolution in this issue is "Declare the project to be free of slop as a mark of quality and be loud about this."

That's very easy to change. I wrote the issue summary and now I changed it so it now contains the ethical concerns. I thought it'd be easier to accept it on code quality grounds but apparently that's not the case.

ghost of drupal past’s picture

Issue summary: View changes
cainaru’s picture

Another potential option for consideration could be a moratorium (i.e., temporary, and to be reviewed periodically as things continue to evolve in real-time) for accepting LLM-generated code to the Drupal core project (see comment #23 above) with a carve-out exception for trusted roles such as core committers/core maintainers, and the security team in the meantime.

With great power comes great responsibility.”

It would mitigate overwhelm on the core maintainers while also serving as a bit of role model behavior demonstrating to the community how these things can be used thoughtfully and in ways that won’t make our core maintainers want to pull their hair out or ragequit and chuck their laptops into the nearest body of water.

(Note: I do realize this doesn’t address the ethical/economic/ environmental/legal/political/social concerns, which I do wholeheartedly, to my core, believe are extremely important and must not be ignored. We need to be transparent about that. These are big issues that do matter to a lot of people here, including me. Also, these issues are also why many of us — myself included — are either avoiding LLM usage or keeping it to a minimum… think of it as voting with our feet and/or our wallets, so to speak, until things change or something far less problematic comes along.)

scott falconer’s picture

@cainaru I’d be all for that. The carve out could also include required training/acceptance on responsible use, ethical/environmental concerns, harm mitigation etc.

aporie’s picture

Right, but the ethical concerns (and the environmental concerns are ethical too) are such that I don't think partial AI use is acceptable. I don't think enabling AI use for analysis and issue reporting is acceptable.

I'm really not sure it is of Drupal to take such a political decision. I mean, I understand the environmental concerns and all, but then why not fighting against the return to office trend, or trying to reduce our commit numbers to prevent triggering unecessary pipelines. I really think, this ethical concerns which diserve to be discussed at a societal level should not stop Drupal from accepting the AI shift. I'm not sure Drupal itself can pull the political effort to impose its stance on the ethics of AI. The risk is that clients (with less ethical concerns) just leave for other solutions. To me, it's not to the Drupal project to make such risky decision, but to politics to do their jobs (if they see fit, which I don't think ^^).

Regarding contrib to cores, noting that 5000 lines contrib are usually de-facto sided away, I got another idea. Way easier to implement and cheap:

- We could just add a field to tickets which becomes mandatory if a contrib is more than (to be defined) ~200 lines of codes.
- We just query the gitlab API, check how many contributors are part of the change, well, basically the magic is to be defined and tailored, but when reached, the field becomes mandatory.
- The ticket is put on hold (maintainer request more info), until the user (one of the pool if several), add a video (up to 5 min) explaining the code. This ensure that: 1- even if vibe coded we can quickly assess the contributor has been doing the review job and know what he is committing. 2 - Quickly assess is a ticket is worth investigating. 3 - Filters tickets for which no-one ever add a video (which may comes as a double edge sword). 4 - Identify abuse, someone obviously just reading an AI generated answer of "explain that code to me" and recording it, is discarded as "Maintainer needs more info". Until someone, is capable of explaining the change with its own words.

Regarding the role "AI contributor", I really meant that as a handfull of contributors who can litteraly use AI to revamp entire part of the core. They need to be vetted and basically should only be given to already known core contributors. To make it fair, there could be a screening process to get it or something. But when they put a contrib to core, we know their work is worth reviewing. The rest? We just ignore if doubt it's AI generated.

But again, I mean, the benefits of AI are hard to evaluate. I personnaly was struggling with a stack middleware for my alert_message module to invalidate some caches (and I know other modules are too), and with the help of AI, I switched to lazy_loading (which I was not fluent with) with just one prompt. This is worth taking into account ...

catch’s picture

- The ticket is put on hold (maintainer request more info), until the user (one of the pool if several), add a video (up to 5 min) explaining the code.

So you want to set an arbitrary limit on MR size and then force people to shoot and upload a video if they go one line over that limit?

This would punish completely normal contribution that people have been doing for years.

aporie’s picture

So you want to set an arbitrary limit on MR size and then force people to shoot and upload a video if they go one line over that limit?

Yes. Anyone can today contribute. We need to level up the entry barrier.

dimilias’s picture

Hello all.. I will come trying to simply state out facts and try not to provoke anyone. Please, do not get upset, I am trying to give a few incentives (or if they are even called that).
I will play the devil's advocate with all of you. But before that, I will not get into the politics. Some words are too strong and are not helping. I am not smart enough to judge the politics of the things probably, so I will entirely skip them.

So, what do we have? In terms of whether it will blow up, there are many articles claiming yes. But many said the same for Crypto for example, and the numbers were already crazy there (talking about energy consumed). Seems like the planet is already wasting energy on stuff, and AI is just one of them. Not to say that it won't blow up, just saying that it is not the first. That is not an argument in favor or against. But that is an argument about continuity.

On the comment from @catch, and about the quality of the work, yeah.. I have seen some bad pieces of code as examples. We all have I guess out of the people arguing on this. But I am pretty sure I have missed some as "that code looks fine" which would instead be "Hey, can you rewrite this in this way to be more consistent" which I am sure we missed it being written by AI. You don't know how the other guy is working. If e.g. I am bad at prompting because I just started, and I create needless work for you, but that B guy has been training a lot and you didn't even notice, yeah, that is my job being banned because I am not good at it yet while the other guy thrives because they can tamper with the limits. I say that with all respect, because even if you say "no AI in Drupal", you probably only exclude those who are learning or are clumsy. Most small patches are 2 lines here, 3 lines there, maybe a copy paste test with some alterations. There is no way to catch this as AI changes. And I am saying this being someone who is careful what to submit by AI while others in my cycled are being "noticed" for how they use it.

I don't agree with the argument that core contributors will have more work due to this. IF AI remains in the future, you will have more work one way or another. Either to QA, or to verify and ban accounts. Which, again, I am not part of the governance of Drupal so I don't know if the later is better or not. But the work will be there. I have been having HOURS and DAYS lost managing the security of my servers because of bots that happened to appear at the same time as AI all over the world, and I can tell you, advocating for the ban of AI in one technology will not make the world better. It will make sneaky users more careful but more successful.

About lost jobs (that is the only part of politics I will touch which though, is not really politics). I don't see how this is an argument. AI might be quite unpredictable, but at the same time, we have installed a LOT of tools in our projects so that we can handle them financially and not need 4 teams for a single site. Even D.O. has installed the Drupal Bot to update contrib modules for latest versions. This has taken credits and work away from people that might do it. And you might say "but this is doing it in projects that maintainership is low or non-existent" but it is not an argument in the long run of things. A lot of automations have done so. Of course AI, is causing a LOT more turmoil which can be moderated, but the outcome falls under the same idea.

Of course, there is the argument "does that mean we should blindly accept everything that is dropped against us?", of course not. But history shows that a technology is as ethical and as good as the person using it. But technologies is there. From atomic power, to Identity cards in Greece that have devil's chip (don't ridiculize me for this, this is actually an argument), technology evolves with or without our concent. We have sharp pointy blades at home to cut fruits.. you see my point.

It is weird, to say so, that Drupal CMS comes with an AI recipe and there are members working heavily on integrating AI but we are discussing to not allow code-assisting usage.

Now that I am done with the in favor arguments, and being divided, I would like to point one thing. Unlike contrib modules, which is a bit different as they are small chunks of modules mostly, core is scary complex. I have contributed a bit but nowhere near the experts in here and I must say, whenever I figure that a bug comes from core and not my little custom code, it sends a small shiver down my spine. Because from debugging, narrowing it down, properly explaining it, reproducing it, and writing a patch, and having user A, B, and C tell me mistakes, it is a process that takes too much time and involves many people. But it is still human code with human limitations and I can understand most of them because we all have limits. It makes it a bit more approachable for an average user. Putting aside the cases where AI creates way more problems than it solves in the code sometimes, even if every code piece was working perfect, the code that it produces sometimes is quite not easy to understand or uses sources that are "above my paygrade" and would result in pushing away some users in the very long run.

Furthermore, it turns out that "allowing with declaration of usage" does attract more usage than denying knowing that of course, not everyone will respect it (seeing it in a scientific journal - not the owner - submission system that I manage, where we even see that entire scientific papers are AI generated).

I cannot argue in favor of banning ai-assisting in code, because you will never stop it as it is now. I am not saying it is good, just pointing it out. It is better to moderate it. The https://github.com/lobsters/lobsters/blob/main/AGENTS.md will not do anything either. Because generating code and contributing is different. Anyone can find a small prompt that will override it. Or still generate it and copy paste instead of having AI work on the codebase directly. I would still argue it is good to point it out. My suggestion is that if you want to ban/moderate it, write a strict policy that it is not allowed and can lead to account ban or whatever the punishment might be. But if I am writing a patch, and it is for my site, I will still ask AI to QA it for example. Because it might come up with an edge case I will not see out of the box. One more pair of "eyes" before the commit. I am just being pragmatic here. That is also moderate for users who just don't see the politics the way you see. Not disagree, just don't see them this way. We are not all living in the same country, we don't have the same values (I feel weird calling my thoughts "values"), not everyone is getting paid as others do, and will see these arguments and roll their eyes with a sight and move on.

P.S. I will try to close with a joke in an attempt to lighten the tension. This response was not written or QAed by AI in any way :D So grammatical/lectical mistakes are solely mine and I apologize! I was not a good at writing essays :D Sorry for the long potato :)

catch’s picture

@aporie I've never seen you in a Drupal core issue, but several people in this issue have been contributing for over 20 years. And you're suggesting they should be forced to produce videos in order to do so. At this point I'm wondering if you're deliberately trying to kill the project tbh.

aporie’s picture

@aporie I've never seen you in a Drupal core issue, but several people in this issue have been contributing for over 20 years. And you're suggesting they should be forced to produce videos in order to do so. At this point I'm wondering if you're deliberately trying to kill the project tbh.

Well, I did contribute to the core. But mostly was building workaround solutions in contrib modules to find "easy" solutions to complex core tickets.

That's the way I work (and sorry for not being a core contrib), I work to get things done and working for my clients.

You may have not noticed me, I did noticed you :) And you've been very helpful to me. Why you removed your nice avatar? It was nice this kind of magician, baldur's gate like.

I'm surely not trying to kill the Drupal project, if you have a core team, already vetted, who know each others, just give each others the "AI contributor role" out of the box and that's it.

The rest of us? Screening process.

dimilias’s picture

Yes. Anyone can today contribute. We need to level up the entry barrier.

That is really not a good approach. Seeing a "dummy" (calling the entry level just for the sake of argument) submit something that makes you want to take your eyes out and rejecting them might result in sending away some future genius.
Also, we were all at some point in that level.
There is a reason why I, that will write a full fledged patch, tests, analysis, whatever, and a newbie that can't even argue against me in his remarks and ends up simply RTBCing my patch, get a credit in the ticket just like I do and in our profiles, we just have an "equal credit". It is to provoke them to become better and collaborate, not to make me the king of coding.
I would say that allowing LLMs will result in something like you are suggesting. That many patches will be written naively and will result in having to give credit score to users until we verify they actually work.

In general, sometimes, we take a major decision thinking we have all the cards in our mind but the consequences are way out of our reach.

ghost of drupal past’s picture

I'm surely not trying to kill the Drupal project

If you begin to enact barriers to newcomers of which we already have too few then you will. One of the problems we face here is if LLM rots people's brains then who will become a new Drupal contributor? My proposal to ban LLMs are a faint hope in this spreading darkness the project could attract those who still care. I might be naive.

acbramley’s picture

Speaking purely to the title and IS - how exactly would we ban LLM contributions? I am all for banning large amounts of obvious slop like the examples @catch pointed out in #52, but what about smaller contributions?

I have been experimenting with Claude and Co-pilot using PHPStorm's built in IDE agent and it has really surprised me with how much it can improve velocity on the more mundane tasks.

For example, I recently used it to upgrade a project to PHPStan v2, something that would have taken potentially 1-2 days of hand fixing every error I was able to achieve in a few hours because Claude could easily identify and fix all the new errors.

Another great example was converting Hux hooks to Core OOP hooks. I was able to have Claude write a new Skill.md file, plan the changes and then convert over 60 custom modules to use Core OOP hooks. Yes it got a few things wrong but that allowed me to refine the skill and apply further logic to those fixes. This only took me 2 hours in total where doing it manually could have again taken 1-2 days of very boring/repetitive work.

In those 2 examples, the code changes would be identical whether it was a human or LLM making the changes. How are we supposed to detect that? Are we creating more work for ourselves by having a blanket ban on LLM contributions and therefore having to determine on each MR whether it was written by a human or LLM? Some are obvious, some are basically impossible.

IMO we need something more nuanced than a complete ban, but that brings with it its own complications.

ghost of drupal past’s picture

Obviously we can't detect all of them. The point of this issue is twofold: one, make it easy to outright refuse slop without further waste of time. Two, make a value statement.

For example, to quote QEMU:

Current QEMU project policy is to DECLINE any contributions which are believed to include or derive from AI generated content. This includes ChatGPT, Claude, Copilot, Llama and similar tools.

Note "which are believed".

catch’s picture

Another great example was converting Hux hooks to Core OOP hooks. I was able to have Claude write a new Skill.md file, plan the changes and then convert over 60 custom modules to use Core OOP hooks.

For the core procedural hook -> OOP conversion we wrote a rector rule, ran that on one module, refined it, then eventually ran bulk conversions on modules using rector (+ phpcs fixer etc. on top). The eventual commit was a +25k/-25k diff https://git.drupalcode.org/project/drupal/-/commit/8aeb2ca5992dc3aecfb65...

No one can meaningfully say they reviewed a +25k/-25k diff, but we can say we reviewed each of the steps that eventually produced that diff and were confident that they produced consistent results. While it was possible for the rector rule to have bugs, it would have exactly the same bugs every time. It's not how we would have done it 15 years ago, we'd have had 200 novice issues or something. A very similar approach was used for plugin annotation to attribute conversions.

The automated hook conversion had plenty of limitations - no dependency injection, single monolithic hooks classes per module etc., but we accepted those on the basis we'd rather review those changes in much smaller, focused issues later on.

I would not have felt confident if someone had used Claude + a Skill.md file to produce that diff, nor would I have volunteered to review a +25k/-25k diff or expected anyone else to do so, because I would not trust Claude to produce deterministic output; even when it looks deterministic, it's by accident. So yes I would have refused an LLM contribution like that to core.

Additionally, if we'd done it that way, we'd have ended up with a Skill.md file (even if copy and pasted into the issue summary), which people could use on their modules, as long as they pay Anthropic $200/month for the privilege of something they can do for free with rector.

It's quite possible your Hux -> Core OOP hooks conversion led to a much smaller reviewable diff because the code was already OOP in the first place, and there aren't literally thousands of contrib modules (+ tens/hundreds of thousands of custom ones) using Hux hooks that need converting etc. so I'm not trying to say you didn't save time by using it or that it definitely introduced bugs or anything like that. But what I am saying is that I would not use an LLM to apply changes like that across core and contrib modules, both because I wouldn't trust it, and because I don't think people should have to take out a $2400/year subscription to do things they can currently do for free in a slightly different but more deterministic way.

A hybrid approach would have been to have Claude write a rector rule, then use that rector rule to convert the modules, that is a slightly grey area in terms of the discussion in this issue (it would be fine under @cainaru's 'carve out' approach but not an absolute ban I think).

cainaru’s picture

I want to draw emphasis to 2 important points that @catch made in #86, that should be food-for-thought for all of us here. These are things that I think are too easy to lose sight of in these AI/LLM conversations, especially for many of the folks here who work for big corporations or agencies in Western countries and/or in higher-paying enterprise sectors.

Firstly, on producing reproducible, consistent results each and every time (deterministic results):

No one can meaningfully say they reviewed a +25k/-25k diff, but we can say we reviewed each of the steps that eventually produced that diff and were confident that they produced consistent results. While it was possible for the rector rule to have bugs, it would have exactly the same bugs every time. […] I would not trust Claude to produce deterministic output; even when it looks deterministic, it's by accident. […] A hybrid approach would have been to have Claude write a rector rule, then use that rector rule to convert the modules

Secondly, on subscription costs that put these tools out-of-reach for many:

Additionally, if we'd done it that way, we'd have ended up with a Skill.md file (even if copy and pasted into the issue summary), which people could use on their modules, as long as they pay Anthropic $200/month for the privilege of something they can do for free with rector. […] I don't think people should have to take out a $2400/year subscription to do things they can currently do for free in a slightly different but more deterministic way.

A friendly reminder that we are a global community, and a lot of our developers and site builders do come from countries in places such as the global south where a $2400/year subscription is unaffordable for many (despite these subscriptions currently being subsidized by venture capitalists), and even within Western countries there are a lot of developers and site builders who work in industries such as higher education and non-profits (which are also big portions of the Drupal economy; the clients and customers for many of us in this issue) that have much lower budgets and pay much lower salaries than corporations or agencies (e.g., a $50k-$80k USD salary per year is pretty common in higher ed for a senior developer here in the US in 2026). Even within Western countries, the economics can vary widely (especially here in the US).

(I really, really hope this wide range of economics across the world and even within countries is something the Drupal AI strategy folks keep on their radar, since it could be a blind spot for agency leaders over there and a potential foot-gun w/r/t the clients and customers we serve, but I digress…)

Note: I’m not making a case for a ban based on cost. Instead, I’m trying to make a case for awareness of the cost (see comment #89 for clarification).

dimilias’s picture

and site builders do come from countries in places such as the global south where a $2400/year subscription is unaffordable

yeah, that is why we moderate, not ban. This is not an easy argument you can sustain. Because 1. you do not have the upsides. Like, how much have these people been helped by the free tier version. and 2. this can escalate quickly to a irrational level. PHPStorm has a paid licence, having an M4 max PC over someone that still works on an i5 3rd generation is also a blocker for the second. You don't ban strong machines.
Also, in terms of #86

A hybrid approach would have been to have Claude write a rector rule, then use that rector rule to convert the modules, that is a slightly grey area in terms of the discussion in this issue (it would be fine under @cainaru's 'carve out' approach but not an absolute ban I think).

yeah, that is more or less what I am saying. A clumsy user will get banned because a sneaky will know what to do and have AI create the rector file and push the changes including the rector file. The second guy had the ability to experiment a lot, while the first was given a chance with the free tier and was banned because he didn't know politics..
And sorry, I don't mean to write it as an emotional trap. I am just saying what I said above. Sometimes we think we hold all the cards but the world is vast, and out of our reach many times.

cainaru’s picture

Hi @dimilias, @cainaru here who proposed the “hybrid approach” (as @catch called it in #86). I think you might be misreading my comment in #87; I’m not making a case for a ban based on cost. Instead, I’m trying to make a case for awareness of the cost — that the push toward AI tooling as a de facto norm has equity implications (for both our clients as well as our developers and site builders themselves) that the community (including the project leader, agency leaders, and Drupal AI initiative strategy leaders) should keep on its radar.

Especially since the VC-backed subsidies of the subscription costs will only last for so long.

Hence me framing it as a “friendly reminder”/“food-for-thought” rather than a policy proposal.

Also, long potatoes are fun too 🥔 (regarding the bit of levity you brought in your earlier comment #79 🙂).

dimilias’s picture

I am not assuming anyone's internal thoughts or what they "might imply". Sorry if it sounded like that. I don't say you claim that. In all honest, I don't care if D.A. bans or allows or moderates the AI. It is a business decision and I can go with it. I think people will still use it anyway though. But I would like to have all arguments so that the decision is taken on proper facts. I also don't think there is a correct or a wrong decision here.

dimilias’s picture

Sorry, "I don't care" sounds a bit too harsh. I mean that it is not up to me to decide and there is no good/bad option anyway.. And I did understand your food for thought :) I also use this phrase along with the devil's advocate :)

ghost of drupal past’s picture

dimilias, I might have missed something but banning people outside of your comment is mentioned only twice. Once by me

in extreme, repeated cases the submitter barred from submitting any more of that.

and by nicxvan

a clear policy decision to ban people that clearly abuse the policy

No one will get banned for submitting their first LLM MR. It will be refused with a link to the policy and that's the end of it. If they repeatedly do so then after a while they are clearly spammers and only then will the ban hammer swing. And I expect it'll be a rather high number before any of that happens. This is still the Drupal community, we try to be friendly with each other even when we make mistakes.

kevinquillen’s picture

I don't think an outright ban will stop it altogether although I understand in principle with the arguments and concerns made.

Considering core, there are only so many maintainers to go around and review issues or PRs. I worry that there could be a deluge of requests along the same lines of the years contrib was plagued with 'Update README.txt to README.md' or any other of the dozen similar issues that just made it a slog to maintain contributed modules from orgs looking to rack up credits for ranking. It's along those lines that I agree with #92 - I don't think it can be 100% stopped but there does need to be 100% a deterrent to members or orgs who want to game/pitch slop that wastes time and effort. I just fear that the level of drive-by PRs and issues would occur in such volume that the stellar maintainers we have will just get exhausted and move on, or worse, something is accepted and brings about another Drupalgeddon which takes a long time to recover from.

For every issue someone wastes hours reviewing only to realize its useless is precious time not spent on real, valuable issues elsewhere and that is one of the dangers here. Even if you argued that it could solve hundreds of years-old issues, time and effort is the precious commodity in this community we need to protect with our maintainers and contributors.

yautja_cetanu’s picture

Overall, I think a real good-faith exploration of the ethical issues of AI is good. But the general vibe is that people who are against AI are angry and really only want to solve things by straight up bans. I haven't seen that many people who have strong opinions against AI who are interested in putting serious time and effort into exploring mitigation strategies for the negatives of AI. Because they don't like it, they want it banned.

I do think a serious good faith exploration of the ethics of AI with an attempt to explore approaches to mitigating the negatives is worthwhile. And I have had some people who have engaged with me constructively on this, especially around policies to make it clear THAT someone is using AI.

Environmental destruction:

Unchecked burning of fossil fuels and a refresh of the nuclear power industry to satisfy exploding demand for electricity. All pledges of "carbon neutrality" are totally out the window now.

Is there any precedent for this in open source? Do we have events where we ban cars or we ban people flying on planes? Is it ok if someone uses an LLM trained and powered by data centers in Iceland using Geothermal energy? Do we know that AI is causing this?

For example, the Drupal 7 to 8 migration was quite long and painful. I know of reasonably small rail network sites that took 2 people working full time for almost 2 years to migrate. Have we seen cost analysis of using AI to do the migration (including a percentage of the cost of initially training a model) versus the energy costs for 2 humans, commuting to work, heating an office, doing the migration, etc?

When looking at the environmental cost to AI do we compare it to the environmental costs of humans doing the work? I've seen a few examples in the early days when it came to AI graphic design and art and every time AI was ultimately cheaper on the environment. The counter is that companies like Microsoft are overall using more electricity but that still isn't a good analysis as you need to compare the whole system to the previous whole system.

Vast quantities of water being used to cool all those data centers, instead of growing food or providing clean drinking water

I understand how you may want to use the issue of water to make personal decisions to avoid AI. Similarly people will avoid cars, flights or eating meat due to water consumption. But I don't see evidence that it is common in open source communities to ban individual contributors due to these kinds of considerations.

Why this, but not ban anyone who builds a drupal site selling meat products? Or any agencies that ever provide lunches or meals that involve meat? Or ban any organisation that does any sales whilst playing golf? What about the amount of energy and water used by datacentres that serve video content?

It’s not something people were talking about much until very recently and there are so many wasteful activities we do with water, it really seems like this is a problem people have manufactured because they fundamentally don't like AI, rather than a true worry and if it is a true worry then why not explore mitigation strategies?

I think the concept of Drupal doing its bit to try and reduce the environmental excesses of AI makes sense. In the early days we explored, could we focus on training small language models? Could we try and partner with green datacentres using geothermal and push AI on drupal.org like that? We spoke to individual companies that tried to do things like create mini-datacentres that used the excess heat to warm up homes. But the problem with people who are anti AI is that usually they don't care about this and so it became a wasted effort.

In the Drupal community, we are literally exploring AI strategy and data-centre-building initiatives with organisations like the European Commission. If people wanted to influence this stuff we could actually organise to do so.

This is similar to whether or not you eat animal products such as eggs. If everyone who cared about animal welfare stopped eating eggs, then there is no incentive for egg producers to treat their animals well compared to just selling them for the lowest amount. However, its clear that there are times its better to completely disengage if the ethical issues with say, killing animals are important enough.

If this really matters to people, then its worth exploring if there are other ways of approaching it before it being a bullet point in banning LLMs entirely.

The training of LLMs is based on vast quantities of very underpaid "digital sweatshop" labor, mostly in the "global south". Yes, humans do this work, it is not fully automated.

Again, if this were serious, it’s something that mitigation strategies could address.

However, I think this is an area that needs to be treated with respect and decorum. Whilst we absolutely should punish western companies that cause harm to employees in the "global south", we should push for similar workers rights the Luddites fought for during the industrial revolution, in western companies working with countries that don't have the same laws.

We should be careful of protectionist and colonial attitudes where western companies prevent the global south from engaging and competing with workers in the west. Protectionism around rice, vegetables, meat products, linen can make it tough for countries to work on more valuable secondary economic activities versus just digging things out of the ground.

Are these "Digital Sweatshops" or are these places where the west is providing seriously better employment compared to what they could get? Are we using "Digital sweatshops" as an excuse to prevent people from other countries competing with us?

If this was a serious comment then the flip side should be explored. How many people suffer in the Drupal community due to poor english language and cultural skills? Could clear guidelines in comments for AI to use help people engage more effectively?

Impending economic collapse.

Its far from clear on this one. Every article that suggests it (for example the recent fear for deflation), has articles that have good arguments to counter it and usually people who use this as "anti ai" focus only on one kind of article.

If AI means loads of people get sacked, but because AI sucks, everything gets worse, then those people will be rehired and we won't lose jobs to AI. If we lose jobs to AI because AI can automate things and everything gets cheaper to produce and we have industrial era yearly productivity improvements, then we have more stuff and more stuff that we need is good for the economy. Its not obvious what will happen here.

Supporting a technology built by and for fascists. Mass surveillance. "Autonomous weapons". Etc.

Why on earth would this be a reason for a large open source community to put its head in the sand with AI? Surely if all people who worry about this avoid AI and all people who are happy with this use AI, and AI is actually good, then we will cede ground to a horrific future.

Early exploration of copilot specifically radicalised me towards Drupal AI and open source because the thought that the future of AI is controlled by a small number of large companies primarily in the US is terrifying.

Total disregard for copyright, license, etc. The LLM scrapers have stolen all of human creativity and then glue it back together as if creating "original works".

How are people in open source communities seriously entertaining this argument? I know there is a split between the ideology of Free Software and Richard Stallman vs the more practical linus torvalds and "Open source" but it never got to the place where Open source activists became radical defenders and fighters of intellectual property even if you thought it was ok to use GPL v2 vs v3 because making money is ok.

One of the worst things for me with LLMs is that they are not truly open source because even open weight models cannot publish all their training material for you to download and use because it is copyrighted. What we need is more aggressive attacks on intellectual property so everyone can have the training data for their own models and have the freedom free software gives them.

Fortunately we are able to use proprietary frontier models to produce training data to train smaller models as the chinese have discovered. So it looks like we will be able to have truly open models as the legal state of the output of AI is unclear but it looks like openai and Anthropic cannot own this or stop it outside of Terms of Service.

Further, there are concerns around open source projects and Drupal in specific:

The value in Drupal is it provides a lot of pre built functionality. However, if the perception -- obviously not the reality but -- is that you can prompt an LLM or fleet of LLMs to produce a webapp then why would pre build functionality matter when all of it is generated for you anyways?

This is not a reason to avoid AI. This is a reason to get ahead of it and find out if that "perception" is true or not and show why you need Drupal.

Traditionally open source, especially larger projects have been seen more trustworthy. However, the emergence of OpenClaw is destroying this trust (AI is destroying open source and it's not even good yet, diffusion of responsibility).

This is again a good reason to get ahead of this and find ways we can get the benefits of AI without eroding trust.

Its also a dangerous argument for open source to play with as the whole open software community has had this argument used against them when compared to proprietary software. Why trust Drupal, software where code could be written by anyone? Foreign agents? 14 year olds? Its all anonymous and so we don't know who they are? Why trust software where people can look at the code because it is online and explore any security bugs they come across? Why trust wikipedia over Encyclopaedia Britannica when anyone can edit it? (They literally sued Nature when they found that you actually could trust Wikipedia, they didn't go to court and basically died after that).

There are reasons open source software can be trusted despite it not relying on the approaches to trust that proprietary software uses (security through obfuscation). There will be reasons why AI assisted code can be trusted whilst using different methods.

There are bottlenecks to Drupal core velocity, writing code is not one of them, good quality reviews and committer time are the two biggest.

This is clearly the biggest and most important thing. But until people who are skeptical of AI put serious energy into this whilst assuming AI is coming its not going to get resolved. There is a high chance that the solutions to this problem involving making greater use of AI not less. For example; having AI do initial quality reviews. Creating Agents.md files or skills for writing Drupal comments, AI Summaries in the Issue Queues, AI Triaging. Maybe instead of credits systems for making people accountable to the specific patches and code they produce.

But somehow do this in a way that doesn't make it too scary for contributors who are starting out or who don't speak english as a primary language.

kevinquillen’s picture

cainaru’s picture

From #94:

But somehow do this in a way that doesn't make it too scary for contributors who are starting out or who don't speak english as a primary language.

Just wanted to make note that the issue is regarding LLM-generated code contributions to the Drupal core project; IIUC this wouldn’t affect language translation (see [ some comment I cannot find at the moment that @ghost of drupal past made earlier ] #3570498-40: AI tools for contributors and maintainers):

I have narrowed the sister issue to "Ban LLM code contributions" because these days even machine translation is mostly done by LLMs (alas) and I do not want to bar non-English speakers to use them to help communicating.

Also want to note the “hybrid approach” policy idea (as @catch had called it in comment #86) of comment #74:

Another potential option for consideration could be a moratorium (i.e., temporary, and to be reviewed periodically as things continue to evolve in real-time) for accepting LLM-generated code to the Drupal core project (see comment #23 above) with a carve-out exception for trusted roles such as core committers/core maintainers, and the security team in the meantime.

scott falconer’s picture

Just wanted to make note that the issue is regarding LLM-generated code contributions

@cainaru The proposed solution in this issue of a hard block via agents.md would block much more than just code contributions, and would block it only for responsible humans and agents that adhere to it. It is the technical equivalent of a Disallow: / in a robots.txt file.

dimilias’s picture

I might have missed something but banning people outside of your comment is mentioned only twice

@ghost of drupal past no, you have not missed it, sorry, I was using it as an exaggeration, and a distinct possible consequence, to have a basis to talk to. It was not a suggestion, rather than a way to compare thoughts.

#94 is +++ for me.

ghost of drupal past’s picture

  1. I am not against AI. I am against LLMs. The distinction is important because we had very useful and well working machine learning apps before the rise of this blight. And LLM proponents often deliberately do not make this distinction so they can point to actually working applications and claim LLMs work. We know better.
  2. Yes, we are angry. How could we not be when a bunch of rich white men is hell bent on destroying the environment, our societies, our democracies, our open source projects and people completely fall for it forcing us into using them? Again, see https://www.garfieldtech.com/blog/selfish-ai
  3. And yes, while initially this was about all LLM I did narrow this. It would be much better if people didn't use it but as mentioned in the previous point, we are pretty much forced to. You can't pick a machine translation tool which categorically would state it's LLM free. And so at the end of the day the point is to make Drupal better or at least keep it from getting worse and in the name of that we can't completely ban LLM use. Also are we going to police posts written by LLMs? Hardly. Despite what some suggested here, my goal is not to create some arbitrary policy which can be used to ban people. There are people who post excessively and needlessly without LLMs just fine and they are not banned either. I can imagine someone abusing this leniency so hard it turns into spam then again they will be dealt with as spammers should.

As for #94 that's a lot of "but what about". The truth is plain: LLMs currently produce slop at horrible external costs. We should make it easy to refuse slop. And preferably make a value statement about it as well. That's ... about it.

dimilias’s picture

I am going to link an issue and say something that I am not 100% sure. But I do believe this is a case.
A colleague of mine opened an issue a few hours ago. Already provided a patch. The maintainer also responded.
Linked ticket is https://www.drupal.org/project/eck/issues/3579703
Suddenly, a new account just logged in, did not comment, provided a test that does not help with the issue description (probably because it was automatically derived). Occupied time from my colleague to reject the idea of the test because it does not help the cause.
My comment is with the assumption that this test was generated entirely by AI and a bot account. Please, do not take this as an accusation as I am not digging into why this is AI generated but I could get into details.

This is what frustrates a lot of you and I understand this. And it is frustrating. But as with everything, this will continue until there are ways that will come up to prevent this. I am fine with the idea of having a strict policy. I would prefer to instead have guidelines which will prompt users to learn how to QA their code rather than produce blobs for core committers to waste their time on. The long run will outperform the first case IMHO.

dimilias’s picture

We should make it easy to refuse slop.

I agree. A policy should exist to easily reject cases like my previous comments.

I can imagine someone abusing this leniency so hard it turns into spam then again they will be dealt with as spammers should.

Still, agree. Your opinion is not wrong.

Yes, we are angry. How could we not be when a bunch of rich white men is hell bent on[...]

really, with all respect, how do you expect anyone to have a conversation at this level? What is the point of this remark here and the IS? Like, should I reply how we can generally make the world better? Does this bring any constructive argument in the conversation other than rage? are the same white rich people responsible for all of these? are we bringing blackrock down in this chat? I don't see why we are repeating this. I don't know where you are from, but I can assure you, take 15 random members, you will not agree in 100% of your opinions or 50% for all that matters, and will find people that don't care about your concerns. I really cannot understand how is this vague global power system issue an argument here.
Also,

while initially this was about all LLM I did narrow this

, you cannot narrow this. Allowing some LLMs or all, does not make a difference. You cannot distinguish. A policy is enough, at least to give moderators some power and inform the users.

catch’s picture

@yautja_cetanu the issue title is 'ban LLM contributions', which means 'don't accept LLM contributions', it does not mention 'banning people who use LLMs' and as mentioned already in the issue, something like that would happen only where someone's behaviour amounts to spam and defacement (and to be clear people have had their accounts suspended for spam and issue defacement on d.o whether using LLMs or not).

So you entire comment is based on a false premise which is unfortunate for someone who claims to want a rational discussion, it also makes this even more ridiculous whataboutism:

Why this, but not ban anyone who builds a drupal site selling meat products?

So I would really try follow your own advice to argue in good faith a bit more.

bircher’s picture

I have been following along this very emotional thread full of very long posts from the beginning. I think it is only fair if I give my two cents here.
My position should be clear from by contribution in #11. That said I would like to bring some nuance to the discussion.

First, I have been following technologies that are all lumped up as AI now for over a decade. So when LLMs became a phenomenon a few years ago It didn't seem like magic to me like it did to others.
LLMs are stochastic parrots and as such do not deserve antropomorphising, they inherently just produce text that has the correct form but lack understanding or creativity.
They are non-deterministic and may at any moment fabricate things that need to always be double checked and verified. Contrary to popular believ LLMs also don't "learn",
in a "conversation" or with "skills" you just add more tokens to start predicting the next ones. The only way a model "gets better" is by changing the weights by fine tuning and other techniques usually out of scope for a mere user.
That said, I think it is pretty amazing that we now have an algorithm that can produce the same medium we humans use to exchange ideas.
It means depending on the training data it can infer concepts out of vague descriptions and may return the thing you were looking for.
As such I don't think this an extraordinary technology and doesn't deserve the hype that currently surrounds it.
I think LLMs as a technology in a vacuum is really neat. It is like a rubber duckie that talks back at you; a conversation you have with a mirror.
As @dww put it, it makes sawdust of its input and produces plywood, spoken with the contempt a carpenter who loves their craft has for plywood.
But thanks to plywood I can assemble my IKEA furniture with power tools that my carpenter friend would not want to work with.
And I think this is an apt comparison to the code contribution we talk about here. Nobody wants to take the plywood away from @scott falconer's child.
But we DO want to ensure only the highest possible quality craft goes into Drupal core.

I don't want to be anti-ai, and while it seems to be possible to create an LLM ethically
(and I would really like to believe that not just because apertus was made by my alma mater), the current reality is very much the opposite.
I find it extremely diffcult to justify using services from OpenAI or Anthropoc for the various reasons described in arguments above.
But not least because sending all my data to the cloud and relying on a service that is financially extremely unsustainable even when it costs quite a bit and therefore might go away at any moment seems like a bad idea.
That is why when I tried to contribute to the ai module last year I tried to use local models, but I was never really able to reproduce what Jamie and Marcus demoed.
I just can not take the argument that "AI" is only going to get better seriously.
The moment the financial bubble arround it pops improvements will only come from making smaller, cheaper to run models more likely to produce useful output a very different LLM from what most people seem to use nowadays.
Nobody can predict the future, and maybe we will get some very specific LLM coding assistant out of it, but before that I think there will be a crash, maybe not but I will not bet on it. The more integrated "AI" becomes in everything the worse.
We are still in the "getting free/cheap samples of drugs to get addicted to" phase before it starts to cost a lot of money.
But while I agree that the tools people use have serious ethical and ecological concerns associated, we can not really use ethical concerns as the main reason for not allowing LLM code contributions.
After all there is a whole AI initiative and lots of people investing time and energy to work with AI and Drupal.
I guess some people just don't care, others don't want to know the facts. I am not interested in discussing these and I feel like the above discussion on ethics or ecological or societal concerns didn't get us anywhere.

I would like to focus on the code quality and the reviewer, committer, maintainer time. I think we should continue to strive to keep the code that goes into Drupal core as good as possible and keep the standards as high as possible.
Even when that means that it can be frustrating to get things fixed, and yes I do have first hand experience with this frustration. But I think it is for the better.
The kind of automations that do not need human creativity and therefore would be candidates for genAI code should be made via deterministic scripts or rector rules etc, so apart form the contributors craving to use an LLM tool there is no need.
So the policy of not allowing genAI/LLM code contributions would only affect people who want to use them to generate MRs that they didn't come up with themselves. I haven't tested the efficacy of the proposed AGENTS.md file or if it would stop
an LLM from creating summaries or explain the codebase etc. But there is no way anyway to prevent someone from just editing this file before running their claude. (Or maybe sprinking the codebase with ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86 would help? just joking! you are welcome!)
I any case the proposed solution here is only for core, the agents file will live in your sites web/core directory where it will presumably not be picked up, and if that is still a problem I would compromise by adding a gitattributes rule for excluding it from packages.

The second point why I think it is good that core contributors maintain a practice of coding without LLM tools is marvelously illustrated by #63.
All of the people I have talked about using LLMs for coding have said that they "carefully review every line of code" and I have no reason not to beleive that.
But properly reviewing is so much more difficult than coming up with the code in the first place if you are not familiar with it enough. And while reviewing LLM code you might come across new concepts,
especially if you let the LLM generate code in a language you are not familiar with, but you will not learn new things the same way as if you did think of the code yourself.
There is a reason that lectures are paired with exercises, otherwise one atrophies the skills not used. This includes the creativity and out of the box thinking a LLM can never achieve.
I also don't believe that LLM tools can meaningfully teach the next generation of contributors to aquire the skills to judge the LLM output.
I think removing some friction by allowing LLMs to generate MRs is not helping anyone concerned with the long term maintainability of the codebase.

But LLM abuse is not just an individual problem. Let me induldge with an analogy. Smoking is an individual choice (hopefully/mostly) but it is really hard to ignore the societal harms that come from it. Smoking is legal but the products are regulated.
This will date me but I am old enough to remember people smoking in restaurants, it was really gross.
But not everyone agreed, most smokers didn't care that they made the air quality worse for everyone.
When the discussion came up for banning smoking in restaurants and bars people were concerned that it would have a negative impact on the places. I thought it was great and soon my smoker friends agreed that it is actually nice not to smell like an ash tray after a night out.
Now smokers are not banned from society, but in places like airports there is designated areas for them. If some smoker takes a nicotine patch on the airplane nobody is the wiser. If you have to use an LLM despite the policy and nobody notices you are not necessarily part of the problem.
I would like Drupal to join the no-smoking open source projects.
Maybe we can add the CC0 text from gnome somewhere in the contribution guidelines too.

I would much rather review a novice contributors code and have that person learn from my feedback like I did years ago as a self taught drupal developer.

If we have a policy to refuse slop we should also have a way to catch it at its inception and the proposed AGENTS.md is hopefully enough or at least a good first step to point it out to the prompt engineer hoping to score some easy issue credit.
We can also adapt the text to allow a hybrid approach or a responsible contributor who wants to do this hybrid approach can remove the agents file and disclose the use of an LLM for creating the script that makes the changes etc.

Befor @catch posted #52 I would have had to only guess what Drupal already is exposed to, now this seems much more relevant to me.

cainaru’s picture

Issue summary: View changes
cainaru’s picture

Issue summary: View changes
aporie’s picture

If you begin to enact barriers to newcomers of which we already have too few then you will. One of the problems we face here is if LLM rots people's brains then who will become a new Drupal contributor? My proposal to ban LLMs are a faint hope in this spreading darkness the project could attract those who still care. I might be naive.

There's no barrier to newcomers. Only a barrier on huge core contributions ... If the ability to produce code becomes easier and more popular, it makes sense to me to level up the barrier for contributions to a 25y.o projects built by a community before AI. That's how you keep the trust for clients.

But what I am saying is that I would not use an LLM to apply changes like that across core and contrib modules, both because I wouldn't trust it, and because I don't think people should have to take out a $2400/year subscription to do things they can currently do for free in a slightly different but more deterministic way.

Maybe we just end up talking about a trust issue here, it's regular market product segmentation, early-adopter, majority adoption etc. I think the AI bubble is working at majority adoption now. You may just be in the last ones to be convinced :D. LLMs does not cost 2400$/year, I'm personaly using antigravity with a simple google one plan for 20€/month and I can vibe code for entire days if I want (even though haven't tested fully, not available). But anyway, this should be the choice of the developer, not an imposed choice of a product people decide to contribute to.

Totally agree with #88 and also with #90.

Why this, but not ban anyone who builds a drupal site selling meat products? Or any agencies that ever provide lunches or meals that involve meat? Or ban any organisation that does any sales whilst playing golf? What about the amount of energy and water used by datacentres that serve video content?

Totally agree, we can't use political arguments to take decision for a product (Drupal). I totally agree with the environmental concerns and all, but it is not for us to make these decisions. These decisions are politicals and should come as regulations (which they may). What we need to do is just adapt, while it's just chaos for now and probably for many years, knowing how regulations tend to take ages to come into force.

I think the concept of Drupal doing its bit to try and reduce the environmental excesses of AI makes sense. In the early days we explored, could we focus on training small language models? Could we try and partner with green datacentres using geothermal and push AI on drupal.org like that? We spoke to individual companies that tried to do things like create mini-datacentres that used the excess heat to warm up homes.

+100 for initiatives from the Drupal project to try to make a stain on politics. And that's actually our jobs as developers to try to draw an objective vision of the technical impact of AI, as a lot of us may be working for govs institutions, it's every easy to escalate this up the chain command. For the impact on the environment, they are other people we should rely on, researchers, institutions which society depends on to create research papers and documented analysis. This is not our field, the only think we produce, as developers, are opinions in the latter.

However, I think this is an area that needs to be treated with respect and decorum. Whilst we absolutely should punish western companies that cause harm to employees in the "global south", we should push for similar workers rights the Luddites fought for during the industrial revolution, in western companies working with countries that don't have the same laws.

But how? Should we forbid a country to use Drupal for their institutions because they commit mass murders? Again, not our decision to make. I agree the drupal project can try to influence that by informing, making strategic parternships, eventually even drafting reports with the data it has, for decision makers to be advised. But that is all. That is where the political aspect of the technical part of Drupal should stop.

This is not a reason to avoid AI. This is a reason to get ahead of it and find out if that "perception" is true or not and show why you need Drupal.

This is again a good reason to get ahead of this and find ways we can get the benefits of AI without eroding trust.

+100

Yes, we are angry. How could we not be when a bunch of rich white men is hell bent on destroying the environment, our societies, our democracies, our open source projects and people completely fall for it forcing us into using them? Again, see https://www.garfieldtech.com/blog/selfish-ai

I actually feel the same way, but somehow I just gave up with age ... We should try to focus on what we can control, and banning LLM contributions is like shooting ourselves in the foot. GenAI is here, and here to stay, like any previous tech which was a game changer.

Suddenly, a new account just logged in, did not comment, provided a test that does not help with the issue description (probably because it was automatically derived). Occupied time from my colleague to reject the idea of the test because it does not help the cause.

I mean ... I'm not gonna state it outloud, but I see an obvious security breach in that. And we should definitely do something about it. Maybe limiting the amount of pipelines a new user can run? I don't have the data, but if that becomes a recurring problem, it'll need a fix.

So I would really try follow your own advice to argue in good faith a bit more.

I do think it was an attempt. We are mixing politics with product's, limited capacity, decisions here. The whole idea was to show that politics could not be involved to take technical decisions for clients who may not care at all about what we think (should we ban the extreme left wing from using Drupal? Maybe the extreme right wing then? Or why not the center?). They see Drupal as an added value to their own projects with their own point of views. What we should work toward is how we keep this trust, so client's still want to use Drupal, so we still keep to get working on a project we have contributed to for so many years, not in an equal manner, I admit. What is weird IMO, is that the ones who have been putting the more sweat into it, are the ones, again IMO, who seem to take the wrong decision for it to stay on track. But again, it's just a personal opinion. I mean, if govs decide to ban LLMs from their own apps, I'm surely in the wrong! I just think they won't.

On this, at the end, it'll be from the core team to make a decision, and I'll just stick to it like @dimilias said. I think I've done my part for this ticket and will try to restrain myself from commenting more, even if I'll keep reading.

I just hope I managed to show to some people, that some little, sometimes simple measures, can mitigate AI acceptation in the Drupal project.

cainaru’s picture

Issue summary: View changes
rajab natshah’s picture

I have been thinking more about how and who could manage this since #3533875: Ethical aspects of using AI in Drupal.

To manage this in a more constructive way, we could follow up with the TUF methods (in documentation, Spec, and TAPS).

The Update Framework A framework for securing software update systems

It could be ~ The Ethical AI Framework (TEAIF) ~ for general AI ethics.

PHP/Drupal could implement AI Specs and AIAPs (AI Augmentation Proposals), which could be followed in projects.

ghost of drupal past’s picture

Well.

I am done here. I said all needed to be said and yet people have come out in full throated support of fastech. After my recent blog post I can't say I am surprised but I am still disappointed.

As for https://dri.es/never-submit-code-you-do-not-understand

Never submit code you don’t understand

That's a true "Mistakes were made" way to defang this issue. It clearly shifts the blame from LLMs to people using them which, again, given the immense hype and the addictive nature is at most partially correct. The blog post mentions AI only thrice and LLMs zero times. There's no acknowledgement of the harms of LLMs in that blog post. It's crystal clear this is the statement that will be accepted. It's the enterprise road. The enterprise road was chosen once before and it led to a major decline of Drupal usage. This will, eventually, be another. I fought that one, bitterly and it led to the ban of my previous account. Oh no, we are not doing this again. I am so out of this issue.

catch’s picture

Need to respond to this from #93...

There is a high chance that the solutions to this problem involving making greater use of AI not less. For example; having AI do initial quality reviews. Creating Agents.md files or skills for writing Drupal comments, AI Summaries in the Issue Queues, AI Triaging. Maybe instead of credits systems for making people accountable to the specific patches and code they produce.

This would all make things considerably worse.

When I'm reviewing core issues, I review not only the current state of the MR, but also the discussion on the issue that led to that state. If there is something in the MR I'm not sure about, reading the discussion might explain it (and possibly suggest that some of that discussion should be distilled into a code comment). If there's no discussion of the weird thing, that in itself is information - will usually result in pushing back on why it was done that way, or just asking about it etc.

Also I don't only review the words in the discussion but who they're were written by. If I know someone has deep knowledge of the subsystem the change is against, then their review carries a lot of weight. Or if I know one of the reviewers is generally very thorough that helps too. I also tend to know people's tolerances for an RTBC - some will RTBC very quickly, some essentially have to be coaxed into it (this can be the same person on two different issues).

All of us when we're reviewing issues can make mistakes or communicate things in a way that aren't clear enough - suggesting that something be split out to a side issue when it turns out it needs to be fixed together, suggesting an approach that someone already tried three years ago and didn't work. In those cases, the MR author(s) can either reply to that review explaining why it's wrong, or ping the person in slack. In the worst case, people follow the bad advice, then (hopefully) realise it was bad, and have to revert again. When my bad reviews do that to people, I usually feel bad about it, but it helps me to try to give a more useful review the next time.

When you add LLM bot reviews to an issue, it disrupts this entire process:

1. Rather than reducing the amount of text and code on an issue I need to read, it would add to it - even if I scroll past it that is additional effort in the review process.

2. LLMS are very good at making plausible sounding review points that are subtly wrong. For example they might indicate an extreme edge case error condition and suggest defensive coding, when instead an assert() or logging is more appropriate. You cannot tell them that they're wrong (or you can, but they'll either double down or reply with 'You're absolutely right!' neither of which are based on any mutual understanding).

If a new (or not so new) contributor gets misdirected by a d.o-sanctioned LLM to make unnecessary or out of scope changes to an MR, then someone like me comes along and tells them that they've wasted their time and need to revert all those changes again and do them differently, that's a great way to put people off contributing and waste experienced reviewer's time in the process.

It also ignores that we already have extensive and useful linting tools and a test suite that is double the number of lines of code of our runtime code base. phpstan already catches a lot of 'easy for computers to spot but not always easy for humans to spot' issues and we're only using that at a fraction of its potential because we're on level 1 instead of level 9.

acbramley’s picture

@catch

But what I am saying is that I would not use an LLM to apply changes like that across core and contrib modules, both because I wouldn't trust it, and because I don't think people should have to take out a $2400/year subscription to do things they can currently do for free in a slightly different but more deterministic way.

My point wasn't that it should be used for something like that, as I said I was just experimenting. My point was that LLMs can produce changes that are outright identical to the manual changes a human would make for some tasks making it impossible to detect LLM contributions, therefore making a complete ban impossible.

I agree that in my specific scenario a rector rule would be better, but given the complexity of the changes I would not know how to write a rector rule that did that.

nod_’s picture

Another quote from Dries blog post: "You are welcome here, with or without AI tools." That sort of answer the question about a ban.

To complement #110, I did give a fair go at trying to use LLMs for the community "glue" work: https://tresbien.tech/blog/algorithmic-bias-against-drupal-community-val... it does not work. It's not a good idea. The training is really buried deep, and even when it pretends to follow your instructions you can see the struggle in the "thinking" output.

@aporie: the simple fact that you're part of a community and contribute to an open-source project under GPLv2 is already a political statement. The argument that we're "just" technical people that shouldn't be concerned with politics is dangerous. We have agency, we're not passive spectators in all this.

dww’s picture

"You are welcome here, with or without AI tools." That sort of answer the question about a ban.

In fairness, I highly doubt @Dries read this issue before he published that piece. I'm not optimistic, but there's at least some hope he might be swayed by what's been said here, especially given the negative experiences of folks like yourself and @catch as committers, and the growing list of subsystem maintainers and other "heavy hitters" in the core queue who are in favor of taking this step.

@everyone: Of course this is not going to actually prevent people from using LLMs, even to try to contribute to Drupal core. I said so in #47. Neither @ghost nor myself are naive fools. Indeed, consent isn't much of a thing in the sloptimist / LLM-user world, nor the folks training / developing the LLMs (witness their abusive scraping behavior and unwillingness to honor things like robots.txt). So I have no doubt this is a mostly symbolic effort. I maintain it would be a principled stand to take, even if technically ineffective. Again: will this solve all the numerous problems with LLMs? Of course not. But it's a step we could take.

Re: #94 - not to take the bait on what-aboutism, but for the (public) record, I've been organizing and fighting for workers rights, the environment, and against fascism for almost twice as long as I've been contributing to Drupal (which is itself 20+ years). So no, this is not the only effort I've made or will continue to make to improve the chances for humanity's survival, and for the rights of the people all over the world who do the work to make human societies function. And yes, I absolutely hate golf and golf courses. 😂 The outrageous waste is mind boggling. But that's not something I can do anything about here in the Drupal Core issue queue. So I'm focusing on things I (might) have some social leverage to help accomplish.

nod_’s picture

A 1h discussion on AI that touches on topics very relevant to the discussion at hand: Reclaiming our Humanity in the Age of AI

mdranove’s picture

I think this post makes a lot of good points. I am generally in favor, however, I think LLMs should still be permitted for adding tests.

The Needs Tests backlog is quite large, and I think one of the main reasons is that even moderately skilled PHP devs have a hard time remembering the syntax/methods/class names needed to write tests.

Just my own 2 cents. I would support a full ban or a partial ban.

kentr’s picture

The Needs Tests backlog is quite large, and I think one of the main reasons is that even moderately skilled PHP devs have a hard time remembering the syntax/methods/class names needed to write tests.

I understand the reasoning, and I agree that writing tests is a hurdle. To me, though, using LLMs in this case is also worrisome.

Tests need to provide confidence in the code. For that, the tests themselves must be trustworthy. If the dev isn't familiar with the testing framework, how can they QA the result before submitting the MR?

grasmash’s picture

Hi everyone. I've only just discovered this issue and that my own merge requests are referenced in the issue summary. I don't typically jump into these threads, but I feel that I should, given that the conversation was sparked by my own contributions.

I am the director at Acquia that submitted those merge requests. I'm also a long-time Drupal community member and contributor. And that predates my joining Acquia. I don't mind those merge requests being used as a talking point, but @catch if my merge request upset you, I would have preferred for you to initially reach out to me to share your concerns. I'll be at DrupalCon Chicago next week, and I'm happy to talk about it then if you'd like.

I'd like to provide important context that may not be self-evident from the issue. This is perhaps information that @catch did not know either. The merge requests in question were not submitted to Drupal Core. They were submitted to Canvas in a pre-alpha state when only a 0.x branch existed. So I'm not sure it's fair to use these as examples for setting core contribution policy.

@catch, you've also referenced that five core committers have reviewed it. I certainly don't want to waste anybody's time, and I apologize if you feel that your time was wasted by looking at these merge requests, but some of those core committers are employed by Acquia to work on Canvas and specifically to build some of the features in those merge requests. So again, perhaps not a good example of the potential for carelessness and wasted time of vibe-coded work. The merge requests were in fact initially created in collaboration with some of those people in person at DrupalCon Atlanta.

The purpose of those merge requests was for sharing the features as a "prototype" to communicate the product requirements for the feature (noted in https://www.drupal.org/project/canvas/issues/3515399#comment-16093053) . It was always expected that they would be restarted from scratch. I'm sure that we could make process changes to avoid people wasting time reviewing the code quality of prototypes. But to be honest, I felt that a speculative merge request to the development branch of a pre-alpha module was relatively harmless

I'd also like to call out that last February, which was roughly a year ago at this point, was a much different time in the world of AI. The quality has improved drastically and will continue to improve. So any strong opinions we may have about quality should be tempered by the fact quality will be changing quickly and inexorably.

I would encourage the community to not make a blanket moratorium on LLM-generated code, but rather take a more nuanced approach and consider what type of work might be appropriate for an LLM. Is there a place for prototyping? What about minor changes that are small in scope? What about changes that have full test coverage from existing tests? When will such a policy be reviewed again, given that the conditions may change?

I do feel that DrupalCon Chicago is a good opportunity to have this discussion. Personally, I get overwhelmed by these very long issues with many comments. Feel free to ping me on Drupal Slack at @grasmash if you want to find some time to connect.

catch’s picture

@catch if my merge request upset you, I would have preferred for you to initially reach out to me to share your concerns.

I thought I made that pretty clear on the issues. e.g. #3515646-7: Add automated <img srcset> generation, not really sure how I could have stated it more clearly. You didn't actually comment on the issue after posting the MR, so did you actually read the discussion that occurred?

I won't be at DrupalCon Chicago.

some of those core committers are employed by Acquia to work on Canvas and specifically to build some of the features in those merge requests. So again, perhaps not a good example of the potential for carelessness and wasted time of vibe-coded work

Yes they are. Also one of them, me, is not but was trying to keep up with Canvas development in case it (or parts of it) eventually made its way into core, and because of it's usage in Drupal CMS. You can ask Wim how much help I've given over the past couple of years if you're not already aware.

Here's something to try.

Run git shortlog --since 2024-01-01 --summary on the main branch of core, and then count how many commits were made by Acquia employees since the first of January 2024.

The repeat the same thing with:

git shortlog --since 2022-01-01 --before 2024-01-01 --summary

Just because someone is doing some work for their day job, does not mean their time isn't being wasted. Additionally Acquia used to employ more core committers 5 or so years ago than it currently does. This is a notable shift away from core development, while Acquia is also shifting towards more and more dependence on and integration with LLMs.

Note that commits obviously aren't everything, there are non-commit contributions happening that aren't reflected there, but they are probably around the same level, have not gone up to compensate.

Also to be clear I'm not blaming any of the individuals here, feels like an institutional issue.

yautja_cetanu’s picture

Yes, we are angry. How could we not be when a bunch of rich white men is hell bent on destroying the environment, our societies, our democracies, our open source projects and people completely fall for it forcing us into using them? Again, see https://www.garfieldtech.com/blog/selfish-ai

Can we try and ban conversations about race in this conversation. I am not white, and many people who are on the core team ARE white. This issue is banning many many non-white people from using a tool that helps them so much communicate with white english speaking people. If you want to make this about race this is flat out unacceptable. The indian Drupal community has done so much to push AI forwards. Also the "White verses not white" distinction is a distinctly american concept that isn't really appropriate for an international community.

I propose ghost of Drupal past removes that bit of this comment and then I delete my comment.

yautja_cetanu’s picture

This would all make things considerably worse.

My position is a that serious discussions should be made to figure out a way that this won't happen.

For example, Youtube used to have something for Youtube Comments that would review a comment you're about to post to see if it is too aggressive and then if it is, it tells you the policy and you click submit again. Its relatively light touch but can nudge things in a directly.

I am not suggesting that all these ideas are good right away but I think they could be made to work if taken seriously by people who want to make it work:

- What if LLMs just flagged, things as "low quality" so that you can filter it out.
- What if LLMs just provided potential help to the person doing the vibe coding, but the tool to do that is created by the Drupal community with lots of people focusing on its code reviews being useful.
- What if LLMs or even manual people could do things like codify or rank a lot of the stuff you've said you do generally.
- Or this could be a manual process. Like make it so that whenever a committer likes a review or code submission, they could give "karma" so "Catch" could specifically have a list of individual people that review code well or not (Similar to Dries' favourites).

Fundamentally if the goal is to save committer's time then a good policy that works with contributions from Junior developers who are posting bad or lazy contributions is likely to be something that will help with LLMs.

It seems like, if we focus on "AI Slop" the primary issue is wasting core committer time. If we use AI for initial Triage of AI generated code submissions then someone like yourself, can just ignore all the AI generated contributions that have failed initial tests. Alternatively you could just ignore all AI generated Code full-stop. You could have a personal policy of only allowing things to be committed when someone has reviewed it thoroughly who doesn't mind AI generated code.

What would be the difference between a policy that allows for LLM generated code but flags it so it can be ignored, verses a policy that doesn't allow for it at all when it comes to wasting contributor time?

When you add LLM bot reviews to an issue, it disrupts this entire process:

1. Rather than reducing the amount of text and code on an issue I need to read, it would add to it - even if I scroll past it that is additional effort in the review process.

Why would it do that? What if LLM reviews only added flags and full write ups to the specific people that present a PR, or if AI reviews were skills that happened during the development of the code? There is no reason why the implemention we do here has to add to the stuff you read for it to help.

2. LLMS are very good at making plausible sounding review points that are subtly wrong. For example they might indicate an extreme edge case error condition and suggest defensive coding, when instead an assert() or logging is more appropriate. You cannot tell them that they're wrong (or you can, but they'll either double down or reply with 'You're absolutely right!' neither of which are based on any mutual understanding).

Sure... but if someone is choosing to use AI to help them review their code they can get a feel for that.

If a new (or not so new) contributor gets misdirected by a d.o-sanctioned LLM to make unnecessary or out of scope changes to an MR, then someone like me comes along and tells them that they've wasted their time and need to revert all those changes again and do them differently, that's a great way to put people off contributing and waste experienced reviewer's time in the process.

If the LLM is clearly marked as doing that people can choose to avoid it. I think being accused of submitting LLM generated code when you haven't because of an LLM ban policy also has the posibility of putting people off contributing. We've seen in anti-ai communities with novels many authors being accused of AI slop when their books were written before AI.

It also ignores that we already have extensive and useful linting tools and a test suite that is double the number of lines of code of our runtime code base. phpstan already catches a lot of 'easy for computers to spot but not always easy for humans to spot' issues and we're only using that at a fraction of its potential because we're on level 1 instead of level 9.

It doesn't ignore that, the LLM review tools we've build internally all make use of that.

yautja_cetanu’s picture

Whilst I have made it clear that I do not agree with everything stated here about LLMs. I do agree with Catch's worry about the potential for LLMs to drown sponsored and volunteer time having to deal with poor quality contributions and so this should be taken seriously. There are a number of Drupal contributors that are anti AI with equally valid concerns.

So how about this? Proposed by someone that is super in favour of AI overall.

Idea:

What if instead of a blanket ban we had a temporary moratorium on LLM contributions to Drupal Core and issues. We assigned a specific group (working group) of people who have the time to work together on creating this policy. We assign a specific group of people (review group) to review this policy on a monthly basis until we have consensus amongst the review group to go ahead with it.

The moratorium will be a light touch. It will be a general policy of not accepting LLM contributions, but until any changes are suggested, it will not involved penalties for LLM contributions nor processes for trying to figure out if people are doing it or not. But just a general description that core doesn't want it for now.

We can open it to a number of people to be part of the working group but the review group would be only from individuals currently in the MAINTAINERS.txt.

We can also explore things like having a carve out of people in the MAINTAINERS.txt or individuals who are known quantities with their review qualities and what to do about existing issues.

Proposal:
We do not do the moratorium right away but we do aim to create the working group and review group right away (At Chicago). We set a specific date for when a temporary ban will begin and an outline of how often it is reviewed.

catch’s picture

A temporary moratorium (possibly with a carve out as suggested by @cainaru) seems OK to me, it's better than absolutely no change happening at all due to an impasse.

Just quickly on this:

What if LLM reviews only added flags and full write ups to the specific people that present a PR,

New (or all?) contributors to Drupal core would find that their first experience after creating an MR is being essentially dmed by a bot about all the problems it found in their code. This seems extremely unwelcoming to me. I'm not sure if I'd ever have stuck around Drupal core if I'd had an experience anything like that. For reference my first major core contribution was massaging a friend's code into a patch, when I barely knew any PHP, it formed the basis of the eventual commit that fixed the issue, but the first attempts were all over the place and needed help from chx and others to get anywhere near committable. It's a feature of the Drupal community that people can post horrible code that tries to fix actual bugs, and other people will help them get that to a committable state, it's how many of us learned and continue to learn. But that also requires respecting the time of the people who do that. The other issue with the dms is then if the LLM misdirects them, it's not possible to see how.

bircher’s picture

Some people like their ai, some don't trust their output.
Some people are convinced that ai is necessarily going to get better, some are not.
Some people are concerned with the ethical/ecological implications of the currently leading ai solutions others are not.

We can have an endless debate about it and regularly re-visit it, but I don't think it gets us anywhere.

Ok so what can we pragmatically do?
I would state the problem we try to solve with this issue like this:
The rappidly evolving ai coding assistants make it easy and cheap to produce a large number of reasonable looking code changes/MRs.
This creates more work for reviewers and maintaners whereas the author of the code doesn't have to think about it.

There were already many more contributers that created patches and MRs than reviewers and maintainers before the advent of genAI, but this shift in technology make it worse.
Many maintainers of other big open source projects were forced to address this already, many imposed a strict ban.
Drupal may be shielded a bit more from it by the fact that it has an RTBC process and issues on drupal.org rather than github.
But without a policy to point to, maintainers will always have to justify why "vibe coded MRs" are a waste of time to review.

The way I interpret what Dries blogged is this:
You are welcome in the Drupal community with your AI tools but please don't not vibe code. (Never submit code you don't understand)

Most places in the Drupal community (ie contrib) can probably deal with it easily and may already let genAI generate some or all of their module code.
For core (ie this issue) I would propose that AI contributions have to be explicitly allowed per issue by a core maintainer. Opt-in instead of opt-out.
Call it a moratorium until a better way can be found. This

So I would add to the core contribution guidelines the following:

Use of Generative AI

Drupal core does not allow code contributions generated by large languages models (LLMs) and chatbots. This ban includes, but is not limited to, tools like ChatGPT, Claude, Copilot, DeepSeek, and Devin AI. We are taking these steps as precaution due to the potential negative influence of AI generated content on quality, as well as likely copyright violations.

This ban of AI generated content applies to drupal core and all contributed modules and themes which did not explicitly opt-in. An exception applies for purely translating texts for issues and comments to English. Exceptions may also be individually granted by maintainers for specific issues.

AI tools can be used to answer questions and find information. However, we encourage contributors to avoid them in favor of using existing documentation and our chats and forums. Since AI generated information is frequently misleading or false, we cannot supply support on anything referencing AI output.

And in order to prevent potential contributors who did not read these guidelines or asked for exemptions in advance from making unwanted contributions in the first place I would merge the MR 14932 here to make the LLM tool inform the contributor.

I think this is a fair compromise, if you want to be sneaky and you make small contributions with ai and nobody notices then that is on you.
I hope that this process doesn't incurr extra hurdles on contributors and neither on core maintainers apart from a few extra comments on the issue.
It is by far the simplest proposal in my opinion and it doesn't outright ban all use of genAI.
It should not affect sites built on Drupal or contrib development.
It also doesn't exclude setting up a working group to come up with a better plan.

greg.harvey’s picture

It seems to me that the crux of this for maintainers is respect. I don't want to ignore all the other reasons LLMs might or might not be good/bad, but if I may focus specifically on the "slop" issue, which - for me at least - has the most concerning potential consequences for Drupal and for wider open source, this is a very human problem.

It's pretty clear to most people that dumping 5,000 lines of code into an open source project, knowing full well that some poor schmuck is going to be obliged to review it, and not even making the smallest effort to ensure it's clean and ready first, is just rude. It shows a shocking lack of respect for one's colleagues.

The right thing to do would be to formulate a proper plan to incrementally release %thing, probably create a meta issue to discuss it with the maintainers first, develop and track the resulting tasks, breaking it up into manageable issues with sensibly sized MRs.

If people are going to be lazy, for example trying to skip the whole planning phase of a complex feature, and not make any effort to consider the workload of the VOLUNTEER maintainers and how they might be able to alleviate that, they deserve to have their MRs rejected and to be told to go away and come back when they know how to behave.

To quote my colleague, just now as we were discussing this:

throwing stuff over the wall and expecting people to clean up your mess has always been rude in open source.

Quite. This didn't start with LLMs, LLMs just make it way easier. The bad behaviour still starts with a human making a string of poor and inconsiderate choices. Maintainers are totally within their rights to reject MRs, like the ones @catch described, out of hand and we should support them, NOT because of how they were coded, but because the submitter is just being rude and inconsiderate, placing an unreasonable burden on finite human reviewers because they couldn't be bothered to do it properly themselves.

We now have an environment where AI accelerates this kind of behaviour, Pandora's box is flapping open. We have to respect our maintainers, call it out and allow them to be free to reject unreasonable demands on their limited time. If we don't, I can only see an end where maintainers burn out and just quit. And when we have no maintainers we have no Drupal.

I would hope we could just support our maintainers to run their issue queues as they always have and, critically, support their decisions while educating our peers in the arts of collaborative planning and creating polite issues and MRs. I would hope we wouldn't need new rules, as it should be *obvious* that dumping terrible code and expecting someone else to fix all the problems is just rude.

That said, if this problem has grown exponentially with vibe coding, then maybe pausing vibe-coded contributions just to stop the bleeding of maintainer hours is a necessary, albeit temporary step?

scott falconer’s picture

@yautja_cetanu @catch fully agree with the negative impact to sponsored and volunteer time. We should quickly take steps to protect the time of those who chose to contribute to Drupal, and to ensure the quality of Drupal remains high.

One of the ideas we discussed on the AI Initiative Leadership meeting today creating opt-in responsible LLM use training materials. These would be a step above general guidelines and could be use as a baseline to measure the impact of responsible contributions. I'm happy to take the lead on creating these and will be at DrupalCon if anyone would like to discuss. Once ready I'll create another issue to track that work in.

This would also allow us work through a moratorium period in a way that doesn't seem as rigid as a ban. i.e. maybe initially it's: Please ensure your use of LLMs adhere to the guidelines as set by the specific project. That would then allow something like core to set an expectation of: "While we work towards standards ensuring quality of contributions and to prevent undue maintainer burden no LLM generated code contributions should be submitted, even if human reviewed. LLMs may be used for local assistance, testing, etc, but should be disclosed".

It would then allow other projects to set standards like: "LLM assisted code contributions are accepted from those who have completed the opt-in responsible use training." etc.

edit: this would also allow an agents.md file to be crafted on a per-project basis that outlines these rules without impacting other projects .

kentr’s picture

It seems to me that the crux of this for maintainers is respect.

I'd say this applies to all participants in the issue (not just maintainers).

Peer reviewers aren't always maintainers. It also goes beyond code review.

For example: if someone jumps on a sloppy MR to test the behavior, take screenshots, etc—and then the MR gets rejected because it's sloppy or bogus—that's a waste of their time also. Same if the MR gets significantly refactored to the point where it needs a new behavior review.

The same is true for anyone who participates in discussion around the MR.

If the MR includes tests, it's potentially a waste of CI resources.

greg.harvey’s picture

I'd say this applies to all participants in the issue (not just maintainers).

💯💯💯

longwave’s picture

I've been watching this thread with some interest. Personally I've used LLMs to some success. I've used GitHub Copilot as an extended autocomplete pretty much since it was introduced. A handful of MRs I've worked on in the last two months have been assisted by Claude Code, and perhaps would not have been opened at all without it - but I certainly haven't used it in every case. I've fully disclosed this in the issues where I've used it. On the other hand, as already noted in this issue, I've also been on the receiving end of increasing amounts of slop as a reviewer, both paid and as a volunteer.

My personal opinion is that, like almost any other tool, an outright ban isn't the right thing to do. This is just the next iteration of Maslow's hammer: not every problem can be solved by AI tools, and while it appears that they can be helpful, that depends very much on both the specific case and the person using it. Similarly, you wouldn't choose a power tool every time you wanted to cut wood - there are times when it might be helpful and speed you up if you know what you are doing, but there's also times when it's never the right answer, and most people shouldn't offer up their first attempt without getting some experience first.

I do agree with #124 that respect is the crux of the matter. This isn't the first time we have had abuse of the issue queue, but the problem here (and elsewhere) with AI tools is that it's easier than ever before to generate vast amounts of output that someone else then has to deal with.

longwave’s picture

I wasn't sure whether to add this, but I'll put it out there: it's not often any more that we get 120+ comments from 25+ different contributors on a single issue in just one month. I can't help but wonder where the project would be if the time and effort spent here was replicated in other issues.

greg.harvey’s picture

A fair observation. 😬
Although most of the contributors to this thread seem to have decent DrupalCode profiles, I note. (My shamed self excepted, I spend most of my time contributing to Ansible these days, but at least I have colleagues who do Drupal contrib!)

I guess it's something people feel passionate about at the moment.

aporie’s picture

Just a little jump'in to say that I find all this discussion very fruitful.

We assign a specific group of people (review group) to review this policy on a monthly basis until we have consensus amongst the review group to go ahead with it.

If there is working group, I'd like to be counted in. Though, I'm mostly remote, but if anyone don't mind me using their computer for visio I can be sheldon (even better if it's on a remote controlled robot :) ). Otherwise, I'll just wait for the minute, and will give my 2cents after, for the next monthy review. I'm usually not so bad at thinking out of the box and can come up with easy implementable solutions.

What if LLMs just flagged, things as "low quality" so that you can filter it out.
- What if LLMs just provided potential help to the person doing the vibe coding, but the tool to do that is created by the Drupal community with lots of people focusing on its code reviews being useful.
- What if LLMs or even manual people could do things like codify or rank a lot of the stuff you've said you do generally.
- Or this could be a manual process. Like make it so that whenever a committer likes a review or code submission, they could give "karma" so "Catch" could specifically have a list of individual people that review code well or not (Similar to Dries' favourites).

Also, because for now, I totally disagree with that, and think the filters should be applied at a human level, so it can make great discussions (for the worshop, not for the review). Furthermore, I'm not sure AI would be the best tool to review AI generated code, even with the best skills and rules and even training (why not), how can we ask a genAI to auto judge code generated by genAI? I'm talking about the technical side here, it would imply we have a genAI better than current commercial ones.

In addition to the cost ... How much this is gonna cost running this AIs in the pipelines?

Still might worth investigating, just doubting about the feasibility ...

benjifisher’s picture

I apologize for not reading any of the comments, and just skimming the issue summary. Probably I am repeating points already made, so please consider my comments as +1s.

1. I support restrictions based on code quality, not on how the code was created.

It may already be possible to create high-quality code with AI coder/reviewer loops. If not, it is coming. (My guess: less than a year away, and getting easier all the time.) On the other hand, an inconsiderate contributor can waste a maintainer's time with or without the help of AI. It is unreasonable to demand that maintainers spend unlimited time on every contribution.

I also support restrictions based on how the contributor responds to code review: the quality of the contribution, not just quality of the code.

2. I believe in openness.

That is part of why I contribute to open-source software.

For now, I think that means it is a good idea to acknowledge when you use AI in your development process. Perhaps this good idea should be codified as a policy. If so, then we will have to re-evaluate as practices evolve.

3. Take responsibility for your contribution.

Until and unless AI is both accountable and as reliable as an experienced developer, we should expect developers who use AI to understand their contributions and take responsibility for them. In part, taking responsibility includes responding cooperatively to code review.

+1 to "Never submit code you don’t understand."

rohan-sinha’s picture

Instead of a full ban, what if we introduced an AI-based code reviewer in the contribution pipeline? It could automatically flag or score LLM-generated code for quality, security, and Drupal coding standards before it even reaches maintainers. This way we don't block contributions entirely but still protect maintainers from being overwhelmed by low-quality slop. Could be a middle ground worth exploring.

Edit: Just noticed this idea was already raised and addressed in earlier comments. Apologies for the repeat!

longwave’s picture

The answer to "too much AI" is probably not "more AI".

joachim’s picture

The crux of the matter is not respect.

The matter of the externalities and the harms that LLMs cause keeps getting brushed away here.

I honestly wonder, do people not believe that these are a problem? Or do they not care?

cainaru’s picture

From comment #135:

The matter of the externalities and the harms that LLMs cause keeps getting brushed away here.

I honestly wonder, do people not believe that these are a problem? Or do they not care?

Unfortunately, there does seem to be quite a mix of both here (at least IMO). It is disheartening and disillusioning to say the least, but what do I know. I don’t have much left in me on this one. 🤷🏻‍♀️😫

kentr’s picture

The matter of the externalities and the harms that LLMs cause keeps getting brushed away here.

I honestly wonder, do people not believe that these are a problem? Or do they not care?

There's a third category: People who care but at the same time understand that deciding an issue like this based on those externalities is likely impossible because of the diversity of the community members and their individual values.

yautja_cetanu’s picture

Created an Issue here for the start of a working Group: #3580299: Create an "AI and Core Policy" Working Group and Review Group - Create a list of names.
Focusing on who would want to be part of it and where the initial conversation should happen.

@Catch

I'm not sure if I'd ever have stuck around Drupal core if I'd had an experience anything like that.

I think this is one for the working group. I can imagine some ways this could be done that would discourage contributions. However, it is also the case now that our processes discourage contributions as making basic mistakes and being accused of "Gaming the System" or met with anger isn't great either. I think the work group can explore things like this and explore ways that could be done where its an optional extra that people can use if they want, but doesn't discourage others.

- I've put this is a bullet point on the list that could have a specific issue.

It's a feature of the Drupal community that people can post horrible code that tries to fix actual bugs,

.

Its certainly the case that this is true but at the same time, balancing that with saving people's time could discourage people that don't immediately understand Drupal's culture.

- I've put this is a bullet point on the list that could have a specific issue.

The other issue with the dms is then if the LLM misdirects them, it's not possible to see how.

- Similarly added this to the list. I can edit this comment later with links to issues around this. It may be worthwhile that people who are using AI to code post what AI says, or plans, or approaches, but in a way that doesn't add noise to the humans talking.

And in order to prevent potential contributors who did not read these guidelines or asked for exemptions in advance from making unwanted contributions in the first place I would merge the MR 14932 here to make the LLM tool inform the contributor.

The advantage of a working group on this is that we can make sure people who want to encourage more AI contributions to Drupal are happy with this and find it helpful, people who dislike all AI are ok with it and that it helps, not hinders Drupal core.

@greg.harvey

It's pretty clear to most people that dumping 5,000 lines of code into an open source project, knowing full well that some poor schmuck is going to be obliged to review it,

The problem with this, is its easy to say this with hindsight but this is far from obvious whilst experimenting. Early stuff with AI, the problems reviewers found is that tiny improvements seemed pointless when it wasn't clear how this fit into an overall picture. Written plans and architecture was difficult when the actual capabilities of AI were constantly shifting. Sometimes you need to vibe code large amounts of working code to see if the thing will actually work or to explain what it is you were trying to do.

I think we can see now with hindsight that 5,000 lines of code is a bad idea, everyone is talking about it. But it is regularly bought up as a single example of something bad with AI, but in the same way we want to be nice to people who write bad code but are learning, we probably should be open to people making mistakes with AI and pointing them in the right direction. Not assuming its fundamentally about respect always (even when it sometimes is).

Quite. This didn't start with LLMs, LLMs just make it way easier. The bad behaviour still starts with a human making a string of poor and inconsiderate choices

Fully agree here. My personal view is that many of the solutions to the issues with AI are the same if you imagine a large number of people new to the community but coming in scale. The only difference (and how it links to the 5,000 lines of code) is that a junior dev would struggle to write 5,000 lines of plausible code. It would be immediately obvious that it was gibberish. Its quite amazing if you could write 5,000 lines of code and it runs at all! But AI can, and this is a new issue we may have to resolve.

As you mentioned with pandoras box.

@aporie

Also, because for now, I totally disagree with that, and think the filters should be applied at a human level, so it can make great discussions (for the worshop, not for the review)

I think this is the reason for a working group, which you're welcome to join!

The issue with comments like this, is that no one can really know if it would work in the real world until its tried, and it needs to be tried in the real world with real people for it to mean anything. But fundamentally if the problem is:

- people WILL use AI to produce code.
- Fundamentally we will never know for certain if they have, if they say it didn't
- People may flood maintainers and reviewers with too much noise.

Then we may need something outside of human filters but taking into account all the issues with it. Stuff like this should have an issue if its going to solve a specific problem.

In addition to the cost ... How much this is gonna cost running this AIs in the pipelines?

This is just a thing to look into and research. We have ideas of how stuff like this can be funded.

@benjifisher

It may already be possible to create high-quality code with AI coder/reviewer loops. If not, it is coming. (My guess: less than a year away, and getting easier all the time.) On the other hand, an inconsiderate contributor can waste a maintainer's time with or without the help of AI

Strongly agree with this, my general view is that its good to model this stuff as lots of humans submitting things and create processes we'd use for them.

@joachim

The matter of the externalities and the harms that LLMs cause keeps getting brushed away here.

I honestly wonder, do people not believe that these are a problem? Or do they not care?

As a "pro-ai" side I feel the same way. I recorded a 30 minute video about, write a large comment about the externalities and have worked behind the scenes to find ways we can resolve the externalities, my first talk "Is AI Coming for your Job" explored some of the externalities. I've seen studies, calculations and papers about the externalities.

However, someone reasonably the focus for most people is the specific impact on the maintainer time so that has more response.

I think the working group could help with this? We could have specific issues to discuss the externalities? But if we are making descisions based on the perceived view of the externalities than people who want to get involved in it may need to be ok with engaging with pushback.

I think the real answer is people care deeply about the externalities, think about it all the time, but find it scary speaking about it in public.

dimilias’s picture

I honestly wonder, do people not believe that these are a problem? Or do they not care?

I will be honest here and please, don't get upset, I am expressing my opinion about the world view, not aspiring it.
I don't get what you expect people to care about.
The environment? The electricity? The strong language in the IS?
You think a poor person somewhere in another country cares about these? Just because the west is shouting "if we don't protect the environment, there is no tomorrow"? And you expect people to care when they were never certain about tomorrow?
We are all the moralists that some how will change the world because WE are much better.

I know how people think, I do care about the environment, I do try to follow politics, but I will try to speak for those that don't have the courage to speak.
If we are talking about how we will handle AI, as it is coming, and whether we will adapt or be consumed by it or block it to care about the core contributors, I am all for it. Any decision, super supportive.

For the rest, nobody cares. No one. Nada. No one cares that musk bought X, apart from those shouting. No one else cares about the environment impact because who listens? What difference do you think this decision will make? Are we talking about not allowing llm because it overflows the work for core contributors? Or because our morality is much better than the poor Indian (as an example) that finally got a chance when he wasn't able to have a good pc even?
No one cares. Technology is there, and will be. Shouting about morality in a highly technical issue just sends people away. You don't care that I don't have fiber optics yet in my village and cannot stream at nights (not that I would) and I don't care that someone used data illegally because that gave me the opportunity to make some better money for my family.

And again, I speak from my very safe space because I already had a nice job - I am the privileged that gets to whine about these satanic technologies because I am fine.

And you can argue about the consequences and discuss the with the rest of us scientists here but it will make 0 difference. Zero. Because those of us here with the privilege to discuss it, are discussing morality of those not enough privileged.

Again, I am terribly sorry, I do not want to upset anyone, and there are indeed issues with morality, environment and everything. But it does nothing for this thread. Any decision taken by the people that govern drupal based on the impact that llms do or might have in drupal, I am all for it.

Anything that involves morality, I don't think it helps in any way here. There is literally nothing to discuss here about this. Not because it does not affect the world, but because for some reason it seems we are trying to define the morality of the world when the community is from the opposite sides of it.

And no, this is not a nihilistic approach "nothing matters so why should we care". This is for everyone trying to bring back the "why not anyone cares" issue about something they personally care and somehow this should now be globalized responsibility...

And this comes from someone that does care, but knowa that conditions we all grew up into, differ so vastly, that morals differ as well..

yautja_cetanu’s picture

For the rest, nobody cares. No one. Nada. No one cares that musk bought X, apart from those shouting. No one else cares about the environment impact because who listens?

This isn't actually true. Some people don't but actually many corporations do care, as well as contributors.

  • - If an ethical policy we take in Drupal discourages contributions, the community suffers.
  • - When it comes to sustainability, transparency and privacy many organisations care. Many Drupal clients are NGOs, governments, universities. All have ethical policies very important to the core of them. Geopolitics has been a major reason why some people have chosen Drupal AI over alternatives
  • - Similarly you can win points in tenders based on carbon credits, water etc. So being able to point at our approach being better or worse for the environment than others is a big selling point. Companies are judged based on this and sometimes get investors in index funds based on the ESG rating.
  • - When talking about the "Global south" the indian Drupal community has taken Drupal AI (and AI generally) was more seriously than anyone else, then after that its europe and than US the least. The richer you are the less involved and excited by AI it seems you are right now not the other way round (this is a broad generalisation based purely on my own personal anectdotal experience). So the flip-side I think this is actually an externality that means we should take a more inclusive approach to AI as it can help people who might struggle with the Drupal community due to cultural and language barriers.

Its ok to say the reason why you don't personally care, but I think there are many who do care about externalities and the community benefits from that. It's going to be difficult to discuss though as the internet has traditionally been a very bad place to have these conversations.

joachim’s picture

> - If an ethical policy we take in Drupal discourages contributions, the community suffers.

Wow, so we drop ethics when it doesn't suit us. That's not what having ethics is.

dimilias’s picture

@yautja_cetanu

Some people don't but actually many corporations do care, as well as contributors.

I agree. However, policies are made on the top level. Here we are trying to discuss about something very specific. Trying to force a "higher" level conversation among the arguments, I don't see how that has any value.
What I am trying to say is that if the community decides that e.g. AI/LLMs are bad, then we have a basis to discuss here. Starting with this idea. But here, the point is whether LLMs should be allowed as contributions in core. I see that as a very specific thing.
People will come and say "here are the pros, here are the cons". If we start categorizing all pros/cons according to some higher ideas that are still under discussion, either we postpone this until a generic decision is taken, or we stick to this one.

Me and you and many others have expressed pros in allowing or to monitor the situation. If we start the "but you are in favor, don't you care about anything moral that I think of?" then you are just 2 steps away from being labeled as the "fascist" that is still in the IS as a word.
You advocated for removing the white race as a strong language, but ideologies exist outside this spectrum as well.
Is Drupal denying people of right wing or extreme right wing people? is a "fascist" not good of a dev? are those who think climate change is a hoax unwelcome to write code for Drupal? Does that help in any way as an argument in this conversation? do you know if I am a fascist or do I know if you are? what about anarchists? what about A B and C categories?
My point is, what conversation are we having really here?

If an ethical policy we take in Drupal discourages contributions, the community suffers.

This is not good or bad. Any decision has consequenses, whether good or bad. Ethical policies are policies. Everyone has them and it always has effects.

yautja_cetanu’s picture

Wow, so we drop ethics when it doesn't suit us. That's not what having ethics is.

I don't know if I agree with you that what you're talking about talks to a large international community. I have my own personal ethics and they aren't things that are dropped when it doesn't suit us.

But different people come to this from different angles. Some people care about these ethical considerations in which case we can discuss whether or not they are really issues or not. Others like above don't care, in which case you can present to them the reasons why caring about those ethics are good for them anyway.

Some people care about Climate Change because of ethics, others because of national security? If you are care about the planet are you so anti other people's approaches that you won't even be ok with others discussion the national security benefits to avoiding fossil fuels? If not, its the same here.

hestenet’s picture

Posting a small update more about providing some context than contributing to the discussion at this immediate moment.

Sidenote:

I'm afraid I don't have the emotional energy to navigate these waters right now, but believe me, I care! My book was in the libgen dataset that Anthropic stole to train its model. Theft of copyrighted work is rampant and clear cut. On the other hand, I feel much less conflicted about these models being trained on open source code to generate open source code, but the baby seems to be thrown out with the bathwater.

In a very simplistic way, I think the debate has two primary branches:
A) Is this bad because it produces bad contribution and maintainer burden? One day, this may be fixable, imo.
B) Is this bad because of ethical trespass? copyright violations, undermining the rights of labor, etc, etc. - these very well may not be fixable.

I think laid out on a grid, we'd see people fall into all 4 quadrants.

With side order of a pragmatic question:
C) Could it possibly be stopped/would it do more harm than good to try?

Back to the main thing I wanted to post

The small update was just that I recently updated the phrasing of the current status quo policy: https://www.drupal.org/docs/develop/issues/issue-procedures-and-etiquett...

It addresses(but by no means fully solves) debate branch A, but is currently silent on B, I fully admit.

This update was inspired by the activity we're seeing from the new class of Google Summer of Code contributors this year (another program that is struggling with AI, and has many organizers collaborating together to try and figure out what to do).

AI-Generated Content

There is no doubt that artificial intelligence tools such as ChatGPT can be powerful ways to jumpstart code or content. However, AI systems still have significant flaws. Often times the code they produce is non-functional, and the content they create includes assertions or citations that are untrue.

AI contribution is currently allowed in the Drupal community, but there are very strict rules. If it's usage becomes too abusive or burdensome, this policy may change. When using AI in the course of making a contribution to Drupal, we require that:

  1. You must fully understand the issue and the most recent comments before trying to use AI to solve it. If you don't understand it, you will probably give a bad prompt and wind up posting a bad solution that only annoys the maintainers.
  2. You must disclose whenever you have used AI.
  3. You must review and understand the output of the AI and test it yourself!
  4. You must fix any problems with the AI-generated code before posting it to an issue.
  5. Most important - you must be a good listener and collaborator with the project maintainer, original issue reporter, and other contributors on the issue. A 'drive-by' contribution where you don't follow up on feedback, or ignore previous discussion on the issue will likely result in an account ban.

Special note for programs like Google Summer of Code - GSOC is meant to be a learning experience. We would strongly recommend not using AI so that you learn the foundations you need. This will serve you  better later even if you do use AI, because you will understand how to prompt and interpret it.

We want to ensure that your contributions follow the best practices we have established as a community, so that you are building good relationships with the maintainers of the projects you are contributing to.

Warning Use of AI when it is not relevant to an issue, provides broken or unusable code, or provides false information—especially if done in bulk across many issues—WILL result in a ban.
aporie’s picture

@yautja_cetanu, I didn't get it. Should we comment in #3580299: Create an "AI and Core Policy" Working Group and Review Group - Create a list of names. if we want to be part of it?

- Similarly you can win points in tenders based on carbon credits, water etc. So being able to point at our approach being better or worse for the environment than others is a big selling point. Companies are judged based on this and sometimes get investors in index funds based on the ESG rating.

It is true even though most of the time the main criteria is the price (which AI can tend to put down, at least that's what we are being sold for now, but to me it still to be proven). Now, can we really get a strategical advantage in comparison to competitors, by saying Drupal is more environmentally friendly because it decided to ban LLMs contribution from the core? IMO it's negligeable. As mentioned before by someone (maybe you, I don't recall), it's best if the Drupal project woud make strategic partnerships (but maybe already does) to lower its footprint. Like, dunno, 100% of the energy used to run Drupal pipelines (with AI or not) is provided by carbon 0 energy.

Because we say LLMs use a lot of energy ok. But if I'm running it on my local machine from my little cabine in the wood, 100% auto managed by solar panels, they are then not a big problem for the environment anymore. Paradoxaly, my solar panels and my presence in the wood is a problem to the local fauna and flora :)

kentr’s picture

But if I'm running it on my local machine from my little cabine in the wood, 100% auto managed by solar panels, they are then not a big problem for the environment anymore.

There's more to the story than that... Any product still requires energy to build it in the first place. This is embodied or embedded energy.

So before the software even gets to your cabin, it has had an impact on the environment (not to mention all the other harms that were caused by its production). Plus, the solar panels, batteries, etc require energy to produce & maintain. If you're upping your solar panel wattage just so that you can run the LLM in your cabin, then it's a net increase of energy relative to what you would have used without the LLM.

I'm not saying that this is significant relative to everything else that happens in the world, but I am saying that it's not zero, that it may still be a big problem, and the relative significance can only be determined with real data.

quietone’s picture

@hestenet, thank you.

aporie’s picture

I'm answering to a comment from Ghost of Drupal past here from #3580299: Create an "AI and Core Policy" Working Group and Review Group - Create a list of names., as I don't want to turn every issue into an unconstructive, non readable, personal attack, trolled content.

You (core commit mentions: 0) and aporie (core commit mentions: 0) have successfully trolled the parent issue to death and now you filed the textbook illustration of sealioning to ensure a decision can never be made. You have successfully exploited a weakness in the current processes which indeed downright encourages the worst sort sealioning because polite people can never be told "you said your piece, it's time to stop". Congratulations is in order.

I see you all for what you are because dimilias (core commit mentions: 0) have said the quiet part out loud:

Shouting about morality in a highly technical issue just sends people away.

Maybe because contributing to core is not that easy and not everyone is either, grinding for credit, nor being co-opted for core contribution. Now some of us, honest developers, just trying to do their jobs, actually contribute to core, it just never reach the merging state.

Because we are at showing white paws here, not to be attacked at a personnal level, some of my contributions (~40+ contrib from my personal search):

#2761273: Make exposed filter values available as tokens for text areas
#2858392: Views doesn't parse twig when there are no tokens to replace
#3292849: Allow 'Cron run completed.' log message to be skipped
#3313665: How to install a new entity type from class annotations in an update hook
#2929115: Unsupported operand types in form_builder()

Here. Now I feel, I just had to pass the security check at the airport.

Thanks

[EDIT] And at the end, as mentioned by @yautja_cetanu, we are not here to take the decisions people in the MAINTAINERS.txt will take. We are just here trying to help contributing to the debate. I personaly don't feel legitimate to take decisions for the Drupal Core team, as I'm of course far from being a big contributor myself. I just want Drupal to stay on track with the changing environment, because whether you like it or not, I've been contributing to this project for 10 years, and like yourself, I don't want it to be smashed down to a memory by the genAI hype.

ghost of drupal past’s picture

OK I can't resist, sorry.

We are just here trying to help contributing to the debate.

really? With those exact words?

Sealioning (also sea-lioning and sea lioning) is a type of trolling or harassment that consists of pursuing people with relentless requests for evidence, often tangential or previously addressed, while maintaining a pretense of civility and sincerity ("I'm just trying to have a debate")

Emphasis mine. Quote is from Wikipedia.

Some people who wrote a lot in this issue have zero mentions in the Drupal core git log include yautja, aporie, dimilias.

This issue should be closed down and then debated on a platform with strict moderation -- and only for core contributors. There's no point in it now, it has been trolled to death.

aporie’s picture

I have a good one, so I'm gonna share it:

To me, what you want to do is:

pwd
~/present
cd ../3_years_ago_before_llms_came_out
~/3_years_ago_before_llms_came_out
echo "Haaaa it feels great here"
~/3_years_ago_before_llms_came_out Haaaa it feels great here

... thinking
... thinking

sudo rm -rf ../present

As said, none of us that you mention have admin access here. So we can't run sudo. But it belongs to the MAINTAINERS.txt to perform this action or not.

Me? I'm just here:
ls -lh /present/aporie
rwxr--r-- 1 aporie drupal
I'm just asking to chmod me a w, not even a x, I have limited time for the Drupal project as of today.
chmod 764 /present/aporie
rwxrw-r-- 1 aporie drupal

And I'm pretty sure other people would want to join.

Sealioning? But who is harrassing who here? It's an open discussion anyone is free to join or not. If I'm sealioning, you have a tendency to use fallacies as an argument:

A fallacy is the use of invalid or otherwise faulty reasoning in the construction of an argument.

When it's not syllogisms:

Premise 1: Elon Musk makes both Teslas and LLMs.
Premise 2: I drive a Tesla.
Conclusion: I endorse LLMs.

Which in our conversation was:

Premise 1: LLMs destroy environment
Premise 2: I use LLMs.
Conclusion: I endorse the destruction of the environment.

scott falconer’s picture

Before this thread spirals out of control again can we assume that everyone here is working with good intent and has the best interest of Drupal and the community in mind? While this issue is filed against Drupal core, some of the proposed solutions would have a direct impact on any contributor or user of Drupal unless handled thoughtfully, which is why many people are participating.

There are clear actionable steps that can be taken as a result of this conversation, i.e. I think we're all in agreement that there are real and pressing issues affecting maintainers as well as very valid concerns on the impacts to the quality of Drupal. I think we're also all in agreement that there are real and valid concerns about LLM usage in general, but many of us have differing opinions on how to address that.

Opening new issues like #3580299: Create an "AI and Core Policy" Working Group and Review Group - Create a list of names. seems like a rational step forward.

dimilias’s picture

@ghost of drupal past again, really sorry for upsetting you - which I understand says nothing. I did not know the terminology of sealioning so did not reply on that. I am fine if the maintainers decide to do that - private thread. I never claimed having "equal word" with everyone's efforts for the community. However, I stand by my words on the sealioning. According to the definition you gave above, my main note still stands. If you don't split off tasks, you will end up in "sealionings". Because if you include arguments like "too much work for contributors" and "the global environmental issue", then the "too much work for contributors has 0 hope of being addressed. Because the second arguments creates so many branches.

I will not continue replying in this or the other thread. Apologies again. I hope mostly that a decision is taken soon so that the community can adapt rather than my decision is taken.

ghost of drupal past’s picture

can we assume that everyone here is working with good intent

towards the Drupal community? towards their own pockets?

the answer is not the same

Opening new issues like #3580299: Create an "AI and Core Policy" Working Group and Review Group - Create a list of names. seems like a rational step forward.

Nope, that's just more sealioning. Much more. I already called that out for sealioning before one of the trolls here have named themselves doing it with their own words. Since the botlickers can not post in support of fascism they have two options, we have seen both: refuse to apply any ethics at all or flood the zone so there is no hope of making a decision. That's what the linked issue does.

Mind you, sealioning is an inevitable consequence of the culture forced upon the issue queues where instead of honest discussion politeness rules above all together with zero moderation of discussions and some people have ruthlessly exploited this in other places as well. I've complained about this a few months back at https://git.drupalcode.org/project/gitlab_templates/-/issues/3562505#not...

yautja_cetanu’s picture

- I do agree that core devs should have more of a say on what happens with core.
- But if someone presents an argument based on bad science or a misunderstanding of economics, than the number of core credits you have doesn't change what is scientifically true.
- The thing about politeness goes both ways. I only got involved because many people, including people in the maintainers.txt told me about this thread as many people are also not that keen of getting into a discussion like that that will so quickly go into personal attack.
- At Drupal events I am often one of the few non-white people in the room or in discussion panels and I have done a ton to work with the Drupal Indian community especially that is doing a ton of work in AI. The number of core credits you haven't doesn't mean you get to silence non-white people pretending to represent them. I have no idea if you're a "Rich White man" or if you've grown up as a woman in the global south, if you are a poor woman from the global south, then I'll apologise as everything I've said about this would be wrong.

The goal of something like a Working Group is to allow this to actually be discussed without the noise.

towards the Drupal community? towards their own pockets?

Who is this directed at? Do you want some kind of accountant to look at my accounts? I'll bet anything you're way wealthier than me, almost everything we make we keep giving back to the Drupal community and do you really think right now, the most money in AI to be made is in Opensource and the Drupal community not silicon valley?

volkswagenchick’s picture

It feels like this discussion is starting to get a bit heated again, and some of the language here could make it harder for others to participate or feel comfortable engaging.

Let’s try to keep the focus on the ideas being discussed and avoid making assumptions about each other’s intentions or circumstances.

We want to make sure this stays a space where people can contribute without the conversation becoming personal.

I also want to note that bringing in assumptions about someone’s identity (race, gender, background, socio-economic class etc.) can be harmful and isn’t something we want to encourage in this space.

For more information, please refer toDrupal’s Values and Principles of seeking first to understand, then to be understood. We ask to please suspend judgment until you have invested time to understand decisions, ask questions, and listen. Before expressing a disagreement, make a serious attempt to understand the reasons behind the decision.

This comment is provided as a service (currently being tested) of the Drupal Community Health Team as part of a project to encourage all participants to engage in positive discourse. For more information, please visit https://www.drupal.org/project/drupal_cwg/issues/3129687

moshe weitzman’s picture

In our little community, we have the wonderful ability to make the rules. I'm encouraged that folks here are exercising that ability to make this an enjoyable place to live. Emphasis on joy. If we lose the joy, it doesnt matter that someone can build a big feature in a day.

I think Option 3 ("Never submit code you don’t understand.”), could use a bit more definition. I think a practical enforcement of that is that we temporarily ban issue description/comments written by an AI. This accomplishes two things:

  1. Writing your motivation and explanation without AI encourages thinking, and helps ensure you understand the code you are submitting.
  2. Reading human created text is far more enjoyable (and less verbose) than bot text.

I would be 100% fine with the other Options in the OP.

ghost of drupal past’s picture

The Verge published The gen AI Kool-Aid tastes like eugenics today. Quote:

The voices featured in Ghost in the Machine — a blend of AI researchers, historians, and critical theorists — make a compelling case that basically every facet of the AI space has been profoundly influenced by its historical connections to fields of science built to support discriminatory world views.

dimilias’s picture

I will come with a truce offer for @ghost of drupal past. As previously noted by @moshe weitzman, without joy in the community, not much are is to be expected. There will be friction (of course) but yeah.
So, without changing my ideas, and I really don't like all the aggression that is built up, and knowing that my word is not equal for core, I would suggest that the "better safe than sorry" is a good place to start.
Though I still think that LLMs, despite their progress are at their infant, and having major objections about the comments in this thread, in order to calm down the friction a bit, I would suggest to go with a limited (calling it limited but it is a no LLMs mainly) option for now, and IF (big if) something changes in the future, we can adoptadapt (oops). If not, we are still on the safe side than the sorry side.

I never intended for this sealioning thing (sorry, first time I hear this term), or to troll the thread or whatever the accusations are for, and they are harsh and unfair/dissapointing a bit. But if such tension and such passion exists in all sides, then let's start low, see how that works. There can be further debate in the future if the need arises.
The goal here would be to make it easier for the core maintainers is my point. If there is division of such magnitude, I might suggest this is the less friction-ous way to go.

ghost of drupal past’s picture

dimilias Oh you were not flooding the zone, that was the other two, you are the one who have refused to apply ethics in #139: "Shouting about morality in a highly technical issue just sends people away."

catch’s picture

I think we should do Moshe's #156 immediately. It doesn't preclude taking other rapid steps but that is currently the most disheartening thing I see. If people want to add a translation exception then fine. But translation doesn't mean turning a couple of sentences into ten paragraphs.

After saying Drupal.org has not had the same problems as GitHub earlier in the thread, there's been an influx of AI generated comments and MRs in the past couple of weeks, mostly from within a week of user registration. See for example this verbose MR summary that largely rewords the issue summary while adding very specific notes about the MR contents which are both useless and would not be kept up to date.

https://git.drupalcode.org/project/drupal/-/merge_requests/15112

scott falconer’s picture

I'm supportive of #156 and of ensuring that communication intended for humans on Drupal.org is written by and for humans.

Since detecting AI generated text at scale isn't really feasible, I would also be supportive of requiring new contributors to participate in discussions, feedback, and analysis prior to being able to submit code for review. It's not foolproof, but it would be a gate that uses other types of contributions as a way to validate responsible use.

dimilias’s picture

This is not relevant to this thread but relevant to #100 and #160. I am not making assumptions or deriving into conclusions, just adding some notes - and they are general notes, I am not commenting on that account.
@catch, I was looking at this account for a couple of days (given I already noted one of his first commits that happened to be highjacking one of my coleague's assigned issues).
Mostly I was curious because of the behavior as it seems to have automated multiple parts of the process. Due to that I was curious if there were other accounts performing the same, but the issues from the past 3 days (around 3 pages) seem to not have any more cases (as of the day before yesterday I last checked (thankfully).
However, I would say this is a bit of a ringing call because up to now, we are talking about flooding the queue with code that is LLM assisted. And that is made in the assumption that e.g. I take a ticket that is interested to me (due to my project) and work on it using LLM. But what if, LLM is not used only as an coding assistant but as a procedural assistant as well. The patterns in this profile reveal that multiple aspects of the submission process is assisted on that matter. Of course, comments can fall under the category "I don't speak good english, I need to communicate better" which I cannot judge, but I would say, after this issue, a secondary issue we (or you, again, not the same weight in my voice) need to think about is multi level processing. A selenium assisted service that grinds for credits in d.o. doing, if not all, most of the procedure, automatic. Because if we are talking about flooding with comments and code, that might be coming quite fast.
I will not comment any further in regards to this account as it is not up to me, but patterns can be recognized and problems might be caught early here.

Edit: For example, that automated process (if automated) does not take assignees into account (another colleague's issue has been highjacked by the same user).

kentr’s picture

@scott falconer:

Is it possible to instruct the agent to include text that states that the AI was used to generate the output?

kentr’s picture

Added comment to the MR.

ghost of drupal past’s picture

Title: Ban LLM code contributions » Ban slop issue summaries and comments
Issue summary: View changes
aporie’s picture

Guys,

I see the issue description has completely changed and we are now making progress towards #156 from @moshe weitzman from which there seems to be a consensus (including myself).

It's great, but we need to organize ourselves to prevent commit wars (know as edition wars on Wikipedia). We shouldn't allow ourself to use the same workflow as with code. Code, it computes or not. When you contribute you add something valuable (or you discredit your own contribution). Here we are talking about gouvernance decisions (each one of us has a different opinion, where no "truth" exist). I'm pretty sure DrupalCon Chicago will be very constructive and valuable in the next few days, but a lot of us won't attend, hence a dematerialized solution should be considered.

I'll try to draft the base of what could be the structure of such a system (using d.org):

1- We need a "garbage" issue (and that is why I suggested slack) for all polemical, argumentative, brainstorming discussions. Well, de facto this ticket is one, we could create others about polemical, complex wide topic about genAI shift for the Drupal project.
2- From the garbage issue, we create 1 child issue "action" (such as #156), but there was tons of other ideas that arised from this ticket. In that issue we can discuss the pro and con of the suggested idea.
3- At the same time we create the "action" issue we create another child issue to the action one: "vote" issue. This issue is solely used to add "in favor: +1" or "against: +1".

So 1 ticket for main discussions, 1 ticket for taking actions, 1 ticket to vote about the action. Each ticket should be open against "Plan" category (somebody mentioned there were specific ticket for gouvernance).

Because time is of the essence IMO, regarding AI, we should set a limit date to action tickets (and by heritage vote tickets), like 2 weeks (in the description). During this two weeks people can discuss and vote on the two tickets created for that.

Vote: To keep track of the vote, we should self count ourselves, meaning when you add +1 to one side of the vote, you keep the math for everybody. Any cought attempt to cheat (double voting) or cheating the math, OR trying to highjack the vote with polemical content, should result in a warning if doubt of mistake, or a ban if obvious cheating.

At the end of the vote, either the community discarded the ticket because the majority voted "against", or a MR is created (or a ticket to core, with proper category) as a result of the "In favor" vote.

MAINTAINER.txt have a veto on any ticket if they reached a quorum of 50% (40%? 60%? it all depends on how much weight we want to give maintainters, if they all agree already on the path the drupal project should take from the genAI revolution, then either there is no point into doing anything and they should just use sudo, either we could think about dropping down the quorum in favor of the community vote).

We should elect impartial "organisers" for which the task is to open and keep track of the above mentioned ticket. Thus to prevent people from highjacking the process and opening tons of tickets to flood the process. These "organisers" can give their opinions, only they stay impartial on the decision to open/edit or not a ticket, if the community obviously seem to want to discuss a topic and put it to votation. Also no one should be able to just edit a ticket for their own benefit. Only "organizers" are able to do so. Any attempt to do so, the same, warning and/or ban.

I hope it's clear, even though that might be a gas factory, but at least it seems fair to me and help us move forward in this very quick changing time of genAI revolution.

longwave’s picture

Guys,

https://www.drupaldiversity.com/blog/2020/why-guys-isnt-gender-neutral-o...

As for the rest of your post I'll just repeat what I said earlier in this issue:

I can't help but wonder where the project would be if the time and effort spent here was replicated in other issues.

ghost of drupal past’s picture

@longwave as well: sorry it took so long. Wasn't my doing. But there's good progress now. Just the usual trolls still trying to flood the zone.

@aporie: just stop. Despite your best efforts there's progress finally.

scott falconer’s picture

StatusFileSize
new436.73 KB

I've updated the MR with what I hope are clear instructions to the agent. That being said, all agents act a little differently. Attaching a screenshot from Codex CLI where the agent refused to draft the comment for me and instead guided me towards a more useful contribution.

Changes in this revision:
- Attempted to target scope to Drupal Core files, while ensuring that if modules or other sub-systems have their own agents.md file that is also respected. Behavior will vary by agent.
- Clarified that is covers "intended for posting" as some agents would work around that by attempting to directly post.
- Disclosure remains a reminder to follow Drupal's current AI policy rather than prescribing a canned disclosure sentence

Ai disclosure, I used AI to research and test this AGENTS.md file. I also used an agent to push this MR because I haven't done that manually in a long time and I'm not sure I'd even remember how and I'm at DrupalCon and the bar is calling.

quietone’s picture

Status: Needs review » Reviewed & tested by the community

@scott falconer, thanks for the changes to the MR and the screenshot. There are small things I would change in the MR but I have not commented on those because the screenshot has convinced me this is a good first step. This is not the complete ban this issue started with but it should improve the immediate problem of slop in the issue queue and the drain on reviewers.

I agree with @catch that the ideas in #156 should be implemented immediately. Therefore, RTBC. It can always be changed and improved.

Also, thank you to @cainaru for updating the issue summary with options.

scott falconer’s picture

StatusFileSize
new563.86 KB
new486.53 KB

@quietone thanks for the review. I'm attaching some more screenshots, one where Cursor in auto mode followed the agents.md directive and one where Claude code ignored it until I called it on it.

At best this agents.md is a partial step as 1) it will only work for responsible contributors and 2) agents will not consistently adhere to it. In practice the info just gets appended to the context, so the behavior will wildly vary between models and use cases. As such, it's likely worth us epxloring other solutions beyond just an agents.md file. Our best case for mitigating the impact are deterministic guardrails on Drupal.org / gitlab. Asking an agent to follow rules and standards will likely end up similar to asking humans to do the same.

kentr’s picture

@scott falconer, thanks for working on this.

To me, it's not quite aligned with #156.

I think a practical enforcement of that is that we temporarily ban issue description/comments written by an AI.

The current MR appears to allow generation of technical notes be rewritten by the user. To me, that's writing comments as chunks formatted as bullet points and telling the user to reword them.

Several of the generated technical notes in the Codex CLI output—even if "verified" and put into the user's own words—would be spam at best (redundant to what can be found by looking at the MR & pipeline statuses) and slop at worst (the first entry).

quietone’s picture

Status: Reviewed & tested by the community » Needs work

@kentr, fair point. Can you make those changes to the MR?

catch’s picture

Status: Needs work » Active

This should be a policy issue for humans.

I don't agree with adding an agents.md here at all. Once there is one in core, there is a very high chance that it will encourage more and more LLM usage. Also the agents.md added in the MR explicitly allows for the LLM to generate prose that ends up in the issue summary, which is essentially a workaround for the policy.

Moving back to active.

fathershawn’s picture

I found that this essay by Carson Gross Yes, and... really got me thinking. I've also been very concerned about the effect of these agents. What we do has often been described as an art and as beneficial as listening to music is for musicians, it is not sufficient if one desires to make music. For that one has to play!

I found this part particularly intriguing:

AI is a great TA

Another thing that I tell my students is that AI, used properly, is a tremendously effective TA. If you don’t use it as a code-generator but rather as a partner to help you understand concepts and techniques, it can provide a huge boost to your intellectual development.

One of the most difficult things when learning computer programming is getting “stuck”. You just don’t see the trick or know where to even start well enough to make progress.

Even worse is when you get stuck due to accidental complexity: you don’t know how to work with a particular tool chain or even what a tool chain is.

This isn’t a problem with you, this is a problem with your environment. Getting stuck pointlessly robs you of time to actually be learning and often knocks people out of computer science.

(I got stuck trying to learn Unix on my own at Berkeley, which is one reason I dropped out of the computer science program there.)

AI can help you get past these roadblocks, and can be a great TA if used correctly. I have posted an AGENTS.md file that I provide to my students to configure coding agents to behave like a great TA, rather than a code generator, and I encourage them to use AI in this role.

AI doesn’t have to be a detriment to your ability to grow as a computer programmer, so long as it is used appropriately.

I hear and take seriously @catch concern about any AGENTS.md as I don't want to encourage any automated contribution. But what if we had one that made the LLMs a teaching assistant rather than a code generator?

scott falconer’s picture

This initial issue summary had proposed an AGENTS.md as a way to ban LLMs, which is where it came into the discussion. It does seem like a conversation is needed on if AGENTS.md belongs in core at all. While I'm coming at this from a different direction than @catch @ghost etc, it is a critical decision point that should not be taken lightly.

Pros:
- The AGENTS.md is an open standard backed by the Linux Foundation.
- An AGENTS.md file is one of the best ways to influence agent behavior. It fits in the context stack below a system prompt and above user prompts and skills.

Cons:
- The standard is still emerging and as a result implementation and effects can be unpredictable.
- Whatever is in any AGENTS.md file in the directory tree will usually be added to the context and influence agent behavior in the entire project.
-- This will impact the context window and token use (i.e. we wouldn't want an overly verbose AGENTS.md)
-- An overly broad AGENTS.md will likely cause problems for contrib / custom development, as well other tools in the ecosystem (translation, accessibility, etc)
-- An AGENTS.md in core will affect project development which may be a barrier to entry for new users. You could set it up to be optional in the build, but then that somewhat defeats the purpose of it being there.
- An AGENTS.md file is just guidance for the agent. Agents will usuallyfollow the guidance, but it should not be considered a list of rules that will always be enforced.

My recommendation is we should craft a lightweight AGENTS.md file (and likely related agent skills) for developers starting in contrib as part of a responsible contributors framework. As the standard and extensibility expands and we can then evaluate if it belongs in core.

I don't agree that the presence of an AGENTS.md would have a meaningful impact on increasing LLM usage, any more than including a ROBOTS.txt file would lead to an increase in crawlers.

aporie’s picture

If we put an AGENTS.md in core, are we sure commercial agents will pick it up or we need to tell them the path?
Do we get more chance for the agents to pick it up if that's at root? And then we can just move it using composer scaffolds?

I didn't know the Linux fundation was working on standards, but that would be definitely helpful.

yautja_cetanu’s picture

Moved my comment to a sandbox issue

miksha’s picture

Similar to this there is also https://github.com/indutny/no-ai-in-nodejs-core in NODE.JS project.

catch’s picture

Putting this here for documentation since in various issues the suggestion of LLM MR review as a way to manage slop keeps coming up.

This is an example of something I personally ran into this week.

I am doing some consulting for a client, in their private repos they have LLM code reviews enabled. The LLM reviews can 'pass' or 'fail' but they ignore the 'fail' ones if they disagree with it.

I found an issue with a batch import process that was multiple loading entities, checking if they need updating, then saving them with updated data if so. The bug was that in an attempt to clear the entity static cache to conserve memory, it was calling the ::resetCache() method on the entity storage. This method resets both the static and persistent caches, and it empties the persistent cache for all entity types. Core has an LRU memory cache with fixed number of slots, so this is not only unnecessary but was also likely causing severe performance issues on their site.

I submitted a PR to remove it - this PR only removed one line of code and a couple of lines of comments.

The PR review got an AI review 'fail' because the LLM said:

Drupal's entity static cache (EntityStorageBase::$entities) has no size limit. The removed resetCache() calls were inside batch loops that process potentially thousands of entities. Without periodic cache clearing, every entity loaded via loadMultiple() or accessed during save() remains in the static cache for the lifetime of the command.

This site is on 11.3.x, the entity LRU cache was added in 11.2.x. I know this because I worked on that issue, and the original issue to add static and persistent entity caching to core for that matter #3498154: Use LRU Cache for static entity cache, and also wrote a contrib Drupal 7 module for it back in the day (https://www.drupal.org/project/entitycache). In fact there's no $entities property in EntityStorageBase at all.

Not only that, but the code being removed was also incorrect prior to 11.2 as well, because it could have cleared the (non-LRU) static cache which has been in place and clearable separate from the persistent cache for some time. The alternative way to empty the memory cache also wasn't brought up by the LLM, it essentially argued to retain the bug.

On top of that, the LLM reviews on that project are not always wrong. Sometimes they pick something up (but usually trivial issues), often they hedge things like 'you should check if x affects y' or similar. The fact it can be 'correct' sometimes could lull people into a false sense of security when it's wrong, either not flagging anything, or wrongly flagging things as in this case.

If this review had run against someone's MR who wasn't literally the subsystem maintainer for the API in question who's been working on entity caching for over 15 years, it could have completely thrown them off rather than simply being annoying.

scott falconer’s picture

+1 to #180. There are some places where LLMs can help triage and manage things at scale, but asking an LLM to check for things that LLMs are bad at is only going to add to the noise.

aporie’s picture

The fact it can be 'correct' sometimes could lull people into a false sense of security when it's wrong, either not flagging anything, or wrongly flagging things as in this case.

If this review had run against someone's MR who wasn't literally the subsystem maintainer for the API in question who's been working on entity caching for over 15 years, it could have completely thrown them off rather than simply being annoying.

Totally agree. But to be honest, I could have myself, without the use of AI, also assured my client of the same mistake ... Using AI it could comfort me in my wrongness ... But I'd probably save hours of research on the web to maybe end up empty handed. Problem here is that the LLM introduced in their workflow (in the pipeline) pointed an issue you were sure the LLM was wrong about. I guess it's a limitation shared by humans and LLMs. Like you said, if that wasn't literally the subsystem maintainer for the API in question reviewing the MR ... No (easy) way to catch it.

If it happened to me, I guess that the website will end up having latency at some point because of the issue, and a new ticket would pop up for investigation and eventually be fixed later with a deep search of the root cause. Iterations ...

#179, there is definitely an issue with long MR and I actually can't understand that some developers would think normal to submit a 19k lines change to an open source project expecting maintainers to do their job for them. Here, it's even worse because it was a long time contributor specifically stipulating he reviewed everything himself.

IMO, I don't think people voluntarily (at least I hope so) post long MR like that on purpose to waste anyone time. It's just that the tech is new and some of us tend to trust it too much.

On my free time, I've been trying to vibe-code (entirely) in languages I don't know. I can read, understand and debug using all my knowledge. Though, I find myself being lazy and just prompting the AI again and again to fix things, also because I want to check the limit of the thing. As a result:

1 - It is not working.
2 - When it works, because neither the AI or me is understanding why it works, another prompt will break working code, which is very frustrating. This would never happened if I was writing the code myself following my train of thought, and checking: ok this part validated, this part validated. Oops now it doesn't work. It must be from what I just introduced or because I'm sending the wrong data to my previous code, which I know. I've written it.

My honest opinion on the actual state of AI. It's like Dries said during the keynote. It's good for prototyping.

If I had to start a module from scratch. I'll use it for 2-3 prompts to jumpstart the module (with good prompts about the architecture). Actually I'll use AI to think about the architecture with me. Then I'll just stop vibe-coding, and dig in to understand my module. Still, I'd be using AI to write the stuffs for me (but not vibe-coding). Just snippets per snippets telling it what I want and where, stepping in myself if it doesn't understand what I want.

I'm finnishing on a more "polemical" topic (which doesn't require any answer): Making a ban is a political statement. I'm not here to say we should or shouldn't. I just think it's more complex than just a harsh decision and needs to be discussed. Banning LLM is also banning easy entry to new commers. What about the generations of new kiddos who will be hooked up into coding via AI tools? Should we reserve the IT world to IT students? Open source has been an entrance door to non-diplomed persons (like me) for years and I think it should stay like that.

To make a dirty parallel, it would be the same to me as: should we ban stack overflow because it helps non-techy people to code?

Stack overflow is a fundation of the open source world and it has helped techy and non techy people. I think we should see AI a bit like it (yes now stack overflow is dying ... It's bad, but that's life! We evolve, we adapt ...).

Now about the costs ... Well, taken this way, you also need to pay a monthly subscription to an internet provider to access stack overflow right? You also need a decent computer to run, eventually compile, code. It's a no brainer. To do stuffs you need to be able to afford stuffs ...

Now I loved this part:

Node.js is a critical infrastructure running on millions of servers online and supporting engineers through command-line utilities that they use daily. We believe that diluting the core hand-written with care and diligence over the years is against the mission and values of the project and should not be allowed. Accepting LLM generated changes to Node.js core would break the reputational bedrock of public contributions that have brought Node.js to its current public standing and societal value.

To me we should think about how we keep Drupal trust, without necessarily banning LLMs. We can think of policies, tools, simple features added to d.org to do just that without relying on core maintainers to separate the wheat from the chaff on their own time.

catch’s picture

Totally agree. But to be honest, I could have myself, without the use of AI, also assured my client of the same mistake

I'm not sure how this is relevant to an LLM telling a developer they made a mistake, a workflow that has been proposed multiple times for Drupal.org, not for your client.

taken this way, you also need to pay a monthly subscription to an internet provider to access stack overflow right? You also need a decent computer to run, eventually compile, code.

My two fiber connections (shared with the rest of the household) and laptop with 64gb of RAM (bought before Sam Altman tied up international RAM production) cost me less than a Claude Max 20x subscription would, and the Claude subscription would be on top of those things.

aporie’s picture

I'm not sure how this is relevant to an LLM telling a developer they made a mistake, a workflow that has been proposed multiple times for Drupal.org, not for your client.

But because people are people. And LLMs are LLMs. How any of those are supposed to know about that you have propose this multiple times to the drupal.org project?
[EDIT] I don't think you understood the limitations of both humans and LLMs. None of them were actually to figure out a bug YOU figured out. Because you've built the API. And so what?

Nobody is doubting your competencies catch.

Stop trying to be obnouxious.

catch’s picture

But because people are people. And LLMs are LLMs. How any of those are supposed to know about that you have propose this multiple times to the drupal.org project?

I can't remember the last time a person confidently told me about a specific protected class property that hasn't existed on that class for years, are you saying you do this to your clients?

aporie’s picture

I can guarantee you I've been doing that for my clients. For years.

Honesty always win.

LLMs are limited. We outsmart them. We just need to prove them wrong. And my guess is we have already reached the best they can do. Some tweaks here and there (which are not neural network related) but upper layer nice builder shit. The game is all about bringing more python shit on top of a very nice working LLM.

It's great.

I don't know about quantum computing though ....

But we'll be ready for the next revolution.

scott falconer’s picture

For everyone involved in this discussion that would like to discuss training, guidance, and tools that encourage responsible LLM use, the Drupal AI Initiative has created an issue here: #3581443: Create responsible LLM use training, guidance, and tools package for Drupal contributors. The intent for that issue is to guide and develop tooling and resources based on Policy on the use of AI when contributing to Drupal.

I would encourage discussion from everyone involved, both pro and anti LLM, as these tools are not only intended those who want to use LLMs in their contributions, but will hopefully help reduce the negative impacts to Drupal, the community, and to other contributors who choose not to use them.

hestenet’s picture

Update to #144, as referenced by @Scott Falconer above.

Policy moved from a subsection of the issue etiquette page, to a dedicated page.

Posting from an airport after a long DrupalCon that was challenging and rewarding and tiring and inspiring as it always is.

This continues to be a pragmatic iteration. Not a panacea.

I hope to have more energy to engage post-Con, as a person who has been directly, measurably, and provably harmed by the disregard for ethics among LLM providers, and who also sees the inevitability of the moment, and the opportunity for at minimum harm reduction, if not the chance to assert our own control over this new future.

I am heartened by progress on an MR, that while it does not solve all issues, takes some pragrmatic first steps.

For now, this is is how the page was updated:

Policy on the use of AI when contributing to Drupal

Policy on the use of AI when contributing to Drupal

Why this policy exists

AI tools make it easy to produce a lot of code and text very quickly. This creates pressure on the people who review and maintain Drupal. This policy is not about which specific tools you use. It is about making sure every contribution is something a real person stands behind. 

Most importantly—no matter whether you are using AI or not—you must be a good listener and collaborator with the project maintainer, original issue reporter, and other contributors on the issue. A 'drive-by' contribution where you don't follow up on feedback, or ignore previous discussion on the issue will likely result in an account ban.

The core principle: you are responsible for what you submit

Understanding an issue and collaboratively finding the right solution is a critical part of contributing. Writing the code is simply the execution of that solution, and it will only be successful when built on that solid foundation. AI tools do not change this. If a reviewer asks you to explain a decision or a piece of logic, you must be able to answer. Saying "the AI wrote it" is grounds for immediately closing the contribution.

You are fully responsible for the integrity of your submission. AI tools can hallucinate nonexistent software packages (risking supply chain attacks), introduce subtle security vulnerabilities, or introduces unhelpful refactors and new code that puts an undue burden on others to review. You must thoroughly verify all dependencies, logic, and security implications of AI-generated code before submitting.

Copyright and Licensing

AI models can occasionally output verbatim code from other copyrighted projects. You are solely responsible for ensuring that any AI-generated code you submit does not violate third-party copyrights and is fully compatible with the Drupal project's GPL license. Ignorance of the code's origin is not an excuse for licensing violations.

Examples of contributions that do not meet this standard

These are patterns that create unnecessary work for maintainers and reviewers:

  • Dumping code into an issue without reading the thread or acknowledging previous attempts to solve the problem.

  • Posting an Merge Request (MR) where automated checks fail, and leaving it for others to fix.

  • Adding AI-generated code to someone else's existing MR without their knowledge and without disclosure.

  • Using an AI to dump a large patch and then abandoning the issue when human feedback is requested.

  • Submitting code that ignores the conclusions of prior architectural discussions.

  • Proposing a full rewrite of a module based on an AI review, without first engaging the existing maintainers.

  • Using AI to generate issue summaries, comments, or reviews that lack independently verified technical insights (e.g., using an AI to summarize a thread simply to gain contribution credits).

  • Posting issue comments, MR descriptions, or forum posts that are unreviewed AI output, not your own words.

Disclosure

Transparency builds trust. If you use an AI tool to generate a significant portion of the code or text you are submitting, you must disclose it. You must disclose this use regardless of how thoroughly you reviewed the output.

What is "significant"? Generating entire functions, classes, architectural scaffolding, or extensive documentation blocks requires disclosure. Minor uses, such as standard single-line autocomplete suggestions or basic syntax corrections, do not require disclosure.

When disclosing, please use existing issue or Merge Request templates if they include a designated AI disclosure section. If no template is available, simply append a clear, human-written statement to the end of your issue summary, comment, or MR description. For example:

AI-Generated: Yes (Used GitHub Copilot to help generate the boilerplate for this feature).

Another example: AI was used in the drafting of this policy, to help review for clarity, clean up language, and check grammar.

Enforcement

Contributors who repeatedly violate this policy by submitting unexplained, untested, or disruptive AI-generated code will face consequences.

Our goal on Drupal.org is to educate first whenever possible. However, in some cases, violations may require a temporary ban so we can provide the necessary guidance and ensure it has been read and understood before restoring the account.

In other situations, when contributors show good intent and respond constructively to maintainers and the community, a temporary ban may not be necessary. In these cases, we will focus on education to help them align with community standards.

Finally, when there is clear disregard for these policies or disrespect toward maintainers or other contributors, a permanent ban may be issued.

We recognize that Drupal Association staff and Drupal.org site moderators have limited capacity to keep up with the volume of these contributions. We appreciate the community’s patience as we continue to scale our education and moderation efforts.

This Policy and the Drupal.org Terms of Service

This is a contribution policy. It defines expectations for contributor behavior and can be updated through the normal community governance process.

The Terms of Service (TOS), by contrast, is a legal framework that governs all use of Drupal.org and can only be changed by the Drupal Association.

The intent is to establish this policy first as part of our community norms and issue queue etiquette, and then reinforce it through updates to the Drupal.org Terms of Service.

One area the TOS will need to address more explicitly is the use of automated agents acting on behalf of contributors. As AI tools become more capable, clearer rules in this area will be necessary.

Frequently Asked Questions

For Google Summer of Code Contributors

GSOC is meant to be a learning experience. We would strongly recommend not using AI so that you learn the foundations you need. This will serve you  better later even if you do use AI, because you will understand how to prompt and interpret it.

We want to ensure that your contributions follow the best practices we have established as a community, so that you are building good relationships with the maintainers of the projects you are contributing to.

rkoller’s picture

I am not sure if the following perspective is/was already covered in this issue or if there is a dedicated other one already - at least it is not covered in the summary by @hestenet in ä188 . but based on the following article posted today over on mastodon https://www.theregister.com/2026/04/01/claude_code_source_leak_privacy_n... there following detail i consider quite important: "I don't think people realize that every single file Claude looks at gets saved and uploaded to Anthropic," the researcher "Antlers" told us. "If it's seen a file on your device, Anthropic has a copy." . how is it ensured that content/input by people, who don't consent with LLMs, is neither scraped nor used to refine the providers models based on the feedback by them in the comments? Personally, I don't want to get any comments of mine on any issue in core and or contrib getting incorporated by any provider like openai, anthropic, and alike nor contribute in any way to their business model. That is a red line of mine.