Problem/Motivation

Disclaimer: I am not sure if this mechanism exists outside of Claude Code so may adversely impact cross-provider compatibility. Nevertheless…

https://code.claude.com/docs/en/hooks much like Drupal hooks, offers a more definitive way to say “do this all the time under these conditions” vs CLAUDE.md which is more of a “suggestion” for it to consider.

Seems like this might be an interesting thing to consider for any “right or wrong” guidance such as coding standards.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Comments

webchick created an issue.

alex ua’s picture

Hooks run in codex too, fwiw: https://developers.openai.com/codex/hooks

Claude.md is for sure not the right place either, from what I can tell it looks to that around 80% of the time. From my own experience I've seen the most improvements in output to leveraging skills and workflow improvements:
* planning as a prerequisite to execution, full test passes a gate for success
* agent "swarms" each with a unique role and jtbd (agents are really just other chats with specific skills,
Cli tools, & mcps loaded)
* criticism, testing, and review of both plans and execution
* clear output documents that show what was done and allow others to easily replicate the outcome
* force it to work from lists like todos and make it fill in forms to prove it succeeded.
In my opinion what would be great is a fully tested and community maintained review agent that worked in any of the big players as well as on the self hosted, international, &/or smaller LLMs.

alex ua’s picture

Building on my earlier point that CLAUDE.md compliance is ~80% — hooks are clearly the right answer for deterministic, right-or-wrong rules. But there's a question upstream of hooks: how do we discover which rules should become hooks?

Not every Drupal-specific mistake an agent makes is a coding standards violation that phpcs can catch. Some are domain knowledge errors — wrong mental model of the render pipeline, outdated understanding of hook-to-event migration status, incorrect assumptions about entity storage. These are too contextual for hooks but too important to leave to CLAUDE.md's ~80% compliance rate.

I've opened the issue for capturing expert corrections proposing a structured correction capture workflow that classifies each expert correction by failure type. One of the classifications is HOOK_CANDIDATE — when a correction reveals a deterministic rule that should be enforced by a hook but currently isn't.

The flow would be:

  1. Expert corrects agent during live session
  2. Correction classified — if it's a deterministic, objective rule → tagged HOOK_CANDIDATE
  3. When 3+ corrections for the same rule accumulate → strong signal to implement as a hook
  4. Hook gets an eval case to verify it works (ties back into #3581832)

This gives hooks a data-driven prioritization path rather than trying to anticipate every rule upfront. The corrections people actually make in practice tell you which hooks matter most.

On the cross-provider compatibility point — the correction log itself is plain JSONL (agent-agnostic). The hook implementation is necessarily tool-specific (Claude hooks vs Codex hooks vs whatever), but the discovery of which rules need hooks can be shared infrastructure.