Closed (fixed)
Project:
AI Best Practices for Drupal
Version:
1.0.x-dev
Component:
Evals
Priority:
Normal
Category:
Task
Assigned:
Unassigned
Reporter:
Created:
5 Apr 2026 at 19:33 UTC
Updated:
20 Apr 2026 at 07:10 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
zorz commentedComment #3
zorz commentedI ran compare.py --no-baseline against the coding-standards skill from #3581705 with 8 behavioral evals I wrote for this test. The first 5 cover basic patterns (2-space indentation, elseif, constructor DI, short array syntax, controller create pattern). The last 3 target patterns that contradict general PHP conventions, where I expected Sonnet to struggle:
elseifandelseon their own line after the closing brace, not cuddled as} elseif. This contradicts PSR-12.$this->t()via StringTranslationTrait, not the globalt().->accessCheck(), required since Drupal 9.2.Sonnet results (3 runs, 8 cases)
24/24 PASS on both configs. I also re-ran the baseline from an empty temp directory (no repo context, no CLAUDE.md) to rule out the model picking up hints from the project files. Same result: 8/8 PASS.
Sonnet already knows all 8 Drupal coding standard patterns, including the ones that contradict PSR-12. The skill adds 35% cost for zero quality improvement.
I also found a bug in compare.py while testing:
load_skill()could not handle directory-based skills (skills/{name}/SKILL.md). Fixed and pushed to #3582953.The "proving a new skill helps" workflow works. Coding standards is the wrong test case for Sonnet because it already has this knowledge. The value shows when testing genuine knowledge gaps (like RunTestsInSeparateProcesses in writing-automated-tests: 0% to 100%).
Comment #4
zorz commentedComment #5
pritam-osl commentedComment #6
pritam-osl commentedComment #7
webchickLOL, well THAT's no good. :D On the other hand, it is fantastic that we have a way to know this. :D
Thank you so much for testing (and for finding + fixing the directory bug!)