--- AI TRACKER METADATA ---
Update Summary: Simple approach to bringing advanced metadata into Canvas AI
Check-in Date: MM/DD/YYYY (US format) [When we should see progress/get an update]
Due Date: MM/DD/YYYY (US format) [When the issue should be fully completed]
Blocked by: [#XXXXXX] (New issues on new lines)
Additional Collaborators: @username1, @username2
AI Tracker found here: https://www.drupalstarforge.ai/
--- END METADATA ---
Overview
Currently we only have a one shot process of the building agent and the template agent of picking the right components and how they are being used by the agents. This one looks at all available components, their title, description and the props and slot descriptions to figure out what it should pick.
For the Driesnote we did use this together with the Mercury theme and it worked mostly due to a limit of components and the possibility to remove components that we knew never would be used (see #3549432: Make it possible to disable component for Canvas AI selection). The context window was still around 13k tokens just for this information.
Before Mercury theme we were working with Civic Theme, that is very much aligning to atomic design, meaning loads of atoms that goes into slots at different places. The actual context window, if the above would have been used would have been more or less impossible to use, without disabling components.
The problem with this was twofold:
- It is too much context window to do one shoting, the props and the slots information might not be needed.
- It is at the same time to little context to decide if a component is actually good to use and how to use it and how it relates to other components.
We need a process that is similar to how we as humans actually would do the same process, by:
- Scanning on labels and descriptions for possible candidates what we are trying to solve.
- Using the potential candidates to figure out why and how they should be used.
- Possibly redo #1 if you see a slot that should be filled out or if at #2 you realize that this was not the correct component.
Proposed resolution
- Add to the CanvasAiComponentDescriptionSettingsForm the possibility to add a textarea with markdown metadata, which can include information on how, why and where to use that component. Save it on submit.
- Add a function call that only provides the id, labels and descriptions of the components.
- Add a second function call that takes a list of ids, and provides the ids, label, description, props, slots and the above metadata as part of the response.
- Change the agents that use these, to have the initial tool in memory (default information tools) and instruct it to get more information about possible candidates of components it could pick and execute the function call.
- Allow it to use the function call as many loops as it needs.
Improvements
Discuss if there should be some way a component creator can add this data in the components files.
Issue fork canvas-3545816
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
marcus_johansson commentedComment #3
marcus_johansson commentedI do not have access to set contributors on this issue - but the idea should be credited to ahmedjabar, not me or Catia.
Comment #4
tedbowCredited https://new.drupal.org/contribution-record/11423802
Comment #5
tedbowI don't understand
CanvasAiSettingsForm is an existingglobal settings form for the module not dealing with individual compoments but " that component" seems to imply it is setting info for a specific component.
we have admin/config/system/canvas-ai-component-description-settings that uses CanvasAiComponentDescriptionSettingsForm to make "description" text area for each available component. How is what is being proposed different from this?
Comment #6
tedbowComment #7
marcus_johansson commentedSorry about that, I mean the CanvasAiComponentDescriptionSettingsForm and having a setting for each. I'll edit the issue.
So the issue is the following - say that you have 100 components. Some of the components are atoms that should be in specific molecules, most of the components have props and/or slots with further descriptions. Some of the components should just be used in certain use cases and not in others.
If you look at the extra schema Salsa Digital did specifically for AI where we did a lot of testing, you can see that one component has a lot of metadata for the AI to get it right. For instance: https://github.com/salsadigitalauorg/xb_metadata/blob/develop/web/themes...
The problem is that if you have 100s of components with such extensive metadata, you have all of a sudden 100k input token and the AI will hallucinate and create bad results due to context drifting, even on the first loop.
So the idea is rather that we split this up in two tools (or one tool with a setting). One tool just exposes the label and description. The other tool exposes label, description, props, slots and this extra metadata field for specific components.
That way the agent can use the first tool on all the 100 components and get a low input token count (<1k tokens) and decide candidates that it should use to solve the users request.
Then in a second loop, it will request those candidates and get more information on when and how to use them. This will just fill with input tokens for those components information.
On a third loop it might get atoms that it saw should be used in slots for the initial component information and possible other component information, because the initial candidates where wrong for whatever reason.
The theory is that this should make it possible to have a larger component library and still be able to pick the correct components. We did test that in the above repo and it seemed to work there at least, where the one shot layouts had more problems.
You could also argue that it will save you money/remove unneccasary compute.
Hope that clarifies things?
Comment #8
akhil babuComment #9
akhil babuComment #10
yautja_cetanu commented- It would be good to have a hook as well so that third party modules could override the advanced field.
- It would be good to have the AI Agents to make use of it.
Comment #12
akhil babusee #16
Comment #13
marcus_johansson commentedFor the get_metadata_of_components I think we also will need an (optional) other textarea for a longer description, so if you take the example space-accordion, a complex metadata system would add metadata like:
So its not just about token saving/context drift management, but also about making sure that you can express how, why, when not, when to use a component. Since its unclear exactly what the best practices for this information is, it should be a textarea for now, so we can run tests what works and what doesn't.
Comment #14
akhil babufrom #3558241: Canvas AI: CanvasAiComponentDescriptionSettingsForm redirects to 'canvas.api.config.list' after submission
Comment #15
akhil babuResuming the work
Comment #16
akhil babuHere are the changes pushed in the last commit.
The get_component_context tool will output something like this
If the user has not used the CanvasAiComponentDescriptionSettingsForm to override the component, prop, or slot descriptions, then:
If the CanvasAiComponentDescriptionSettingsForm has been used to override the descriptions, then those overridden descriptions will be loaded for all components.
A new tool, get_metadata_of_components, has been created. It accepts an array of component IDs and returns a single YAML response containing detailed metadata for each component.
For example, if the input is
[block.system_menu_block.footer, sdc.space_ds.space-button, js.hero_banner], then the output of get_metadata_of_components tool would be
Comment #17
akhil babuThis issue builds on the work done in #3558241: Canvas AI: CanvasAiComponentDescriptionSettingsForm redirects to 'canvas.api.config.list' after submission and depends on it being merged first.
Comment #18
marcus_johansson commentedComment #19
marcus_johansson commentedI've added some comments regarding the code.
When testing it, I get a unknown form error, when I try to save it without Blocks enabled, see
In this case, I just ran it on a vanilla site, not Drupal CMS, so the descriptions there seems autogenerated. But it doesn't really give me an error message at all, also nothing in the log. The error after debugging seems to be "At least one source must be enabled.", however its connected to "component_context" element, so the message seems to not show up.
Comment #20
rakhimandhania commentedComment #21
akhil babuThanks for reviewing. I'm waiting for #3533079: Introduce AI Agents and tools to create entire page templates using available component entities to merge, since it may cause conflicts once it's merged.
Comment #22
rakhimandhania commentedComment #23
akhil babuComment #26
rakhimandhania commentedComment #27
alex ua commentedFollowing up on the metadata optimization discussion here. We’ve built a complementary approach: horizontal scoping that reduces which components the agent sees during edit operations, rather than reducing metadata per component.
The problem
When editing a single heading, the page builder agent receives the full page layout — every region, every section, every nested component with all props and slots. On a 15-component demo page, the full layout JSON is ~11.5K bytes (~2,900 tokens). The agent only needs the section containing the selected component.
Approach
A
BuildSystemPromptEventsubscriber (priority -10, after ai_context at 0) that runs whenactive_component_uuidis set:Fail-open design: if the selected component can’t be located in the layout, the subscriber falls through to the full layout. Never degrades the editing experience.
Known limitation
The subscriber replaces layout JSON in the system prompt via string matching. If the serialization format between the tempstore and the prompt differs (whitespace, key ordering), the match fails silently (falls through to full layout). This works but is fragile.
Would a structured layout accessor on the event be a cleaner path? Something like a
layout_datatoken via the existingsetToken()/getTokens()API, so subscribers can work with parsed data rather than doing string surgery on the prompt? We prototyped this using the token bag and it works without requiring changes toBuildSystemPromptEventitself.Measured results (N=1 heading edit, 15-component demo page)
Layout is approximately 10% of total per-loop cost — system prompt instructions and ai_context items dominate the other 90%. Layout scoping yields a modest reduction on its own but compounds with other optimizations:
How this complements this issue
Cross-region edits
Scoped layout preserves cross-region awareness via the region index but limits cross-region component detail. Operations requiring full cross-section context ("match the style of the hero section") would need the agent to request the full layout via existing tools, or would fall through to an unscoped prompt.
Prototype
Working
LayoutScopingSubscriberin a custom module. UsesCanvasAiTempStoreto read the current layout andBuildSystemPromptEventto replace layout JSON in the system prompt. Unit tests covering region index generation, section scoping, nested components, and edge cases.Happy to share the code or contribute a patch if this direction is useful.
Comment #28
alex ua commentedI had been carrying a local rebase of MR !719 for a demo site. Before posting it, I verified against current Canvas 1.x HEAD and found that the functionality from this issue has been implemented upstream — likely through other MRs, since the API details differ from !719's approach. MR !719 appears to no longer be needed.
Specifically, all of these are now in Canvas 1.2.0:
- CanvasAiComponentContextHelper (method renamed to getLessDetailedComponentContext)
- GetMetadataOfComponents plugin (present, slightly different error handling)
- CanvasAiPageBuilderHelper subrequest removal + array return type
- CanvasAiComponentDescriptionSettingsForm detailed_description field
- GetComponentContext switched to CanvasAiComponentContextHelper
- canvas_ai.services.yml context_helper registration
- Agent config with get_metadata_of_components tool
One minor difference: upstream getDetailedMetadataOfComponents() silently skips unrecognized component IDs, where !719 threw \InvalidArgumentException.
If the maintainers agree this is resolved, marking Fixed would unblock #3549232: Canvas AI: Updating page contents with agents and downstream #3583357: Deterministic edit controller: resolve simple property edits without LLM.