--- AI TRACKER METADATA ---
Update Summary: Iteration agent on created components
Check-in Date: MM/DD/YYYY (US format) [When we should see progress/get an update]
Due Date: MM/DD/YYYY (US format) [When the issue should be fully completed]
Blocked by: [#XXXXXX] (New issues on new lines)
Additional Collaborators: @username1, @username2
AI Tracker found here: https://www.drupalstarforge.ai/
--- END METADATA ---
Background
We have been playing around manually with the Agent Explorer and the component creator agent to see if a subagent that looks at an image of an initial component and look at the state of the component will generate better results, and with three tests its obvious that it does.
One example:
We start with this image and ask the agent to generate it:

The first iteration as the agent works at the moment generates this:

We then loop this screenshot and the original image and ask a sub agent to look and describe what is needed to change. It comes up with:
### Key Changes Needed for testcomponent
1. **Background**
- Add a blue abstract gradient with glowing/light overlay effects using custom CSS. Keep Tailwind's bg-black as a fallback.2. **Typography & Layout**
- Split the large heading into two lines with stronger sizing and spacing.
- Add a glowing or outer shadow/text-shadow effect to the hero text for emphasis (custom CSS).
- Increase spacing between all major elements.3. **Top Badge**
- Update the "NEW" badge with a more vivid blue and subtle shadow/glow (custom CSS).
- Increase badge and padding roundedness to look more pill-like.4. **Button**
- Make the button more pill-shaped, larger, with a blue glow/inner shadow (custom CSS).
- Lighter button font color.5. **Supporting Text**
- Subheadline should be slightly lighter, add subtle blue tint if possible via CSS.
We then feed this into the agent again and tell it to change this, and it also give the two images and it comes up with:

Add another step and its at

So its clear that this improves the results.
Overview
We should build this in so the agent automatically can loop over its results until its happy. There are some questions that needs to be answered though:
We can do this via the frontend, since the iframe that previews the component is on the same origin - a tool like html2canvas should work. This however means that the loop or instruction has to be started over or triggered via the frontend, instead of the normal agent loop.
One option is that the agent returns a value of how well the new component fits the screenshot and if its under a threshold and it didn't reach max loops, it just sends it back for another iteration.
Since it costs money another solution is that after you generate a component from a prompt or image, that there is a button in the chatbot or on the preview that says something like "Improve quality" that runs another loop.
Tested and not a viable solution.
We can also do this via backend, but then we need an endpoint where we can render the unsaved component. On top of that we need to have headless chrome or somehing similar on the backend that can take screenshots, alternate a service that can run a proxy so it can work against your local machine.
The advantage with backend generation is that you will be able to create components over MCP (and through it Claude Code, Roo Tools, Cursor etc.) and also create a starterkit modules, that can pre-generate 10-15 components for you when you start a new website.
Proposed resolution
- Figure out which, if any solution is feasible to deploy. We will also have a way where agents can get supplemental features via recipes in AI 1.3.0, so it doesn't have to ship with XB AI core.
- ☑ Experiment with the html2canvas and XB to verify that it works. - does not work
User interface changes
TBD
| Comment | File | Size | Author |
|---|---|---|---|
| #17 | xb-ai-canvas.mp4 | 9.52 MB | yautja_cetanu |
| #2 | screenshot (2).png | 310.42 KB | marcus_johansson |
| #2 | actual_screenshot.jpg | 21.21 KB | marcus_johansson |
| c_it_3.jpg | 34.26 KB | marcus_johansson | |
| c_it_2.jpg | 31 KB | marcus_johansson |
Issue fork experience_builder-3532296
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
marcus_johansson commentedhtml2canvas tested with a manual button against the Experience Builder iframe and the results are not 1-to-1, so its not a solution that is viable.
vs
Its also stated in the documentation
I will do the headless chrome tool, because that is needed for other reasons anyway, the question then is how it can access the component via localhost. Will try to figure this out.
Comment #3
marcus_johansson commentedComment #4
marcus_johansson commentedComment #6
yautja_cetanu commentedComment #7
yautja_cetanu commentedFollow up
Comment #8
yautja_cetanu commentedComment #9
marcus_johansson commentedSo, the current setup for anyone taking over:
There are three components to this agent:
1. The preview controller.
This is a controller that can preview the latest version of a JS component rendered within the frontend theme. This takes parameters that can be set with width and height in percentage or pixels. The reasoning is so you can request the component rendered with the frontend theme with the exact same measurements, so it can compare.
2. The Rate XB Components tool.
Currently it takes the file id and the component id, and uses Chromium to visit the controller above and take a screenshot and then it asks inside the tool to rate this from 1 to 10 how similar they are and what improvements can be made and returns these.
Pre-requirements:
This solution needs chromium installed can chrome-php.
What is missing:
On the controller we should add a possibility to have one off permissions. This means that the agent should be able to request and get a authentication code that is one-use only. Let me see if I can fix this.
The tool should be possible to move into a tool that a sub-agent is using instead now that we can use images.
The last part is to reiterate on the agent over and over until you get a result that is over N within A amount of loops. Rating 6 is something that worked quite well with the prompt above. I had that working in long timeout format, but with the iteration issue this should be possible to see over and over.
Comment #10
kristen polSwitching to the correct tag
Comment #11
akhil babuComment #12
akhil babuCurrently, the code component creation agent returns the component data (id, js, css etc) in JSON format and based on that, the actual component is created from the UI.
But to review the component on the fly before sending the final output, the component must exist. Means, we should create it directly from the backend.
Comment #13
akhil babuComment #15
akhil babuAs mentioned in #12, component creation is currently happening from the UI. For the review to work, the component must be created from the backend. However, in that case, the CSS and JS will not be compiled properly, since the compilation happens in the frontend and the compiled CSS and JS are then sent to the backend. Because of this, the component will not be rendered in the preview controller.
Also, when Chromium is used, the component is rendered as seen by an anonymous user. XB does not apply styles for anonymous users. The Create/Edit XB page content permission is required for this. Even if the permission is granted, the issue described above still prevents the component from being previewed.
Also, currently the component agent doesn't actually 'see' the component image. The image is given only to the orchestrator agent and orchestrator 'describes' the image to the component agent. I think the output of first iteration would be more similar to the uploaded image, if the image is actually given to the component agent. Few options that come to my mind are
I tried option 2. The Drupal\ai\OperationType\GenericType\ImageFile class currently only contains the filename, mime type, and file binary data. Perhaps an issue could be created for the AI module to also include the file id for images.
Comment #16
yautja_cetanu commentedFor people not into AI basically we need to find a way that we automatically generate an accurate screenshot of a componenet.
- The componenet is built in react by typing javascript, css etc.
- We need some way we can take the componenet we see rendered and automatically take a screenshot without the user having to press screenshot screens.
- We need to send that screenshot to a server side API for processing (in this case an AI agent looking at it) whilst thinking carefully of any security implications.
Comment #17
yautja_cetanu commentedAndrew has created a video of the new htmltocanvas pro:
xb-ai-canvas.mp4
It shows a very simple way of this working but its not perfect, you can see the colour of "New Introducing Framer forms" isn't quite right.
Comment #18
andrewbelcher commentedThis with with the following option:
I'm not sure the implication on browser support - this is the description from the docs:
Comment #19
andrewbelcher commentedHere is another approach using the screen capture API.
There is an issue of blur in this. That may not affect the LLM's looping as it may not care about image resolution/quality. However, I believe it is to do with DPI of the source vs the target canvas, which I expect is resolvable (perhaps this article would help).
I think we'd also want to look at finding a way to resize the iframe to the "correct" size to avoid scrolling etc, which the canvas approach bypasses.
Again some rough and ready code:
Comment #21
andrewbelcher commentedxb-ai-review-2.mp4 is a quick video of the change in merge request !1462.
At the very least it would need a better icon, and a review of my noddy React code. If we want to try and get this merged, it might want a feature flag for now. We were thinking that we might try and support a few different approaches, giving scope for server side headless browser where available as a more reliable and direct alternative. So a setting with options which allow us to control what we do, with a default of "disabled", might be a good approach.
Comment #22
lauriiiI'm wondering if we're approaching images and implementing designs the right way. It doesn't look like even with the latest changes the design produced matches closely enough the original image.
I also don't think it makes sense that the image is visible in the chat. It's confusing to see a new image appear there that I didn't upload myself.
I'm also wondering if we should upload multiple images because some issues could happen on a specific breakpoint?
Comment #23
yautja_cetanu commentedI'll bring this up at the XB AI meetup but I think we should create another module for this to experiment on, could be a sandbox module, could be something else and maybe bring it back in. As I think we need to find out how to make something that physically works before figuring out the ideal UI.
There are three approaches:
I think we need to make a module that allows us to switch between the 3 above so we can try prompt engineering to get it to work well. With testing a review agent using chatgpt playground, we got the design to look pretty good after one iteration. As you can see above, in the canvas one, it didn't look good after one iteration (but that might just be the prompt). When we have a better idea of what works we can think about the different aspects of the UI we have to worry about:
So my proposal is:
We create a module that allows you to pick between 3 different approaches and we can try the prompt engineering to figure out what is the best approach and then we can use this information to answer the UX questions above (eg break points etc).
Comment #24
akhil babuComment #25
catia_penas commentedComment #26
catia_penas commentedComment #27
yautja_cetanu commentedComment #29
rakhimandhania commentedComment #30
rakhimandhania commentedComment #31
rakhimandhania commented