[Tracker]
Update Summary: Buffer flushes mid-attribute on relative URLs, corrupting HTML in server-side consumers
Short Description: StreamedChatMessageIterator buffer corrupts HTML when consuming streamed responses server-side (relative URLs split mid-attribute)
Check-in Date: 03/18/2026
Metadata is used by the AI Tracker. Docs and additional fields here.
[/Tracker]

Problem/Motivation

When using the AI module's Chat operation type to process HTML-rich content server-side (e.g. via ai_integration_eca or any server-side batch consumer), the output HTML is consistently corrupted. Attributes are split mid-value, tags are broken, table structures are destroyed, and src/href URLs are truncated.

The root cause is in StreamedChatMessageIterator, where the buffer flushes when $maxBufferSize (100 characters) is reached or on every newline. This causes the buffer to flush mid-attribute when processing HTML content with relative URLs like src="/sites/default/files/...".

Note: as of Drupal 11.3, Fibers are supported and any AI provider call runs in streaming mode — including server-side consumers like ai_integration_eca. There is no way to avoid this code path.

Steps to reproduce (required for bugs, not feature requests)

  1. Install Drupal 11.3+ with the AI module and ai_integration_eca
  2. Configure an ECA Chat action that processes a node body field containing images with relative URLs (src="/sites/default/files/..."), HTML tables, and links
  3. Trigger the ECA model
  4. Observe the output — HTML attributes are split, tags are broken, table structure is destroyed

Example of corruption observed

Debugging with Xdebug inside flushInternal(), two consecutive flushes were observed:

Flush 1 (buffer limit reached mid-attribute):
src="/sites/default"

Flush 2 (rest of the URL):
/files/inline-images/image.png" width="2816" height="1536">

Resulting in broken HTML with the rest of the URL orphaned outside the tag.

Proposed resolution

  1. Fix the regex in shouldFlush() to protect relative URLs starting with /:
// Before — only protects absolute URLs
if (preg_match('/"(?:http|\/\/:)[^"]*$/', $this->buffer)...

// After — also protects relative URLs
if (preg_match('/"(?:http|\/)[^"]*$/', $this->buffer)...
  1. Consider increasing $maxBufferSize or making it configurable to reduce the risk of mid-attribute flushes on long attribute values.

Remaining tasks

  • Fix the regex in shouldFlush() to protect relative URLs
  • Add kernel test that verifies HTML with relative URLs is not corrupted when consuming the iterator server-side

User interface changes

None.

API changes

None.

Data model changes

None.

Release notes snippet

Fixed HTML corruption in StreamedChatMessageIterator when processing content with relative URLs server-side (e.g. via ECA or batch processing).

Issue fork ai-3579967

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

increweb21 created an issue. See original summary.

marcus_johansson’s picture

This will affect 1.2.x as well, so it needs a backport. Will work on it right away.

marcus_johansson’s picture

Status: Active » Needs review

In theory the first fix you proposed should solve it, and I have added that one and the Kernel tests and a slight change to how the streaming of the test provider works, however I can't actually replicate the base issue, even if the flush happens in the middle of a generation the output is still ok.

marcus_johansson’s picture

As for chunk size, we should add a feature for it - the problem is that when you actually want streaming, anything over 100 chars becomes visually unpleasing to look at.

marcus_johansson’s picture

Assigned: marcus_johansson » Unassigned
arianraeesi’s picture

abhisekmazumdar’s picture

Assigned: Unassigned » abhisekmazumdar
abhisekmazumdar’s picture

Assigned: abhisekmazumdar » Unassigned
Status: Needs review » Reviewed & tested by the community

Marking RTBC.

Reviewed the patch and smoke tested the fix locally. Tested the following cases:

  • <img src="/sites/default/files/...">: relative src attribute, the original reported corruption case
  • <a href="/node/about-us/team/our-people/staff-directory">: long relative href
  • Markdown relative link: [link text](/long/relative/path)
  • HTML table containing a relative URL in a cell

The reconstructed output was intact across all tests.

Code review looks good. The regex change in shouldFlush() correctly extends protection to relative URLs starting with /, in addition to the existing absolute URL protection.

a.dmitriiev made their first commit to this issue’s fork.

a.dmitriiev’s picture

Status: Reviewed & tested by the community » Fixed

Rebased and merging

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

  • a.dmitriiev committed 6d5a4701 on 1.3.x
    Resolve #3579967 "Streamedchatmessageiterator buffer corrupts"
    

  • a.dmitriiev committed cdabc3df on 1.2.x
    Resolve #3579967 "Streamedchatmessageiterator buffer corrupts"
    

  • a.dmitriiev committed cc6ae3fa on 1.4.x
    Resolve #3579967 "Streamedchatmessageiterator buffer corrupts"
    
a.dmitriiev’s picture

No need to cherry-pick to 2.x as it would be handled with guardrails.

abhisekmazumdar’s picture

This was missing the credits that needed to be added to the respective contributor. I have updated it.

arianraeesi’s picture

Issue tags: -AI Product Development