Problem/Motivation

Streaming doesn't work on Pantheon (and others) because they sit behind a CDN (Fastly) and proxy layers that buffer and cache full responses instead of allowing PHP to stream output incrementally.

We can fake this with some tweaks to the controller with padding and delay, these should be optional tweaks or overrides.

Issue fork agui-3584804

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

b_sharpe created an issue. See original summary.

b_sharpe’s picture

Assigned: Unassigned » b_sharpe
Status: Active » Needs review

b_sharpe’s picture

Assigned: b_sharpe » Unassigned
b_sharpe’s picture

Title: Fix Streaming On Pantheon (or similar) » Fix Perceived Streaming On Proxied Hosts (Pantheon or similar)
Issue summary: View changes
rakhimandhania’s picture

ahmad-khalil-imagex’s picture

Assigned: Unassigned » ahmad-khalil-imagex
ahmad-khalil-imagex’s picture

Assigned: ahmad-khalil-imagex » Unassigned
Status: Needs review » Reviewed & tested by the community
StatusFileSize
new32.44 MB

Tested this MR on Drupal 10 with DDEV (nginx + PHP-FPM 8.3) against an AI Assistant routed through /agui/api/chat. Confirmed the patch behaves as designed:

Server-side wiring

  • StreamedResponse headers correctly include Content-Type: text/event-stream, Cache-Control: no-cache, Connection: keep-alive, and X-Accel-Buffering: no.
  • Transfer-Encoding: chunked (no Content-Length), TTFB ~57ms with content download spanning the full 11s of the run — true incremental streaming.


Padding / delay knobs

  • Default initialPaddingSize=4096 produces the leading : SSE-comment with 4096 spaces.
  • Default chunkPaddingSize=1024 produces a : comment between every data: event, visible in the raw response body.
  • Setting chunkPadding=0 (via query string or POST body) removes the inter-event padding lines.
  • Setting chunkDelay=500 introduces a clearly visible per-chunk pause; the assistant message paints token-by-token with the configured delay.


Abuse caps from the follow-up commit

  • chunkDelay=99999 is correctly clamped to 1000ms server-side.
  • chunkPadding=99999 is correctly clamped to 8192 bytes server-side.
  • Negative values clamp to 0.


Both transports

  • Confirmed the params are accepted from both the JSON request body ({"chunkDelay":..., "chunkPadding":...}) and the URL query string (?chunkDelay=...&chunkPadding=...).


Caveat

  • The original buffering symptom only reproduces behind a proxy (Fastly/Pantheon). On local DDEV streaming works regardless of the new settings, so I could only validate the plumbing locally. The padding + delay overrides are exactly the levers needed to defeat Fastly's buffering on Pantheon, and the implementation is sound.


RTBC from my perspective.

b_sharpe’s picture

Status: Reviewed & tested by the community » Fixed

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.