Add prompt caching UI and PDF input support [#3590963]

Summary

Add two native API features to the Anthropic provider:

Prompt caching admin UI — surface the CacheControlEphemeral wiring already shipped in #3572402; per-block cache breakpoints; 1-hour TTL toggle; report cache_creation_input_tokens alongside the existing cache_read_input_tokens.
PDF input — wire Base64PDFSource / URLPDFSource block params for models with the pdfInput capability (Opus 4.x, Sonnet 4.x, Haiku 4.x per the live capability API).

Features

1. Prompt caching UI

Phase 1 shipped CacheControlEphemeral::with() wiring on the top-level message block (toggled via configuration['prompt_cache'] boolean) plus Usage->cacheReadInputTokens to TokenUsageDto->cached reporting. Phase 2 adds the admin surface:

Admin toggle: "Enable prompt caching" checkbox in getModelSettings(), gated by the typed ModelCapabilities
TTL selector: dropdown for ephemeral (5-min, default) vs ephemeral_1h (1-hour TTL, Anthropic's extended cache tier). SDK supports both via CacheControlEphemeral::with(ttl: ...)
Per-block markers: optional pattern for attaching cache_control to specific message blocks via a cache_breakpoint marker in ChatMessage metadata. Falls back to top-level caching when no marker is present.
Reporting parity: TokenUsageDto already has cached; add a parallel surface for cache_creation_input_tokens so AI Logging surfaces both "cache hit" and "cache write" tokens.

2. PDF input

Mirrors Phase 1's image-input wiring:

Capability gating: $client->models->retrieve($id)->capabilities->pdfInput->supported drives whether the operation is offered for a given model
buildMessageContent() extension: detect PDF attachments in ChatMessage; construct DocumentBlockParam with Base64PDFSource or URLPDFSource and append to the message content list
Capability declaration: add an AiModelCapability::ChatWithPdfInput filter in getConfiguredModels() mirroring the existing ChatWithImageVision branch
AI core coordination: if ChatWithPdfInput capability doesn't exist in AI core 1.3.x yet, propose it upstream (one-line enum case + interface contract)

Out of scope (later issues)

Compaction follows in Phase 3 alongside token counting. The cross-provider CompactionInterface in AI core was closed won't-fix (#3573087, April 2026), but Marcus's reasoning "compaction works differently for each provider ... most likely each provider solves this as they wish" is an explicit invitation to ship it provider-internal. Phase 3 wires MessageCreateParams::with(contextManagement: ...) through the same capability-driven UI pattern Phase 1 established, gated by ContextManagementCapability (clearThinking20251015, clearToolUses20250919, compact20260112).
Token counting (/v1/messages/count_tokens) — Phase 3
Citations — Phase 3
Data residency headers, web fetch, code execution — Phase 4

Implementation approach

TDD: tests for cache UI rendering (capability-gated), cache header attachment (per-block + top-level), PDF block construction, capability filtering
Live end-to-end: cache hit/miss demo (D1/D2 scenarios deferred from #3572402), PDF round-trip with both base64 and URL sources, cache_creation vs cache_read accounting in AI Log

Depends on

#3572402 — Phase 1 (merged in d1e078a1)

#3573087 — Add Compaction OperationType (closed won't-fix; compaction lands in our Phase 3 as a provider-internal feature)
#3538499 — Meta: Use Symfony AI (Initiative migration tracker)

Issue fork ai_provider_anthropic-3590963

Show commands

Start within a Git clone of the project using the version control instructions.

Add & fetch this issue fork’s repository

Or, if you do not have SSH keys set up on git.drupalcode.org:

Add & fetch this issue fork’s repository

3590963-add-prompt-caching changes, plain diff MR !28
Check out this branch for the first time

Check out existing branch, if you already have it locally

About issue forks

Comments

Comment #1

19 May 2026 at 20:16

camoa created an issue. See original summary.

Comment #2

camoa commented 19 May 2026 at 20:21

Issue summary:

View changes

Comment #3

camoa commented 20 May 2026 at 14:33

MR pushed to 3590963-add-prompt-caching (7 commits, targets 1.3.x). Summary of what landed and how it was verified.

What was built

Prompt caching

Admin UI in the Anthropic provider settings form: an "Enable prompt caching" toggle, a TTL selector (5-minute default / 1-hour extended tier), and a cache-diagnostics opt-in. TTL and diagnostics fields are #states-gated on the toggle.
When caching is enabled and a system prompt is present, the system prompt is sent as a typed TextBlockParam with cache_control attached — caching a bare string does nothing, the breakpoint has to sit on a content block. A top-level cache_control marker is also set so the SDK places a second breakpoint on the last message block.
1-hour TTL requires the extended-cache-ttl-2025-04-11 beta header; the provider attaches it per-request via RequestOptions::extraHeaders only when 1h is selected. The SDK does not auto-attach it.
cache_creation_input_tokens is surfaced through ChatOutput::getMetadata()['cache_creation_tokens'], complementing the cache_read_input_tokens → TokenUsageDto::cached reporting from #3572402.

PDF input

buildMessageContent() detects application/pdf files on a ChatMessage and emits a typed DocumentBlockParam with Base64PDFSource — mirroring the existing image-input path.
getConfiguredModels() gains a provider-internal anthropic:pdf_input capability filter, gated on the typed ModelCapabilities->pdfInput->supported flag with a last-known-good regex fallback.

SDK maintenance

Composer constraint moved from ^0.16 to >=0.16,<1.0 so the rolling 0.x line is picked up. Tested against the current v0.23.0.
Dropped a now-redundant method_exists() guard (drupal/ai ^1.3 guarantees the method); PHPStan level 8 flagged it.

How it was tested

Same approach as #3572402 — unit tests for logic, then a live end-to-end pass against a real Anthropic key on DDEV, because unit tests verify logic but only real API calls verify contracts.

Unit: 82 tests / 148 assertions. Added a dedicated test for the streaming iterator (re-entrancy guard, typed event dispatch, tool-block accumulation) and for the new caching/PDF helpers.
Static analysis: PHPCS (Drupal + DrupalPractice) clean on src/ and tests/; PHPStan level 8 clean.
Coverage: a module-local phpunit.xml.dist was added (the drupalci template prefers it over core's) with a <source> block scoping coverage to ./src. Module line coverage 46.7%; the streaming iterator went from 0% to 97.6%. The admin form and the native-HTTP paths remain uncovered — those need Kernel/Functional tests and are tracked as follow-up.
Live end-to-end (Drush + a real key on DDEV):
- Cache write — first call with a long system prompt: cache_creation_input_tokens = 1262, cache_read = 0.
- Cache hit — immediate rerun: cache_read_input_tokens = 1262, cache_creation absent.
- 1-hour TTL — same scenario with the extended tier: no 400, beta header accepted, cache_creation = 1362.
- PDF round-trip — a test PDF correctly summarised on Sonnet 4.5, Opus 4.7, and Haiku 4.5.

One bug surfaced during the live pass and was fixed in the same branch: top-level cache_control alone does not cache the system prompt — the SDK's auto-placement targets the last message block, which is normally below Anthropic's 1024-token cache minimum. The fix sends the system prompt as a typed cacheable block.

Out of scope

PDF upload UI. This issue ships the provider plumbing — when a ChatMessage carries a PDF DocumentFile, the provider sends it correctly. It does not add a way for an end user to attach a PDF. The AI API Explorer's Chat Generator has a file field, but it hard-wraps uploads as ImageFile; DeepChat has no document-upload wiring. So PDF input is programmatic-only today (see the usage comment below). A PDF upload UI is an upstream change to ai_api_explorer / ai_chatbot, not to this provider.
Per-block cache breakpoints (4-breakpoint support) — needs an upstream way to mark a ChatMessage; deferred rather than shipped behind a magic-string convention.
Compaction / context management, token counting, citations — a later phase.

Comment #4

camoa commented 20 May 2026 at 14:33

How to use these features

Prompt caching

Go to Configuration → AI → AI Providers → Anthropic (/admin/config/ai/providers/anthropic).
Open the Prompt caching section, tick Enable prompt caching.
Pick a Cache TTL: 5 minutes (default) or 1 hour (extended tier). Save.

Once enabled, every native chat request marks the system prompt as a cache breakpoint. The first call with a given prefix writes the cache; subsequent calls within the TTL read from it at a reduced input-token cost. Caching pays off when the same large system prompt or context is reused across requests — short prompts below Anthropic's minimum cacheable size are simply not cached by the API.

A per-request override is available for code-level callers: set configuration['prompt_cache'] (and optionally configuration['prompt_cache_ttl']) on the provider instance — it takes precedence over the form setting.

Reading cache token usage

With AI Logging enabled, each logged response carries the cache accounting:

TokenUsageDto::cached — tokens served from cache (the saving on a hit).
ChatOutput::getMetadata()['cache_creation_tokens'] — tokens billed to write the cache (present on the first call of a cycle).

PDF input

PDF input is programmatic only. There is currently no UI in the AI module to attach a PDF to a chat — the AI API Explorer's file field wraps every upload as an ImageFile, and DeepChat has no document-upload wiring. A UI path is an upstream change to ai_api_explorer / ai_chatbot. What this issue ships is the provider plumbing: when a ChatMessage carries a PDF DocumentFile, the provider sends it correctly.

To use it from code (a custom module, an AI Agent, an automator), attach a DocumentFile to the ChatMessage — the same shape used for images, with the application/pdf MIME type:

use Drupal\ai\OperationType\Chat\ChatInput;
use Drupal\ai\OperationType\Chat\ChatMessage;
use Drupal\ai\OperationType\GenericType\DocumentFile;

$message = new ChatMessage('user', 'Summarise the attached document.');
$message->setFile(new DocumentFile(
  file_get_contents('/path/to/report.pdf'),
  'application/pdf',
  'report.pdf',
));

$input = new ChatInput([$message]);
$output = $provider->chat($input, 'claude-sonnet-4-5-20250929');

The provider converts the file into a typed DocumentBlockParam and Claude reads both the text and the visual layout of the PDF. PDF input works on Claude 4.x models (Opus, Sonnet, Haiku); to restrict a model list to PDF-capable models, request the anthropic:pdf_input capability when calling getConfiguredModels().

Note: PDFs are passed to Anthropic byte-for-byte without inspection. Treat PDFs from untrusted sources as you would untrusted text — they can carry adversarial instructions. This caution is also shown on the provider settings form.

Comment #5

20 May 2026 at 16:17

camoa opened merge request !28

Comment #6

camoa commented 20 May 2026 at 16:17

Assigned:	camoa	» marcus_johansson
Status:	Active	» Needs review

Comment #7

marcus_johansson commented 20 May 2026 at 16:50

Status:

Needs review

» Needs work

Added some comments - also the phpunit and cspell has failures.

Comment #8

marcus_johansson commented 20 May 2026 at 17:41

Assigned:

marcus_johansson

» Unassigned

Add prompt caching UI and PDF input support

Summary

Features

1. Prompt caching UI

2. PDF input

Out of scope (later issues)

Implementation approach

Depends on

Related

Issue fork ai_provider_anthropic-3590963

Comments

Comment #1

Comment #2

Comment #3

What was built

Prompt caching

PDF input

SDK maintenance

How it was tested

Out of scope

Comment #4

How to use these features

Prompt caching

Reading cache token usage

PDF input

Comment #5

Comment #6

Comment #7

Comment #8

Parent issue

News items

Our community

Documentation

Drupal code base

Governance of community