Summary

Add two native API features to the Anthropic provider:

  1. Prompt caching admin UI — surface the CacheControlEphemeral wiring already shipped in #3572402; per-block cache breakpoints; 1-hour TTL toggle; report cache_creation_input_tokens alongside the existing cache_read_input_tokens.
  2. PDF input — wire Base64PDFSource / URLPDFSource block params for models with the pdfInput capability (Opus 4.x, Sonnet 4.x, Haiku 4.x per the live capability API).

Features

1. Prompt caching UI

Phase 1 shipped CacheControlEphemeral::with() wiring on the top-level message block (toggled via configuration['prompt_cache'] boolean) plus Usage->cacheReadInputTokens to TokenUsageDto->cached reporting. Phase 2 adds the admin surface:

  • Admin toggle: "Enable prompt caching" checkbox in getModelSettings(), gated by the typed ModelCapabilities
  • TTL selector: dropdown for ephemeral (5-min, default) vs ephemeral_1h (1-hour TTL, Anthropic's extended cache tier). SDK supports both via CacheControlEphemeral::with(ttl: ...)
  • Per-block markers: optional pattern for attaching cache_control to specific message blocks via a cache_breakpoint marker in ChatMessage metadata. Falls back to top-level caching when no marker is present.
  • Reporting parity: TokenUsageDto already has cached; add a parallel surface for cache_creation_input_tokens so AI Logging surfaces both "cache hit" and "cache write" tokens.

2. PDF input

Mirrors Phase 1's image-input wiring:

  • Capability gating: $client->models->retrieve($id)->capabilities->pdfInput->supported drives whether the operation is offered for a given model
  • buildMessageContent() extension: detect PDF attachments in ChatMessage; construct DocumentBlockParam with Base64PDFSource or URLPDFSource and append to the message content list
  • Capability declaration: add an AiModelCapability::ChatWithPdfInput filter in getConfiguredModels() mirroring the existing ChatWithImageVision branch
  • AI core coordination: if ChatWithPdfInput capability doesn't exist in AI core 1.3.x yet, propose it upstream (one-line enum case + interface contract)

Out of scope (later issues)

  • Compaction follows in Phase 3 alongside token counting. The cross-provider CompactionInterface in AI core was closed won't-fix (#3573087, April 2026), but Marcus's reasoning "compaction works differently for each provider ... most likely each provider solves this as they wish" is an explicit invitation to ship it provider-internal. Phase 3 wires MessageCreateParams::with(contextManagement: ...) through the same capability-driven UI pattern Phase 1 established, gated by ContextManagementCapability (clearThinking20251015, clearToolUses20250919, compact20260112).
  • Token counting (/v1/messages/count_tokens) — Phase 3
  • Citations — Phase 3
  • Data residency headers, web fetch, code execution — Phase 4

Implementation approach

  • TDD: tests for cache UI rendering (capability-gated), cache header attachment (per-block + top-level), PDF block construction, capability filtering
  • Live end-to-end: cache hit/miss demo (D1/D2 scenarios deferred from #3572402), PDF round-trip with both base64 and URL sources, cache_creation vs cache_read accounting in AI Log

Depends on

  • #3572402 — Phase 1 (merged in d1e078a1)

Related

  • #3573087 — Add Compaction OperationType (closed won't-fix; compaction lands in our Phase 3 as a provider-internal feature)
  • #3538499 — Meta: Use Symfony AI (Initiative migration tracker)
Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

camoa created an issue. See original summary.

camoa’s picture

Issue summary: View changes
camoa’s picture

MR pushed to 3590963-add-prompt-caching (7 commits, targets 1.3.x). Summary of what landed and how it was verified.

What was built

Prompt caching

  • Admin UI in the Anthropic provider settings form: an "Enable prompt caching" toggle, a TTL selector (5-minute default / 1-hour extended tier), and a cache-diagnostics opt-in. TTL and diagnostics fields are #states-gated on the toggle.
  • When caching is enabled and a system prompt is present, the system prompt is sent as a typed TextBlockParam with cache_control attached — caching a bare string does nothing, the breakpoint has to sit on a content block. A top-level cache_control marker is also set so the SDK places a second breakpoint on the last message block.
  • 1-hour TTL requires the extended-cache-ttl-2025-04-11 beta header; the provider attaches it per-request via RequestOptions::extraHeaders only when 1h is selected. The SDK does not auto-attach it.
  • cache_creation_input_tokens is surfaced through ChatOutput::getMetadata()['cache_creation_tokens'], complementing the cache_read_input_tokensTokenUsageDto::cached reporting from #3572402.

PDF input

  • buildMessageContent() detects application/pdf files on a ChatMessage and emits a typed DocumentBlockParam with Base64PDFSource — mirroring the existing image-input path.
  • getConfiguredModels() gains a provider-internal anthropic:pdf_input capability filter, gated on the typed ModelCapabilities->pdfInput->supported flag with a last-known-good regex fallback.

SDK maintenance

  • Composer constraint moved from ^0.16 to >=0.16,<1.0 so the rolling 0.x line is picked up. Tested against the current v0.23.0.
  • Dropped a now-redundant method_exists() guard (drupal/ai ^1.3 guarantees the method); PHPStan level 8 flagged it.

How it was tested

Same approach as #3572402 — unit tests for logic, then a live end-to-end pass against a real Anthropic key on DDEV, because unit tests verify logic but only real API calls verify contracts.

  • Unit: 82 tests / 148 assertions. Added a dedicated test for the streaming iterator (re-entrancy guard, typed event dispatch, tool-block accumulation) and for the new caching/PDF helpers.
  • Static analysis: PHPCS (Drupal + DrupalPractice) clean on src/ and tests/; PHPStan level 8 clean.
  • Coverage: a module-local phpunit.xml.dist was added (the drupalci template prefers it over core's) with a <source> block scoping coverage to ./src. Module line coverage 46.7%; the streaming iterator went from 0% to 97.6%. The admin form and the native-HTTP paths remain uncovered — those need Kernel/Functional tests and are tracked as follow-up.
  • Live end-to-end (Drush + a real key on DDEV):
    • Cache write — first call with a long system prompt: cache_creation_input_tokens = 1262, cache_read = 0.
    • Cache hit — immediate rerun: cache_read_input_tokens = 1262, cache_creation absent.
    • 1-hour TTL — same scenario with the extended tier: no 400, beta header accepted, cache_creation = 1362.
    • PDF round-trip — a test PDF correctly summarised on Sonnet 4.5, Opus 4.7, and Haiku 4.5.

One bug surfaced during the live pass and was fixed in the same branch: top-level cache_control alone does not cache the system prompt — the SDK's auto-placement targets the last message block, which is normally below Anthropic's 1024-token cache minimum. The fix sends the system prompt as a typed cacheable block.

Out of scope

  • PDF upload UI. This issue ships the provider plumbing — when a ChatMessage carries a PDF DocumentFile, the provider sends it correctly. It does not add a way for an end user to attach a PDF. The AI API Explorer's Chat Generator has a file field, but it hard-wraps uploads as ImageFile; DeepChat has no document-upload wiring. So PDF input is programmatic-only today (see the usage comment below). A PDF upload UI is an upstream change to ai_api_explorer / ai_chatbot, not to this provider.
  • Per-block cache breakpoints (4-breakpoint support) — needs an upstream way to mark a ChatMessage; deferred rather than shipped behind a magic-string convention.
  • Compaction / context management, token counting, citations — a later phase.
camoa’s picture

How to use these features

Prompt caching

  1. Go to Configuration → AI → AI Providers → Anthropic (/admin/config/ai/providers/anthropic).
  2. Open the Prompt caching section, tick Enable prompt caching.
  3. Pick a Cache TTL: 5 minutes (default) or 1 hour (extended tier). Save.

Once enabled, every native chat request marks the system prompt as a cache breakpoint. The first call with a given prefix writes the cache; subsequent calls within the TTL read from it at a reduced input-token cost. Caching pays off when the same large system prompt or context is reused across requests — short prompts below Anthropic's minimum cacheable size are simply not cached by the API.

A per-request override is available for code-level callers: set configuration['prompt_cache'] (and optionally configuration['prompt_cache_ttl']) on the provider instance — it takes precedence over the form setting.

Reading cache token usage

With AI Logging enabled, each logged response carries the cache accounting:

  • TokenUsageDto::cached — tokens served from cache (the saving on a hit).
  • ChatOutput::getMetadata()['cache_creation_tokens'] — tokens billed to write the cache (present on the first call of a cycle).

PDF input

PDF input is programmatic only. There is currently no UI in the AI module to attach a PDF to a chat — the AI API Explorer's file field wraps every upload as an ImageFile, and DeepChat has no document-upload wiring. A UI path is an upstream change to ai_api_explorer / ai_chatbot. What this issue ships is the provider plumbing: when a ChatMessage carries a PDF DocumentFile, the provider sends it correctly.

To use it from code (a custom module, an AI Agent, an automator), attach a DocumentFile to the ChatMessage — the same shape used for images, with the application/pdf MIME type:

use Drupal\ai\OperationType\Chat\ChatInput;
use Drupal\ai\OperationType\Chat\ChatMessage;
use Drupal\ai\OperationType\GenericType\DocumentFile;

$message = new ChatMessage('user', 'Summarise the attached document.');
$message->setFile(new DocumentFile(
  file_get_contents('/path/to/report.pdf'),
  'application/pdf',
  'report.pdf',
));

$input = new ChatInput([$message]);
$output = $provider->chat($input, 'claude-sonnet-4-5-20250929');

The provider converts the file into a typed DocumentBlockParam and Claude reads both the text and the visual layout of the PDF. PDF input works on Claude 4.x models (Opus, Sonnet, Haiku); to restrict a model list to PDF-capable models, request the anthropic:pdf_input capability when calling getConfiguredModels().

Note: PDFs are passed to Anthropic byte-for-byte without inspection. Treat PDFs from untrusted sources as you would untrusted text — they can carry adversarial instructions. This caution is also shown on the provider settings form.

camoa’s picture

Assigned: camoa » marcus_johansson
Status: Active » Needs review
marcus_johansson’s picture

Status: Needs review » Needs work

Added some comments - also the phpunit and cspell has failures.

marcus_johansson’s picture

Assigned: marcus_johansson » Unassigned