Issue summary

Add retrieval tracing to the Drupal Langfuse module by modeling the RAG retrieval workflow as
nested span observations inside the existing per-request trace (nested only if needed, it can also be just 1 span).

Problem/Motivation

  • Today, the module creates one trace per Drupal request and appends LLM-related observations (generation, event).
  • Retrieval for RAG (vector store + context prep) is not captured. If we were to use a single event we'd lose timing and structure.
  • We want end-to-end observability of retrieval latency, inputs, filters, candidates, and context encoding stats.

Proposed resolution

  • Represent retrieval as a span named rag.retrieve.
  • Potentially - within that span, create child spans for each step:
    • vector-store-query
    • (optional as already happening) context-encoding
    • (optional) rerank
  • No breaking changes: this only adds spans; existing trace lifecycle and bulk-send remain unchanged.

Recommended metadata on retrieval spans

  • query, normalizedQuery, topK, filters
  • vectorStore (provider/index/namespace), similarityMetric
  • candidates (IDs + scores; redact or hash content when needed)
  • rerankerModel (if used), per-step latencyMs, usedContextTokenCount
CommentFileSizeAuthor
#19 Screenshot from 2025-12-01 23-40-27.png36.99 KBnikro

Issue fork langfuse-3557948

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

    Comments

    nikro created an issue. See original summary.

    • nikro committed b7a98551 on main
      feat: #3557948 Track Search API queries - captures retrieval spans with...

    • nikro committed 76e50d80 on main
      feat: #3557948 Track AI agent tool executions - captures pre/post events...

    • nikro committed 36ef1251 on main
      feat: #3557948 Register tool and search span subscribers
      
      By: @Nikro
      

    • nikro committed 956d556a on main
      refactor: #3557948 Use trace manager in AI subscriber - delegates trace...

    • nikro committed 8ff01fc6 on main
      feat: #3557948 Register trace manager service
      
      Adds langfuse....

    • nikro committed dd8d38b0 on main
      feat: #3557948 Add trace manager service
      
      Introduces...
    nikro’s picture

    Status: Active » Needs review

    Alright pushed some stuff straight into the main for now:

    + Add trace manager service - Core service and interface
    + Register trace manager service - Service definition
    + Use trace manager in AI subscriber - Refactored to use centralized trace logic
    + Register tool and search span subscribers - Service registrations
    + Track AI agent tool executions - Tool span subscriber
    + Track Search API queries - Search span subscriber for RAG

    TLDR - now we can hierarchically track any tool-call and also we bolted in a specialized logic for ai_search_block (which doesn't use any tool) - basically if the ai_search query is performed, we create also a span for it.

    • nikro committed 38afc69a on 1.x
      feat: #3557948 document runner context and span management architecture...

    • nikro committed 6fe711b7 on 1.x
      feat: #3557948 add agent lifecycle subscriber for runner hierarchy...

    • nikro committed 4b1f1004 on 1.x
      feat: #3557948 refactor search span subscriber to use named span...

    • nikro committed 5af67f15 on 1.x
      feat: #3557948 refactor tool span subscriber to claim parents and manage...

    • nikro committed 7fca5bc7 on 1.x
      feat: #3557948 refactor AI logging to use active generations and...

    • nikro committed ae1545b6 on 1.x
      feat: #3557948 implement runner context tracking with span stacks
      
      By: @...

    • nikro committed 657a89e9 on 1.x
      feat: #3557948 refactor trace manager interface with high-level runner...

    • nikro committed 305498da on 1.x
      feat: #3557948 while digging - fix the toolcalls - so they show up in...

    • nikro committed e948c8af on 1.x
      feat: #3557948 split submodule into multiple ones due to dependencies...
    nikro’s picture

    More changes - as previous ones ignored some things:

    I had to split the ai logging sub-module into 3 separate sub-modules: general ai, ai agents and search - because they actually depend on separate module:

    • The trace manager now exposes high-level lifecycle methods (scheduleToolParent, claim/releaseToolParent, start/endDelegation, enter/leave/currentSpanContext). Also nuked some span-related tracking that we implemented earlier.
    • LangFuseAiLoggingSubscriber registers active generations per thread ID, schedules tool parents when tool calls occur, and resolves parents via delegation/search contexts instead of scanning observations.
    • LangFuseToolSpanSubscriber claims the queued generation parent, starts delegation when another agent is invoked, pushes/pops AI Search contexts, and releases claims using releaseToolParent() rather than a custom handle class.
    • LangFuseSearchSpanSubscriber wraps Search API queries with span contexts so embeddings can attach to the retrieval span; contextual comments explain each step.
    • LangFuseAgentLifecycleSubscriber documents that it registers parent runner relationships for delegation.

    This structure keeps AI Agents and AI Search logging optional submodules (wiring only when those contrib modules are enabled) while preserving a single runner-context system the rest of the code can rely on.

    screenshot of new trace

    nikro’s picture

    StatusFileSize
    new36.99 KB
    nikro’s picture

    After anlyzing this live with Frederik Wouters, we discovered that it misbehaves when used under Deepchat.

    More fixes done:

    Termination Flow Consolidation

    Removed duplicate KernelEvents::TERMINATE handler from langfuse_ai_logging. The root LangfuseSyncSubscriber in web/modules/custom/langfuse/src/EventSubscriber/LangfuseSyncSubscriber.php now exclusively finalizes traces, updates metadata (final timestamp/output), clears the active trace, and runs syncTraces().

    DeepChat-Friendly Trace Management

    • Introduced LangFuseTraceSessionStorageInterface with a session-backed implementation (LangFuseTraceSessionStorage). It tracks thread→trace IDs so multi-request workflows (DeepChat’s “submit” + “tool follow-up”) can re-use the same trace.
    • The AI logging subscriber now injects this storage instead of reaching directly into the request stack. Whenever a generation finishes without pending tool calls, it clears the thread mapping; otherwise, it leaves it in place.
    • LangfuseSyncSubscriber::finalizeActiveTrace() checks whether the current trace still has mapped threads. If so, it skips finalization and syncTraces(), allowing the next request to resume the same trace. Only when all thread mappings are cleared does it finalize and sync.
    • DeepChat’s two-step requests therefore stay within one trace: request #1 leaves the mapping (because tool calls were requested), the terminate event defers syncing, request #2 re-uses the trace and clears the mapping once complete, and only then does termination finalize and send the trace to LangFuse.

    • nikro committed 6a52661b on 1.x
      feat: #3557948 minor comment and whitespace cleanups
      
      By: @Nikro
      

    • nikro committed e0fbfd65 on 1.x
      feat: #3557948 refactor AI logging to reuse traces across tool calls via...

    • nikro committed 9c7fea48 on 1.x
      feat: #3557948 move trace finalization to sync subscriber with...

    • nikro committed 3fd80c20 on 1.x
      feat: #3557948 register thread trace storage service
      
      By: @Nikro
      

    • nikro committed 07a7876f on 1.x
      feat: #3557948 implement session-based thread trace storage
      
      By: @Nikro
      

    • nikro committed 8759c06e on 1.x
      feat: #3557948 add thread trace storage interface
      
      By: @Nikro
      

    • nikro committed 7ede88bb on 1.x
      feat: #3557948 Remove thread storage and cross-request snapshot logic...

    • nikro committed 9186b64d on 1.x
      Revert "feat: #3557948 minor comment and whitespace cleanups"
      
      This...
    nikro’s picture

    So last updates were as following:

    - Kept unified terminate subscriber.
    - But, removed the deepchat attempted integration - as it requires way more changes than I thought - especially SDK changes. I'll leave it for now as a separate ticket.

    wouters_f’s picture

    Status: Needs review » Reviewed & tested by the community
    nikro’s picture

    Status: Reviewed & tested by the community » Fixed

    Done, made a new alpha1 release with this in it.
    Thanks!

    Now that this issue is closed, review the contribution record.

    As a contributor, attribute any organization that helped you, or if you volunteered your own time.

    Maintainers, credit people who helped resolve this issue.

    Status: Fixed » Closed (fixed)

    Automatically closed - issue fixed for 2 weeks with no activity.