Problem/Motivation

The SchemaPrinter is meant to provide a schema that clients can use. This excludes directives by design. See for example the discussion in https://github.com/webonyx/graphql-php/issues/552#issuecomment-541391741. The reference JavaScript implementation also doesn't output this.

The underlying reason is that server side directives in SDL can be used to alter the schema and thus may have no meaning once printed or can not be reliably reconstructed.

The current caching method in SdlSchemPluginBase which uses the printer causes directives to be thrown away. This can be problematic if downstream modules (e.g. graphql_oauth) rely on directive resolution.

Steps to reproduce

Proposed resolution

The following is a small proof of concept of how the schema and any extensions can be created which first creates ASTs that can be cached without printing to SDL.


$base = <<<EOF
type Query {
  _empty: String
}
EOF;

$extensions = [
  <<<EOF
"""
Allow access for bots (client credentials grant type) on fields or types
by required scopes.
"""
directive @allowBot(requiredScopes: [String!]!) on OBJECT | FIELD_DEFINITION
EOF,
  <<<EOF
  type User @allowBot(requiredScopes: ["foo"]) {
    id: String
  }
EOF,

  <<<EOF
  extend type Query {
    user: User @allowBot(requiredScopes: ["bar"])
  }
EOF,
];

$base = new Source($base);

$extensionAsts = array_map(fn (string $sdl) => Parser::parse($sdl, ['noLocation' => TRUE]), $extensions);

$baseAst = Parser::parse($base, ['noLocation' => TRUE]);

$extension = AST::concatAST($extensionAsts);

$schema = SchemaExtender::extend(
  BuildSchema::build($baseAst),
  $extension,
);

The base schema is still needed from the SdlSchemaPluginBase's SDL file because the Query type must be defined in a schema in GraphQL PHP before it can be extended. However, all other files (including those with new definitions from extension.base.graphqls files) can be treated as schema extensions.

I suspect (but have not measured) that there's a small performance improvement in not parsing the SDL when fetching from the cache but instead using the (deserialized) AST directly.

Remaining tasks

- Finish my internal PoC that allows us to remove a patch that undoes the caching
- Move the changes to the base schema file into the module

User interface changes

API changes

Data model changes

CommentFileSizeAuthor
#15 benchmark_get_schema.php_.txt5.86 KBklausi

Issue fork graphql-3572680

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

kingdutch created an issue. See original summary.

kingdutch’s picture

Version: 8.x-4.x-dev » 5.x-dev

This should be fixed in 5.x first, then we can debate a backport.

kingdutch changed the visibility of the branch 3572680-sdlschemapluginbase-should-cache to hidden.

kingdutch’s picture

While working on this I realized that there's a potential for a breaking change here.

In order to fix the caching we need to move from concatenating SDL to ensuring the GraphQL library merges ASTs; then also store the AST instead of the SDL. However the `AlterSchemaDataEvent` and `AlterSchemaExtensionDataEvent` currently expect an array of SDL fragments. Those fragments are technically no longer available because they're converted to AST in a single operation.

I expect that most alterations would be better off working with AST anyway and instead now parse the SDL and then print it again, causing us to parse it again (which can lead to data loss as we've seen in the caching issue originally).

Searching for the two events on Drupal's GitLab does not yield any results.

For now I have chosen to make the change and add proper type annotations to the events to make this clear; which could be a small breaking change for current consumers of those events.

If a breaking change is not acceptable then we should do a bit of extra work as part of this issue:

  • Split the current loop of read + parse for all extensions into multiple loops
  • Keep the current alter events in-between those
  • Deprecate the current events
  • Introduce new events that contain the parsed AST
kingdutch’s picture

There's also a choice for the revised event (or the new event) of whether we pass in the ASTs as an array or pass in the concatenated AST.

kingdutch’s picture

Assigned: kingdutch » Unassigned
Status: Active » Needs review

Here's the code that we're currently running in production at Open Social (although there we don't have the events implemented).

I see that GraphQL Compose is relying on those events to add their schema (although I think they could also implement this as extension): https://git.drupalcode.org/project/graphql_compose/-/blob/3.0.x/src/Even.... I think with that in mind it makes sense at this point to implement some sort of transition.

My suggestion at the moment is to introduce a new event AlterASTDataEvent which instead of having an array has the concatenated AST. That makes using GraphQL Visitors to walk the tree more efficient (and that's the preferred way to alter the schema).

As an upgrade path we'd leave the old event class in place so referencing it doesn't break, but we'd only fire the new event. That allows modules to implement both event listeners and provide a clean transition path for their users, without having to deal with an overlap of both events firing at the same time and having to deduplicate.

This would also allow GraphQL Compose to simplify their code since they currently create an AST representation; then print it; just for us to later parse it again.

alexpott’s picture

So this does slightly reduce memory usage but it requires far more calls per request so it is going to have an impact of performance and it does not offer the savings of #3576164: Improve efficiency of cache usage in \Drupal\graphql\Plugin\GraphQL\Schema\SdlSchemaPluginBase::getSchema()

Test scenario:

  1. Install graphql_example
  2. Create an article
  3. Create a graph server
  4. Profile the following query:
query MyQuery {
  article(id: 1) {
    id
    title
  }
}

HEAD

Memory usage: 2.71MB
Function calls: 12540

THIS MR

Memory usage: 2.69MB
Function calls: 14100

#3576164

Memory usage: 2.65MB
Function calls: 12440

Larger schemas (more usual for a complex graphql site are going to push these numbers way way up.

alexpott’s picture

So this code:

    // The base schema is built with the registry to ensure all resolvers are,
    // registered. SchemaExtender::extend must be used because in the PHP
    // implementation (contrary to the JS implementation) the BuildSchema::build
    // call does not properly process `extend` keywords in the SDL; and the
    // extender itself expects `Query`/`Mutation`/`Subscription` to be declared
    // exactly zero or one times.
    $schema = $this->buildSchema($document, $registry);
    $extensionAst = $this->getExtensionDocument($extensions);
    return $extensionAst
      ? SchemaExtender::extend($schema, $extensionAst)
      : $schema;

Means that we have to get both the extension and schema caches but I think the change to only cache the extension document and not the full document will be a good one. I think we we really need to address is the extra 1500 function calls. Thats a lot for the tiny schema exposed by the graphql_example and likely to spiral for larger schemas.

alexpott’s picture

I've tested this change on a client site using graphql 5. The site has a well defined (and small) schema. This change results in nearly doubling the number of function calls per request from 86270 to 138430 calls - this is mostly clearly seen in the number of calls to \GraphQL\Language\Visitor::extractVisitFn - from 20340 to 37080. On the other hand we do see a memory usage reduction from 13.50MB to 12.07MB which reflects not loading the full cached schema twice.

kingdutch’s picture

Good catch @alexpott! It turns out I forgot to disable schema validation in SchemaExtender::extend. This has been addressed in https://git.drupalcode.org/project/graphql/-/merge_requests/71/diffs?com...

In my own profiling I saw that this eliminated the extra visitor calls and now the only visitor calls are for validating the incoming GraphQL query, which is 'user input' and should thus be validated.

That should bring the number of function calls back down.

kingdutch’s picture

I've reverted the adoption of Source. I still think it's very useful to provide debugging information. However, it's not needed to fix AST caching and that's the most important issue for me.

We can debate how we can move to using Source so we can add file paths in schema errors in a separate issue. Implementation would be close to re-applying the inverse of https://git.drupalcode.org/project/graphql/-/merge_requests/71/diffs?com...

benstallings’s picture

Status: Needs review » Needs work

Claude Code has a lot to say about this MR:

Concerns

1. Breaking change in the event API, no deprecation. AlterSchemaDataEvent and AlterSchemaExtensionDataEvent previously carried array (raw SDL, numerically indexed — note $schemaData[0]). Now they carry array (ASTs, keyed by plugin ID). Every downstream subscriber that manipulated SDL via str_replace/regex (exactly the pattern the old test subscriber demonstrated) is silently broken at runtime — wrong types going through visitors that assume strings, or implode on objects. Since 5.x is still pre-release (the tip of 5.x is 5.0.0-beta2), a hard break may be acceptable, but it warrants an entry in the upgrade notes / CHANGELOG and ideally a mention in the commit message.
2. getSchemaDefinition() behavior change (SdlSchemaPluginBase.php:292–303). Previously returned NULL on file_get_contents failure or empty file (via ?: NULL). Now throws InvalidPluginDefinitionException with message "Failed to read schema file". Two nits:
- if (!$contents) conflates false (read error) with "" (empty but readable file). Prefer if ($contents === FALSE) and let an empty file parse (empty is already invalid SDL and will fail downstream with a clearer error) — or at least have the message reflect both possibilities.
- The return type hint is still string but old callers expecting nullable behavior are now getting exceptions. Fine in context (old ?: NULL was ignored by the caller anyway), but worth a line in the docblock noting the throw.
3. Stale docblock on getCacheId() (SdlSchemaPluginBase.php:310): "The cache type, e.g. 'schema' or 'full'." — 'full' no longer exists. Should be 'schema' or 'extension'.
4. Typo in the new comment (SdlSchemaPluginBase.php:137): "all resolvers are,\n // registered" — should be a period, not a comma.
5. Redundant array_filter (SdlSchemaPluginBase.php:208–210, 248–250): the inner array_map already returns NULL for empty/null schemas, so the outer array_filter(..., fn ($d) => !empty($d)) is redundant. Not wrong, just cruft.
6. Null caching for extensions (SdlSchemaPluginBase.php:260–264): when there are no extensions, $ast is NULL and still written to the cache. Works correctly on reads (if ($cache = $this->astCache->get(...)) yields the CacheItem regardless of $cache->data), but it's subtle. Either add a comment or skip the set() when $ast === NULL.
7. Test subscriber comment copy-paste (GraphQlSubscriber.php:103): says "Once we've visited the position field node we can stop" inside the alterSchemaData method that's actually matching id. Minor, test-only.
8. Visitor::stop() after first match: the test subscriber stops on first match of position/id. Fine for this fixture, but serves as an example for module authors — some callers might copy it and be surprised when a second matching node is skipped. Worth a sentence in the example explaining why stopping is safe here.
9. No user-facing documentation update. There's nothing in doc/ mentioning these events, and I'd expect a short migration note given #1. The test subscriber is now the canonical example — fine, but hard to discover.

What looks good

- Removing the print→parse round trip is a clean win; the new flow matches what graphql-php itself expects.
- Indexing the events by plugin ID is a genuine improvement — subscribers can now target a specific plugin's contribution rather than hunting through numerically indexed entries.
- Demonstrating AST visitors in the example subscriber is the right pedagogy.
- assumeValid gated on !$inDevelopment is consistent with the existing pattern for BuildSchema::build.
- Cache IDs and tags (graphql) are consistent, and the per-server server_id suffix is preserved for ConfigurableInterface schemas.

Suggested follow-ups before merge

- Fix the comment typo (SdlSchemaPluginBase.php:137) and the stale getCacheId docblock.
- Tighten if (!$contents) to if ($contents === FALSE) or adjust the error message.
- Drop the redundant outer array_filter calls.
- Add a CHANGELOG/upgrade note flagging the event payload type change.
- Optionally skip astCache->set() when the extension AST is NULL, or add a comment explaining it's intentional.

benstallings’s picture

Assigned: Unassigned » benstallings
klausi’s picture

StatusFileSize
new5.86 KB

I did some performance benchmarking and unforuntately this change introduces a performance regression. The getSchema() method takes 60% longer to complete.

5.x branch:
Results
- min: 26.770 ms
- avg: 41.025 ms
- median: 39.083 ms
- p95: 61.782 ms
- max: 67.663 ms
- total: 1230.755 ms

MR 71
Results
- min: 42.585 ms
- avg: 65.877 ms
- median: 65.930 ms
- p95: 73.875 ms
- max: 82.449 ms
- total: 1976.322 ms

Generated test script attached.

I need to instrument getSchema() in more detail to find the bottleneck.

klausi’s picture

Issue tags: +DevDaysAthens2026
klausi’s picture

I added some instrumenting lines and tracked it down to SchemaExtender::extend():

getSchema() step breakdown (avg / min / max over 30 measured runs)
Step                               avg ms     min ms     max ms
----------------------------------------------------------------
getExtensions                       0.238      0.197      0.324
getSchemaDocument                  12.884     10.740     14.122
registerResolvers                   2.952      1.351     29.227
buildSchema                         1.107      0.137     28.319
getExtensionDocument                5.012      4.248      5.515
SchemaExtender_extend              27.858      5.451     39.303

27ms just to extend the schema on every request is quite bad.

We really need to cache the full generated Schema object instead if possible. I think that is not possible right now because of anonymous function callbacks in the resolver registry. For that we have #3576071: Deprecate Callback Resolver.

Wild idea: could we automatically dump the anon callbacks into a php file? Probably not because they can interact with their surrounding environment? I'm thinking if we could wrap those resolvers any other way to be able to cache the schema.

klausi’s picture

Thanks, with this approach I see now a performance improvement of 30% now, testing a grapqhl request through the kernel on jobiqo:

5.x:
Results
- min: 25.066 ms
- avg: 35.403 ms
- median: 35.934 ms
- p95: 39.490 ms
- max: 41.056 ms
- total: 1062.103 ms

MR 71:
Results
- min: 13.302 ms
- avg: 24.005 ms
- median: 23.025 ms
- p95: 38.401 ms
- max: 40.128 ms
- total: 720.142 ms

I will run this through our full test suite now as well.

benstallings’s picture

Assigned: benstallings » Unassigned
Status: Needs work » Needs review
kingdutch’s picture

Status: Needs review » Fixed

Reviewed this together with Klausi, merging this for the upcoming beta release.

I've removed some of the docs that were AI generated from the MR again and went through the rest of the recent commits manually to pick out the function docblock changes and other coding standards issues. The auto-generated docs were too verbose and contained one-time upgrading information that's much better tackled by a change record.

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

  • kingdutch committed 6b4f8ea9 on 5.x
    feat: #3572680 SdlSchemaPluginBase caches AST instead of SDL string...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.