Active
Project:
Entity Blueprint
Version:
1.0.x-dev
Component:
Code
Priority:
Normal
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
1 Jun 2026 at 19:02 UTC
Updated:
5 Jun 2026 at 23:20 UTC
Jump to comment: Most recent
Thanks for the module -- I've tried out the module and I can see that it generates schemas for entities/bundles. I wonder if it's possible to have a JSON Schema mode so that Drupal\entity_blueprint\BlueprintSchemaBuilder, since AI providers are converging on making JSON Schema their preferred schema language?
FWIW, our module's use case is telling LLMs how to restructure HTML into valid JSON that represents Drupal entities. (LLM API structured output calls all require JSON Schema.)
I realize this is no small feat, but the way that you're iterating through fields and adding entity references makes a lot of sense already.
I'd be curious to know what you thought. Thanks!
Comments
Comment #2
tim bozeman commentedYes sir majorrobot sir!
🤖🫡
We should totally make the output configurable.
What's my thinking? I guess that one specific thing is inconsistent. Everything else is json. I was just optimizing for token cost there. It seems like claude at least doesn't mind the yaml format for the schema. I've been using haiku for testing and haven't seen any structural problems because of it.
Why optimize for token cost for that one thing? A landing page schema is potentially HUGE. The node, layout plugins, available blocks. Easily over 1000 lines. Granted they are very sparse lines, but the json syntax is just more tokens. That was my initial thinking and I am prepared to double down on it.
One thing I've noticed is that with a new context window you can say "create me a landing page about ..." > it gets the schema which includes guidance on what a good landing page would look like > it then does a reliably good job. Here's the issue: you start editing the page. Change this, make these look like that, make this an image of a giraffe on roller skates... You start set a precedent. The ai just gets in a groove of making tiny edits. We can't expect the average joe when doing this page building to always clear the context window at this point (I use a rolling window so if they don't clear it, it's a little more $, but it doesn't break). So with a really long context window and an unfortunate precedent set you ask for a new landing page, the ai just makes a new page and does the equivalence of a tiny edit. You get an unimpressive new landing page when it is capable of doing so much more. It just needs a reframing, or to shake it out of its little habit.
I am about to push a new skills system that works the same as claude code to address this. The ai goes "Oh I am about to create a page" > calls create entity skill > Oh cool I see what a good landing page looks like again, I can get a good result.
So, all that to say that get schema is potentially going to be called a lot and the json is just more $$$.
What makes you say json is becoming the standard? Should I use json too?
Comment #3
majorrobot commentedAt ease, soldier! :)
Thanks for the detailed response! I'm a little confused, though, because I think you're saying you're using yaml for the schema, but I don't see that in the code (or the module description)? What am I missing?
I think overall using JSON is a good idea. If you want the model to return valid JSON (I realize that may not be your use case), the gold standard is to use Structured Output mode when you query the model. Specifically, the JSON Schema standard (just to be clear, that's "Schema" with a capital "S") has become the standard for models/providers if you want to get back valid JSON that always matches your schema.
Here's a good write-up on it.
You're right -- it might be a lot of tokens. I think some of ours are around 10k tokens. But, if the module offered separate modes, at least the user would have choices, and would not have to choose JSON Schema mode. Some small amount of copy in the README or in the settings could mention the pros/cons of each mode? Just an idea.
Comment #4
majorrobot commentedAnother thought: what if the SchemaBuilder were pluggable? That way other modules could bring their own schema formats. Instead of building in "modes," folks could cook up plugins.
Comment #5
tim bozeman commentedThe yaml part is just in the AI tool I'm using. The drush command does use JSON, but not JSON Schema.
I like the idea of making SchemaBuilder pluggable. My page builder uses the schema as reading material, which is nice for tokens and comprehension, but you want a Structured Output call, which has to be valid JSON Schema.
The tricky part is that right now BlueprintSchemaBuilder gathers the structure and formats it in the same pass. So step one is extracting a format-neutral intermediate representation, then letting formatter plugins render it .