Overview
Currently, the XB field type is a single-item field with two columns: tree and props, each defined as a JSON structure.
#3468272: Store the ComponentTreeStructure field property one row per component instance proposes to store the tree relationally, not as a JSON value.
Option #3 in the proposed resolution of #3440578: [PP-2] JSON-based data storage proposal for component-based page building proposes to associate a field_union type with each component instance's static prop values in order to facilitate code outside of XB being able to use the information in the corresponding field_union config entities to implement Views integrations, migration tooling, configuring search indexes or other special view modes, etc.
This issue proposes to combine the de-jsonifying of the tree and the addition of the field_union type reference into a single refactor.
Proposed resolution
- Change the field type from single-item to multi-valued. Each item would be for a single component instance.
- Order (the
deltacolumn of) the items the same as how they appear in the left sidebar's Layers panel. This corresponds to how the component instances are ordered in the HTML when the page is rendered except where components render their slots in a different order than the order in which those slots are defined in the SDC's YAML. - Define the following columns (properties) in the field type:
instance_id(string)component(string): Reference to thecomponentconfig entity that defines the component that this is an instance of.parent(string): The instance ID of the parent component in the tree. NULL for component instances that are at the top-level in the tree.slot(string): The parent's slot that this component is in. NULL for component instances that are at the top-level in the tree or are in the default/unnamed slot of their parent (if in the future we decide to add support for default/unnamed slots).data_sources(json): ThesourceTypesandexpressionportion of what's currently in thepropscolumn (prior to this proposed refactoring) for this component instance. For example:
{ "prop1": { "sourceType": "dynamic", "expression": "ℹ︎␜entity:node:article␝title␞␟value" }, "prop2": { "sourceType": "static:field_item:string", "expression": "ℹ︎string␟value" } }static_values(json): Thevalueportion of what's currently in thepropscolumn (prior to this proposed refactoring) for this component instance. For example, given the above example ofdata_sources, this could be:
{ "prop2": "Hello, world!" }The above example uses XB's current optimization of omitting the column/property name within the sourceType's field type if it's the sole property being used and it corresponds to the field type's
mainPropertyName(). Given this issue's proposal to add afield_unionreference (see below), we should evaluate if it would be better for people using that reference if we always explicitly included the column/property name, in which case the above would be:{ "prop2": { "value": "Hello, world!" } }field_union(string):
Reference to thefield_unionconfig entity that defines the union of field types for this component. Alternatively, we could omit this from here and instead add the field_union reference to thecomponentconfig entity. Since the component instance references the component config entity, this would just then be one more hop to get to the field_union config entity, but denormalizing the field_union reference into here might help with querying since Drupal doesn't have great support for JOINing on config entities (though that support could be improved if Drupal core refactored theconfigtable to store as JSON instead of serialized PHP).
Whether thefield_unionreference is in this field type directly, or only indirectly viacomponent, it can be NULL in cases wherestatic_valuescan't be, or wouldn't benefit from being, conformed to a field_union definition. Note that a field_union can be a union of fields that are of any type, including a JSON field, so the "wouldn't benefit from being" is more likely to be the case than "can't be". An example of this might be the values for block settings that we either can't or choose not to define field_union types for.
User interface changes
None
Caveats
- It would be nice to have the field_union module be an optional dependency rather than required for XB to function. See comment #5.
- @lauriii, @catch, @alexpott, @longwave, @Wim Leers, @tedbow, and I (@effulgentsia) met during DrupalCon Barcelona and there was disagreement on whether the
field_unionreference would be useful. @catch thinks it's needed to support #3462219: [META] Support alternative renderings of prop data added for the 'full' view mode such as for search indexing or newsletters, but @lauriii thinks those use cases could be solved in better ways by XB directly. The next step to try to resolve that disagreement would be to formulate some specific use cases and think through what a good UX would be for a site builder to implement them. However, regardless of whether thefield_unionreference ends up being useful in practice, I don't think it hurts anything to have it, especially if the reference is only in thecomponentconfig entity and not denormalized into the field type. - If some fields or field properties are marked as
requiredwithin the field_union config entity, that will not be enforced when saving thestatic_valuesJSON. The reason is that XB will allow some of a component's props to be defined statically and some to come from the entity's fields or other dynamic or even remote sources. In other words, it's perfectly valid forstatic_valuesto only contain a subset of what's required by the field_union.
Risks
I ran this proposal by @Wim Leers before writing it up, and he pointed out that there could be nodes with hundreds of component instances on them, so this proposal creates the possibility of a multi-valued field item with more items in it than we're typically used to in Drupal, and we don't know what performance or memory issues FieldItemList and its related PHP objects will encounter at that scale. This is something we'll need to keep our eyes on, but I think there's two things that mitigate the risk:
- I hope people don't actually put hundreds of component instances on a node. That wouldn't lead to a good authoring experience. A good design system, even if it includes some small components (atoms), should also include larger components (molecules, organisms) that content authors work with, so that content authors aren't in practice putting every atom one-by-one on a page.
- If we need to, we can implement a
list_classfor the XB field type that's more suitable to very large lists thanFieldItemListis.
| Comment | File | Size | Author |
|---|---|---|---|
| #16 | XB props token-aka-dynamic vs static.png | 861.04 KB | wim leers |
Comments
Comment #2
effulgentsia commentedComment #4
effulgentsia commentedCrediting @joachim who proposed de-jsonifying the tree back in #3440578-51: [PP-2] JSON-based data storage proposal for component-based page building.
Comment #5
effulgentsia commentedA big advantage of this would be that it would allow the field_union module to be an optional dependency. XB could add the field_union config entities, and reference them from the component config entities, when the field_union module is enabled, and not do that when that module is not enabled, without that affecting the schema of the XB field type.
Given the advantage above of letting the field_union module be an optional dependency if we keep the field type normalized and only access an item's
field_unionconfig entity via itscomponentconfig entity, I recommend doing that, and solving the querying use case by changing core'sconfigtable from serialized PHP to json.Comment #6
effulgentsia commentedAdded a Caveats section to the issue summary.
Comment #7
catchThanks for writing this up! I'm still digesting the proposed schema. Two minor things:
So I haven't actually used paragraphs, but I assume there's a single paragraph reference field that can have dozens if not hundreds of deltas in it referencing the paragraph entities. If so then we already have an equivalent example in the wild (except no extra entities for every row here). Also having written the last couple of sentences, I wonder if this starts to make an actual data migration from paragraphs more feasible.
We started discussing that in one of the JSON database support issues, it would allow us to remove the key value config stuff (which supports some limited querying now).
However, I think we could workaround not having that yet, just by running extra queries. e.g. if we want to find out whether a field type is used, we can get a list of field unions that use it, then a list of components that use those field unions, then run an IN(). Given the main current use-case for that sort of querying is auditing, it should be OK.
Comment #8
effulgentsia commentedI realized after writing this that the reason this is true is because the
componentconfig entities have essentially all of the information that would need to be in thefield_unionconfig entity, which is what would let us generate the field_union config entity from the component config entity at any time that we needed to.Given that, I wonder if making the field_union module a hard dependency wouldn't actually be that bad. It would let us take a bunch of stuff out of the
componentconfig entity and instead move that information to thefield_unionconfig entity.Comment #9
carlitus commentedHi, I just wanted to comment on this:
We use a lot of low-level elements on a page, so we have a lot of freedom. Yes, we also have some elements like molecules, but we usually do that with templates that we can then modify. And this templates are a group a single atoms.
And a landing, por example, can be very, very, very long.
So actually the hundreds of components that @Wim Leers was talking about can be real in a lot of cases.
Comment #10
catchIn general that sounds like a great idea, it would mean the component config entity only needs to hold the things that are unique to the concept.
I had wondered whether we actually need two config entity types at all - i.e. could field union directly use a component config entity type instead of using its own, or could XB directly use field unions without an extra entity type in-between, but... no idea whether that would even be desirable even if it's possible.
Comment #11
lauriiiAre there use cases outside of the use cases that have been already identified that this would help with? So far I've not heard compelling reasons to do this. I'm pretty strongly -1 to supporting the workflow proposed in #3462219: [META] Support alternative renderings of prop data added for the 'full' view mode such as for search indexing or newsletters out of the box because at least as I understand it, it would result in a extremely convoluted UX. As a fairly technical user, I'm having hard time imagining working with several lists of components and figuring myself how to build anything meaningful out of it. I believe there should be an easier way for managing the challenges related to the search indexing.
Unless we can define what's the value we get out of this, I don't see why we would prioritize working on this over other work, especially because it sounds like that there's risk associated to introducing this. If I also understand correctly, this also means that there's additional complexity going forward because we support multiple data models out of the box (one for config, one for content).
I checked a sample front page I had built on another page builder and I had 129 components/elements on that page. This was still a fairly simple page using a mix atoms and organisms. I would have to do some more research to define what a reasonable upper bound would be, but it seems that the architecture should definitely be able to handle at least some hundreds of components.
If we move from JSON structure to a multi-valued field (where each delta represents a component), how do we handle scenarios where there are overrides on top of the desktop breakpoint (e.g. for the mobile breakpoint)? This is requirement #20 from the original product requirements for Experience Builder.
Example scenario would be that I want larger margin and padding on desktop than on mobile and I want to display a block recommending to install an app on mobile.
How would this be represented in this data model? Would this still all be stored in the single list or would we have separate lists for different breakpoints?
Comment #12
effulgentsia commentedIs the thinking here that any prop could be overridden by breakpoint? For example, text content such as the quote for a testimonial component could be changed by breakpoint? Or only certain props, identified by the SDC creator as "style" props as opposed to "content" props?
Comment #13
wim leersRE: issue summary
It should definitely be present in the
Componentconfig entity, because that'll allow a config dependency on theFieldUnion👍RE: @catch in #7
Very good point! 👏
🤔 Does anybody in this issue have access to a Paragraphs-heavy complex site with good performance, so that we can get some statistics? 🤓🤞
@effulgentsia in #8
Exactly!
XB currently basically implements the functionality that Field Union provides (I wrote that previously ~4 months ago at the top of #3440578-52: [PP-2] JSON-based data storage proposal for component-based page building), but without another config entity; the metadata that defines what field types to use (just like a
FieldUnionconfig entity does) is captured by theComponentconfig entity type. See how similar these are (the most notable difference being the absence/presence of config validation):FieldUnionconfig entityComponentconfig entity(It's rather unlikely that two SDCs would point to the exact same
FieldUnion, because even just naming the SDC's props differently would result in differentFieldUnions.)@catch in #10
HAH! 😅
Maybe … maybe the existing
Componentconfig entity with its (validated) config schema (see links above) actually sufficiently addresses all those needs already then? 😄 Back when we were actively going back-and-forth on #3440578 (~4 months ago), a lot of this was much vaguer, less fleshed out. Now it is in a more complete (but nowhere near done!) state. If you look at the<dl>above … does that already do what you're suggesting/thinking here? 😊 🤞@lauriii in #11
I imagine41. Conditional display of components?Hm, requirement #20's story says , which requires storing additional values, not merely conditionally displaying (which is yes/no based on some condition, whereas the example in the story is more nuanced) … not sure. None of that is specced out nor estimated though. It sounds more like a "CSS-per-component instance" thing than a "prop value per component instance" thing though.
That's why I doubt the delta/multi-value change this issue proposes would affect this product requirement; it'd likely be a new
styleorcssfield property on the field type, to allow per-component-instance (responsive aka media query) styles.Comment #14
wim leersIn my prior comment I caught up on the issue and replied to things that stood out. @effulgentsia captured my first concern at the bottom of the issue summary ("what about hundreds of component instances"). @lauriii confirmed this must be supported. @catch suggested that Paragraphs likely already hits that scale. We still need numbers to gain confidence.
In this comment, I point out concerns that have not been raised before.
I'll use the same example in both concerns: suppose a
PROVIDER:headingSDC that contain 2 props: a string (title) and a heading level (enum integer). Then a matching field union (first see the docs at https://git.drupalcode.org/project/field_union/blob/8.x-1.x/Readme.md) would be:Concern 2: product requirement
7.1 Tokensaka Reusing values in the host entity's base/bundle fieldsWhat if I want to populate one of the SDC props using a value from a host entity base/bundle field?
(This is called a
dynamic prop sourcein current XB terminology because its value is dynamic: the value changes when the host entity's field values change. This is in contrast with astatic prop source, where the value is manually/explicitly entered by the Content Creator, where the value that was entered is static: it will always evaluate to the same result. See XB terminology docs.)For example, I want to populate a component instance that uses the
headingSDC in part with the label of the single-cardinality "Category" taxonomy term reference of my host entity type+bundle "News item". (Or, simpler example: the "News item" entity's "Title" field .)So, my heading component instance would be claiming to be using this
xb_component_PROVIDER_headingFieldUnion, but … actually only thelevelprop would be populated by the field union, thetextSDC prop would be populated by the "Category" taxonomy term reference!This is what I was referring to in #3440578-30: [PP-2] JSON-based data storage proposal for component-based page building. That's what product requirement
7.1 Tokensrefers to.(The above interpretation AFAICT accurately/reasonably interprets the product requirement. @lauriii, please correct me if I'm wrong.)
Concern 3: How will this work for SDC props that themselves are
type: object-shaped?An SDC's
propsis alwaystype: object. But what if some propfooalso istype: object-shaped?This is not yet supported in XB yet (issue: #3467890: [later phase] Support `{type: object, …}` prop shapes with single level that require *multiple* field types: use `field_union`? — OUT OF SCOPE: nested components/component reuse), but I know/I'm confident it's possible.
This is a common need, and a number of SDCs in https://www.drupal.org/project/demo_design_system had to be refactored to not use that because #3467890 is not yet fixed! 😅
An example in the XB codebase itself is the
shoe_detailscomponent, which contains:I've found #3170831: Support nested union fields in the
field_unionissue queue. I have no idea yet how much work it'd be to support that. I bet @larowlan can speak to that 🤓But this would make concern 2 above more complicated: what if it's the "title" of the
expand_iconthat you want to populate using a base/bundle field value? Then it'd be a token that needs to be resolved for one of a nested field union.Comment #15
catchI think to answer this, we would need to figure out what addressing #3462219: [META] Support alternative renderings of prop data added for the 'full' view mode such as for search indexing or newsletters looks like with field_unions + components vs. just components or at least enough direction and agreement on the use-cases to be able to talk about it with a common understanding of the need, currently that does not seem to be the case.
The big difference with a field_union is that it results in field data that can be used outside XB (i.e. regular manage display), or which could be configured as the optional source for 'dynamic props' inside XB, for a different view mode. Just this morning I tried to type up some thoughts on how a component-only solution to that issue theoretically possibly could work to have something to compare to.
But if we used field unions, then everything (or at least everything that looks like field data using field types) would be dynamic - the difference would instead be whether field deltas can be created/re-ordered/edited directly within the XB interface or not.
Comment #16
wim leersI just saw @lauriii's DrupalCon presentation, where he walked through designs that show the
7.1 Tokenfunctionality I mentioned in concern 2 in #14. He showed DrupalCon before everyone else 😄That enables me to actually illustrate the problem:
👆 That shows the props form for an
PROVIDER:event_cardSDC. It has 9 props:For each of the 9, I drew an arrow on the screenshot of the design:
How do you map that onto a Field Union? The Field Union metadata would still be relevant (allowing "regular manage display" as you say), but the field data itself would empty for 6/9=66% of the fields in the field union.
I don't understand this paragraph in two ways:
Could you rephrase that? 🙏
(Could be me I've had a terrible night with our RCBO/GFCI interrupting twice in the middle of the night 😬, so I'm not at 100% brain capacity.)
Comment #17
catchCurrently dynamic is 'referenced from entity fields' and static is 'stored directly in the xb field', with field union, everything becomes a field reference/dynamic.
We discussed this during the Barcelona meeting - let's say you have five images + description components back by field unions, when re-ordering them, we might want to re-order the field union deltas too so that the XB order and the field order stays in sync. This is opposed to say a single standard description field which doesn't have a delta order.
Even with the diagram I still don't understand what's going on here unfortunately. Personally it seems odd to me that you would place a single component then have to individually map what comes from there in it. Why is the content editor making all those 7-9 decisions about each component they add? For entity references I can see you would want selection/create etc., but not for little bits of text.
Comment #18
wim leers🤔 Aha! Maybe I have fundamentally misunderstood how Field Unions work. This is not mentioned in https://git.drupalcode.org/project/field_union/blob/8.x-1.x/Readme.md. I'll read the underlying code instead.
That is a very fair point! 👍
I wonder if this is solely because we don't have #3455629: [PP-1] [META] 7. Content Templates — aka "default layouts" — affects the tree+props data model built yet (where it'd indeed be a Site Builder decision), or whether @lauriii truly intends for Content Creators to (be able to) decide this.
That'd definitely change this conversation!
Comment #19
catchSorry I may not be doing a good job explaining what I mean, it should not be necessary to look at the current field union module.
If we store XB-entered data in Field Union, then XB will be writing that data to field union field values. This means that the field union data is field data, same as any other field (except the extra stuff it adds on top).
XB's static vs. dynamic distinction is for field API vs. non field API data.
If all (or all*) entity-content data entered via XB is field API data, then I would assume that XB would switch to treating the field union data as field data and referencing it the same way that it does other field types. Albeit a more complex field type but one nonetheless.
If that doesn't clarify things, we should grab each other on slack, figure out the disconnect, then report back here.
OK let's please try to clarify that asap. If content creators cannot do this manual mapping, (I agree it's something site builders might do with bundle-level fields similar to layout builder view mode config now), then each component added within XB by site editors will be 'coherent' in that its data will come from the same place and no need for 'partial field unions' which I agree would be weird.
Comment #20
wim leers💯 — made @lauriii aware in a meeting an hour ago 👍
Comment #21
lauriiiThe main goal isn't necessarily to make content creators do the mapping themself because the task of mapping fields to properties would be quite challenging for most content creators to manage (as @catch is arguing in #17). That said, the aim has been to include this capability consistently across the system for site builders to utilize. This would also enable us to build capabilities where site builders could pre-configure mappings to components, which would make it easier for content creators to utilize this capability.
Something to note is that the field mappings are not conceptually supposed to be restricted to Drupal fields. The plan is to eventually allow site builders to connect components to data from external APIs and other pre-configured integrations (e.g., Shopify, Zapier, Airtable, etc).
Comment #22
catchI think if I'm understanding #21 correctly, the use case that needs to be kept open is to have a component where the site builder has pre-configured mappings from different Drupal fields (or elsewhere), for say 3/5 of the sources, and then 2/5 are entered by the content editor. But then in that case, you could have a field union with two field types providing those two values and it would still be internally consistent. Can't think of a good concrete example but something along those lines?
The other case that maybe applies here is a component where the fields are mapped for everything (maybe with a hard-coded string value on the bundle layout level), but the content editor can provide overrides on a specific content item. For example a section heading that is the same on 90% of entities but can be customized for the other 10% of entities.
But in that case, the override can be an optional field value anyway, and the override just depends on whether it has content or not. And that would still be internally consistent too.
But it sounds like it's not necessary to support a use case where the site builder sets up a component, and the content creator can unilaterally remap where things come from arbitrarily.
Comment #23
lauriiiA content creator wouldn't necessarily map properties to fields but a site builder could do this even in the context of a single page. This is also one way in which the site builder could build these pre-defined mappings in the first place.
The goal is for the framework to behave similarly regardless if you're editing a component or a page. This way you get a consistent experience across the system, and can for example start building a component while you are building a page. This is a workflow that tools like Figma have popularized.
Comment #24
catchOK but if you do that, then the component that you're building would eventually get saved as config, and then if there is mappings involved, the source for those mappings would eventually get saved as field values - so you would still not have a page-unique field mapping? Or would you?
Comment #25
tedbow#19 @catch
So this is different from what is proposed in the summary of this issue, correct?
In the summary there is
static_valueswith the exampleBut in what you wrote in #19 it seems like this would be written to the
field_unionfield values instead. Otherwise you wouldn't get the benefit of using it in manage display or views.Comment #26
catchYes what effulgentsia wrote in the issue summary is not the same as what I was suggesting in #3440578: [PP-2] JSON-based data storage proposal for component-based page building option 3. It would be useful if he could explain what he thinks the pros/cons are and why he diverged in this issue because for me I can't really see the benefit of the specifics here where the field union is an extra config entity but doesn't do anything else.
But also I really think sorting out the use cases and desired functionality in the view modes issue needs to happen alongside this - see the discussion over there about partial field unions vs not from yesterday.
Comment #27
effulgentsia commentedDid I misunderstand option 3 from that issue? In it @catch wrote:
Isn't that what this issue's summary is also suggesting, except renaming
valuestostatic_values?For me, the key difference between this issue and how I understood option 3 from that issue is that in this issue I'm suggesting that in addition to
static_values, we also have the other columns, in particularparentandslot, so that each item has all of the information about the component instance: both its "Field union JSON" value and its location in the tree.Comment #28
catch@effulgentsia I think it might stem from:
"The field table.." in that paragraph.
What I mean here is;
"The field union table (as distinct from current field union which does not use JSON) would store the field values, distinct/independent from the XB storage.
Instead of:
"The XB field table" would store field unions.
I thought this was a brand new approach, not a misunderstanding!
Comment #29
effulgentsia commentedWould the following reconcile #28 with this issue's current summary?
What do you mean by the field union table? Currently, field_union defines a field type (via its deriver) for every field_union config entity. I imagine a "field union json" concept would be implemented as a new field type:
dynamic_field_union*, where the properties/columns of this field type are:typeandvalues, wheretypeis a reference to the field_union config entity for that item, andvaluesis the JSON.So that's basically the same as this issue's proposed last 2 columns. However, for XB, we also need the first 5 columns. We could add those additional columns in one of two ways:
dynamic_field_unionand add the extra columns. Just like how in core FileItem subclasses EntityReferenceItem.dynamic_field_union. This might actually be nice in terms of making thecomponentcolumn a full-fledged EntityReferenceItem (sub)field in its own right.*Note: I'm using the term
dynamic_field_unionto convey the same distinction fromfield_unionas dynamic_entity_reference has from entity_reference.Comment #30
effulgentsia commentedGiven the choice of subclassing or aggregating, I think aggregating would fit the desired mental model better. FileItem subclasses EntityReferenceItem but that's because conceptually a file item is an entity reference item, that also has a description. However, the mental model of a component instance should not be that it's a dynamic field union of static values plus also some other stuff; the mental model should be that a component instance is its own thing, where one of the things that it has is static values and those static values can be modeled as a dynamic field union.
Comment #31
catchHow I had it in my head is that the XB data would only reference the field union data (similar to how it does other fields on the entity), not incorporate it as such.
Or that the component instance is its own thing, and it can reference dynamic values which happen to be in a field union.
Comment #32
effulgentsia commentedIf the XB field is a multi-valued field of component instances, and there's a separate multi-valued dynamic_field_union field for the component instances' static prop values, then how would each component instance reference its corresponding dynamic_field_union item? Doing it with a numeric
deltathat gets re-ordered would be fragile. Each dynamic_field_union item would need a stable ID, which could be the same as theinstance_idof the component instance, or it could be its own separate ID.Currently, a regular
field_uniondoesn't have the concept of a stable item ID. Would we want to add that concept todynamic_field_unionwithout also adding it tofield_union? Would we then be adding this to bothfield_unionanddynamic_field_unionsolely for the XB use-case, or would a stable item ID serve other use cases as well? If it's only for the XB use case, then what makes this better than having the XB field either extend or aggregate thedynamic_field_unionfield?Comment #33
catchI think it could help for translation, diff, and conflict resolution potentially? e.g. it would help to detect when field unions have been re-ordered as opposed to edited. If so, it seems as applicable to dynamic field unions as non-dynamic field unions.
Field union for me is 'field collections or paragraphs (or custom blocks) without the extra entities', so if we give those a uuid or similar then it should cover any latent use cases where having an individual identifiable thing was relied upon.
Another possible use case is for things like the featured image in an image gallery - if there's a way to select the featured image, and this is done by 'field union uuid' then that would persist across re-orderings. I would normally try to persuade someone that instead of selecting they should just automatically select the first delta instead, but if the requirements are specific it would enable use cases like that. I haven't personally used paragraphs (at least not for site building, I've seen it in performance audits...) but given it's theoretically possible to reference an individual paragraph entity now, I imagine a 'field union uuid' could be used to do similar things when they come up.
Comment #34
effulgentsia commentedWe likely still won't get to this in the very short term but tagging it as a stable blocker so that it stays on the radar for that. There's a chance we decide to not do any of this, or only do a subset of it, so re-titling accordingly. But either way, we should make a conscious decision before considering XB stable.
Comment #35
catchComment #36
catchPostponed #3511852: Experience Builder support (default content dependency support) on this issue.
Comment #37
kristen poltagging for findability
Comment #40
wim leersCurrent issue summary:
data_sources+static_values+field_unionFirst, the simple part. Evolution since this issue summary was created, but fits in existing proposal:
data_sourcesmust be able to storeDefaultRelativeUrlPropSource. I don't see why it couldn't.Then, the harder part. For a
@ComponentSource=blockcomponent instance, we wouldn't use neitherdata_sourcesnorfield_union. We'd be storing something like… because block plugins have their own explicit input ("block settings") input UX + storage mechanism. Only the
sdcandjsComponentSourceplugins use the shape matching infrastructure andStaticPropSources. Which is why they're the two that subclassGeneratedFieldExplicitInputUxComponentSourceBase, which provides a "generated field-based explicit input UX" for any future component type as long as it has (JSON) schema information for each explicit input they accept.@effulgentsia: how do you propose that to be represented?
One row per component instance per revision 🤔
This would definitely easily result in millions of rows (one row per component instance per content entity revision) — because it's quite literally this excerpt from #3468272: Store the ComponentTreeStructure field property one row per component instance:
would grow to something like the following based on the current issue summary + #3457504: XB field type: calculate all dependencies, store them, surface in new Component "Audit" operation:
… which in turn means querying dependencies (happening in #3457504) would have to match many more rows. Not necessarily a problem, but definitely a consequence to keep in mind.
Or … is @effulgentsia's idea to use
instance_idto actually end up withand then a separate DB table with
to allow #3469082: Add way to "intern" large field item values to reduce database size by 10x to 100x for sites with many entity revisions and/or languages? 🤔
Comment #41
wim leersAlso, when this issue was created,
ContentTemplateconfig entities were a distant reality. Now they're becoming a close reality, with key pieces having already landed!Thoughts on how you'd want #3519352: Content templates, part 3b: store exposed slot subtrees on individual entities to be reflected here?
Comment #42
catchIt's a benefit rather than a problem relative to all the other options in #3457504: XB field type: calculate all dependencies, store them, surface in new Component "Audit" operation because we know that relational databases can very efficiently query an indexed varchar even when there are millions of rows. Whereas even if there less rows, we either have unknown performance (indexed JSON queries across the three core-supported database types), or ones known to be bad (LIKE'%foo%').
Comment #43
larowlanComment #44
larowlanReviewed this and associated issues and created the following spikes to explore some of the options and be able to size and break this up
Comment #45
larowlanComment #46
daffie commentedAt the moment we are using JSON storage for the Experience builder. This was done, because it is best solution from a performance prespective. In this issue we want to change that to a standard relational database structure. The result will be that, just like with the paragraph module, the performance will drop when we have complicated pages and/or a lot of pages. With JSON storage the page is stored in a single row in a single table. With a standard relational database structure the same data is split up in a lot of pieces (1 piece is a single row from a single table). When you load or update a page, every piece of the page needs to be read from the database or updated in the database. The more pieces of data that you have, the slower it will get. The same with the total amount of data. The more there is, the slower it will get. So, yes you can split up the page in a lot of pieces, just like was done in the paragraph module, and the result will be that you will get the same performance problems as with the paragraph module. The main difference with the paragraph module will be that the UI of the Experience Builder will be a lot better.
My educated guess is that we want to change the storage to a standard relational database structure, because we want to do things with the JSON data in the database that is not supported by MySQL or MariaDB. If I am wrong then please say so.
Support for JSON storage in MySQL and MariaDB is, and how do I say this in a polite way, pretty basic. Drupal core has another database that it supports and that is PostgreSQL. PostgreSQL has a for more advanced support for JSON storage. Without knowing what kind of functionality the Experience Builder needs for JSON objects in the database, I am pretty sure that PostgreSQL can do it. To be fully sure, I will need a list of all the things we need from the database for JSON storage to support the features that you want to add to the Experience Builder.
We have 2 options:
1. Change the storage to a standard relational database storage and keep support for MySQL and MariaDB. The main drawback will be the less then ideal performance.
2. Keep the JSON storage and have a great performance. The main drawback here is that it only works with PostgreSQL.
As I am not designer myself, so I have asked a couple of designers about the importance of performance with the Experience Builder. When I start talking about the Experience Builder their faces light up and they very much like the demo's that they have seen. When I then ask how much they like it when the performance will be the same as that of the paragraph module, they get a very disappointed look on their face. For them the performance is super important. I am not the product owner or the project owner, but for me the option to go for is very much the PostgreSQL only option.
I know that most of you have none or very little experience with PostgreSQL. That is fine. It is a change and people do not like change. From a strategic standpoint are MySQL and MariaDB just not the right choice. The owner of MySQL is the Oracle corporation. As the owner for the last 16 years, they have done very little to improve MySQL. They would very much like you to change you database to OracleDB. MariaDB is also supported by a single company. The problem here is that the company hasn't made a profit in a lot of years. Threfore they have very little money to improve MariaDB.
PostgreSQL is however supported by a large community. Just like the Drupal project. PostgreSQL and their community is been great run for a lot of years now. From a technical standpoint is PostgreSQL by far the superier database when compared to MySQL and MariaDB. PostgreSQL can be extended just like you can do with Drupal. Drupal modules are called extensions and there are a lot of them. They offer a lot of functionality that is not available in MySQL or MariaDB.
A lot of what Drupal can do is the result of the database it is using. Yes, the PHP part is very important, but so is the used database. With PostgreSQL you create far more advanced solutions with Drupal.
Comment #47
lauriii@daffie Thank you for the detailed analysis! I totally agree that performance is critical for Experience Builder. Improving performance over the existing solutions was one of the goals since the beginning.
Are you aware of the discussion in #3468272: Store the ComponentTreeStructure field property one row per component instance? The issue includes most of the intended changes. In #3468272-33: Store the ComponentTreeStructure field property one row per component instance there was an assessment from @catch where he stated that the expected performance impact would be largely neutral. It would be great if you could look at that issue and provide any further insights there!
What comes to relying on PostgreSQL, the challenge is that if we introduce changes to the hosting requirements, it will slow down adoption. Because of that, it seems unlikely that Experience Builder would introduce a dependency on PostgreSQL.
Comment #48
catchAs @lauriii mentioned, the current proposed new schema is in #3468272: Store the ComponentTreeStructure field property one row per component instance. It's very different from paragraphs though so it feels like #46 is based on a misunderstanding.
Paragraphs stores each 'component' as a separate entity. This means each row has to be a separate entity save (and that will likely involve multiple tables for different fields too. This gets considerably worse with nested paragraphs too.
e.g. one node with 50 paragraphs:
node + node_field_data + node_field_revision
paragraph + paragraph_field_data + paragraph_field_revision * 50
= ~153 database queries / tables.
And there's not only the overhead of the queries themselves, but building the database queries in PHP, all the entity presave/update/insert hooks etc.
The proposed schema in that issue is for each 'component' to be a delta of a single field on the main entity, with the values themselves still in JSON. Field storage already writes all deltas of a field in a single query, so it's no more database queries than a single JSON blob to write or read each time. 1 entity save = 1 entity save still.
node + node_field_data + node_field_revision + node_field_xb = 4 queries / tables.
It should actually reduce write queries compared to the current schema by making #3521202: Store XB field type's "deps_*" columns in separate table to allow efficient querying partially or wholly redundant. That issue is mostly a workaround for lack of JSON index support in core, but I personally don't think that #3468272: Store the ComponentTreeStructure field property one row per component instance is a workaround at all, it should also help with implementation of issues like #3469082: Add way to "intern" large field item values to reduce database size by 10x to 100x for sites with many entity revisions and/or languages.
Comment #49
catchAlso just to expand on:
This is what allows all the data to remain in deltas of a single field instead of separate tables.
We don't expect the field values of XB deltas to need to be queried on (by views or entity query) very often if at all - if/when they do, then once core supports JSON queries a bit better it will be doable.
What it does help with though, is querying on what the component for a specific delta is, or which components are in use across all entities and things like that - this will be just a regular varchar in its own column.
Comment #50
wim leers#3468272: Store the ComponentTreeStructure field property one row per component instance landed. Which is what prompted @catch to RTBC #3440578 at #3440578-87: [PP-2] JSON-based data storage proposal for component-based page building.
That means the
first half of this issue's title is done: . 🎉
Work is under way over at #3523841 that will perform the second half of this issue's title: — see @larowlan at #3523841-35: Versioned Component config entities (SDC, JS: prop_field_definitions, block: default_setting, all: slots for fallback) + component instances refer to versions ⇒ less data to store per XB field row.
AFAICT that means this issue should be postponed, to make sure that after #3523841 is done, no additional concerns are lingering.
@catch: agreed?
Comment #51
catch@Wim Leers, yes! afaict that issue covers everything that's in here, so this is hopefully mostly a reference point at this point until that one gets resolved.
Comment #52
wim leers👍
Let's ensure that we don't forget to revisit this after #3523841: Versioned Component config entities (SDC, JS: prop_field_definitions, block: default_setting, all: slots for fallback) + component instances refer to versions ⇒ less data to store per XB field row lands — prefixing with
PP-1:)Comment #53
wim leersCaptured in the meta as of #3520449-31: [META] Production-ready data storage. 👍
Comment #54
lauriiiThis has been covered by the issues referenced here.