Problem/Motivation

Tracking paragraphs is really hard, a big reason for that is the bad stale paragraph management of the ERR/paragraphs module (which is hard to get right).

The module currently tracks each paragraph entity on its own and then at runtime when building the usage list, tries to find the parent and display it as a hierarchy based on the fields.

Proposed resolution

This is just a rough idea right now, but might be worth to explore. Instead of tracking paragraphs that are being saved directly, the module would only track the entities that have those paragraphs attached, e.g. nodes and paragraph library items. It would detect ERR fields pointing to paragraphs and transparently include anything those paragraphs are using and attribute it to the host entity.

We currently are doing something like that simply to figure out all the media entities referenced by a node and it would be way simpler if we could just look for usages of that node directly.

We would lose the hierarchy information but I'd say that is not very useful anyway. In turn, if a paragraphs is no longer used on a node, it would simply no longer be counted as nothing would be saved that references it.

Remaining tasks

User interface changes

API changes

It might require some API changes as \Drupal\entity_usage\EntityUpdateManager::trackUpdateOnEdition() and so on would need a way to tell plugins on what the usage should be reported.

Looking at that, it does seem rather inefficient right now, each usage does its own merge query + event, that's pretty heavy. If you're building large sites with a lot of paragraphs, I imagine there can quickly be a considerable amount of target entities being involved.

Maybe there could be something like an EntityUsages value object where the plugins could add usages to, and then that could do an optimized bulk update similar to how field saving works (delete all rows for a specific source, then do a single multi-insert query) and also a single event on all those usages.

As a side effect of that, EntityUpdateManager would then have full control over what those usages are reported on and e.g. pass in N referenced paragraph entities and track everything on a single EntityUsages object.

Data model changes

CommentFileSizeAuthor
#8 usage1.png183.21 KBmarcoscano
Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

Berdir created an issue. See original summary.

marcoscano’s picture

Thanks for reporting this!

I totally understand the problematic use case of the composite entities being tracked on their own. In principle, I'm not against the idea of making paragraphs "transparent" to the API, and only registering the relationship between non-paragraph entities, I agree that intermediate field names or the number of paragraphs in the chain isn't very relevant for most of the use cases this module is intended for, so it shouldn't be a big deal to lose that.

I would be curious to understand better how the implementation of this would look like. Originally, the idea was to have an API as unopinionated as possible, but we are already detracting that principle while building the list on the UI, special-casing paragraphs...

In an ideal world, it would be great if we could have that to be a configurable setting, so sites could individually set/unset paragraphs to become transparent or not. However, I'm afraid that may become quite hard to achieve with the current plugins implementation.

About the performance issue, so far we were under the assumption that a mitigation factor for that is that the usage is recorded during the save operation (i.e. an admin action), where the server response time is generally not so critical. And once it's done, everything is available in the database. Having that said, I'm interested in learning more about how it could become a problem, and try to improve that. Maybe we can open a separate issue for that, or do you think this change is intrinsically related to the composite change proposed on this issue?

Thanks!

berdir’s picture

2> Maybe we can open a separate issue for that, or do you think this change is intrinsically related to the composite change proposed on this issue?

It's not directly connected, I brought it up because it might allow us to to "kill two birds with one stone". Performance improvements as well as taking the decision of what to report against exactly out of the plugins.

And yes, this isn't in the critical path, but that doesn't mean it doesn't need to be optimized at all. There's also the bulk update, which isn't something that is done often, but if you have to re-index everything and have a million nodes, it might make quite a difference in the end.

Agree that it might make sense to be configurable, on the other side, then you'd need to keep the complexity of dealing with paragraph in two places. The configurability could simply be whether or not paragraphs are tracked explicitly as a source entity or not.

I'll see if I can come up with a PoC or get someone from our team to work on it. It's not a high priority right now, mostly I wanted to write my thoughts down so I can focus on something else for now ;) On the other side, now the module is still in alpha, the longer we wait the harder it would be to deal with the impact such a change would have on the plugins.

marcoscano’s picture

For visibility, we might end up marking this as won't-fix in the 2.x branch, since in a theoretical 3.x branch being cooked in #3060802: Refactor module architecture in a simpler, opinionated and more performant approach this wouldn't be an issue anymore.

rp7 made their first commit to this issue’s fork.

rp7’s picture

I've been taking a stab at this, trying to stay within the constraints of the 2.x version of this module. Not the finished product yet (some tests are failing - not yet sure why), but this is my progress so far.

The route I went is creating a new @EntityUsageTrack plugin (paragraph_inherited) that goes through all the paragraphs a (host) entity references and tracks those references on the host entity itself.

We would lose the hierarchy information but I'd say that is not very useful anyway.

Well, actually it is for the editors on the project that I'm working on. They have pages with quite a complex nested paragraph structure. It helps them tremendously to immediately see in which nested paragraph the entity is referenced. This is the reason I expanded the field_name column so that it contains all the fields of the paragraph chain. It adds quite some complexity though, I'll give you that.

Not battle-tested just yet, but our initial internal tests look good. We have custom logic in various places that uses entity usage information, and no longer having to deal with intermediary paragraphs makes things less complex.

Insights & help always welcome.

marcoscano’s picture

StatusFileSize
new183.21 KB

Thanks for working on this. I haven't reviewed the MR yet, but re-rolled it so we can test manually and see the results of the automated tests.

Functionally, I see that for each level of nesting, we are creating a new row in the usage table... Wouldn't that be confusing to editors?

usage

gugalamaciek made their first commit to this issue’s fork.

marcoscano’s picture

Heads up, the work being done in #3547273: Deleting paragraphs causes the entity usage tab to display confusing information has a large overlap with this one, potentially even making it a duplicate.