[META] Plan entity relationships with metadata [#3187877]

Problem/Motivation

farmOS provides a "Plan" entity type, which is used as a way of organizing Asset and Log entities around a particular purpose. For example, imagine a "Crop Plan" (bundle) which provides features for organizing "Plant" assets and "Seeding", "Transplanting", and "Harvest" logs with the purpose of planning and executing a crop plan for the season.

In farmOS 1.x, we did not use Entity Reference fields for these relationships, because we needed to store additional metadata alongside each relationship. For example, in a "Crop plan", we want to link "Plant" assets to the plan, but we also want to store things like "Days to transplant", "Days to harvest", "Harvest window", etc alongside each relationship, so that we can render a timeline view of multiple plant assets within each plan (imagine a "2020 Lettuce crop plan" with 20 varieties of lettuce, with successions, etc. So a simple Entity Reference field is not enough.

One option is to create a custom field type that extends the Entity Reference field type and adds additional columns to the database table. This was more difficult in Drupal 7, so instead we just created our own database tables with hook_schema() to represent these relationships, and bypassed the use of fields entirely.

For context, here is the 7.x-1.x Crop plan module: https://github.com/mstenta/farm_crop_plan/tree/7.x-1.x

And specifically the table we created to represent those relationships: https://github.com/mstenta/farm_crop_plan/blob/7.x-1.x/farm_crop_plan.in...

Proposed resolution

In Drupal 9 / farmOS 2.x, I would like to reassess the possibility of extending Entity Reference fields for this purpose. Field types are classes in D9, which makes this a lot more feasible.

The benefit of using Entity Reference fields is we can leverage the Entity Reference Integrity module to prevent deletion of assets and logs that are associated with a plan. In farmOS 1.x we have a bunch of custom constraint logic to achieve this, which I would love to leave behind. However, we will have to investigate whether or not Entity Reference Integrity supports field types that extend from Entity Reference fields, or if a patch might be necessary to support that.

I asked about these ideas in Drupal Slack, and @kim.kennof pointed to this module: https://www.drupal.org/project/entity_reference_override - which appears to be a good example of what we are trying to achieve.

It is also worth noting that this level of complexity may not be necessary for all plan types. Some may work fine with a standard Entity Reference field. So it may be worth considering adding a simple Asset and Log reference field to all Plan bundles by default (using BundleFieldDefinitions per #3183372: Convert config fields to base/bundle fields), but allow them to be disabled on a per-bundle basis when a plan type provides their own more advanced reference field.

I'm also curious if we can create a standard base class to use for these kinds of things - to make it easier for contrib to provide their own plan types with these advanced reference fields.

Remaining tasks

- ~~Add generic asset and log reference fields to PlanTypeBase, which all other Plan bundle plugins inherit from (and allow unsetting/overriding them).~~
- Review Entity Reference Override module to understand how it achieves the same goals, and what considerations it has dealt with.
- Investigate whether or not Entity Reference Integrity supports custom field types that extend from the core Entity Reference field type.
- Experiment with providing a custom field type in the farm_crop_plan module's 2.x branch.
- Provide base class(es) to make it easier for other contrib modules to do similar things.
- Document how to build custom plan relationship fields for contrib plan types.

User interface changes

None.

API changes

Another benefit of this approach is that we would get JSON:API exposure of the plan relationships for free, which is not something we have in 1.x currently. So this would allow for Plans to be utilized via the API in even more ways.

Data model changes

Instead of bespoke database tables for representing Plan entity relationships, we would be using Drupal fields, and all the nice things that come with that.

Comments

Comment #1

11 December 2020 at 13:28

m.stenta created an issue. See original summary.

Comment #2

m.stenta

he/him

English

CreditAttribution: m.stenta commented 11 December 2020 at 16:07

Issue summary:

View changes

So it may be worth considering adding a simple Asset and Log reference field to all Plan bundles by default (using BundleFieldDefinitions per #3183372: Convert config fields to base/bundle fields), but allow them to be disabled on a per-bundle basis when a plan type provides their own more advanced reference field.

I'm going to go ahead and add these as part of the branch I'm working on for #3183372: Convert config fields to base/bundle fields.

Comment #3

14 December 2020 at 16:17

m.stenta committed a942ae3 on 2.x

Issue #3187877: Add generic asset and log reference bundle fields to...

Comment #4

paul121 CreditAttribution: paul121 commented 22 March 2021 at 17:25

Another good reference for custom compound fields: https://events.drupal.org/seattle2019/sessions/custom-compound-fields-dr...

And a good example I found was the Webform module. They have a custom field type that extends EntityReferenceItem to include a few additional string and date fields. The actual FieldType plugin is relatively simple - much of the complexity comes from the FieldFormatter and FieldWidget plugins that must be provided. The Webform module is a good example of these as well. Link to the field type: https://git.drupalcode.org/project/webform/-/blob/6.x/src/Plugin/Field/F...

I followed the Webform module's example to create a custom entity reference field that has two additional timestamp fields. The use case being to reference plant_type terms and specify a date range with the reference. Extending from the EntityReferenceItem class makes it easy to generalize this into a more generic entity_reference_daterange field that could potentially be used elsewhere.

Provide base class(es) to make it easier for other contrib modules to do similar things.

This would be great! Of course it's going to be hard to find universal solutions, but it does seem like there could be some common field types we could provide. Thinking along the lines of entity reference + notes/text, entity reference + timestamp(s), entity reference + list_string/category, etc... these will probably come up over time.

Comment #5

paul121 CreditAttribution: paul121 commented 3 June 2021 at 04:48

The benefit of using Entity Reference fields is we can leverage the Entity Reference Integrity module to prevent deletion of assets and logs that are associated with a plan.

I had erroneously assumed that the reverse of this was enforced too, but of course it is not! But I think this is something worth considering. I imagine that logs will usually be "created by a plan" rather than "added to a plan" after the fact (at least from the users perspective, not thinking technically here). Because of this, it seems reasonable to prevent the deletion of a plan IF it is referencing logs (and/or assets?). Or maybe at least provide another confirmation form before deleting. Not suggesting that this should be the default, but perhaps something to consider since we do add asset & log references to all plan bundles by default.

Makes me think about the idea of never deleting entities and only archiving instead. I'm not a huge fan of this as a blanket approach, but IMO it would make more sense for Plans. Maybe it could be implemented as configuration on the bundle level, similar to the new_revision option on bundles.

Comment #6

m.stenta

he/him

English

CreditAttribution: m.stenta commented 3 June 2021 at 19:35

Because of this, it seems reasonable to prevent the deletion of a plan IF it is referencing logs (and/or assets?).

I think Entity Reference Integrity will cover us. We only want to prevent deleting logs/assets that are referenced by a plan... but we don't want to prevent deleting plans. Otherwise it would be impossible to delete either.

You can kind of think of it as stacked dependencies. At the bottom are assets, which have no dependencies. Next are logs, which reference assets. This prevents the assets from being deleted if logs reference them (otherwise you end up with logs with broken references). And on top of both these layers are plans, which can reference both logs and assets. This prevents those logs and assets from being deleted (otherwise you end up with plans with broken references). Each layer "locks" the layer(s) below it so that they can't be deleted.

If that makes sense?

Comment #7

m.stenta

he/him

English

CreditAttribution: m.stenta commented 3 June 2021 at 19:41

We should also update this issue with some recent discussion we had in a call (sometime between the comment that 2 months ago and now)...

One of the biggest hurdles to using "compound" fields is how those end up getting represented in the API.

For example, consider an entity reference field with an additional text field. Does that go into JSON:API attributes? Or relationships? Or both?

It gets messy quick, especially if we start thinking about the potential for fields that contain multiple entity reference fields (not sure if we'll actually need that - but technically that is how the farm_crop_plan module models its own database table (with references to planting_id, seeding_id, and transplanting_id): https://github.com/mstenta/farm_crop_plan/blob/a4114bd80a4bfd3fab7847267... - maybe we'll rethink that, but worth pointing out how complicated that would make these ideas...

The other option is to create new entity types for these fields, which would allow any number of sub-fields to be added to them, and it would all "just work" in JSON:API. It's weird, though, to think about these as additional entities IMO. But given the way that JSON:API expects data to be structured it might be the "right" way to think about it.

Comment #8

paul121 CreditAttribution: paul121 commented 3 June 2021 at 20:19

Oh sorry, yeah I understand the technical reason of integrity to prevent broken references. My thought was more on a meta / UX level, not the data model..

I imagine that logs will usually be "created by a plan" rather than "added to a plan" after the fact (at least from the users perspective, not thinking technically here).

The idea is that these logs would have less meaning without the context of the plan. When logs are created from the context of a plan imagine how a "meta reference" is created for the user, and deleting a plan would break that reference. This is certainly an opinionated idea! As an example, consider a plan that tracks some kind of "progress" via its logs that reference assets. The plan has the logic & is responsible for displaying the "progress" via the UI. Without the plan, these logs are less meaningful.

Conversely, simply using a plan to create logs is interesting & totally valid! The plan could be seen as more ephemeral.. the assumption is that their association with a plan isn't as important. Very similar to our quick forms.. :-)

Comment #9

m.stenta

he/him

English

CreditAttribution: m.stenta commented 8 June 2021 at 18:38

The idea is that these logs would have less meaning without the context of the plan. When logs are created from the context of a plan imagine how a "meta reference" is created for the user, and deleting a plan would break that reference.

Oh yes! I understand you now.

This reminds me... in farmOS 1.x, we had some custom mechanisms to "remove" and/or "delete" records associated with plans. This provided two endpoints:

- /farm/plan/[plan-id]/[entity-type]/[entity-id]/remove
- /farm/plan/[plan-id]/[entity-type]/[entity-id]/delete

See https://github.com/farmOS/farmOS/blob/2b64bc8e7c6c307c2a32366a0636651528...

And https://github.com/farmOS/farmOS/blob/2b64bc8e7c6c307c2a32366a0636651528...

These were specifically added so that plans could add links to these paths in their UI, depending on what they needed. In some cases, you ALWAYS want to delete certain records that were created through the plan. And in other cases, you want to give the user the option to simply "remove" or "unlink" a record from the plan, but NOT delete it.

Now, those two paths were for deleting/removing individual records. But you make a good point that we should have something broader, when the plan itself is deleted.

Perhaps we take a similar approach, whereby when you delete a plan, we alter the delete confirm form to display a list of records associated with the plan - perhaps as a set of checkboxes - so the user can decide if they should all be deleted with the plan or if they should be kept and simply "unlinked".

The open question might be: should there be a way to officially distinguish what kinds of records SHOULD be deleted and which SHOULDN'T? Or maybe which shouldn't even appear in the list.

For example, we've talked about creating a plan type for "recurring tasks" (#2962368: Recurring tasks). In that scenario, it's conceivable that a plan might produce MANY logs. Should all of those be referenced by the plan in a log reference field? Maybe? Maybe not? Should they all be deleted if a plan is deleted?

Makes me think about the idea of never deleting entities and only archiving instead.

Agreed! And in the case described above, it's actually better to just archive the plan. Deleting plans should only really be necessary when you want to clean things up for some reason, and don't need the archived data. That *should* be rare. So maybe we just give users the option to delete records associated with plans, and leave it up to them as to whether or not archiving is a better option.

Comment #10

paul121 CreditAttribution: paul121 commented 8 June 2021 at 20:23

Perhaps we take a similar approach, whereby when you delete a plan, we alter the delete confirm form to display a list of records associated with the plan - perhaps as a set of checkboxes - so the user can decide if they should all be deleted with the plan or if they should be kept and simply "unlinked".

Oh interesting! Yea I like this idea. Of course custom entity reference field types would complicate this a bit... but maybe the plan bundle plugin class could have a getReferencedRecords that could optionally be implemented & provide a list of referenced records? Or provide a view that displays the referenced records? Hmmmmm.

This also makes me think, it would be interesting to show an "Archive" button on every delete confirmation form... maybe that would be confusing... but might also "save a click" when entity reference integrity prevents deletion, and allow us to suggest "archiving" instead.

Comment #11

m.stenta

he/him

English

CreditAttribution: m.stenta commented 24 June 2021 at 13:08

So I think the next big step in this issue is taking a closer look at how JSON:API works with the "compound field" idea. Personally, from a data modeling perspective I think I like the idea of a compound field over another entity type, but the JSON:API side is still a big question mark.

Comment #12

m.stenta

he/him

English

CreditAttribution: m.stenta commented 20 December 2022 at 18:45

The other option is to create new entity types for these fields, which would allow any number of sub-fields to be added to them, and it would all "just work" in JSON:API. It's weird, though, to think about these as additional entities IMO. But given the way that JSON:API expects data to be structured it might be the "right" way to think about it.

It's been a long time since we've updated this issue, but I want to say that during that time I've been coming around to this idea more and more. The main disadvantage is still "another entity type" - but the benefits and flexibility it would allow may outweigh that. And it completely side-steps the thorny questions around JSON:API compound field handling. It feels like we would end up with a lot more custom code and divergences from the "norm" if we go the compound field route. Whereas with an entity type, we need to explain how it works and why to anyone who wants to develop plan types, but after that we get a lot of the other functionality we need "for free" and can use generic Entity API, JSON:API, etc.

Comment #13

m.stenta

he/him

English

CreditAttribution: m.stenta at Farmier commented 21 April 2023 at 18:32

Talked with @paul121 about this today, and we're starting to refine some decisions... nothing in stone yet but wanted to capture it here:

"What should the entity type be called"?

We're leaning towards plan_record for entity type name. It is short and sweet, and implies a relationship between a "plan" and a "record". And then bundles could be created of that for specific plan<->record relationship types.

"Should the plan reference the plan_record entities via an entity reference field on the plan? Or should the plan_record reference the plan in an entity reference field on itself?"

The nice thing about the reference being stored on the plan is it can be done by a specific field, which can give the connection more meaning. For example, maybe a plan references two kinds of assets, and wants those references to be maintained in two separate fields.

This also relates to the next question...

"Should it have any base fields? Or just bundle fields?"

Still TBD. Might be good to include some basic hidden metadata base fields like created, changed, uid, etc.

"Should they be revisionable?"

This is a similar decision we went through for quantity entitites. It means that we need to use an entity_reference_revisions field instead of an entity_reference field.

Comment #14

m.stenta

he/him

English

CreditAttribution: m.stenta at Farmier commented 30 October 2023 at 17:10

Version:

2.x-dev

» 3.x-dev

Comment #15

m.stenta

he/him

English

CreditAttribution: m.stenta at Farmier commented 20 January 2024 at 18:07

Update: I opened a pull request with a proposed approach to this...

https://github.com/farmOS/farmOS/pull/781

"Should the plan reference the plan_record entities via an entity reference field on the plan? Or should the plan_record reference the plan in an entity reference field on itself?"

Worth noting... I went with a plan reference field on the plan_record entity type instead of a reference field on plan entity, for reasons described in the PR.

The nice thing about the reference being stored on the plan is it can be done by a specific field, which can give the connection more meaning. For example, maybe a plan references two kinds of assets, and wants those references to be maintained in two separate fields.

After giving it more thought, the same "meaning" can be achieved with multiple plan_record types, and the complexity of managing field references on the plan itself seemed unnecessarily burdensome.