Problem/Motivation
I noticed this a long time ago, but didn't dig into it because there were bigger problems to solve. But the mysterious revision_translation_affected
field that isn't really a field and is exposed by both core REST and JSON API is a huge WTF for developers. I noticed it again while working on #2930028: Comprehensive JSON API integration test coverage phase 1: for every entity type, individual resources only.
It was added in #2542748: Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state, to the Node
entity type, back in August 2015. The Serialization/REST test coverage was so weak that not a single change to its test coverage was necessary: nobody even realized that this was affecting the normalization of Node
entities!
That field was then generalized to all revisionable entity types in #2896845: Provide the 'revision_translation_affected' base field by default for all revisionable and translatable entity types in August 2017.
It's only since November/December of 2017 (so, a few months ago), that REST integration test coverage is complete enough to detect (many, still not all) changes in normalizations/serializations. So only since then we've at least had some way to track most (but still not all!) changes to our normalizations, so we can ask ourselves whether they make sense for Decoupled Drupal developers.
Proposed resolution
To be determined. Options discussed so far:
- Omit that field from all normalizations by marking it internal? (Original IS by @Wim Leers)
- Keep it, but only allow
view
access (@tstoeckler in #12) - Keep it, but validate it when it's being modified using a validation constraint (@Wim Leers in #24)
- Keep it, but show it conditionally: only when an entity type actually has translations enabled (@Wim Leers in #24)
- Configure on each entity resource whether internal fields should be exposed (@hchonov in #22)
- …
Remaining tasks
Discuss.
API changes
TBD.
Data model changes
TBD
Comment | File | Size | Author |
---|---|---|---|
#11 | 2933518-11.patch | 2.94 KB | timmillwood |
#11 | interdiff-2933518-11.txt | 2.96 KB | timmillwood |
Comments
Comment #2
plachWhy isn't it really a field?
Comment #3
Wim LeersTo the best of my knowledge,
revision_translation_affected
is an "internal-use metadata" field, to ensure correct handling of revisions+translations. But … I cannot find a clear explanation anywhere. So I might very well be wrong.I know for a fact that nobody knows how to interpret this data as part of the
json
,xml
orhal_json
responses that core's REST module provides. And nobody knows whether they could or should send it inPATCH
orPOST
requests. I know I don't.Hence why I wrote:
It'd be really great if you could clarify this, and hopefully confirm that it's okay to not expose this in REST/JSON API/GraphQL representations :)
Comment #4
Wim LeersAnd today I just stumbled upon #2891215, which added a new
revision_default
field which is similarly mysterious :( See #2891215-107: Add a way to track whether a revision was default when originally created.Comment #5
timmillwoodCan we expand this issue to mark both revision_translation_affected and revision_default as @internal?
On a semi-related note, #2674518: Add revision log field to views is looking to expose revision metadata fields to Views, does marking things internal also hide them from Views?
Comment #6
timmillwoodThis will need an upgrade path and tests, but let's see what testbot thinks of it.
Comment #7
Wim LeersNo.
Comment #9
Wim Leers… you posted this based on #2891215-109: Add a way to track whether a revision was default when originally created. I just responded there: #2891215-110: Add a way to track whether a revision was default when originally created. Let me quote my comment in full here:
Comment #10
plachA brief note, I will post a more complete comment later, if needed.
The new
revision_default
field is not supposed to be populated (POST/PATCH), but it may definitely be useful to have it available in the response, as it's going to be a critical part of pending revision translation and likely many other workflow / workspaces use cases. There is no new documentation around this field because the API seemed to be self-documenting to me:Not sure what's so mysterious about it.
The case of
revision_translation_affected
is slightly different: it is populated automatically by the system, but only if a value is not explicitly set by the API consumer. Hence it's a regular field, which users may find it useful to read and write: for this reason I'm not sure it's correct to mark it as internal. I was hoping that the documentation we added lately was enough to clarify the meaning of this field.Comment #11
timmillwoodok, as @Wim Leers suggested, let's keep this issue for revision_translation_affected only for now.
In this patch I have removed setting revision_default to internal, and updated block_content, node, and media resource tests.
We'd still need an upgrade path to make existing revision_translation_affected fields internal, and decide if removing this field from serialized entities is a BC break or not, and how to get around that.
Comment #12
tstoecklerCan you give a reason why someone may want to set a value for this externally? Since it's very crucial to set this correctly to ensure the integrity of the content model I think it may indeed make sense to forbid setting this field externally. At the very least we should restrict it to people with elevated access.
Comment #13
plachDo we have a way to prevent write and allow read? Because if we don't, I'd say we have to bite the bullet and keep it around, as the RTA flag can be critical to know in multilingual scenarios.
Aside from that, I'm not sure it's correct to prevent write just because we can't come up with a reason, right now. I wound't be able to mention one a couple of years ago and now I had the need to set it explicitly in #2924724: Add an API to create a new revision correctly handling multilingual pending revisions.
I think most people just need to know that they can safely ignore this field unless they know what they are doing.
Comment #14
tstoecklerWell since Rest checks access on a field-level, we could prevent edit access on this field in the access control handler. Since this field never shows up in forms this should have no other consequences.
I disagree. You can completely botch up your content by e.g. setting this to FALSE where it shouldn't be. For example, revisions will be hidden in the UI with no way to recover them (without introspecting the DB). The reverse problem is also confusing, although it's not as bad: If people set this to TRUE where it shouldn't be, content editors will see revisions with no differences which is annoying if you have some sort of publishing workflow that and you want to see what changed, e.g. with the Diff module. So granting anyone who has access to fix a typo in a node title via Rest full access to this field is problematic. I realize that this is the status quo, but that doesn't mean we shouldn't fix it.
Comment #16
plachOf course you can break your content if you use it the wrong way (which means not ignoring it, btw ;), I was not questioning that. The problem I was bringing up was with the reduced flexibility. I guess we normally prefer to allow people to do stuff at their own risk rather the opposite. That said, if we go the access control handler way, it should be alterable, right? This would be enough to address my concerns.
Comment #17
tstoecklerYes, return
AccessResult::allowed()
from ahook_entity_field_access()
implementation should be able to override that. Although would be nice to check with someone who actually understands field access logic and/or test it.So the problem I see with that is - like I hinted at in #14 - with Rest "their own" is no longer really true. I.e. any content editor of a decoupled site can write this field which is not the case with normal Form API. I agree that a custom module should still be able to set this field value, but I think we can still do the access thing, as that module can then simply choose to ignore access-checking on that field.
Comment #18
plach+1
Comment #19
hchonovIf an entity sync is implemented through REST, then I guess that both
revision_translation_affected
andrevision_default
should be exposed. How could this be achieved if they are flagged as internal?Comment #20
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@hchonov, that's a very good point. Not trying to put more gas on this fire, but this is (or was?) the main purpose of the REST module :)
Comment #21
tstoecklerRe #19: You are talking about GETting entities, though, not POST/PATCHing them, right?
Comment #22
hchonov@amateescu, this makes me think of that we need something like a setting on each entity resource whether internal fields should be exposed, so that we could define an entity resource for entity sync including all fields and also an entity resource for retrieving entities for some other use case where the internal fields should be excluded.
@tstoeckler, I am talking about GETing and POST/PATCHing. If you are able to retrieve an internal field through REST, then you should be able to include that field when POST/PATCHing an entity through REST.
Comment #24
Wim Leers#10: That docs update was AWESOME! 👌 However, that documentation is only discoverable by Drupal developers (and as we all know, even then they often either don't try to find it or don't succeed in finding it).
HTTP API consumers don't even necessarily know they're talking to Drupal. Field names should be self-explanatory as much as possible
#13 + #14:
+1
#14 + #16:
😱
#16:
… but how does one know what's the right way? A validation constraint preventing to use this the wrong way would probably achieve this. But AFAICT there's no validation constraint on this at all:
git show 88d1f67ece9644c5b0b468863a046248cde823ee | grep constraint
returns nothing. Which means that the only reason that things currently can only happen correctly, is because we simply don't have any code in core trying to set this field manually.#19:
Note that the issue title is a question. I'm not at all certain that marking it internal is the correct solution. @plach's comments indicate that there is a use case for it. @tstoeckler's comments indicate that it currently is dangerous. See my suggestions above.
That's my point exactly: it is not clear how one should interpret this field, especially if you're not already a Drupal expert. Maybe the answer is that we should mark it internal conditionally: if you haven't enabled translations for an entity type (and perhaps don't even have the
language
installed, because your site is perhaps monolingual), then there is no point in exposing this field, right?#20:
I'd say this is perhaps the necessary gas on the fire :D However, rather than derailing this issue with that long overdue discussion, I created a new issue, just for that! Hope to see you in #2937376: Document what the REST and Serialization modules are for: decoupled Drupal sites or content syncing!
#22:
That's the wrong abstraction level to do that at. See #2937376: Document what the REST and Serialization modules are for: decoupled Drupal sites or content syncing for more detail :)
That is only true for the simplest of scenarios. It is okay to make certain fields read-only. That's why we have the Field Access API. See
\Drupal\comment\CommentAccessControlHandler::checkFieldAccess()
for a bunch of examples of both read-only and create-only fields!Comment #25
hchonovWe are on the way to start implementing some sort of content syncing and were considering to use REST and at first we want to be able to sync all kind of fields without restrictions - I am pretty sure that even if some fields are flagged somehow read-only or whatever we'll end up needing a solution for overwriting this.
Do I understand it correctly that it "might" be the wrong decision to consider REST at the moment for implementing entity sync?
Comment #26
Wim Leers#25: This is what I opened #2937376: Document what the REST and Serialization modules are for: decoupled Drupal sites or content syncing for. Also, please check https://www.drupal.org/project/relaxed + https://www.drupal.org/project/entity_pilot + https://www.drupal.org/project/entity_share before building your own! :)
Comment #27
Wim LeersUpdated IS with relevant bits from #2891215-128: Add a way to track whether a revision was default when originally created.
Comment #28
Wim LeersUpdated title + proposed solution.
Comment #30
Wim LeersWith #3023194: [PP-2] Add parallel revisioning support happening, hopefully this will move forward again.
Comment #36
bklineAs further evidence for the need to resolve this issue, we noticed that as we switched our site from D8.x to D9.x the kernel tests for our REST modules were breaking for the first time. So we wrote a script to find out what was happening under the covers. The script created a node and then in ten iterations of a loop created for each of two languages first an unpublished, then a published revision. Then we loaded each of the revisions and for each of the revision each of the translations and reported whether the revision was published for the language, whether the translation was affected for the revision, and whether it was the default revision. Running it under Drupal 8 if a revision was created for a given language,
isRevisionTranslationAffected()
returnedTRUE
and it returnedFALSE
for the other language (when the language was present, which of course it wasn't for the second language in the very first revision).The Drupal 8.9 documentation for
isRevisionTranslationAffected()
says:Then we ran the same script under Drupal 9 and
isRevisionTranslationAffected()
returnedTRUE
only for the very last revision, and for BOTH languages in that revision.So we looked at the Drupal 9 documentation for the method. It says:
So I think a reasonable person would expect that if the behavior of the method is going to change, there would some corresponding change in the documentation, and maybe even some elaboration about what "is affected" means, exactly.