Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
- When retrieving an entity in json format (e.g.,
/node/1?_format=json
), all entity fields that the user has permission to view are returned, including the local ID (nid
for node entities) and revision ID (vid
for node entities) fields. - When retrieving an entity in hal+json format (e.g.,
/node/1?_format=hal_json
), the local ID and revision fields are excluded. The code that performs this exclusion is
$exclude = array($entity->getEntityType()->getKey('id'), $entity->getEntityType()->getKey('revision'));
inDrupal\hal\Normalizer\ContentEntityNormalizer
. - The decision to exclude the local ID was made in the initial patch that added hal.module (#1924220: Support serialization in hal+json) with no clear explanation as to why. Perhaps it was believed that too many identifiers would be confusing and lead to misuse. For most use cases, the entity's URI and UUID are more reliable and semantically meaningful identifiers.
- The decision to exclude the revision ID was made in #2219795: Config entity references broken and other bugs also without a clear explanation as to why. Perhaps it was believed that because revision IDs are also local (rather than UUIDs), that if one is excluded, then so should the other one.
- However, there are use cases where you need the local ID. For example, suppose you have a View with a contextual filter (e.g., core provides several such as
taxonomy/term/{tid}
and/admin/content/files/usage/{fid}
). Given a term or file, a JS application may want to output a link to one of these Views, contextually filtered to that term or file. It currently has no way to do so, since neither the term or file's URI, nor its UUID can be used as the value within such a link. - There are also use cases where you need the revision ID. An Entity ID is not enough to know what state of the original entity was captured; as an API consumer you need to be able to identify when a revision switches. There's no revision UUID stored in Drupal's entity schema, so the revision ID is the only revision identifier available.
Proposed resolution
Revert the mentioned change from #3 that is remove the excludes thus adding the id and revision to the normal state properties as mentioned in #9
$exclude = array($entity->getEntityType()->getKey('id'), $entity->getEntityType()->getKey('revision'));
Remaining tasks
User interface changes
API changes
Original report by @webchick
I suppose this might be "by design" but I hope it's merely an oversight. :)
When you look at the hal+json output of a node, "nid" is not one of the top-level keys. In fact, the only way to get a node's nid seems to be by doing horrific string parsing on [_links][_self]. Yet, this seems like a really useful key, for example to construct a link to a node from a mobile app.
Comment | File | Size | Author |
---|---|---|---|
#46 | Screen Shot 2016-02-13 at 19.59.40.png | 168.5 KB | Arla |
#45 | add_nid_vid_as_part_of-2304849-45.interdiff.txt | 2.82 KB | Arla |
#45 | add_nid_vid_as_part_of-2304849-45.patch | 6.77 KB | Arla |
#44 | interdiff.txt | 3.01 KB | effulgentsia |
#44 | add_nid_vid_as_part_of-2304849-44.patch | 5.79 KB | effulgentsia |
Comments
Comment #1
webchickOutput from Postman to show what I mean:
Comment #2
vivekvpandya CreditAttribution: vivekvpandya commentedI was discussing same issue over IRC and there was a suggestion like "use hal+json" only for POST and PATCH " . And if mobile app has GET this hal+json result for a node then I think it should have nid for the node. but in other situation it may not be the case . I am currently developing iOS app and in that I have used a REST export for a view that gives only necessary information for all nodes belongs to same bundle type . From that json entry you can create a link to a node. you can check this rest end point http://tntfoss-vivekvpandya.rhcloud.com/fossTips/rest
But I think REST is not only limited to mobile apps so we might need nid in hal+json for other use cases So I think we should first discuss in which situation we need nid .
Comment #3
clemens.tolboomWhy is the link _self.href not useable? On could argue to only put in 'node/nid' instead of the full URL.
Reading http://tools.ietf.org/html/draft-kelly-json-hal-06#section-8.1 does not say anything about excluding nid or revision what is what we did in
Those lines where introduced in #2219795: Config entity references broken and other bugs
Comment #4
webchickIf we only had node/nid instead of the full URL, _self may have worked, though still not ideal.
What I was attempting to do is build a jQuery Mobile app that passed the nid as a query string in order to load a particular node from a list. See https://github.com/webchickenator/d8ws (note only the listing page works; the view page is hardcoded atm). Specifically:
You can't do "#view?id=http://d8ws.webchick.net/node/2" (well, I suppose you could, but it makes the code a huge PITA to parse).
Comment #5
clemens.tolboomI think your use case should use GET with json and not hal+json as json has the nid.
I totally agree with HAL is a PITA ... I myself started with https://github.com/build2be/drupal-rest-test to install + run tests against HAL and 'normal' rest to battle wtf's :(
Where is the HAL / REST team?
Comment #6
webchickI thought of that, but on the view screen I want images, which I thought were links and therefore needed HAL. But now it's been a few weeks and I don't remember WTF I was doing anymore. :)
Comment #7
clemens.tolboomThis is what json image looks like. Which is pretty lame for a mobile app to use. You only have to tap into the public:// stream wrapper :p
Maybe we should focus on the serialization of File as it's URI above should be 'external' URI and not internal.
(sorry for derailing this issue)
Comment #8
webchickYeah, I think a dedicated issue about that makes sense. I also still think there's no harm in including nid/vid in the default HAL output, so re-titling a bit.
Comment #9
clemens.tolboomChanged title and added summary.
http://tools.ietf.org/html/draft-kelly-json-hal-06#section-3
Comment #10
clemens.tolboomComment #11
clemens.tolboomPatch empties the $exclude. Let's see what crashes?
Now all fetched entities get their id + revision like node.nid, comment.cid, user.uid
Weird is I get nid, cid without patch too ... even after
drush @drupal.d8 cache-rebuild
:-/Comment #12
clemens.tolboom@webchick: the Hal response for file (full image URL) is done by #2277705: Files don't have URI nor href. I think we need something similar for plain json requesting an entity (node) having a file entity (image). I'm happy to fix that but currently don't know how :(
Comment #14
webchickSo looking at the failures, it seems like there's an explicit test in NormalizeTest.php line 160 for "Internal id is not exposed." Git blame points to #1924220: Support serialization in hal+json. It was in there since the first patch, maybe this is a HAL thing?
Comment #15
clemens.tolboomI reread both the formal and informal specs now listed on https://www.drupal.org/documentation/modules/hal but did not found reasons to exclude. My reasoning would be about DRY as the nid is already in _self link.
The entity test failures are from #2219795: Config entity references broken and other bugs
imho we can revert these test to make these tests pass again.
tbc
Comment #16
clemens.tolboomThe only 'prove' for not exposing the ID is in the test code
Comment #17
dawehnerIt would be great if the IS could describe why the entity IDs have been skipped previously. I guess this makes it much easier for usecases like content deployment as you don't have to drop those keys again?
Just imagine if you "export" a content entity after this patch and import is somewhere else, you end up again with having conflicting IDs, so should not internal code just always rely on the UUID?
Comment #18
clemens.tolboomI finally found some reference to not exchange id nid.
In core/modules/serialization/src/EntityResolver/EntityResolverInterface.php the documentation says
Using the entity ID is way easier for simple client apps. Server apps fetching ie a node get both nid and uuid so the receiving Drupal should decide on how to proceed.
OTOH maybe simple clients should use rest calls like
curl --user admin:admin --header "Accept: application/json" --request GET http://drupal.d8/node/1
which has both nid and uuid.Comment #19
clemens.tolboomThis issue probably become won't fix when #2353611: Make it possible to link to an entity by UUID lands.
Comment #20
webchickHonestly, I can't see a reason not to provide this property, regardless if that issue makes it in or not. It's a part of the node. There's no reason not to provide it. Yes, "1" can differ between foo.com and bar.com, but if that's your use case, use UUIDs (which are already provided in the JSON dump).
Comment #21
webchickAlso, even if that makes it in, node/1 is still the canonical URL. So if you don't have to worry about mismatches between domains (as in the "headless Drupal" case), why not link there directly in the first place, rather than suffer the extra HTTP request of a 301 redirect first?
Comment #22
kaa4ever CreditAttribution: kaa4ever commentedI´m currently working on a D8 site with Angular as frontend. Trying to make it an one page app, it seems rather important to have the ID of the content available.
I have updated the latest patch to work on beta7, fingers crossed for this feature to make it into the core.
Comment #24
clemens.tolboom@kaa4ever I hope you meant to work on latest DEV
You should not remove the field access part.
Comment #25
miro_dietikerI'm extending the scope here to nid and vid.
The fact that we miss vid is a severe problem.
The revision id is relevant data and tells us about which revision the entity currently represents. Having such an identifier is a common best practice in rest, including using it in http headers such as etag if revisions would be atomic. Revisions are supported through the whole stack of Entity API. hal+json need to be revision aware. In a perfect world, you might claim that a revision should have a UUID and use this as a reference. But we don't have that UUID, so we need to provide the internal identifier.
When you sync data, it's the job of the data consuming system to decide if it want to persist things and e.g. overwrite records based on the primary identifier or if you decide drop the keys and create a record with a new identifier. The server should just provide all the data that might be relevant.
A system consuming data also can not assume anything about an URI at all. It can only understand what the URI represents by looking into data. Each module can expose or alter an entity URI or it could even be missing at all. If missing, the Entity URI is "" (an empty string). Oops. We end up with no identifier at all?
Back to the nid: The nid is stable. So is the URI supposed to be. Then the URI suddenly changes due to the alias system, and it is not necessarily stable, but in real cases autoupdated through e.g. pathauto. This leads to crazy problems of moving resources...
I didn't find any standard / best practice document that proposes to remove internal identifiers in hal+json. Providing more data than needed is usually not wrong, except it opens security holes. The D8 design though is not at a point where we want to hide internal primary keys from public completely as a design decision.
If core does not address these issues, then the default services and format will just be an example of drupalism and require all real applications to write their own serializer and resources.
A first step would be to output the missing fields. As a followup, we need to discuss and decide what the goal of the core resources are, including consideration of a revision aware resource.
Comment #26
klausiSo there seem to be valid use cases why people want nid and vid in the HAL response, could you add those to the issue summary so that we know why we are doing this? "It makes it easier for headless builders to have ie the nid at hand." is not really an explanation, why does the headless builder need the nid?
With nid present we allow people to shoot themselves in the foot by using nid instead of UUID, but I guess that is ok in certain cases.
Comment #27
ArlaAlso the following comment needs modification (ContentEntityNormalizer):
Comment #28
clemens.tolboomAs #22 was based on beta7 and had no interdiff here's a new patch including #27
We still needs an issue summary update.
This is patch is AFAIK much wider as all content entities get their 'id' and 'revision' exposed.
Comment #30
ArlaRe-added access check introduced by #2365319: Entity normalization should check field access to avoid leaking data and accidentally removed in the #22 reroll.
Comment #31
webchickAdjusted the issue summary for my use case.
Comment #32
Wim LeersUpdated IS again, primarily based on #25. Bumping to major, also primarily based on #25.
Patch in #31 looks great, and updates test coverage.
This comment now should be removed. Can be done on commit.
Comment #33
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI agree with outputting the revision ID. Since that was added to the exclude list in #2219795: Config entity references broken and other bugs, I left a comment there asking for people on that issue to comment here with reasons for why they thought it should be excluded.
I'm not sure about outputting the internal ID. On the one hand, I appreciate the argument in #25 that it's part of the entity's data, so in the absence of compelling reasons to exclude it, we should just let the client have it and make smart choices with how to use it. On the other hand, I also agree with #26 that the current issue summary does not make a good case for its need:
This makes no sense to me. If you want to link to a resource, use the appropriate info from
_links
.Hm, so passing an internal ID as a query parameter might have some use cases, but I'm still not clear on what those use cases are, and why using UUID for those use cases is inferior.
So tagging for another IS update, as I think the case for outputting the internal ID needs to be stronger. I also left a comment on #1924220: Support serialization in hal+json for people there to comment here with objections to that.
Also bumping this to 8.1. I don't see a compelling reason to do it for 8.0, and I think it's technically a (REST) API addition, which by semver rules, isn't appropriate for a patch-level release.
Comment #34
webchickIt's basically the same reason we still keep node/5 as the canonical URL in Drupal despite node/76s8df678s67d86sd78f68s76d8 (or whatever) being perfectly valid alternative: brevity. ease of typing. not triggering link shorteners in clients such as twitter. etc. Since it's the canonical URL on the PHP-backed version of the site, it makes sense to me for the headless app to use the same values when building out its separate URL structure.
Overall, the requirement for defending use cases though feels feels like needless nannying/babysitting/"I know better than you" to me. These are simply properties of the data you're exposing via REST. Excluding this data, esp. when the HAL spec offers no such requirement (according to #3) is just plain confusing to app authors.
Comment #35
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI don't find it intrinsically obvious that a local ID is truly a property of the entity. I think identifiers are a different kind of thing than true properties. For example, PHP objects have memory addresses, which is the PHP engine's identifier to the object, but when you serialize a PHP object, its memory address is not output as part of that serialization. Therefore, I think a use case can help to clarify in what sense a local identifier in Drupal actually is a property of the entity data. However, I think Views provides such a use case, so I rewrote the Problem/Motivation part of the issue summary with it.
If there's agreement here on the updated IS, please re-RTBC.
Comment #36
ArlaThe new IS makes sense to me. The Views example is a good one.
Regarding use cases, I know we wanted the id and revision id available when we did #2462293: Support HAL in Collect. I cannot remember exactly why, but the goal there is simply to store "everything", and in regard to that, the absence of the id values is a deficiency.
Perhaps more understandably: With Collect, a site can send entities to another site. If the HAL serialization does not contain ids, they would have to be specified "on the side", which kinda defeats the purpose of the serialization and makes the transferred data more complex.
Comment #37
effulgentsia CreditAttribution: effulgentsia at Acquia commentedThanks Arla. I'm a bit confused though. If Collect is sending an entity to another site, its ID will end up being different on that other site. So for that kind of use case, why is the ID that the entity has on the source site needed? This seems similar to the PHP memory address example in #35: part of why it makes no sense for PHP to add that to the serialized data is that when you unserialize that data in a later request, it won't have the same memory address.
Comment #38
clemens.tolboom(tested on wrong D8.0.x)
As long as self contains a nid instead UUID it would be weird not to emit the nid as without nid we cannot calculate easily other links which require the nid.
Comment #39
miro_dietikerCollect knows (or wants to know) about everything relevant from the origin. Yes, a memory address wouldn't be. Reconstructing (a clone) is just a nice feature and definitively not related to the origin ID (and revision ID) problem.
An Entity ID is simply not enough to know what state of the original entity was captured. If you work with revisions, you always want to refer to a specific revisions. Even accross multiple systems. You want to know what exact revision has been captured or released. (And if you follow the principle of atomic revisions, you are pretty safe. [The Drupal content type revision flag is not, but that's subject of other issues...])
If we reduce it even further: At least as an API consumer i need to be able to identify when a revision switches, except you declare that the fact that we are working with revisions is an internal...
Comment #40
ArlaNice addition about revision id, I'm adding it to IS.
Comment #41
effulgentsia CreditAttribution: effulgentsia at Acquia commentedThanks. Setting back to RTBC. Will leave it here for a day though before committing to see if anyone raises objections.
Comment #42
ArlaUpdating a comment and rerunning tests against the right core.
Comment #43
ArlaWith that cosmetic fix passed, I'm setting back to RTBC.
Comment #44
effulgentsia CreditAttribution: effulgentsia at Acquia commentedWhat do you think of simplifying the test to this (see patch)?
Also, I wrote up https://www.drupal.org/node/2667938 as a change record. Any thoughts on that?
Comment #45
ArlaWhile we're at it, we can even compare the entire arrays.
Not actually sure if it's better. If another change makes them different, it will be difficult to tell how from reading the test results ("Value [eight lines of array] is equal to [eight more lines of slightly different array]"). Anyway it turns out there's an unnecessary unset of "The target field" (?) so that can go at least.
The change record looks good btw. But I do think that serialization (and/or normalization) should be mentioned somewhere in the text. Serialization/normalization is not used exclusively for REST.
Comment #46
ArlaEdited the change record:
Comment #47
effulgentsia CreditAttribution: effulgentsia at Acquia commented#45's interdiff and #46's CR update look good to me.
Comment #49
catchI think it's reasonable to not exclude these.
However do we have an issue to add uuids to revisions - that seems like it'd be more useful to provide over REST?
Committed/pushed to 8.1.x, thanks!
Comment #50
effulgentsia CreditAttribution: effulgentsia at Acquia commentedPublished the CR.
I found #1812202: Add UUID support for entity revisions. Not sure if there are others, but adding that one as a related issue.
Comment #52
xjm