Problem/Motivation

  • When retrieving an entity in json format (e.g., /node/1?_format=json), all entity fields that the user has permission to view are returned, including the local ID (nid for node entities) and revision ID (vid for node entities) fields.
  • When retrieving an entity in hal+json format (e.g., /node/1?_format=hal_json), the local ID and revision fields are excluded. The code that performs this exclusion is
    $exclude = array($entity->getEntityType()->getKey('id'), $entity->getEntityType()->getKey('revision'));
    in Drupal\hal\Normalizer\ContentEntityNormalizer.
  • The decision to exclude the local ID was made in the initial patch that added hal.module (#1924220: Support serialization in hal+json) with no clear explanation as to why. Perhaps it was believed that too many identifiers would be confusing and lead to misuse. For most use cases, the entity's URI and UUID are more reliable and semantically meaningful identifiers.
  • The decision to exclude the revision ID was made in #2219795: Config entity references broken and other bugs also without a clear explanation as to why. Perhaps it was believed that because revision IDs are also local (rather than UUIDs), that if one is excluded, then so should the other one.
  • However, there are use cases where you need the local ID. For example, suppose you have a View with a contextual filter (e.g., core provides several such as taxonomy/term/{tid} and /admin/content/files/usage/{fid}). Given a term or file, a JS application may want to output a link to one of these Views, contextually filtered to that term or file. It currently has no way to do so, since neither the term or file's URI, nor its UUID can be used as the value within such a link.
  • There are also use cases where you need the revision ID. An Entity ID is not enough to know what state of the original entity was captured; as an API consumer you need to be able to identify when a revision switches. There's no revision UUID stored in Drupal's entity schema, so the revision ID is the only revision identifier available.

Proposed resolution

Revert the mentioned change from #3 that is remove the excludes thus adding the id and revision to the normal state properties as mentioned in #9

    $exclude = array($entity->getEntityType()->getKey('id'), $entity->getEntityType()->getKey('revision'));

Remaining tasks

User interface changes

API changes

Original report by @webchick

I suppose this might be "by design" but I hope it's merely an oversight. :)

When you look at the hal+json output of a node, "nid" is not one of the top-level keys. In fact, the only way to get a node's nid seems to be by doing horrific string parsing on [_links][_self]. Yet, this seems like a really useful key, for example to construct a link to a node from a mobile app.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

webchick’s picture

Output from Postman to show what I mean:

Only place nid is shown is at the tail end of the _self link

vivekvpandya’s picture

I was discussing same issue over IRC and there was a suggestion like "use hal+json" only for POST and PATCH " . And if mobile app has GET this hal+json result for a node then I think it should have nid for the node. but in other situation it may not be the case . I am currently developing iOS app and in that I have used a REST export for a view that gives only necessary information for all nodes belongs to same bundle type . From that json entry you can create a link to a node. you can check this rest end point http://tntfoss-vivekvpandya.rhcloud.com/fossTips/rest

But I think REST is not only limited to mobile apps so we might need nid in hal+json for other use cases So I think we should first discuss in which situation we need nid .

clemens.tolboom’s picture

Why is the link _self.href not useable? On could argue to only put in 'node/nid' instead of the full URL.

Reading http://tools.ietf.org/html/draft-kelly-json-hal-06#section-8.1 does not say anything about excluding nid or revision what is what we did in

namespace Drupal\hal\Normalizer;
...
class ContentEntityNormalizer extends NormalizerBase {
...
  public function normalize($entity, $format = NULL, array $context = array()) {
...
    // Ignore the entity ID and revision ID.
    $exclude = array($entity->getEntityType()->getKey('id'), $entity->getEntityType()->getKey('revision'));
    foreach ($fields as $field) {
      if (in_array($field->getFieldDefinition()->getName(), $exclude)) {

Those lines where introduced in #2219795: Config entity references broken and other bugs

webchick’s picture

If we only had node/nid instead of the full URL, _self may have worked, though still not ideal.

What I was attempting to do is build a jQuery Mobile app that passed the nid as a query string in order to load a particular node from a list. See https://github.com/webchickenator/d8ws (note only the listing page works; the view page is hardcoded atm). Specifically:

      // Build an array of rows.
      items.push('<li><a href="#view?id=' + nid + '">' + title + '</a></li>');

You can't do "#view?id=http://d8ws.webchick.net/node/2" (well, I suppose you could, but it makes the code a huge PITA to parse).

clemens.tolboom’s picture

I think your use case should use GET with json and not hal+json as json has the nid.

I totally agree with HAL is a PITA ... I myself started with https://github.com/build2be/drupal-rest-test to install + run tests against HAL and 'normal' rest to battle wtf's :(

Where is the HAL / REST team?

webchick’s picture

I thought of that, but on the view screen I want images, which I thought were links and therefore needed HAL. But now it's been a few weeks and I don't remember WTF I was doing anymore. :)

clemens.tolboom’s picture

This is what json image looks like. Which is pretty lame for a mobile app to use. You only have to tap into the public:// stream wrapper :p

$ ./rest.sh rest node file
========================
curl --user admin:admin --header "Accept: application/json" --request GET http://drupal.d8/node/1
...
  "field_image" : [
      {
         "width" : "520",
         "alt" : "Autem camur ratis wisi.",
         "title" : "Diam dolus et imputo pneum tamen venio.",
         "height" : "439",
         "description" : null,
         "target_id" : "4",
         "display" : null
      }
   ],
...
========================
curl --user admin:admin --header "Accept: application/json" --request GET http://drupal.d8/entity/file/1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   406    0   406    0     0    844      0 --:--:-- --:--:-- --:--:--   844
============ RESPONSE : entity/file/1 ============
{"fid":[{"value":"1"}],
"uuid":[{"value":"290206f4-a173-4b99-b91e-308308184bbf"}],
"langcode":[{"value":"und"}],
"uid":[{"target_id":"1"}],
"filename":[{"value":"imagefield_Kl3rSX.gif"}],
"uri":[{"value":"public:\/\/pictures\/imagefield_Kl3rSX.gif"}],
"filemime":[{"value":"image\/gif"}],
"filesize":[{"value":"660"}],
"status":[{"value":"1"}],
"created":[{"value":"1407330615"}],
"changed":[{"value":"1407330615"}]}
=========== END RESPONSE : entity/file/1 =========

Maybe we should focus on the serialization of File as it's URI above should be 'external' URI and not internal.

(sorry for derailing this issue)

webchick’s picture

Title: nid not part of REST output on content » nid not part of REST output on content when using HAL

Yeah, I think a dedicated issue about that makes sense. I also still think there's no harm in including nid/vid in the default HAL output, so re-titling a bit.

clemens.tolboom’s picture

Title: nid not part of REST output on content when using HAL » Add nid as part of it's state properties
Component: rest.module » hal.module
Category: Bug report » Feature request
Issue summary: View changes

Changed title and added summary.

http://tools.ietf.org/html/draft-kelly-json-hal-06#section-3

Here, we have a HAL document representing an order resource with the
URI "/orders/523". It has "warehouse" and "invoice" links, and its
own state in the form of "currency", "status", and "total"
properties.

clemens.tolboom’s picture

Issue summary: View changes
clemens.tolboom’s picture

Status: Active » Needs review
FileSize
864 bytes

Patch empties the $exclude. Let's see what crashes?

Now all fetched entities get their id + revision like node.nid, comment.cid, user.uid

Weird is I get nid, cid without patch too ... even after drush @drupal.d8 cache-rebuild :-/

clemens.tolboom’s picture

@webchick: the Hal response for file (full image URL) is done by #2277705: Files don't have URI nor href. I think we need something similar for plain json requesting an entity (node) having a file entity (image). I'm happy to fix that but currently don't know how :(

Status: Needs review » Needs work

The last submitted patch, 11: add_nid_as_part_of_it_s-2304849-11.patch, failed testing.

webchick’s picture

Title: Add nid as part of it's state properties » Add nid as part of a node's state properties

So looking at the failures, it seems like there's an explicit test in NormalizeTest.php line 160 for "Internal id is not exposed." Git blame points to #1924220: Support serialization in hal+json. It was in there since the first patch, maybe this is a HAL thing?

clemens.tolboom’s picture

I reread both the formal and informal specs now listed on https://www.drupal.org/documentation/modules/hal but did not found reasons to exclude. My reasoning would be about DRY as the nid is already in _self link.

The entity test failures are from #2219795: Config entity references broken and other bugs

imho we can revert these test to make these tests pass again.

tbc

clemens.tolboom’s picture

Status: Needs work » Needs review
FileSize
4.24 KB

The only 'prove' for not exposing the ID is in the test code

-    $this->assertFalse(isset($normalized['id']), 'Internal id is not exposed.');
+    $this->assertEqual($normalized['id'], $expected_array['id'], 'Internal id is exposed.');
dawehner’s picture

It would be great if the IS could describe why the entity IDs have been skipped previously. I guess this makes it much easier for usecases like content deployment as you don't have to drop those keys again?
Just imagine if you "export" a content entity after this patch and import is somewhere else, you end up again with having conflicting IDs, so should not internal code just always rely on the UUID?

clemens.tolboom’s picture

Issue summary: View changes

I finally found some reference to not exchange id nid.

In core/modules/serialization/src/EntityResolver/EntityResolverInterface.php the documentation says

* Drupal entities are loaded by and internally referenced by a local ID.
* Because different websites can use the same local ID to refer to different
* entities (e.g., node "1" can be a different node on foo.com and bar.com, or
* on example.com and staging.example.com), it is generally unsuitable for use
* in hypermedia data exchanges. Instead, UUIDs, URIs, or other globally
* unique IDs are preferred.

Using the entity ID is way easier for simple client apps. Server apps fetching ie a node get both nid and uuid so the receiving Drupal should decide on how to proceed.

OTOH maybe simple clients should use rest calls like curl --user admin:admin --header "Accept: application/json" --request GET http://drupal.d8/node/1 which has both nid and uuid.

clemens.tolboom’s picture

This issue probably become won't fix when #2353611: Make it possible to link to an entity by UUID lands.

webchick’s picture

Honestly, I can't see a reason not to provide this property, regardless if that issue makes it in or not. It's a part of the node. There's no reason not to provide it. Yes, "1" can differ between foo.com and bar.com, but if that's your use case, use UUIDs (which are already provided in the JSON dump).

webchick’s picture

Also, even if that makes it in, node/1 is still the canonical URL. So if you don't have to worry about mismatches between domains (as in the "headless Drupal" case), why not link there directly in the first place, rather than suffer the extra HTTP request of a 301 redirect first?

kaa4ever’s picture

I´m currently working on a D8 site with Angular as frontend. Trying to make it an one page app, it seems rather important to have the ID of the content available.
I have updated the latest patch to work on beta7, fingers crossed for this feature to make it into the core.

Status: Needs review » Needs work

The last submitted patch, 22: add_nid_as_part_of_a-2304849-22.patch, failed testing.

clemens.tolboom’s picture

@kaa4ever I hope you meant to work on latest DEV

+++ b/core/modules/hal/src/Normalizer/ContentEntityNormalizer.php
@@ -92,15 +92,7 @@ public function normalize($entity, $format = NULL, array $context = array()) {
-      if (in_array($field->getFieldDefinition()->getName(), $exclude) || !$field->access('view', $context['account'])) {
-        continue;
-      }

You should not remove the field access part.

     if (!$field->access('view', $context['account'])) {
       continue;
     }
miro_dietiker’s picture

Title: Add nid as part of a node's state properties » Add nid, vid as part of a node's state properties
Category: Feature request » Bug report

I'm extending the scope here to nid and vid.

The fact that we miss vid is a severe problem.
The revision id is relevant data and tells us about which revision the entity currently represents. Having such an identifier is a common best practice in rest, including using it in http headers such as etag if revisions would be atomic. Revisions are supported through the whole stack of Entity API. hal+json need to be revision aware. In a perfect world, you might claim that a revision should have a UUID and use this as a reference. But we don't have that UUID, so we need to provide the internal identifier.

When you sync data, it's the job of the data consuming system to decide if it want to persist things and e.g. overwrite records based on the primary identifier or if you decide drop the keys and create a record with a new identifier. The server should just provide all the data that might be relevant.

A system consuming data also can not assume anything about an URI at all. It can only understand what the URI represents by looking into data. Each module can expose or alter an entity URI or it could even be missing at all. If missing, the Entity URI is "" (an empty string). Oops. We end up with no identifier at all?

Back to the nid: The nid is stable. So is the URI supposed to be. Then the URI suddenly changes due to the alias system, and it is not necessarily stable, but in real cases autoupdated through e.g. pathauto. This leads to crazy problems of moving resources...

I didn't find any standard / best practice document that proposes to remove internal identifiers in hal+json. Providing more data than needed is usually not wrong, except it opens security holes. The D8 design though is not at a point where we want to hide internal primary keys from public completely as a design decision.

If core does not address these issues, then the default services and format will just be an example of drupalism and require all real applications to write their own serializer and resources.

A first step would be to output the missing fields. As a followup, we need to discuss and decide what the goal of the core resources are, including consideration of a revision aware resource.

klausi’s picture

So there seem to be valid use cases why people want nid and vid in the HAL response, could you add those to the issue summary so that we know why we are doing this? "It makes it easier for headless builders to have ie the nid at hand." is not really an explanation, why does the headless builder need the nid?

With nid present we allow people to shoot themselves in the foot by using nid instead of UUID, but I guess that is ok in certain cases.

Arla’s picture

Also the following comment needs modification (ContentEntityNormalizer):

    // Otherwise, output all field values except internal ID.
clemens.tolboom’s picture

Status: Needs work » Needs review
FileSize
4.83 KB

As #22 was based on beta7 and had no interdiff here's a new patch including #27

We still needs an issue summary update.

This is patch is AFAIK much wider as all content entities get their 'id' and 'revision' exposed.

Status: Needs review » Needs work

The last submitted patch, 28: add_nid_vid_as_part_of-2304849-28.patch, failed testing.

Arla’s picture

Status: Needs work » Needs review
FileSize
819 bytes
4.8 KB

Re-added access check introduced by #2365319: Entity normalization should check field access to avoid leaking data and accidentally removed in the #22 reroll.

webchick’s picture

Issue summary: View changes

Adjusted the issue summary for my use case.

Wim Leers’s picture

Title: Add nid, vid as part of a node's state properties » Add nid, vid as part of a entity's state properties
Priority: Normal » Major
Issue summary: View changes
Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs issue summary update +API-First Initiative

Updated IS again, primarily based on #25. Bumping to major, also primarily based on #25.

Patch in #31 looks great, and updates test coverage.

+++ b/core/modules/hal/src/Tests/EntityTest.php
@@ -81,9 +81,8 @@ public function testNode() {
     // Verify that the ID and revision ID were skipped by the normalizer.

This comment now should be removed. Can be done on commit.

effulgentsia’s picture

Version: 8.0.x-dev » 8.1.x-dev
Status: Reviewed & tested by the community » Needs review
Issue tags: +Needs issue summary update

I agree with outputting the revision ID. Since that was added to the exclude list in #2219795: Config entity references broken and other bugs, I left a comment there asking for people on that issue to comment here with reasons for why they thought it should be excluded.

I'm not sure about outputting the internal ID. On the one hand, I appreciate the argument in #25 that it's part of the entity's data, so in the absence of compelling reasons to exclude it, we should just let the client have it and make smart choices with how to use it. On the other hand, I also agree with #26 that the current issue summary does not make a good case for its need:

It makes it easier for headless builders to have e.g. the nid at hand, for example to print links in a JS front-end directly to /node/5.

This makes no sense to me. If you want to link to a resource, use the appropriate info from _links.

items.push('<li><a href="#view?id=' + nid + '">' + title + '</a></li>');

Hm, so passing an internal ID as a query parameter might have some use cases, but I'm still not clear on what those use cases are, and why using UUID for those use cases is inferior.

So tagging for another IS update, as I think the case for outputting the internal ID needs to be stronger. I also left a comment on #1924220: Support serialization in hal+json for people there to comment here with objections to that.

Also bumping this to 8.1. I don't see a compelling reason to do it for 8.0, and I think it's technically a (REST) API addition, which by semver rules, isn't appropriate for a patch-level release.

webchick’s picture

Hm, so passing an internal ID as a query parameter might have some use cases, but I'm still not clear on what those use cases are, and why using UUID for those use cases is inferior.

It's basically the same reason we still keep node/5 as the canonical URL in Drupal despite node/76s8df678s67d86sd78f68s76d8 (or whatever) being perfectly valid alternative: brevity. ease of typing. not triggering link shorteners in clients such as twitter. etc. Since it's the canonical URL on the PHP-backed version of the site, it makes sense to me for the headless app to use the same values when building out its separate URL structure.

Overall, the requirement for defending use cases though feels feels like needless nannying/babysitting/"I know better than you" to me. These are simply properties of the data you're exposing via REST. Excluding this data, esp. when the HAL spec offers no such requirement (according to #3) is just plain confusing to app authors.

effulgentsia’s picture

Title: Add nid, vid as part of a entity's state properties » Stop excluding local ID and revision fields from HAL output
Issue summary: View changes
Issue tags: -Needs issue summary update

Overall, the requirement for defending use cases though feels feels like needless nannying/babysitting/"I know better than you" to me. These are simply properties of the data you're exposing via REST.

I don't find it intrinsically obvious that a local ID is truly a property of the entity. I think identifiers are a different kind of thing than true properties. For example, PHP objects have memory addresses, which is the PHP engine's identifier to the object, but when you serialize a PHP object, its memory address is not output as part of that serialization. Therefore, I think a use case can help to clarify in what sense a local identifier in Drupal actually is a property of the entity data. However, I think Views provides such a use case, so I rewrote the Problem/Motivation part of the issue summary with it.

If there's agreement here on the updated IS, please re-RTBC.

Arla’s picture

The new IS makes sense to me. The Views example is a good one.

Regarding use cases, I know we wanted the id and revision id available when we did #2462293: Support HAL in Collect. I cannot remember exactly why, but the goal there is simply to store "everything", and in regard to that, the absence of the id values is a deficiency.

Perhaps more understandably: With Collect, a site can send entities to another site. If the HAL serialization does not contain ids, they would have to be specified "on the side", which kinda defeats the purpose of the serialization and makes the transferred data more complex.

effulgentsia’s picture

Thanks Arla. I'm a bit confused though. If Collect is sending an entity to another site, its ID will end up being different on that other site. So for that kind of use case, why is the ID that the entity has on the source site needed? This seems similar to the PHP memory address example in #35: part of why it makes no sense for PHP to add that to the serialized data is that when you unserialize that data in a later request, it won't have the same memory address.

clemens.tolboom’s picture

(tested on wrong D8.0.x)

As long as self contains a nid instead UUID it would be weird not to emit the nid as without nid we cannot calculate easily other links which require the nid.

      "self" : {
         "href" : "http://drupal.d8/node/1?_format=hal_json"
      },
miro_dietiker’s picture

Collect knows (or wants to know) about everything relevant from the origin. Yes, a memory address wouldn't be. Reconstructing (a clone) is just a nice feature and definitively not related to the origin ID (and revision ID) problem.

An Entity ID is simply not enough to know what state of the original entity was captured. If you work with revisions, you always want to refer to a specific revisions. Even accross multiple systems. You want to know what exact revision has been captured or released. (And if you follow the principle of atomic revisions, you are pretty safe. [The Drupal content type revision flag is not, but that's subject of other issues...])
If we reduce it even further: At least as an API consumer i need to be able to identify when a revision switches, except you declare that the fact that we are working with revisions is an internal...

Arla’s picture

Issue summary: View changes

Nice addition about revision id, I'm adding it to IS.

effulgentsia’s picture

Status: Needs review » Reviewed & tested by the community

Thanks. Setting back to RTBC. Will leave it here for a day though before committing to see if anyone raises objections.

Arla’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
4.88 KB
774 bytes

Updating a comment and rerunning tests against the right core.

Arla’s picture

Status: Needs review » Reviewed & tested by the community

With that cosmetic fix passed, I'm setting back to RTBC.

effulgentsia’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
5.79 KB
3.01 KB

What do you think of simplifying the test to this (see patch)?

Also, I wrote up https://www.drupal.org/node/2667938 as a change record. Any thoughts on that?

Arla’s picture

While we're at it, we can even compare the entire arrays.

Not actually sure if it's better. If another change makes them different, it will be difficult to tell how from reading the test results ("Value [eight lines of array] is equal to [eight more lines of slightly different array]"). Anyway it turns out there's an unnecessary unset of "The target field" (?) so that can go at least.

The change record looks good btw. But I do think that serialization (and/or normalization) should be mentioned somewhere in the text. Serialization/normalization is not used exclusively for REST.

Arla’s picture

Edited the change record:

Diff screenshot

effulgentsia’s picture

Status: Needs review » Reviewed & tested by the community

#45's interdiff and #46's CR update look good to me.

  • catch committed 4b8acaa on 8.1.x
    Issue #2304849 by Arla, clemens.tolboom, effulgentsia, kaa4ever: Stop...
catch’s picture

Status: Reviewed & tested by the community » Fixed

I think it's reasonable to not exclude these.

However do we have an issue to add uuids to revisions - that seems like it'd be more useful to provide over REST?

Committed/pushed to 8.1.x, thanks!

effulgentsia’s picture

Published the CR.

do we have an issue to add uuids to revisions

I found #1812202: Add UUID support for entity revisions. Not sure if there are others, but adding that one as a related issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

xjm’s picture

Issue tags: +8.1.0 release notes