Problem/Motivation

In order to handle POST and PUT with JSON-LD, we need to have deserialize support.

Proposed resolution

Add a denormalize method to the JSON-LD normalizer.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Anonymous’s picture

This will depend on #1832840: Enable fieldtype-specific JSON-LD normalization, so hopefully we can get some reviews of that.

Anonymous’s picture

Here is a proof of concept for discussion. There is lots of dirty, brittle, error prone code in there, and it does not work for Entity References either. I want to focus on discussions of the general approach.

It depends on #1832840: Enable fieldtype-specific JSON-LD normalization and #1831286: Provide machine-readable description of entity/field/property.

It uses the "@type" attribute's value (a URI) to retrieve a subset of bundle metadata. From that, it extracts the entity type and bundle machine names. This requires minting two new property URIs in the Drupal.org namespace:

  const DRUPAL_BUNDLE_NAME  = 'http://drupal.org/schema/bundleName';
  const DRUPAL_ENTITY_TYPE  = 'http://drupal.org/schema/entityType';

It currently just uses the short property names (instead of full URIs) for accessing the field values. This will only work if you have two sites that are using the exact same field names used in the same way.

The temp patch is provided for manual testing. If you create a new entity_test and then go to entity/post, it should duplicate that entity_test, except without the user_id.

klausi’s picture

Status: Active » Needs work
+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldBundleRdfSchemaNormalizer.php
@@ -0,0 +1,48 @@
+ * @file
+ * Definition of Drupal\jsonld\JsonldBundleRdfSchemaNormalizer.

That should be "Contains Drupal\jsonld\JsonldBundleRdfSchemaNormalizer.". See http://drupal.org/coding-standards/docs#files

+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldBundleRdfSchemaNormalizer.php
@@ -0,0 +1,48 @@
+  /**
+   * Implements \Symfony\Component\Serializer\Normalizer\NormalizerInterface::normalize()
+   */
+  public function denormalize($data, $class, $format = NULL) {
+  }

why is that method empty? Please add a comment.

+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldEntityNormalizer.php
@@ -23,15 +24,60 @@ public function normalize($entity, $format = NULL) {
+    if (!isset($data['@type'])) {
+      // @todo Throw an exception?
+    }

Yes, probably re-use some exception from the serialization component?

+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldEntityWrapper.php
@@ -20,7 +22,7 @@ class JsonldEntityWrapper {
-
+ ¶

trailing white space

+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldNormalizerBase.php
@@ -31,4 +35,23 @@ public function supportsNormalization($data, $format = NULL) {
+  /**
+   * Runs a GET request against a URI.
+   */
+  protected function request($uri) {

We have to run a GET request? why? whatever you retrieve from there that should be available somewhere so that we do not have to pack it into json, return a subrequest, and unpack it from JSON. A lot of wasted CPU cycles?

+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldNormalizerBase.php
@@ -31,4 +35,23 @@ public function supportsNormalization($data, $format = NULL) {
+    $response = $kernel->handle($schemaRequest, HttpKernelInterface::SUB_REQUEST, true);

"true" should be upper case.

Anonymous’s picture

As I mentioned in my comment above, this is a very rough proof of concept. There are a whole lot of things that are missing/wrong about the patch, it isn't even close to being ready for a real review.

What I want to focus on (the discussion of the general approach) is what you touch on here:

We have to run a GET request? why? whatever you retrieve from there that should be available somewhere so that we do not have to pack it into json, return a subrequest, and unpack it from JSON. A lot of wasted CPU cycles?

My understanding is that, based on our new Symfony underpinnings, the JSON available at this URI would be cached. If I understand the ESI stuff correctly, it's a similar concept. This means that we wouldn't have to "pack it into JSON". We would simply use a URI identifier (which remains opaque) to get a JSON object (which is basically served as a static file), and access a property of that object.

Anonymous’s picture

Issue tags: +WSCCI, +json-ld

tagging

moshe weitzman’s picture

A Symfony subrequest like this will does end up with any http communication. It is an internal dispatch, akin to menu_execute_active_handler($path) in D7. We discussed this technique in our Web Services call and folks were happy proceeding with the subrequest for now.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
58.67 KB

Posting a patch for testing. Patch for review (with explanation) will come shortly.

Anonymous’s picture

This patch deserializes incoming JSON-LD into an entity. It uses the URI provided in JSON-LD's '@type' attribute to fetch a definition of the bundle metadata, from which it determine the entity type and bundle name. It then iterates through the keys in the JSON-LD and adds the values to the corresponding field on the entity. It does not support entity references yet.

The first 3 chunks should not be committed. They remove language management from the request cycle. This was necessary because a circular dependency was being detected when running the subrequest for the bundle metadata in the test environment. I didn't have this problem when I was testing manually.

Status: Needs review » Needs work

The last submitted patch, 1838596-07-deserialize__for-testing.patch, failed testing.

klausi’s picture

+++ b/core/modules/jsonld/lib/Drupal/jsonld/Tests/NormalizeDenormalizeTest.php
@@ -2,29 +2,46 @@
+class NormalizeDenormalizeTest extends WebTestBase {

why is that not a unit test? where do you need the database or a full new install?

+++ b/core/modules/jsonld/lib/Drupal/jsonld/Tests/NormalizeDenormalizeTest.php
@@ -35,38 +52,26 @@ public static function getInfo() {
     // Add German as a language.
     $language = new Language(array(
       'langcode' => 'de',
       'name' => 'Deutsch',
     ));
     language_save($language);

Aha, probably because of this. Do we really need to create a system language just to use language-specific stuff on the entity API?

+++ b/core/modules/jsonld/lib/Drupal/jsonld/Tests/NormalizeDenormalizeTest.php
@@ -141,4 +146,51 @@ public function testNormalize() {
+  /**
+   * Get the Entity ID.
+   *
+   * @param Drupal\Core\Entity\EntityNG $entity
+   *   Entity to get URI for.
+   *
+   * @return string
+   *   Return the entity URI.
+   */
+  protected function getEntityId($entity) {

Doc block and function name is wrong, this should be getEntityURI()?

+++ b/core/modules/jsonld/lib/Drupal/jsonld/Tests/SupportsSerializationTest.php
@@ -0,0 +1,105 @@
+class SupportsSerializationTest extends WebTestBase {

But this could be a unit test, right?

+++ b/core/modules/jsonld/tests/modules/jsonld_test/jsonld_test.module
@@ -0,0 +1,6 @@
+ */
\ No newline at end of file

new line missing.

Anonymous’s picture

why is that not a unit test? where do you need the database or a full new install?

I'm pretty sure that any test that relies on using the entity system can't be a UnitTest. I would be happy to be shown wrong on that... but I said the same thing the last time someone asked, and no one has shown me wrong yet.

Aha, probably because of this. Do we really need to create a system language just to use language-specific stuff on the entity API?

Yes, as described here.

Doc block and function name is wrong, this should be getEntityURI()?

The URI is also the JSON-LD @id. Since this will only be used to check the @id attribute, I use Id. I can change it if people want, though.

But this could be a unit test, right?

I was hoping that it could, but it seems that in order to instantiate FieldItem classes (such as StringItem) you need typed_data() to work, and it seems it doesn't work in a UnitTest.

klausi’s picture

I included this patch in #1839346: REST module: POST/create, but there were some rough edges so I had to hack around them.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
47.41 KB

I can't reproduce the test failure on my local. There must not be a response from the subrequest to the schema, I'm not sure what would cause problems with subrequests on the testbot but not on my local.

To rerun the test on the testbots, I just rerolled the previous patch to account for #1832840: Enable fieldtype-specific JSON-LD normalization being in.

Status: Needs review » Needs work

The last submitted patch, 1838596-13-deserializer__for-testing.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
45.29 KB

The hack I put in for language manager on subrequests in the test environment is no longer necessary, it seems. Taking those out to see which tests are fixed.

Status: Needs review » Needs work

The last submitted patch, 1838596-15-deserialize__for-testing.patch, failed testing.

Anonymous’s picture

So the test that is a problem right now is the NormalizeDenormalizeTest. The subrequest which gets the type information is failing on the test bots. It works on my local.

Once that is fixed, the test will still fail because there is a problem with multilingual field handling at the moment, which I still have to debug.

effulgentsia’s picture

Status: Needs work » Needs review
FileSize
1.13 KB
45.32 KB

Curious what bot thinks of this one.

Status: Needs review » Needs work

The last submitted patch, 1838596-18-deserialize__for-testing.patch, failed testing.

effulgentsia’s picture

Status: Needs work » Needs review
FileSize
1.23 KB
45.41 KB

Ok bot, how about this hack? Interdiff relative to #15.

Status: Needs review » Needs work

The last submitted patch, 1838596-20-deserialize__for-testing.patch, failed testing.

fago’s picture

This patch deserializes incoming JSON-LD into an entity. It uses the URI provided in JSON-LD's '@type' attribute to fetch a definition of the bundle metadata, from which it determine the entity type and bundle name.

I wonder why we are doing this? I mean, we know it's an entity that has been output by our system. We have all the metadata, so why do we parse it from the schema? Shouldn't it be possible to map back from @type to the entity type or determine it somehow via the route?

Anonymous’s picture

Since both you and klausi have raised this concern, we should fully discuss it.

As far as I see it, we have two options:

  • Run a subrequest on the @type URI
  • Extract route parameters from the @type URI

Here are the pros and cons I see.

Subrequest

Pro: We maintain the opacity of the URI, which means if we change the URI template between D8 and D9, it doesn't have an impact on the code required to use the @type. By not relying on the @type's URI template, we increase backwards compatibility and RESTful-ness IMHO.

Con: It does pose the challenge that if the subrequest for the schema fails, we don't have the data (as the testbot is currently demonstrating). Hopefully, issues with internal subrequests will be exposed and fixed before release, though.

Extract parameters from route

Pro: We don't have to worry about failing subrequests, and it's possible that performance is better. It also means we don't have to add the Drupal.org properties to the RDF Schema.

Con: As stated above, I believe this decreases BC and RESTfulness. In addition, I believe that extracting the route parameters using the Routing system would be more complicated (though Larray can correct me). The attributes aren't parsed on Request instantiation, it seems we would have to dispatch KernelEvents::REQUEST to extract the arguments we need. Alternately, we could parse it directly from knowledge we have of the URI template that RDF module uses, but that couples the two modules far too much.

effulgentsia’s picture

In addition to #23, a benefit of treating the URI opaquely and getting its content with a request is that we can later support that URI being to a different site. For example, in a staging to prod deployment, the staging site can pass to the prod site a $data object where $data['@type'] is a URI on the staging site.

effulgentsia’s picture

Status: Needs work » Needs review
FileSize
1.04 KB
45.52 KB

Going more extreme to unblock progress.

Status: Needs review » Needs work

The last submitted patch, 1838596-25-deserialize__for-testing.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
6.88 KB
46.18 KB

The problem with the multilingual fields is as follows:

In order to assign multilingual values to a field, the entity's default language cannot equal LANGUAGE_NOT_SPECIFIED, as explained in the code comment below.

  public function getTranslation($langcode, $strict = TRUE) {
    // If the default language is LANGUAGE_NOT_SPECIFIED, the entity is not
    // translatable...

However, I don't think that we want to force our consumers to specify the default language of the entity. They should just be able to assign values in multiple languages to a field.

The attached patch uses the language if specified in the JSON-LD. If it isn't specified, it checks the field values to see if any are in a language other than the site default (i.e. if there is any translated content). It then sets the default language for the entity to the site's default language. It also fixes a fatal error in RdfBundleSchemaNormalizer.

Interdiff is from #15.

Status: Needs review » Needs work

The last submitted patch, 1838596-27-deserialize__for-testing.patch, failed testing.

Anonymous’s picture

This patch will clear up the SiteSchema test fail. The only remaining fail is because of the testbot issue.

Anonymous’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 1838596-29-deserialize__for-testing.patch, failed testing.

klausi’s picture

+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldEntityNormalizer.php
@@ -23,17 +31,99 @@ public function normalize($entity, $format = NULL) {
+    $schema = $this->request($typeUri);
+    $entity_type = $schema->{RdfConstants::DRUPAL_ENTITY_TYPE};
+    $values = array(
+      'type' => $schema->{RdfConstants::DRUPAL_BUNDLE_NAME},
+    );

You cannot rely on $schema here as it might be NULL when something invalid is passed. I'm currently experiencing this with the REST creation tests. And we should have a test case for invalid @type URIs.

klausi’s picture

Status: Needs work » Needs review
FileSize
1.33 KB
46.37 KB

New patch, contains some changes required for #1839346: REST module: POST/create. What do you think testbot?

Lin do you have the work on this issue pushed to a git repository somewhere? I pushed my work to my REST sandbox to the deserialize-lin-1838596 branch: http://drupal.org/sandbox/klausi/1815566

Status: Needs review » Needs work

The last submitted patch, deserialize-1838596-33-interdiff.patch, failed testing.

fago’s picture

For example, in a staging to prod deployment, the staging site can pass to the prod site a $data object where $data['@type'] is a URI on the staging site.

Interesting idea, but I fear it won't work anyway as the code behind will just be able to deal with storing entities that look like the outputted ones. Drupal does not have the flexibility of an RDF-store that supprots storing data with any metadata ;) Instead it has to be prepared to store it, meaning we need to have the metadata already. Thus, if we can handle staging entities, we need have their metadata already also and don't need to parse the remote schema.

So yes, I see being able to parse the schema and constructing an object based upon this information as a nice feature. But it's complex and raises more related questions. Do we trust that schema? What about validation constraints, will those be part of the schema?

Where I'd see metadata extraction out of the schema being very handy, is when doing remote entities. But you really need to be careful here if you are not able to fully trust your source...

Alternately, we could parse it directly from knowledge we have of the URI template that RDF module uses, but that couples the two modules far too much.

That sounds feasible to me, you could even let the RDF module do the extraction as it does the generation already.

Anonymous’s picture

So yes, I see being able to parse the schema and constructing an object based upon this information as a nice feature. But it's complex and raises more related questions. Do we trust that schema? What about validation constraints, will those be part of the schema?

You're right here, but that's why my post didn't bring up the issue of remote entities yet. I think there are benefits to this approach beyond the use of remote schemas, as listed in the pros/cons.

That sounds feasible to me, you could even let the RDF module do the extraction as it does the generation already.

Parsing URIs feels brittle to me... and it seems like rewriting part of the routing system. I prefer to handle URIs as opaque identifiers. However, if others agree with this approach, I'll give it a shot. Another option would be preparing an index, which I would prefer to parsing.

First, though, I would like someone to explain what the disadvantages of the subrequest approach are. If it is performance, I do believe we could consider an ESI approach (since the cache would only have to be purged when a new content type or field is added). If it is trust, we already agreed that we would only dereference routes within the same domain for now.

Anonymous’s picture

I've spun the discussion of subrequest/index/parsing out into #1852812: Use cache to get entity type/bundle metadata from JSON-LD @type URI.

fago’s picture

klausi’s picture

Title: Add deserialize for JSON-LD » Restrict data retrieving routes to media type
Component: base system » rest.module
Category: feature » bug
Status: Needs work » Needs review
Issue tags: -json-ld
FileSize
1.15 KB
46.45 KB

In order to move forward here I updated the patch to manually parse the entity type and bundle information from the @type URI. That should move the testbot failures out of the way and we can continue the more advanced solution in #1852812: Use cache to get entity type/bundle metadata from JSON-LD @type URI ass followup.

klausi’s picture

Title: Restrict data retrieving routes to media type » Add deserialize for JSON-LD
Component: rest.module » base system
Category: bug » feature
Issue tags: +json-ld

Sorry, dreditor destroyed the form values when I wanted to restore previously entered data.

Status: Needs review » Needs work
Issue tags: -WSCCI, -json-ld

The last submitted patch, deserialize-1838596-39.patch, failed testing.

klausi’s picture

Status: Needs work » Needs review

#39: deserialize-1838596-39.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, deserialize-1838596-39.patch, failed testing.

klausi’s picture

Status: Needs work » Needs review

#39: deserialize-1838596-39.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +WSCCI, +json-ld

The last submitted patch, deserialize-1838596-39.patch, failed testing.

klausi’s picture

Status: Needs work » Needs review
FileSize
8.85 KB
25.33 KB

New patch, all dependencies to RDF kicked out to be sorted out in a separate issue. This should unblock the progress on general deserialization.

I have no idea why the testbot failed on the previous patch, since module enabling/disabling should have nothing to do with this patch.

Interdiff is against the original for-review-patch from Lin in #29.

Anonymous’s picture

Status: Needs review » Needs work
+++ b/core/modules/jsonld/lib/Drupal/jsonld/JsonldEntityNormalizer.phpundefined
@@ -23,17 +29,103 @@ public function normalize($entity, $format = NULL) {
+    // @todo use a sub request to the schema information instead of manually
+    // parsing the URI.
+    $parts = explode('/', $typeUri);

As discussed on the call, we decided not to do a subrequest for this. This should reference #1852812: Use cache to get entity type/bundle metadata from JSON-LD @type URI instead.

Other than that, this looks fine. We do have to recognize that there is a whole lot of work to be done to make this work in a shippable way, but as long as everyone is aware of that I'd say it's fine to commit. I have mostly completed the event-based type handling that we talked about on the call so I should post a patch to the other issue today.

klausi’s picture

Status: Needs work » Needs review
FileSize
950 bytes
25.35 KB

Yay, the tests are green! What a relief.

New patch that updates the comment as proposed by Lin.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

test bot and lin are satisfied so rtbc

Crell’s picture

I mostly understand what's going on here. :-) My one concern are the static:: calls. Why do we have those? They're in objects that are getting instantiated, so why are those static? (Statics are almost-always-bad.)

Lars Toomre’s picture

Status: Reviewed & tested by the community » Needs review

If this patch gets re-rolled again, ...

+++ b/core/modules/jsonld/lib/Drupal/jsonld/Tests/SupportsSerializationTest.phpundefined
@@ -0,0 +1,105 @@
+    // Supported entity.
+    $this->assertTrue($this->normalizers['entity']->supportsNormalization($supportedEntity, static::$format), "Entity normalization is supported for $format on content entities.");
+    // Unsupported entity.

This string concatenation should use format_string(). Same in a couple of other test assertion messages below.

Lars Toomre’s picture

Status: Needs review » Reviewed & tested by the community

Whoops did not mean to change status...

klausi’s picture

@Lars: why? format_string() has no benefit on assertion messages? Could you point me to a documentation page where this is recommended?

Anonymous’s picture

Discussed this a bit with Crell in IRC... we decided to use static::$format in the serialize issue. Symfony's JsonEncoder used a constant, which it referenced as self::FORMAT. I had originally copied that style, but then reviewers said to change it to static.

I don't think we need it to be static though, so I would be fine with removing the static keyword.

Lars Toomre’s picture

@klausi - I am not sure where the use of format_string() is in our documentation standards. The use of format_string() was part of #500866: [META] remove t() from assert message and its various sub-issues.

The other thing done there where appropriate was to use single quotes around assertion messages instead of double quotes. My understanding that the latter change was done to follow best practice for string handling. That also could be changed in the current patch.

Anonymous’s picture

I can't tell whether this is the intended use of format_string or not

After looking again, yes, this is the intended use of format_string it seems.

klausi’s picture

Then we should provide input to that issue, because it is totally wrong to use format_string() in assertion messages. format_string() should be used when user provided input is involved as its purpose is to sanitize variables. Since 99.99% of all assertion messages don't need sanitization this is just a waste of function calls and CPU cycles.

So the patch should be fine regarding assertion messages.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 8.x. Thanks.

Automatically closed -- issue fixed for 2 weeks with no activity.