Problem/Motivation

Currently, enabling the HAL module breaks the recreation of captured entities.

Proposed resolution

The easy solution is to change 'hal_json' to 'json' in the normalize() calls when capturing, I think. But HAL introduces identification of entities by URI rather than ID, a concept which would be very useful in Collect as well. So possibly we should rather add support for HAL in the recreation flow.

Remaining tasks

User interface changes

API changes

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mbovan’s picture

Component: Metadata Management » Storage
Status: Active » Needs review
FileSize
31.63 KB
6.63 KB

As hal module provides URIs of referenced entities (of captured entity) inside _embedded element and not as a separate values (like JSON does) it requires us to make changes in our plugins. We have to parse _embedded to get the data we previously had as elements in values container (e.g. uid...). Then, we need to create property definitions (with type string) for those fields that don't exist in values container (only in fields definition container) and set the data from parsed _embedded element. Doing this, we are able to use URIs instead of IDs as values of reference fields.

There is a problem with some fields (e.g Node ID, Revision ID) which are ignored by hal, so we don't have that data captured. Anyway, during recreation of entities we skip those values, so maybe we can think about skipping them even in the entity capture process.

Because we display the URIs instead of target IDs, I modified CollectDataFormatter to display a link to the captured referenced entity container if it exists / (if it is captured). In my opinion, that gives us a better UI for showing relationships between containers. Also, I added web tests for this UI feature.

miro_dietiker’s picture

This whole issue leads us to challenging problems and i'm not yet sure about the design implications and decisions to take.

We discussed that:
- We don't like that Hal removes values and we lose information through this. See also #2304849: Stop excluding local ID and revision fields from HAL output
- Field definitions provided by a client or anyone should expose the data representation and not the internal form. Thus entity_references are represented as URIs and need to be described as URIs/strings.

Hal and how it deals with references is strongly related to our support for relationships. We don't have that yet so we are at the beginning of the related learnings.

Berdir mentioned that we possibly need consider implementation of an own serializer to follow our rules / expectations.
I will create followup issues about the (many) identified related problems...

mbovan’s picture

- Changed creation of DataDefinition to type URI for fields that have a type entity_reference/file/image.
- Enabled hal for collect_client.
- Fixed output for Uri data definitions.
- Fixed FieldDefinitionSchema plugin output.
- Fixed tests.

The nid/rid values are skipped, as hal doesn't provide that kind of data.

As now, if we want to compare field definitions with the current system we will get different results (even if there are no missing fields/bundles) because we have different structure. The question is, do we want to convert current field definitions data to match the structure we are capturing or to leave like this so we can clearly see which fields are "different"?

Recreation of entities is half-broken. There are no visible errors, but entity reference fields are skipped.
But, creation of missing fields is broken in case we want to create those fields that we don't have data/settings for (e.g. field_image).

Arla’s picture

Status: Needs review » Needs work
  1. +++ b/client/src/Plugin/collect_client/EntityItemHandler.php
    @@ -69,12 +69,20 @@ class EntityItemHandler extends PluginBase implements ItemHandlerInterface {
    +      if (in_array($field_definition->getType(), ['entity_reference', 'file', 'image']) && \Drupal::entityManager()->getDefinition($field_definition->getSetting('target_type'))->isSubclassOf('\Drupal\Core\Entity\FieldableEntityInterface')) {
    
    +++ b/src/CaptureEntity.php
    @@ -138,6 +150,13 @@ class CaptureEntity {
    +      if (in_array($field_definition->getType(), ['entity_reference', 'file', 'image']) && $this->entityManager->getDefinition($field_definition->getSetting('target_type'))->isSubclassOf('\Drupal\Core\Entity\FieldableEntityInterface')) {
    

    This huge line could use some splitting and/or comments to clarify what is happening. Also, could we extract it to a collect_common function? Because we use it on both sides.

  2. +++ b/client/src/Plugin/collect_client/EntityItemHandler.php
    @@ -69,12 +69,20 @@ class EntityItemHandler extends PluginBase implements ItemHandlerInterface {
         $result->data = json_encode([
           'values' => $values,
           'fields' => $fields,
    -      'operation' => $item['operation']]
    +      'operation' => $item['operation'],
    +      ]
         );
    

    The ([ are on the same line, so the ]) should also be.

  3. +++ b/src/Plugin/Field/FieldFormatter/CollectDataFormatter.php
    @@ -20,7 +21,10 @@ use Drupal\Core\Field\FormatterBase;
    +use Drupal\Core\TypedData\Plugin\DataType\Uri;
    +use Drupal\Core\TypedData\Type\UriInterface;
    

    Unused

  4. +++ b/src/Plugin/Field/FieldFormatter/CollectDataFormatter.php
    @@ -170,13 +174,19 @@ class CollectDataFormatter extends FormatterBase implements ContainerFactoryPlug
    +    $uri = SafeMarkup::checkPlain($value->getString());
    +    if (UrlHelper::isValid($uri)) {
    +      $uri = $this->getContainerUrl($uri);
    +    }
    

    I think the checkPlain will encode some URLs, especially with & in them, so probably better to check UrlHelper::isValid() first, and if not a URL, checkPlain it. (As the comment says.)

  5. +++ b/src/Plugin/collect/Schema/CollectJsonSchema.php
    @@ -127,13 +127,25 @@ class CollectJsonSchema extends SchemaBase implements DynamicSchemaTypedDataInte
    +    // Extract embedded values and create elements of data array with the field
    +    // values which are not explicitly displayed in values container data.
    

    This comment was a little bit confusing to me. I think this at least needs a description of what kind of data is in _embedded.

  6. +++ b/src/Plugin/collect/Schema/CollectJsonSchema.php
    @@ -127,13 +127,25 @@ class CollectJsonSchema extends SchemaBase implements DynamicSchemaTypedDataInte
    +          $values[$field_name] = \Drupal::typedDataManager()->create($property_definition->getDataDefinition(), $data['values'][$field_name], $field_name, $container_entity_adapter->get('data')->first());
    

    Why the change of the create() params? If this causes problems, I think we can just remove the $container_entity_adapter and leave the 3rd and 4th parameters empty.

  7. +++ b/src/Plugin/collect/Schema/FieldDefinitionSchema.php
    @@ -155,9 +163,15 @@ class FieldDefinitionSchema extends SchemaBase implements SpecializedDisplaySche
    +          if ($field_definition['type'] == 'field_item') {
    +            $field_definitions[$field_name] = $this->serializer->denormalize($field_definition, 'Drupal\Core\Field\FieldDefinitionInterface');
    +          }
    +          else {
    +            $field_definitions[$field_name] = $this->serializer->denormalize($field_definition, 'Drupal\Core\TypedData\DataDefinitionInterface');
    +          }
    

    The only difference between the cases is the class name, maybe we could move the denormalize call outside the if-else

  8. As now, if we want to compare field definitions with the current system we will get different results (even if there are no missing fields/bundles) because we have different structure. The question is, do we want to convert current field definitions data to match the structure we are capturing or to leave like this so we can clearly see which fields are "different"?

    creation of missing fields is broken

    I'm also not really sure what we want. This is related to what @miro_dietiker said in #2: "We don't like that Hal removes values and we lose information through this." Let's discuss in a followup?

  9. Recreating entity reference fields can probably be done by resolving the URL (as long as it is local). In a follwoup.
mbovan’s picture

Status: Needs work » Needs review
FileSize
15.29 KB
6.67 KB

@Arla:

1. Tried to move it in collect_common module but it requires some changes in service calls. We need to decide what services we want to inject (and where), where to put the new service, or on the other side implement it as a new global function in collect_common. I guess this can be done in a follow-up. Until then, I added comments which explains this long line on both sides (client and server).

2. Changed.

3. Removed.

4. Changed.

5. Changed the comment.

6. It caused some errors because of missing name parameter, so we added both parameters. We don't have same problems if both parameters are empty, so I removed $container_entity_adapter.

7. Fixed.

mbovan’s picture

Renamed variables, removed parameters which are by default empty.

Arla’s picture

Status: Needs review » Fixed

Looks great :D

Committed and pushed!

So, followups needed to make sure recreation of ER fields works, to decide on what to diff, and resolving URIs when recreating ER values. To tackle some of this, @miro_dietiker suggested changing the field definition serialization (for creating FD containers) to provide the URI-list definitions in parallel with all field definitions. Thus we could use the field definitions for diffing and creating fields, and the URIs with other standard Collect usage.

  • Arla committed 28990d3 on 8.x-1.x authored by mbovan
    Issue #2462293 by mbovan: Support HAL
    
Arla’s picture

Created followups:

make sure recreation of ER fields works, to decide on what to diff

#2480109: Field definition container: include ER fields as well as URIs

and resolving URIs when recreating ER values

#2480103: Resolve URIs to recreate entity reference values

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.