Task

Let's convert the comment entity to the new entity field API as implemented in #1696640: Implement API to unify entity properties and fields.

Follow-ups

#1869562: Avoid instantiating EntityNG field value objects by default
#1869574: Support single valued Entity fields
#1869600: Refactor RDF mappings to be inline with the new Entity Field API
#1867888: Improve TypedData date handling
#1867880: Bring data type plugins closer to their PHP classes
#1868004: Improve the TypedData API usage of EntityNG

User interface changes

None.

API changes

All comment properties/fields have to be accessed using the new entity field API now, e.g. $comment->nid->value.
Also see the general change notice: http://drupal.org/node/1806650

Blocking

Original report by fago (deprecated)

Note:
Patches will depend on the latest patch from #1696640: Implement API to unify entity properties and fields unless communicated otherwise.

Code

Work is done in the entity API improvements sandbox. See the branches property-comment and property-comment-*. For contributing to the patch, please use the sandbox according to its guidelines. If you need access to it, just contact fago.

Branches:
- property-comment (now field-comment)
- property-comment-fago, property-comment-* (now field-comment-fago, field-comment-*)

CommentFileSizeAuthor
#153 comment-entityng-153.patch140.15 KBeffulgentsia
#153 interdiff.txt1.11 KBeffulgentsia
#149 comment-entityng-149.patch140.17 KBeffulgentsia
#149 interdiff.txt20.17 KBeffulgentsia
#147 d8_comment.patch140.43 KBfago
#144 d8_comment.patch140.66 KBfago
#142 d8_comment.patch140.67 KBfago
#142 d8_comments.patch.inderdiff.txt16.77 KBfago
#137 d8_comment.patch143.06 KBfago
#135 d8_comment.diff-to-improvements.txt143.06 KBfago
#135 d8_comment.patch257.56 KBfago
#130 d8_comment.patch-diff-to-improvements.txt143.06 KBfago
#130 d8_comment.patch257.59 KBfago
#130 d8_comments.patch.inderdiff.txt2.02 KBfago
#126 d8_comment.patch255.42 KBfago
#126 d8_comments.patch.inderdiff.txt140.79 KBfago
#128 d8_comment.patch255.76 KBfago
#121 d8_comment.patch250.33 KBfago
#121 d8_comments.patch.inderdiff.txt12.75 KBfago
#120 d8_comment.patch251.46 KBfago
#114 d8_comment.patch251.88 KBfago
#114 d8_comments.patch.inderdiff.txt20.83 KBfago
#112 d8_comments.patch.inderdiff.txt1.91 KBfago
#112 d8_comment.patch251.27 KBfago
#110 d8_comments.patch.inderdiff.txt4.11 KBfago
#110 d8_comment.patch250.65 KBfago
#108 d8_comment.patch247.91 KBfago
#108 d8_comments.patch.inderdiff.txt2.53 KBfago
#106 d8_comment.patch247.55 KBfago
#101 comment-entityng-1778178-101.patch248.75 KBBerdir
#101 comment-entityng-1778178-101-interdiff.txt630 bytesBerdir
#100 comment-entityng-1778178-100.patch248.65 KBBerdir
#99 d8_comment-1778178-99.patch246.82 KBdas-peter
#97 d8_comment.patch244.22 KBfago
#93 d8_comment.patch241.46 KBfago
#85 typed_data_factory.patch.interdiff.txt11.95 KBfago
#84 d8_comment.patch195.63 KBfago
#57 d8_comment.patch186.85 KBfago
#57 d8_comments.patch.inderdiff.txt1.89 KBfago
#56 comment_disable_fields.patch.txt2.4 KBfago
#56 comment_disable_fields_8x.patch.txt1.7 KBfago
#52 d8_comments.patch.inderdiff.txt18.72 KBfago
#52 d8_comment.patch186.69 KBfago
#51 d8_comment.patch180.85 KBfago
#51 d8_comments.patch.inderdiff.txt1.79 KBfago
#49 d8_comment.patch179.56 KBfago
#49 d8_comments.patch.inderdiff.txt9.79 KBfago
#47 d8_comment.patch172.56 KBfago
#43 d8_comments.patch.inderdiff.txt26.91 KBfago
#43 d8_comment.patch170.33 KBfago
#36 d8_comment.patch148.78 KBfago
#36 d8_comments.patch.inderdiff.txt549 bytesfago
#34 d8_comments.patch.inderdiff.txt3.04 KBfago
#34 d8_comment.patch148.73 KBfago
#32 d8_comment.patch148.61 KBfago
#30 d8_comments.patch.inderdiff.txt3 KBfago
#30 d8_comment.patch148.38 KBfago
#29 d8_comment.patch145.55 KBmradcliffe
#24 d8_comment.patch145.26 KBfago
#24 d8_comments.patch.inderdiff.txt6.73 KBfago
#22 d8_comment.patch143.5 KBfago
#22 d8_comments.patch.inderdiff.txt5.61 KBfago
#19 d8_comments.patch.inderdiff.txt1.78 KBfago
#19 d8_comment.patch139.31 KBfago
#18 d8_comment.patch138.14 KBfago
#14 comment-ng-1778178-14.patch88.35 KBBerdir
#8 comments.patch91.02 KBfago
#2 comments.patch87.44 KBfago
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

fago’s picture

Status: Active » Needs work

Pushed some temporary and not yet working code property-comment-fago

fago’s picture

FileSize
87.44 KB

Here is diff of the comments branch against the entity-property branch.

andypost’s picture

Issue tags: +D8MI

For D8MI was chosen the contact module to test entity api for multilingual compatibility so:

Suppose better to try this on simpler entity #1588422: Convert contact categories to configuration system
Probably comment module could be re-factored in #1790764: [meta] Convert comment module to field_api

fago’s picture

Probably comment module could be re-factored in #1790764: [meta] Convert comment module to field_api

I guess so, but we need to convert comment module over to the new property api anyhow - regardless of the refactoring. So I don't think we should wait for this to happen.

Suppose better to try this on simpler entity #1588422: Convert contact categories to configuration system

Well, we don't have the simple entity yet. Also, I think comments are good example to see real-world implications of the property API on the code-level and also on the performance level. That said, we can also use it to run benchmarks on it.

fago’s picture

Project: D8 Entity API improvements » Drupal core
Version: » 8.x-dev
Component: Entity Property API » entity system
Assigned: Unassigned » fago

In the latest skype meeting we've discussed to this as follow-up of the initial patch though, but keep working on it now so we can still use it as real-world example when reviewing and for performance comparisons. But by keeping it separated patches stay smaller and reviewing things is simpler.

Thus, moving to the core queue.

fago’s picture

Title: Convert comments to the entity property API » Convert comments to the new entity property API
Status: Needs work » Postponed

Setting to postponed for now, on the initial patch (#1696640: Implement API to unify entity properties and fields) to land.

fago’s picture

Issue summary: View changes

updated issue summary

fago’s picture

Pushed some updates to the "property-comment" branch of the sandbox.

fago’s picture

FileSize
91.02 KB

Here is the latest diff against the entity-property branch. Manually creating and viewing comments works, comment tests need work.

fago’s picture

Status: Postponed » Active

Patch needs a re-roll to match the latest changes from #1696640: Implement API to unify entity properties and fields.

Also, apart from fixing tests todo is figuring out what should happen with the following fields:

  • new
  • signature
  • signature_format
  • picture
  • registered_name

Probably, most of them could be converted to a computed field that's loaded on demand.

larowlan’s picture

@fago, please ping me on irc as I'm porting comment to fieldapi and decoupling from node in #731724: Convert comment settings into a field to make them work with CMI and non-node entities
I want to ensure my work doesn't conflict with that going on here, as both are big changes.

Berdir’s picture

new sounds like a computed property to me, picture would go away in favor of a field if we can get #1292470: Convert user pictures to Image Field in and it's from user, as is the signature stuff. And registered_name is just a hack to get user.name from the joined user table. It's never accessed directly with one exception*, only used to set $comment->name->value. And this feels now even more bad than it already did:
$comment->name->value = $comment->uid->value ? $comment->registered_name : $comment->name->value;

* That exception is in the comment form controller and could possibly be removed as we should always have a name now.

These properties should IMHO not exist at all on comment and instead be loaded through $comment->uid->entity->x as we need to get rid of the join on the user table anyway, as this messes up persistent entity caching (Right now in 7.x, after changing the username or signature, the old value will still show up for all comments). Not sure if it should happen in this issue or in a follow-up.

If it's a follow-up, then we need to find a workaround...

fago’s picture

These properties should IMHO not exist at all on comment and instead be loaded through $comment->uid->entity->x as we need to get rid of the join on the user table anyway

Agreed. I think it might be easier to leave that to a follow-up. It should be fine to not convert them to the entity field API but keep reading/writing $comment->$name for now.

ad #10: thanks for noting, seen that. Not sure how to proceed best, let's dicuss.

fago’s picture

Title: Convert comments to the new entity property API » Convert comments to the new Entity Field API
Issue tags: +Entity Field API

adding tag

Berdir’s picture

Status: Active » Needs review
FileSize
88.35 KB

Attaching a re-roll based on the patch in #8. Checked the comment branch in the issue, but it was actually easier to re-roll the patch.

Haven't actually tested anything yet manually, no idea how this is going to do.

Status: Needs review » Needs work

The last submitted patch, comment-ng-1778178-14.patch, failed testing.

fago’s picture

Status: Needs work » Active

The patch needs some updates to match the committed Entity Field API code. I'm working on updating this and adding a computed field for 'new' now.

fago’s picture

Status: Active » Needs work
fago’s picture

Status: Needs work » Needs review
FileSize
138.14 KB

ok, I did so and finished converting everything. Let's see whether I've got all comment usages... ;-)

  • In more detail, attached patch
  • fixes some small, random bugs in the entity field API
  • improvescompatibility mode such that field API works with not yet converted fields on converted entities
  • removes the broken and now unnecessary "prepare entity view" check, which created troubles
  • adds 'new' as computed field. For that it improves the FieldItemBase class to allow for per-field-value definition overrides what eases providing computed fields a lot.
  • added a comment_prepare_author() helper which gets you a user account object for theming the comment author. This was necessary as we passing on $comment as (thankfully) $account to theme functions does not work any more. This does away with the need for all the $comment->signature properties. The only left-over user property that requires us to join on user during load is "registered_name". This can be easily removed in a follow-up.
fago’s picture

Looks like I got everything :-)

I realized the Comment classes misses some of the defined base fields (=properties), so I worked over that to ensure all defined fields are defined in the class as well. It's a bit awkward to have to copy around description texts. We could look into deriving base field definitions from the class via the annotation reader... - but that's another issue we can have a look at later on.

Berdir’s picture

Some comments, more questions than an actual review...

+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageControllerNG.phpundefined
@@ -235,7 +235,15 @@ protected function invokeHook($hook, EntityInterface $entity) {
-      $record->$name = $entity->$name->value;
+      switch ($entity->$name->get('value')->getType()) {
+        // Store dates using timestamps.
+        case 'date':
+          $record->$name = $entity->$name->value->getTimestamp();
+          break;
+        default:
+          $record->$name = $entity->$name->value;
+          break;

This is exactly what stopped me so far from using the date_field type in the aggregator feed item entity issue.

I'm wondering who's responsible for this conversion. This approach means that the storage controller basically needs to know all data types and handle them accordingly.

I think having a getRaw() or something along that on those classes might make more sense and documented that as a value that can be stored in the corresponding schema type or something. Then it works out of the box for the default backend while a custom controller is still able to handle a specific type in a different way. That is not possible the other way round, I e.g. can't introduce a type in contrib that needs to be stored as a different value than displayed.

+++ b/core/lib/Drupal/Core/Entity/EntityRenderController.phpundefined
@@ -37,29 +37,15 @@ public function buildContent(array $entities = array(), $view_mode = 'full', $la
     // Prepare and build field content, grouped by view mode.
     foreach ($prepare as $view_mode => $prepare_entities) {
-      $call = array();
-      // To ensure hooks are only run once per entity, check for an
-      // entity_view_prepared flag and only process items without it.
-      foreach ($prepare_entities as $entity) {
-        if (empty($entity->entity_view_prepared)) {
-          // Add this entity to the items to be prepared.
-          $call[$entity->id()] = $entity;
-
-          // Mark this item as prepared.
-          $entity->entity_view_prepared = TRUE;
-        }
-      }
+      field_attach_prepare_view($this->entityType, $prepare_entities, $view_mode, $langcode);
+      module_invoke_all('entity_prepare_view', $prepare_entities, $this->entityType);
 
-      if (!empty($call)) {
-        field_attach_prepare_view($this->entityType, $call, $view_mode, $langcode);
-        module_invoke_all('entity_prepare_view', $call, $this->entityType);
-      }
-      foreach ($entities as $entity) {
+      foreach ($prepare_entities as $entity) {
         $entity->content += field_attach_view($this->entityType, $entity, $view_mode, $langcode);

Are you sure that this can be removed? I left it in there while finishing the render controller because I wasn't.

Yes, we no longer call it twice during a single view/render as we did before view now calls viewMultiple and not the other way round but I still wonder if there could be cases where things could go wrong. Like calling view() twice on the same entity.

+++ b/core/modules/comment/comment.admin.incundefined
@@ -300,11 +299,11 @@ function comment_confirm_delete($form, &$form_state, Comment $comment) {
-  comment_delete($comment->cid);
+  comment_delete($comment->cid->value);

Let's change that to $comment->delete() as we touch the line anyway?

+++ b/core/modules/comment/comment.api.phpundefined
@@ -75,7 +75,7 @@ function hook_comment_load(Drupal\comment\Comment $comments) {
-  $comment->time_ago = time() - $comment->changed;
+  $comment->time_ago->value = time() - $comment->changed->value->getTimestamp();

Is time ago really a defined field?

+++ b/core/modules/comment/comment.moduleundefined
@@ -1642,23 +1654,31 @@ function template_preprocess_comment(&$variables) {
+  $account = comment_prepare_author($comment);
+  $variables['author'] = theme('username', array('account' => $account));

Yay for stopping the misuse of comments as user objects :)

+++ b/core/modules/comment/comment.moduleundefined
@@ -1642,23 +1654,31 @@ function template_preprocess_comment(&$variables) {
+  $variables['created'] = format_date($comment->created->value->getTimestamp());
+  $variables['changed'] = format_date($comment->changed->value->getTimestamp());

I'm wondering how much logic the date type should contain? People are working on a proper date API/field in core with tons of functionality for parsing, calculating, formatting and so on.

I've been wondering if that should be merged with TypedData\Data but not sure if that actually makes sense. Would be kinda weird to have two different classes that represent dates on the other side...

Then this could be $comment->created->value->format() or something like that.

Of course not relevant for this conversion, just thinking out loud.

+++ b/core/modules/comment/comment.pages.incundefined
@@ -56,18 +56,15 @@ function comment_reply(Node $node, $pid = NULL) {
-        if ($comment->status == COMMENT_PUBLISHED) {
+        if ($comment->status->value == COMMENT_PUBLISHED) {

More thinking out loud, wondering if things like node and comment status should be more than a simple integer field.

+++ b/core/modules/comment/lib/Drupal/comment/Comment.phpundefined
@@ -86,21 +92,66 @@ class Comment extends Entity implements ContentEntityInterface {
+   */
+  protected function init() {
+    // We unset all defined properties, so magic getters apply.
+    unset($this->cid);
+    unset($this->uuid);
+    unset($this->pid);
+    unset($this->nid);
+    unset($this->langcode);
+    unset($this->subject);
+    unset($this->uid);
+    unset($this->name);
+    unset($this->mail);
+    unset($this->homepage);

I've been wondering if this is a temporary thing or if it will stay like this?

It's kinda weird :)

I guess the plan is to make those properties protected then the getters will actually always apply, except when using it inside the class?

We could make this method abstract and move the __wakeup() and construct stuff to EntityNG if it's going to stay for a while.

+++ b/core/modules/comment/lib/Drupal/comment/CommentStorageController.phpundefined
@@ -33,23 +34,47 @@ protected function buildQuery($ids, $revision_id = FALSE) {
+  protected function attachLoad(&$records, $load_revision = FALSE) {
+    // Prepare standard comment fields.
+    foreach ($records as $key => $record) {
+      $record->name = $record->uid ? $record->registered_name : $record->name;
+      $record->node_type = 'comment_node_' . $record->node_type;
+      $records[$key] = $record;

I'm wondering if we should call the node_type property differently, because it is not that, it's the comment bundle, which is derived from the node type but not the same.

fago’s picture

I'm wondering who's responsible for this conversion. This approach means that the storage controller basically needs to know all data types and handle them accordingly.

Yes, it needs to know all primitive data types which are pre-defined and cannot be extended. New data types can be defined to map to an existing primitive.

I think having a getRaw() or something along that on those classes might make more sense and documented that as a value that can be stored in the corresponding schema type or something.

But it should be up to the storage how to store e.g. a date. Use timestamps or an ISO representation? That's not our job to decide.

Then it works out of the box for the default backend while a custom controller is still able to handle a specific type in a different way. That is not possible the other way round, I e.g. can't introduce a type in contrib that needs to be stored as a different value than displayed.

With the mentioned mapping to primitives it works perfectly well. See hook_data_type_info().

I'm wondering how much logic the date type should contain? People are working on a proper date API/field in core with tons of functionality for parsing, calculating, formatting and so on.

Yep, I've been thinking about that as well. I don't think formatting dates is necessarily the job of the API, but at least having an access to e.g. a format() method on our date object would help improving DX a lot. Maybe we should open an issue for that.

Then this could be $comment->created->value->format() or something like that.

Exactly :-)

More thinking out loud, wondering if things like node and comment status should be more than a simple integer field.

I defined it as boolean even. I think we want to add options lists / allowed values for it though. #1758622: Provide the options list of an entity field

I guess the plan is to make those properties protected then the getters will actually always apply, except when using it inside the class?

Yeah, I don't like that code either, yet I've failed to come up with better alternatives yet. Having the docs at the class and auto-completion for the properties is very nice, so defining them is good. Would an IDE pick up protected properties - I fear not. Also, it might be confusing for DEVs as they don't look like being accessible.
Maybe we could do it automatically with reflection or so - not sure how performant that is though.

I'm wondering if we should call the node_type property differently, because it is not that, it's the comment bundle, which is derived from the node type but not the same.

Makes sense. Also we should store its value. But that should be another issue.

fago’s picture

I've updated the comment patch to account for the Views-Comment uses after the merge as well. Also, I moved the __wakeup() fix to EntityNG as suggested. I made init() not abstract though, but implemented it for the defined $langcode property - so you can use EntityNG without defining your properties if you want.

Updated patch + interdiff attached.

fubhy’s picture

+++ b/core/lib/Drupal/Core/Entity/Field/FieldItemBase.phpundefined
@@ -22,6 +22,12 @@
+ *     computed field to easily override 'class' key of single field value

'the' class key.

+++ b/core/modules/system/tests/modules/entity_test/lib/Drupal/entity_test/EntityTest.phpundefined
@@ -43,14 +43,12 @@ class EntityTest extends EntityNG {
     // We unset all defined properties, so magic getters apply.
-    unset($this->id);
     unset($this->langcode);
+    unset($this->id);

That swap looks essential! :P

Apart from the suggestion to add 'the' to the docblock this looks RTBC. I also tested it manually and everything seems to work nicely.

Let's get this in quickly in order to unblock #731724: Convert comment settings into a field to make them work with CMI and non-node entities.

fago’s picture

'the' class key.

fixed, see interdiff.

That swap looks essential! :P

Sure :D langcode is defined in the parent class, so I thought it should be listed first and the rest in the order as defined in the class ;-)

Let's get this in quickly in order to unblock #731724: Decouple comment.module from node.

Yep, and to avoid further conflicts!

fubhy’s picture

Status: Needs review » Reviewed & tested by the community

As promised... :)

fago’s picture

fago’s picture

Issue tags: -D8MI, -Entity Field API

#24: d8_comment.patch queued for re-testing.

Status: Reviewed & tested by the community » Needs work
Issue tags: +D8MI, +Entity Field API

The last submitted patch, d8_comment.patch, failed testing.

mradcliffe’s picture

Status: Needs work » Needs review
FileSize
145.55 KB

I checked out code from 10/24, applied patch, and then did a rebase to origin/8.x on my local branch. Git did the following merges and failed twice:

Applying: Issue #1778178 by fago: convert comments to new field api.
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
Auto-merging core/lib/Drupal/Core/Entity/Field/Type/Field.php
CONFLICT (content): Merge conflict in core/lib/Drupal/Core/Entity/Field/Type/Field.php
Auto-merging core/modules/comment/comment.module
Auto-merging core/modules/comment/lib/Drupal/comment/Tests/CommentThreadingTest.php
CONFLICT (content): Merge conflict in core/modules/comment/lib/Drupal/comment/Tests/CommentThreadingTest.php
Auto-merging core/modules/field/field.attach.inc
Auto-merging core/modules/forum/forum.module
Auto-merging core/modules/tracker/tracker.module
Auto-merging core/modules/user/user.module
Failed to merge in the changes.

I kept the HEAD change in Field.php but kept the patch changes in CommentThreadingTest.php (needs review).

This patch is the result.

fago’s picture

Yep, there are merge conflicts. I've fixed the conflicts and updated the patch to also account for the recent addtions/improvements for comment threading. This patch merges CommentThreadingTest properly, i.e. takes over improvements done in head.

Updated patch and interdiff for the comment threading changes attached.

mradcliffe’s picture

Drupal 8 tests are failing so won't get any feedback for a while.

fago’s picture

FileSize
148.61 KB

re-rolled and fixed merge conflicts as #1763974: Convert entity type info into plugins landed.

Berdir’s picture

Status: Needs review » Needs work
+++ b/core/lib/Drupal/Core/Entity/DatabaseStorageControllerNG.php
@@ -80,7 +80,7 @@ public function create(array $values) {
 
     // Make sure to set the bundle first.
-    if ($this->bundleKey) {
+    if ($this->bundleKey && !empty($values[$this->bundleKey])) {
       $entity->{$this->bundleKey} = $values[$this->bundleKey];
       unset($values[$this->bundleKey]);

I need to check if the refactor comment issue is finally adding a bundle column to the comments table.

+++ b/core/lib/Drupal/Core/Entity/EntityNG.php
@@ -60,6 +60,28 @@ class EntityNG extends Entity {
+  /**
+   * Initialize the object. Invoked upon construction and wake up.
+   */
+  protected function init() {
+    // We unset all defined properties, so magic getters apply.
+    unset($this->langcode);
+  }

I'm wondering if we should add a todo here, or open a follow-up issue to discuss this unset()-business. Should not block this conversion.

We still have a revisit-before-release issue open to check if using e.g. Node instead of EntityInterface as type hints is the right thing. If we need this hack to actually make that useful, maybe we'll end up not doing it. But we'll see.

+++ b/core/lib/Drupal/Core/Entity/Field/FieldItemBase.php
@@ -96,8 +108,8 @@ public function getValue() {
-    // given and we only have one property.
-    if (!is_array($values) && count($this->properties) == 1) {
+    // given.
+    if (!is_array($values)) {
       $keys = array_keys($this->properties);
       $values = array($keys[0] => $values);
     }

Hm. Do we really want to support this? Before, the $keys[0] was obvious because we knew there was just one anyway. Now, we just assign the value to the first defined property.

Noticed below that we're using this for the comment_body/text field, right?

Not sure if there is a more predictable/reliable way to do this.

I'm fine with discussing this in a follow-up, I think.

+++ b/core/modules/comment/lib/Drupal/comment/CommentStorageController.php
@@ -33,23 +34,47 @@ protected function buildQuery($ids, $revision_id = FALSE) {
+      $record->node_type = 'comment_node_' . $record->node_type;

Still think that we shouldn't name this node_type, but this will change anyway in the refactor comments issue.

+++ b/core/modules/comment/lib/Drupal/comment/FieldNewValue.php
@@ -0,0 +1,87 @@
+namespace Drupal\comment;
+
+use Drupal\Core\TypedData\ContextAwareInterface;
+use Drupal\Core\TypedData\Type\Integer;
+use Drupal\Core\TypedData\ReadOnlyException;
+use InvalidArgumentException;
+
+/**
+ * A computed property for the integer value of the 'new' field.
+ *
+ * @todo: Declare the list of allowed values once supported.
+ */
+class FieldNewValue extends Integer implements ContextAwareInterface {

I think this new flag is used for nodes too, but we can just move it once we need it there too.

+++ b/core/modules/comment/lib/Drupal/comment/FieldNewValue.php
@@ -0,0 +1,87 @@
+  }
+
+  /**
+   * Implements ContextAwareInterface::setParent().
+   */
+  public function setParent($parent) {
+    $this->parent = $parent;
+  }
+
+  /**
+   * Implements TypedDataInterface::getValue().
+   */

Quite a few missing fully qualified references in this class.

+++ b/core/modules/comment/lib/Drupal/comment/Plugin/Core/Entity/Comment.php
@@ -115,21 +121,107 @@ class Comment extends Entity implements ContentEntityInterface {
+   *
+   * Define some default values used.
+   *

That sentence is a bit.. strange? Maybe just "Define default values." ?

+++ b/core/modules/comment/lib/Drupal/comment/Plugin/Core/Entity/Comment.php
@@ -115,21 +121,107 @@ class Comment extends Entity implements ContentEntityInterface {
+
+  /**
+   * Initialize the object. Invoked upon construction and wake up.
+   */
+  protected function init() {
+    // We unset all defined properties, so magic getters apply.
+    unset($this->cid);
+    unset($this->uuid);
+    unset($this->pid);
+    unset($this->nid);
+    unset($this->langcode);
+    unset($this->subject);

Should we call parent::init() here and leave out langcode?

+++ b/core/modules/field/field.attach.inc
@@ -1384,6 +1384,13 @@ function field_attach_prepare_view($entity_type, $entities, $view_mode, $langcod
+      // Enable compatibility mode if necessary.
+      // @todo: Remove this once all entity types have been converted to the
+      // new entity field API.
+      if ($entity instanceof \Drupal\Core\Entity\EntityNG) {
+        $entity->setCompatibilityMode(TRUE);
+      }

I think namespaced class should have a use statement, especially if used multiple times.

+++ b/core/modules/system/tests/modules/entity_test/lib/Drupal/entity_test/Plugin/Core/Entity/EntityTest.php
@@ -62,14 +62,12 @@ class EntityTest extends EntityNG {
+  protected function init() {
     // We unset all defined properties, so magic getters apply.
-    unset($this->id);
     unset($this->langcode);
+    unset($this->id);

Same here, should call parent::init()?

I'm fine with RTBC'ing this once we can fix those small things.

fago’s picture

Status: Needs work » Needs review
FileSize
148.73 KB
3.04 KB

Thanks for the review!

I'm wondering if we should add a todo here, or open a follow-up issue to discuss this unset()-business. Should not block this conversion.

Opened #1829374: Simplify unsetting entity class properties.

Hm. Do we really want to support this? Before, the $keys[0] was obvious because we knew there was just one anyway. Now, we just assign the value to the first defined property.

Noticed below that we're using this for the comment_body/text field, right?

Right, it's something that eases conversion. We could also default to 'value' and throw an exception else. But yep, let's discuss that in a follow-up. Opened #1829388: Default to value in entity field setters.

Still think that we shouldn't name this node_type, but this will change anyway in the refactor comments issue.

Right, but as said that shouldn't be part of this issue. We are just converting the existing node_type property.

I think this new flag is used for nodes too, but we can just move it once we need it there too.

That was my thinking as well.

Quite a few missing fully qualified references in this class.

That's the case right now for all typed-data type implementations and computed properties. Not sure whether we have already standards for that, but repeating all the namespaces for anyway imported classes here looks a bit cumbersome to me. Anyway, if we want to do so I think we should do so in a separate issue that cleans this up in all typed-data implementations at once.

Addressed the other minor things. Interdiff and new patch attached.

Berdir’s picture

I noticed while working on #1292470: Convert user pictures to Image Field that these two issues introduce different helpers and ways to deal with the user picture, this issue gets rid of most user properties completely but only on comments (obviously) while the user picture issue only needs to get rid of the picture but that needs to happen for both nodes and comments. Would be great if you could have a look at the code and provide feedback on what's the best way to bring these two approaches back in sync again...

In regards to the fully qualified properties, yes, it's very verbose but I think this is part of the minimal requirements to pass the documentation gate, see http://drupal.org/core-gates#documentation-block-requirements. We don't need to fix everything, but we should fix code that we introduce/touch here or we might have a hard time to get it in.

fago’s picture

Regarding the docs, I looked it up:
See http://drupal.org/node/1354#namespaces

Elsewhere in documentation, omit the namespace if the class/interface is within the use/namespace declaration context of the file.

So it's fine the omit once the interface is already used. The TypedDataInterface has not been used in that file though, so I added that as it appears in the docs twice - that way the docs stay slim and easier to read.

I'll have a look at the user picture issue.

fago’s picture

Issue summary: View changes

fixed summary

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Looks like namespace related documentation is still a bit a moving target, so let's not worry about that too much.

I've reviewed this multiple times and while there are still a few bits and pieces in there that I'm not 100% sure about, this is an important step towards converting our entity classes to the new API and several patches are blocked on this. Various things in here are temporary BC-layers and other workarounds for stuff that is not yet converted, which is I think ok.

This already was RTBC before, time to commit this :)

catch’s picture

Issue tags: +needs profiling

It'd be good to profile this with a lot of comments on a page before it goes in - this is the most likely conversion to cause a severe performance regression since you can have up to 300 comments per page, so would be good to know how we're doing.

tim.plunkett’s picture

What's the plan for the conflicts with #731724: Convert comment settings into a field to make them work with CMI and non-node entities? That patch is twice as big, and is also very close.

fago’s picture

The plan is to re-roll the other one based upon this once it got in - it should benefit from some of the conversion also.

@catch: We did some ab runs over at #1762258: Benchmark: Bypass magic to speed up the new entity field API , but I can do more. Let's talk how to do it best.

larowlan’s picture

@tim.plunkett, I'd already discussed this with fago and conceded this will simplify my patch (as too will the entity access work in #1696660: Add an entity access API for single entity access)
As much as it pains me, it makes sense to reroll #731724: Convert comment settings into a field to make them work with CMI and non-node entities when this is in

fago’s picture

Status: Reviewed & tested by the community » Needs work

ok, I ran some performance tests on a page with 300 comments. Turns out it's an unacceptable big regression due to compatibility mode, which is copying around values each time it gets enabled&disabled. I figured out we can do another compatibility mode with a decorator which is more efficient and enables to do #1762258: Benchmark: Bypass magic to speed up the new entity field API even while we have compatibility mode.

I discussed the way forward with catch: I'll work on this new compatibility mode and adding in performance improvements - so the interim performance regression should be in an acceptable size. Then we can try to benchmark this without compatibility mode by disabling fields in order to get an idea what the performance affect of this will be in the end (while leaving out the potential improvements it will bring or allows us to do).

fago’s picture

Status: Needs work » Needs review
FileSize
170.33 KB
26.91 KB

I've implemented the new BC mode and added some performance improvements - it's still not there yet thouh. The new BC mode is not as performant as I hoped as it still needs to clear field objects for every value that is accessed via it - still it's a big improvement to the old one and better separates out the code into the decorator.

Unfortunately, we won't be able to do the magic trick mentioned in #1762258: Benchmark: Bypass magic to speed up the new entity field API as we won't be able to react to updates to the field objects then any more.

I've updated the current patch. With the patch applied rendering 300 comments is still about 100% slower than without - keeping compatibility mode out (by disabling fields).

Once I'm back to Europe (wednesday) I'll work on adding in the ArrayObject performance trick the new template engine uses now for speeding up ArrayAccess in the Field class + do more profiling.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

yched’s picture

The field_info_instances($entity_type) calls in EntityNG are probably a large performance hit. They would really need to be transformed to per-bundle calls - field_info_instances($entity_type, $bundle)

fago’s picture

edit: duplicated post

fago’s picture

Status: Needs work » Needs review
FileSize
172.56 KB

I checked #45 - doesn't seem to be a problem in that situation. I guess that's different once more fields are created, so yep we need to fix that as well.

Anyway, I worked more on that and found an issue with typed data object creation which right now creates array copies of all the definitions by apply type-definitions as default. The fix is rather easy, however I've not yet benchmarked that ... because I noticed that the updated code base has quite some test fails due to recent commits (date, entity translation ui) and the new compatibility layer.

So I've been working on fixing them, unfortunately that has been a pain and takes quite a while. Attached patch should have the compatibility layer doing well (again), but there are still test fails because of test and comment translation ui.

Let's see whether more tests fail.

Note: I've started tracking the code in the entity api sandbox under field-comment-fago now.

Status: Needs review » Needs work
Issue tags: +D8MI, +needs profiling, +Entity Field API

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
9.79 KB
179.56 KB

Updated patch, comment translation UI tests still fail.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
1.79 KB
180.85 KB

Thanks to plach's help I was able to figure out what's wrong. Fixed ET-UI tests to work with entities having some untranslatable fields + fixed the compatibility mode to work with field translations (from non 'und' languages).

Updated patch attached.

fago’s picture

Updated patches:

  • Optimizied EntityNG::__get() a bit.
  • Improved handling of per-entity field definitions by adding a static field definition cache and local cache for the bundle field value.
  • Small documentation improvments.
  • Optimized mapFromStorageRecords() to save memory.
  • Optimized entity construction a little.
  • Optimized the computed new property.
  • Optimized typed data plugin creation.

Still, needs more work.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

yched’s picture

Status: Needs work » Needs review

So, much of this, like the rest of EntityNG API for now, is above my head, but a couple remarks :

The part around static $fieldDefinitions in EntityNG looks a bit ugly, storing a registry of entity meta-data per bundle sounds like it should be delegated to a separate broker object (similar to our CacheArray pattern) - which then would probably have to be merged somehow with Field API's own FieldInfo broker.
The current code is probably fine as a temporary state to see if this is a good move optimization-wise, but could we at least add a @todo ?

   public function getPropertyDefinition($name) {
-    // Try to read from static cache.
-    if (isset($this->fieldDefinitions)) {
-      return isset($this->fieldDefinitions[$name]) ? $this->fieldDefinitions[$name] : FALSE;
-    }
-
-    // First try getting property definitions which apply to all entities of
-    // this type. Then if this fails add in definitions of optional properties
-    // as well. That way we can use property definitions of base properties
-    // when determining the optional properties of an entity.
-    $definitions = entity_get_controller($this->entityType)->getFieldDefinitions(array());
-
-    if (isset($definitions[$name])) {
-      return $definitions[$name];
-    }
-    // Add in optional properties if any.
-    if ($definitions = $this->getPropertyDefinitions()) {
-      return isset($definitions[$name]) ? $definitions[$name] : FALSE;
-    }
+    $definitions = &static::$fieldDefinitions[$this->entityType][$this->bundle];
+    return isset($definitions[$name]) ? $definitions[$name] : FALSE;
   }

Why by reference ? (&static::$fieldDefinitions)

-    $type_definition = $this->discovery->getDefinition($plugin_id);
 
-    // Allow per-data definition overrides of the used classes and generally
-    // default to the data type definition.
-    $key = empty($configuration['list']) ? 'class' : 'list class';
+    if (!isset($this->definitions)) {
+      $this->definitions = $this->discovery->getDefinitions();
+    }

Statically caching discovery results in the factory sounds really off. This is the job of CacheDecorator, which holds its own static cache, which #1764232: CacheDecorator provides no way to clear cached definitions makes clearable, so if the data is duplicated somewhere else, cache clears are going to be a mess.
It's the job of CacheDecorator to make it so that $this->discovery->getDefinition($plugin_id) is fine performance-wise.
If it isn't, for instance because within CacheDecorator::getDefinition() the access to the statically cached data is still two method calls away (thus three func calls from the outside), then this should be the fixed in CacheDecorator.

fago’s picture

The part around static $fieldDefinitions in EntityNG looks a bit ugly, storing a registry of entity meta-data per bundle sounds like it should be delegated to a separate broker object (similar to our CacheArray pattern) - which then would probably have to be merged somehow with Field API's own FieldInfo broker.

I'm not sure about that. This static is for speeding up regular access to it, as it is accessed *very* often. Hiding it behind a broker object might be problematic performance-wise, but I'm happy to explore.

+    $definitions = &static::$fieldDefinitions[$this->entityType][$this->bundle];
+    return isset($definitions[$name]) ? $definitions[$name] : FALSE;
   }

Why by reference ? (&static::$fieldDefinitions)

We do not know whether is set, so a reference works (but will create a NULL value) there. So the reference helps to make code easier to read and that's it - I verified there is no measurable performance difference.

However, right now xhprof says that exactly this two lines cause 8MB of memory!? I'll investage that further.

If it isn't, for instance because within CacheDecorator::getDefinition() the access to the statically cached data is still two method calls away (thus three func calls from the outside), then this should be the fixed in CacheDecorator.

Yes. But as soon as we pass it on to CacheDecorator it will be another function call. One might be acceptable though. I agree that keeping the cacheing logic in the decorator would be better.

fago’s picture

Here is what I'm doing for testing:
- Add 300 comments to a node (authored by uid 1). Have some hierarchy in there (-> devel generate from d8 + execute this custom code to make it work: https://gist.github.com/a98ce4f20df7a75a24c9)
- Render the node with all comments (set paging to 300)

Then, I've been disabling fields so no compatibilty mode is benchmarked. We want to know what's the performance regression without compatibility mode (which will be even a bit slower). For that, I've been using the attached patches, one for head (8.x) and the other one on top of this one.

fago’s picture

Updated patch - should fix test fails.

yched’s picture

Yeah, adding more static caches will speed up things, but we do know by now that it also introduces cache clearing issues and hell-to-debug staleness edge cases.

Especially if, like in the case of the static cache in the factory, the data is already cached somewhere else. "Staleness issues + duplication of data in memory" doesn't sound like a good tradeoff for saving a couple function calls.
If those "couple function calls" have a measured performance impact, then let's open an issue on CacheDecorator to reduce this down to one func call (which is totally doable). But preemptively cramming a static cache in front of those calls without perf measurements sounds like a hammer where we're not sure there's a nail :-).
"The statically cached data is one function call away if you're outside the system that collects the data" is the best we can reasonably aim for if we want to keep sane data flow.

$definitions = &static::$fieldDefinitions[$this->entityType][$this->bundle];

Wow, this syntax to replace an if isset() isn't what I'd call easier to read !

fago’s picture

Wow, this syntax to replace an if isset() isn't what I'd call easier to read !

It's just to make the line below shorter. Anyway.... I think it won't stay is this still has bad memory consumption: I've been researching it together with das_peter and fubhy - turns out the static variable is copied internally for each instantiated entity (but not per method call). However, we've nothing in place that could trigger that (and removing the reference does not chang anythign either). We've no idea what's causing PHP to copy the variable, but as it does - we've to find another way:

The plan is to go with the suggested FieldInfo broker then. E.g. make an entity FieldInfo class similar to the field api one which can manage the definitions. Then make an instance of it accessible via the EntityManager and keep a reference to it in the static variable to allow for fast access. This will add 1 method call per definition access but should solve the memory issue.

Especially if, like in the case of the static cache in the factory, the data is already cached somewhere else. "Staleness issues + duplication of data in memory" doesn't sound like a good tradeoff for saving a couple function calls.

PHP does copy on write, so there is added memory. I agree that staleness issue is relevant though - for the moment I just did not care as it is an issue with the cachedecorator already. But yes, I agree that going with an optimized cachedecorator which then can handle cache flushing properly also makes sense. I'll update the patch to use the cachedecorator instead.

fago’s picture

ok. turns out TypedDataFactory::createInstance() has the same memory consumption problem as EntityNG::getPropertyDefinition(). It's fine (31kB) as long as you work with on entity only, e.g. access properties 10.000 times. As soon as you do the same on ~300 entities, memory consumption is about 9M!

:(

sun’s picture

turns out the static variable is copied internally for each instantiated entity (but not per method call). However, we've nothing in place that could trigger that (and removing the reference does not chang anythign either). We've no idea what's causing PHP to copy the variable

Try self:: instead of static::. Commonly seen statements along the lines of "Always use static:: instead of self::" are not exactly right; the two have different semantics, and from what you describe here, the LSB of static:: could quite potentially be the cause.

fubhy’s picture

@sun We already tried all the different combinations.

yched’s picture

turns out TypedDataFactory::createInstance() has the same memory consumption problem as EntityNG::getPropertyDefinition()

I can't really see what in TypedDataFactory itself could cause more memory consumption than the list of typed data definitions held in the discovery.
But - couldn't the problem come from :

// TypedDataFactory::createInstance() :
$definition = $configuration + $type_definition;
return new $plugin_class($definition, $plugin_id, $this->discovery);

// TypedData::__construct() :
$this->definition = $definition;

This means every instantiated property holds a modified copy of the data definition.
Having TypedData extend PluginBase would mean that they would gain "self-inspection" capabilities - that is, get access to their own plugin definition from the discovery, without copying (that's why you would want to pass $plugin_id and $this->discovery to the constructor - right now TypedDataFactory::createInstance() passes them but TypedData::__construct() ignores them)
Not sure how this "inspection / read $definition from the discovery" ability could play with "$configuration + $type_definition", though.

fago’s picture

yes, the code of #63 is already fixed in my latest code - see http://drupalcode.org/sandbox/fago/1497344.git/blob/refs/heads/field-com...

So this is actually quite odd. The problem seems to be that we are modifying field definitions on our way and pass them on to create e.g. a field item for a field. That obviously results in an array copy, what is not nice memory wise, but additionally seems to trigger that the field definition returned by $entity->getPropertyDefinition() results in actually real array copy. (I removed all the static variable stuff for testing that out).

Ok, so far this makes sense. But then, *once* this happens to become bad entities (meaning we have multiple instances), it triggers the same behavour for CacheDecorator::getDefinition() which results in consuming 4,5MB memory vs. close to 0 if you instantiate only 1 entity and use all of its properties. This makes absolutely no sense, as you can see in the code the $type_definition returned by it is only read and not passed on anywhere else.

So I think that the weird behaviour on entity field definitions must cause a PHP bug that triggers similar weird behavior for type definitions. We can fix entity field definitions such that we do not create defintion adaptions of them and hope this fixes it for type definitions also.

To do so, we need to keep track of created definitions. E.g. if we have create the field item definition of a field, we need to able to re-use it afterwards. One easy way to do that would be migrating field definitions to objects, so the field definition can have a method to generate the field item definition and does so only once. Another way would be to handle "master entity and field" instances that can be used to derive item definitions or even be cloned directly.

fago’s picture

Variant a)
- Have field definitions as classed objects. They are not plugin definitions so no problem here.
- Use $field_defintion->getItemDefintition() to derive the item definition and make sure there is only one instance.

Variant b)
Have a reference on a "prototype" field in each object instance. $this->field->prototype
Then use, $this->field->prototype->getItemDefinitinon() and make sure there is only one array copy.

fago’s picture

Variant c)
Statically cache the derived-defintiions in the field classes. As a field class might be used with multiple different definitions, it would have to be key the "cached" item definition using a hash of the field defintion. ouch. Or have similar functionality in a central defintition manager.

yched’s picture

Can't fully parse #64, a bit meta for me :-)

There's an unfortunate clash with the term "definition" being used for both:
- the plugin definition, aka the annotations data on the TypedData classes
- the property definition, (aka the equivalent of a Field API $field), which Plugin API calls $configuration, and which typically includes a 'type' that refers to one plugin definition above.
TypedData::getDefinition() currently returns the second thing, while PluginBase::getDefinition() returns the first thing. It so happens that TypedData doesn't extend PluginBase right now, but most other plugin types do, so that's a bit confusing.

Sorry, this doesn't really help - just to say that it makes an explanation like #64 hard to follow ;-)

yched’s picture

At any rate, I don't see where "we are modifying field definitions on our way and pass them on to create e.g. a field item for a field" (from #64). Where does this happen ?

fubhy’s picture

@yched: That's the problem. We (mainly fago) are debugging this since yesterday and don't really know where that happens. Seems like we are triggering some sort of PHP bug somewhere down the road.

fago’s picture

Thinking about it more, variant a) isn't too helpful as it would not be able to account for any kind of definition overrides very easily. E.g. we could have one method that changes the 'list' property, but then we also need to change the 'class' property or 'list class'. Supporting any kind of change and still ensuring the change-defintiion is only created once is hard - e.g. would require keying by a hash of changes.

Variant d) (similar to c)
Alternatively, what about pre-computing all definitions of the entity-tree once and saving/caching them? We can identify them by property path, e.g. entity.field.0.value. We can easily keep track of the property path for created objects, thus for getting property Defintions of contained properties one can easily load in the central, pre-computed copy.

I think variant d) is the sanest and can benefit from pre-cached field definitions on the whole level. We'd have to pre-cache by bundle though.

yched’s picture

FWIW, I'm leaning towards turning Field API's $field and $instance structs into classed objects, as part of "field types as plugins" (well, "move Field API to CMI" already turns them into ConfigEntities anyway).
Those objects would include a reference to a "field type" handler object, that is a plugin.
$field->getFieldTypeHandler() (or something) gives access to the "field type" plugin.
Current plan is one instanciated "field type" plugin object per field.

fago’s picture

There's an unfortunate clash with the term "definition" being used for both:
- the plugin definition, aka the annotations data on the TypedData classes
- the property definition, (aka the equivalent of a Field API $field), which Plugin API calls $configuration, and which typically includes a 'type' that refers to one plugin definition above.
TypedData::getDefinition() currently returns the second thing, while PluginBase::getDefinition() returns the first thing. It so happens that TypedData doesn't extend PluginBase right now, but most other plugin types do, so that's a bit confusing.

TypedDataManager::getDefinition is the type defintion, as the TypedData types are plugins. Thus, it's a regular plugin definition. But yes - the property/field definition is not a plugin definition and used as plugin configuration in TypedData type plugins.

At any rate, I don't see where "we are modifying field definitions on our way and pass them on to create e.g. a field item for a field" (from #64). Where does this happen ?

We have a field definition for the overall field, thus with list => TRUE. Then, when it comes to creating item obejcts it creates objects with the defintiton list => FALSE. Furthermore, this patch introduces the ability to override keys of definitions down the tree, e.g. comment.new.0.value['class'] - what is necessary for doing a computed property - as in the comment-new example. Without that you'd have to re-create the field item class just for being able to modify the defintiion of the contained value property.

yched’s picture

TypedData::getDefinition is the type defintion, as the TypedData types are plugins. Thus, it's a regular plugin definition

Really ? Not with the code I'm looking at.
The factory provides $configuration (the field definition) to the TypedData constructor, which stores it as $this->definition, and this is what
TypedData::getDefinition() returns.

We have a field definition for the overall field, thus with list => TRUE. Then, when it comes to creating item obejcts it creates objects with the defintiton list => FALSE. Furthermore, this patch introduces the ability to override keys of definitions down the tree

But all of this happens in TypedDataFactory::createInstance() in the couple lines below // Allow per-data definition overrides of the used classes., right ? Those just figure out a $class, and do not alter either $type_definition or $configuration, so I don't get where the array copy happens ?

fago’s picture

>Really ? Not with the code I'm looking at.
Sry, I confused TypedData::getDefinition with TypedDataManager::getDefintion - that I was actually talkin about (edited). So yep, once you have an instance $typed_data->getDefinition() does not return the plugin definition but the data definition. The same definition you pass to typed_data()::create().

So yep, this has indeed potential for confusion, in particular once people are used to the plugin system. We might want to work on this or at least differentiate it clearly.

But all of this happens in TypedDataFactory::createInstance() in the couple lines below // Allow per-data definition overrides of the used classes., right ? Those just figure out a $class, and do not alter either $type_definition or $configuration, so I don't get where the array copy happens ?

Nope, this happens before the typed data object is created. See Field::createItem() or FieldItemBase::__construct().

fago’s picture

I ran a simple test:

$definition = array('type' => 'string', 'label' => t('label'));
for ($i = 0; $i<10000; $i++) {
  $data = typed_data()->create($definition, 1);
}

10.000 objects: 393ms, 8,6MB
20.000 objects: 803ms, 17,2MB

Memory seems to stem from CacheDecorator::getDefinition only. So it's not related to copying field defintions, but generally applies to getting plugin defintions. Looks like PHP's internal copy on write fails here.

fago’s picture

What seems to happen is that the line
return isset($definitions[$plugin_id]) ? $definitions[$plugin_id] : NULL;
results in array copies being created.

I tested returning the definition by reference, and the memory problem goes away.

fago’s picture

Also, I benchmarked object construction vs cloning. Trying a simple typed data string object, constructing 10.000 objects takes 487ms, cloning only 9,5ms. Cloning 1000 *fully* populated comment objects (whole tree) takes 506ms, whereas creating 1000 comments and *all* nested properties took 3,7seconds.

We surely could work on optimizing the later, but I don't think we can reach the 506ms of the clone operation. Considering that cloning is unbeatable faster for non-complex data objects also I think we should leverage that.

yched’s picture

What seems to happen is that the line
return isset($definitions[$plugin_id]) ? $definitions[$plugin_id] : NULL;
results in array copies being created.

I tested returning the definition by reference, and the memory problem goes away.

Well, the definition array returned by CacheDecorator::getDefinition() is a copy of a part of the statically cached definitions array, that seems normal. But that copy only persists as far as the code calling getDefinition() keeps it in scope, after which it should be cleared, so if there's a memory issue, isn't that on the calling side ?
I don't think we'd want to return the definition by reference ? I mean, if definitions were classed objects, that would be a normal behavior, but it would be quite unexpected when they are currently arrays...

fago’s picture

Sure, I was just using the reference for testing purposes.

> so if there's a memory issue, isn't that on the calling side ?
That's the way it should be yep. But eg. the following code snippet leads to 8,3MB of memory being taken by CacheDecorator::getDefinition().

$test = new Drupal\Core\Plugin\Discovery\CacheDecorator(new Drupal\Core\Plugin\Discovery\HookDiscovery('data_type_info'), 'typed_data:types');
for ($i = 0; $i<10000; $i++) {
  $definition = $test->getDefinition('string');
}

At the end of the snippet the memory is cleared though, so overal usage seems to be ok then. Not sure whether it's just a xhprof problem, but I think das_peter has already verified the problem with xdebug also?

sun’s picture

Can you try to replace that ternary operator in the return of getDefinitions() with a if/else control structure?

According to various resources, "ternary always copies"; e.g., see:

http://fabien.potencier.org/article/48/the-php-ternary-operator-fast-or-not
https://bugs.php.net/bug.php?id=50894
http://www.mail-archive.com/internals@lists.php.net/msg51926.html

EDIT: Those findings and the PHP bug report are very concerning. As we're introducing quite a couple of ternaries for new state and config values (primarily leveraging the new PHP 5.3 ?: syntax), we probably want to create a new issue to rethink those. :-/

fago’s picture

I checked #79 again, and the 8,3MB of memory are xhprof display problem is the memory should be cleared immediately once the definition is returned. The problem still exists though, as once you hold a copy of the definitions (e.g. putting it into $definitions[$i]) memory stays allocated. This is what happens to our patch - but php should do copy on write!

So #80 gets it! By removing the ternary operator memory usage of CacheDecorator::getDefinition() is 41kB!! sun++

sun’s picture

Filed the necessary follow-up: #1838368: Do not blindly use the ternary operator; it always returns copies of non-object values

The directly affected things here (plugins/typeddata/etc) should probably be fixed as part of this patch.

fago’s picture

ok, hashed out with fubhy and klausi how we could go for cloning best:
- drop that "nice addition" to have simple class overrides for computed fields right now. That makes less complexity for now, compared to creating one extra class with close to no content. Also, that keep code-flow easier to follow, so let's figure that better out in another issue.
- introduce a new namespace and property path into the typed-data context:

   *   - namespace: The namespace of the current typed data tree, e.g. for the
   *     the tree of typed data objects of an entity it could be
   *     entity.entity_type.
   *   - property_path: The trail of property names relative to the root of the
   *     typed data tree, separated by dots. E.g. 'field_text.0.format'.

- extend the typed data manager with a method getPropertyInstance($object, $property, $value), which can now derive the namespace and property path from the object and care about cloneing, i.e. create a prototye object per $namespace.$property_path combination and just clone for any subsequent call. That way all $comment->field_foo objects will be cloned from one prototype as well as all $comment->field_foo->value objects. The nice thing is, that this factory works for any typed data api consumer.

fago’s picture

FileSize
195.63 KB

Before we do so, let's see whether the latest code still passes tests.

fago’s picture

I'm not done yet, but here is a first interdiff. The interface changes are applied and typed_data()->getPropertyInstance() is there, but of course untested and unused. WIP, but feel free to review interface changes.

Berdir’s picture

+++ b/core/includes/bootstrap.incundefined
@@ -2522,7 +2522,15 @@ function state() {
  */
 function typed_data() {
-  return drupal_container()->get('typed_data');
+  // Use the advanced drupal_static() pattern, since this is called very often.
+  static $drupal_static_fast;
+  if (!isset($drupal_static_fast)) {
+    $drupal_static_fast['manager'] = &drupal_static(__FUNCTION__);
+  }
+  if (!isset($drupal_static_fast['manager'])) {
+    $drupal_static_fast['manager'] = drupal_container()->get('typed_data');
+  }
+  return $drupal_static_fast['manager'];

We should be careful with adding static caches around the container. What exactly is "very often" and from where?

If we're e.g. calling it from the storage controller, we should instead inject it in the constructor and store it as a reference there. That's even faster than having to call this function.

fubhy’s picture

We should be careful with adding static caches around the container. What exactly is "very often" and from where?

If we're e.g. calling it from the storage controller, we should instead inject it in the constructor and store it as a reference there. That's even faster than having to call this function.

These wrappers should be entirely removed at some point anyways (all of them, hopefully). Agreed with the second argument though. Injecting that sounds like a fine idea.

Berdir’s picture

Did some profiling, here are a few things that I noticed. A few things are worrying, but I'm also seeing some potentatial for actual improvements...

  • Noticed that each comment with a parent loads that comment again in template_comment_preprocess(), because comments have static cache disabled. My comment generation script did a $comment->pid = $i % 10; So I had ~290 comment_loads(). Removing static_cache = FALSE in the entity type defnition cut the amount of queries from 12xx down to 7xx. Not a problem introduced by this patch but obviously even slower here.
  • With that fixed, I have 30k call to EntityNG::__get(), making up 17% of the call time.
    • 30% (but just 2k calls, so somehow expensive ones) of the time is in rdf_comment_load(), not exactly sure why. But we need to find a solution for that rdf_mapping and rdf_data stuff anyway IMHO. I don't think can stay on the actual entity. 4k more calls in rdf_preprocess_comment() but less expensive ones. Disabling rdf.module and we're down to 22k calls to __get() :(
    • Drupal\comment\FieldNewValue::getValue() is second-slowest caller, with just 600 calls. Both this and rdf_comment_load() call getTimestamp(), could that be related to it being slow?
    • Drupal\Core\Entity\EntityBCDecorator::__get() also produces a ton of calls to __get() and language(). Am I mistaken or does this access the $values/$fields properties through __get()? That would be.. not optimal :) Also, I guess we can extract $this->decorated->language()->language and make it a property of the decorator? This method makes up almost 6% of the whole call time and 80% of that is spent in language().
  • Many calls like this one: format_date($comment->created->value->getTimestamp()). We already discussed this before, but what happened in the meantime is that format_date() was converted to using DrupalDateTime, so it does internally a new DrupalDateTime($timestamp)->format() (conceptually). Which means we convert from DateTime to timestamp and then back into a DrupalDateTime object. Not necessarly here, but we really should change our TypeData Date class to *somehow* use DrupalDateTime() (extend or wrap, not sure) and avoid this.

Note: I did this on a fast desktop machine, which only took 2,2s with the original patch and 1,7s when enabling the comment static cache ( I'm wondering how we'll do persistent caching with EntityNG, just serialize the whole thing? that is quite an overhead...). The relative overhead of function calls and magic methods could be bigger on slower systems I think.

Obversations not directly related to this but wanted to write it down:

  • Drupal\Core\Template\Attribute::__toString() really worries me. seeing 6k calls to this and it's right on the second place, even before database queries. And offsetGet() follows on 4th place :(
  • Also seeing 328 calls to config(), 300 of them in template_comment_preprocess(). Every single of them resulting in a complete reload of the whole user.settings config from cache. We *SO* need to fix this.
  • 1200 calls to dumb rdf wrapper theme functions. I assume they will go away too...
  • 10k calls to language()...
chx’s picture

#43 added a drupal_static_fast to typed_data() claiming performance. Are you sure it's a real performance improvement when you have a compiled container?? There's not a lot of code in Container::get() you are saving there but drupal_static is uuuugly and i would love to deprecate it for D8 and remove for D9.

fago’s picture

Ad #89 and

We should be careful with adding static caches around the container. What exactly is "very often" and from where?

Very often, is about ~20.000 times on my test-page. So yes, this was a real improvement But I agree the fix was wrong, we should go for DI instead, yep.

Noticed that each comment with a parent loads that comment again in template_comment_preprocess(), because comments have static cache disabled. My comment generation script did a $comment->pid = $i % 10; So I had ~290 comment_loads(). Removing static_cache = FALSE in the entity type defnition cut the amount of queries from 12xx down to 7xx. Not a problem introduced by this patch but obviously even slower here.

Exactly, this makes use testing 500 entity loads instead of 300 on my test page - ouch. I do not think we shuold touch that in this patch as comments field cache has been disabled to avoid lots of comments filling up memory, i.e. I think this needs its own discussion.

Disabling rdf.module and we're down to 22k calls to __get() :(

Interesting, this is actually quite a difference. However, rdf_comment_load() being in particular slow is what happens, because it is invoked very early and thus causes instantantiation of not yet created typeddata objects for individiual fields (they are lazy-instantiated). Thus, it is slow because typed data object creation is too slow - I think. Let's see how cloning will change it.

Also, I guess we can extract $this->decorated->language()->language and make it a property of the decorator?

I guess so, although the decorator would not react to default language changes then. That should be acceptable for the BC decorator though. Anyway, sure - the backwards compatibility decorator adds unnecessary overhead and makes things even slower, but in the end that's not important as it will go away. We should concentrate on the rest, i.e. why I've been disabling fields for my tests - see #56.

Not necessarly here, but we really should change our TypeData Date class to *somehow* use DrupalDateTime() (extend or wrap, not sure) and avoid this.

Oh the value is actually a DrupalDateTime instance already, thus we should be able to convert this to $node->created->value->format() then - good catch.

Berdir’s picture

Handling comment static cache in a separate issue sounds totally fine to me, opened #1844146: comment_load() call in template_preprocess_comment() re-loads comment parent entities because of disabled static cache.

Oh the value is actually a DrupalDateTime instance already, thus we should be able to convert this to $node->created->value->format() then - good catch.

Ah, good. One note there. I think value being a date time object is a bit confusing. For everything else, value is a primitve type: integer, string, .... It was very confusing for me at first when reviewing patches and seeing value->getTimestamp(). It almost looks as if value is always an object, which it isn't.

My suggestion would be doing something similar to how we handle entity references:
- Keep value as a raw representation of the data, in this case a timestamp
- Have a computed property called date or so, that is a DrupalDateTime object.

The problem with that is that we couldn't really support real date properties with this, but we could rename this type to timestamp and allow contrib to define a real date type. Both could expose a date object and the different raw value. This would also make the hardcoded date stuff in the controller unecessary, which I don't like as you know :)

fago’s picture

My suggestion would be doing something similar to how we handle entity references:
- Keep value as a raw representation of the data, in this case a timestamp
- Have a computed property called date or so, that is a DrupalDateTime object.

So if I understand the idea correctly, it would be to
- have "value" being the actually stored valued, i.e. the timestamp.
- have a computed 'date' property which then is of type 'date' and thus a DrupalDateTime object

That's actually an interesting idea. However, I don't think it makes much sense to expose the timestamp as integer + as date, it's mapping storage-specific internals to the TypedData structure what should be done by the object implementation. So that doesn't feel right to me, however I gave this some more thought and I must say I dislike having a date object wrapped by another object also. So why not just do it that way: Make timestamps our default, plain representation for dates, but use DateTime objects as wrapping TypedData objects - i.e. extend it to fulfill the TypedDataInterface? That way getting/setting would just work for timestamps, while we could do $typed_date->format('foo')?
However, this would mean changing the existing typed data 'date' and 'duration' types, so I'd prefer to do it in its own issue, discuss all options there and for now just move on with what we've?

fago’s picture

FileSize
241.46 KB

Here is an updated patch implemented cloning. This required an overhaul of the ContextAwareInterface to intrdouce the namespace and propertypath as suggested above. I think those are useful anyway though.
Cloning brought some speed and memory improvements, but the improvements are not as big as we'd need them to be :( Regarding date formatting please see #1844956: Optimize date formatting performance, this adds about 250ms to the current patch.

Here my current the still (bad) results:

head 8.x with fields
Overall Summary
Total Incl. Wall Time (microsec): 4,201,617 microsecs
Total Incl. CPU (microsecs): 4,128,258 microsecs
Total Incl. MemUse (bytes): 22,654,368 bytes
Total Incl. PeakMemUse (bytes): 30,100,528 bytes
Number of Function Calls: 584,964

head 8.x without fields
Overall Summary
Total Incl. Wall Time (microsec): 3,571,089 microsecs
Total Incl. CPU (microsecs): 3,464,217 microsecs
Total Incl. MemUse (bytes): 17,011,136 bytes
Total Incl. PeakMemUse (bytes): 19,965,704 bytes
Number of Function Calls: 542,455

comment patch with fields and BC overhead
Overall Summary
Total Incl. Wall Time (microsec): 9,452,702 microsecs
Total Incl. CPU (microsecs): 9,240,577 microsecs
Total Incl. MemUse (bytes): 51,182,272 bytes
Total Incl. PeakMemUse (bytes): 58,751,920 bytes
Number of Function Calls: 1,117,148

Comparison
Number of Function Calls 584,964 1,117,148 532,184 91.0%
Incl. Wall Time (microsec) 4,201,617 9,452,702 5,251,085 125.0%
Incl. CPU (microsecs) 4,128,258 9,240,577 5,112,319 123.8%
Incl. MemUse (bytes) 22,654,368 51,182,272 28,527,904 125.9%
Incl. PeakMemUse (bytes) 30,100,528 58,751,920 28,651,392 95.2%

Patch without fields (and so no BC overhead)
Total Incl. Wall Time (microsec): 5,845,283 microsecs
Total Incl. CPU (microsecs): 5,796,362 microsecs
Total Incl. MemUse (bytes): 41,753,968 bytes
Total Incl. PeakMemUse (bytes): 44,736,848 bytes
Number of Function Calls: 792,011

Comparison without fields and BC
http://vdev.local/xhprof/xhprof_html/?run1=50ab9b24a7fd6&run2=50aba4695e...
Number of Function Calls 542,455 792,011 249,556 46.0%
Incl. Wall Time (microsec) 3,571,089 5,845,283 2,274,194 63.7%
Incl. CPU (microsecs) 3,464,217 5,796,362 2,332,145 67.3%
Incl. MemUse (bytes) 17,011,136 41,753,968 24,742,832 145.5%
Incl. PeakMemUse (bytes) 19,965,704 44,736,848 24,771,144 124.1%

Further possible optimizations:
- Avoid duplicated properties in EntityNG $values and $fields to save memory.
- See whether objects can be more light-weight to reduce time/memory when cloning
- Remove the language array layer from arrays, e.g. $values created in mapfromStorageRecords. Brings ~2MB saved memory per array. However, we cannot do this for $values as BC-mode relies on writing values in that structure. We can do it now for EntityNG::fields
- Optimize EntityNG::__get
- Clone whole entities with exactly the mostly used fields. Would need custom per-entity-type optimization then.

Overally, I don't see huge boost in those optimizations :/ It boils down that all the involved objects in 3 layers add weight compared to a simple $comment->cid. I think we could save quite some time/memory if'd make the list-layer optional as we'd save 1 intermediate object (and thus method calls) for each of the single-valued fields. Still, we could easily support $comment->cid->value to work in the multi-value and single-value case.

Status: Needs review » Needs work
Issue tags: -D8MI, -needs profiling, -Entity Field API

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review

#93: d8_comment.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +D8MI, +needs profiling, +Entity Field API

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
244.22 KB

Yes, Views are entities ;-) So they have to fulfill the interface changes also, fixed that.

fago’s picture

Another thing that came to my mind is that we shuold be able to lazy-instantiate objects for the lowest level (i.e. individual field values/components) and have a field-item implementation that directly accesses the field item values in the magic get - if there is one - and falls back on going via the typed object of the value property only if there is no value in the item. That way, one could do $node->nid->value without having to instantiate the value object, it would be just done in case of other operations like validation?

It's an idea I'd have to think through in detail, but it could potentially save close to 1/3 of instantiated objects without changing the API.

das-peter’s picture

FileSize
246.82 KB

Here's a re-rolled patch against the latest 8.x.
Source is: http://drupalcode.org/sandbox/fago/1497344.git/shortlog/refs/heads/field...

Berdir’s picture

Another re-roll after removal of user picture and user data, code is in field-comment-berdir. That was ... surprisingly easy... :) Let's see if it works.

Berdir’s picture

Ok, so after figuring out that profiling considerably slows down the cost of function calls, I started developing a simple module that allows measures time and memory of a callback. Code is in http://drupal.org/sandbox/berdir/1853392, if you're interested, but it's really very basic :) Might be interesting to see some results for other machines.

Creation of 300 comments on a single node (perf/comments/generate/%node):

8.x

Calling perftest_comment_generate took 888.255 ms and used 526.2 KB memory.
Calling perftest_comment_generate took 1001.001 ms and used 442.4 KB memory.
Calling perftest_comment_generate took 983.893 ms and used 364.16 KB memory.

with patch:

Calling perftest_comment_generate took 1601.218 ms and used 5.09 MB memory.
Calling perftest_comment_generate took 1670.305 ms and used 5.09 MB memory.
Calling perftest_comment_generate took 1657.354 ms and used 5.1 MB memory.

(Note: Creation gets slower the more comments you have on a single node. So make sure to use a different node to have comparable results).

Viewing 300 comments of a node (perf/comments/view/%node, uses comment_node_page_additions())

8.x

Calling perftest_comment_view took 1602.681 ms and used 8.26 MB memory.
Calling perftest_comment_view took 1520.766 ms and used 8.26 MB memory.
Calling perftest_comment_view took 1567.494 ms and used 8.26 MB memory.

with patch

Calling perftest_comment_view took 2586.478 ms and used 22.15 MB memory.
Calling perftest_comment_view took 2543.187 ms and used 22.15 MB memory.
Calling perftest_comment_view took 2484.585 ms and used 22.15 MB memory.

Haven't checked how this relates to your latest profiling...

Also attaching an updated patch that removes the now unecessary user_attach_accounts() which causes things to blow up.

fubhy’s picture

Sorry, wrong issue :P

Berdir’s picture

Oh.

I'm seeing 4200 calls to config() on the comment view page due the recent regional config patch that went in. So those numbers are way off and probably not that useful. Will re-run once that is fixed.

Edit: Oh, "just" 2100 with 8.x. So we're doubling the config() calls with the patch currently.

Status: Needs review » Needs work

The last submitted patch, comment-entityng-1778178-101.patch, failed testing.

Berdir’s picture

With the patch from #1187726-52: Add caching for configuration / rework config object loading (Was: Memory usage and i/o from config objects), the numbers are 420ms vs 730ms. Haven't tried the no-fields patch yet.

I'm also seeing the exceptions from rdf, but I'm not really sure why they happen. Wrapping it in an if !empty() doesn't sound right to me.

fago’s picture

Status: Needs work » Needs review
FileSize
247.55 KB

Unfortunately I was not able to merge in your improvements, looks like the branch was accidentially re-based. I just merged 8.x and manually fixed merge conflicts - please check branch field-comment and do any further enhancements based upon this branch by using merging - not rebasing.

Also attaching an updated patch that removes the now unecessary user_attach_accounts() which causes things to blow up.

I think we should keep it for now, as it leverages multiple-loading. Attached patch does so.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
2.53 KB
247.91 KB

oh, I see user_attach_accounts() breaks things. Fixed that and the bc-decorator.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
250.65 KB
4.11 KB

and more test fixes... rdf module seems to mostly assume we are doing a full page view in entity-view-hooks, that's not valid any more for all the compact user account rendering. I think we really should do #1067126: Support rendering entities in page mode for being able to fix this properly.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
251.27 KB
1.91 KB
fago’s picture

I did some more research on this:

The additional time seems to stem mostly from the additional date objects created right now. As noted by berdir, config() calls a rather bad right now and the doubled date object creation doubles them also. Anyway, we need to work on typeData dates such that we avoid re-creating the objects and are able to re-use them during formatting. In order to move on with this and being able to properly focus on dates, I think we should move that to a follow-up and stay with "integer" dates for now.

Then, the additional memory seems to stem mostly from the objects itself. Definition cloning works well by now, but still the pure objects add ~25MB of memory. Looking at it closer, we should be able to save some memory by optimizing things to use less class variables. Still, I calculated that even with the simplest typed data objects (just having definition and value) we'd get about 12-13MB of memory for our 500 comment entities, so that's the theoretical optimum which we won't be able to reach in practice. Although we won't be able to reach that optimum, we should be able to get it a considerable improvement (hopefully below 20MB). For everything more, I think we have to consider changing the forced 3-level structure such that we get less involved objects. That's a considerable change I'd really prefer to in its own issue though.

fago’s picture

ok, I've done so. Changes:

8f2713d Shortened Field::setValue() a bit.
8f970b6 Micro-optimized EntityNG::__get().
ac3c7a0 Optimized Field __get().
993054b Fixed property path to not have a leading dot.
e7f2140 Revert "Leverage the ArrayObject class for the Field class." This is actually worse memory wise and not faster.
851d82d Leverage the ArrayObject class for the Field class.
ff3f0bd Improved handling of propertypath and namespace to use less memory and b
eeadefe Removed the namespace property for ContextAwareTypedData.
fcd29de Use integer_item fields instead of date_item fields for now.
yched’s picture

--- a/core/lib/Drupal/Core/Entity/EntityRenderController.php
+++ b/core/lib/Drupal/Core/Entity/EntityRenderController.php
@@ -37,29 +37,15 @@ public function buildContent(array $entities = array(), $view_mode = 'full', $la
-      $call = array();
-      // To ensure hooks are only run once per entity, check for an
-      // entity_view_prepared flag and only process items without it.
-      foreach ($prepare_entities as $entity) {
-        if (empty($entity->entity_view_prepared)) {
-          // Add this entity to the items to be prepared.
-          $call[$entity->id()] = $entity;
-
-          // Mark this item as prepared.
-          $entity->entity_view_prepared = TRUE;
-        }
-      }

I can't really say I'm fond of that $entity->entity_view_prepared thing, but not having it has a perf impact. Why are we removing it in this patch ?

catch’s picture

I've opened two comment issues, not directly from berdir's profiling but that reminded me about these two and I don't think they had their own issues before. They're not the fault of this patch at all but obviously making the impact worse since the time spent in entity rendering as a proportion of the request is much higher. Also bumped the comment_load($pid) issue to major.

#1857956: Comment update timestamp is prepared but not rendered by default
#1857946: Comment parent template variables are built twice

Just a note I'd be happy to commit this patch without all of these issues being fixed so that other conversions can happen in parallel to the optimization. We did the same for Field API in Drupal 7 and that worked well overall, and we might find other problems with taxonomy terms, node or user rendering compared to here. This patch is primarily finding issues with already-committed APIs (and just straight changes like the comment parent accessibility issue), so it's great that it's flushing all the issues out but it doesn't have to be responsible for fixing them, and it's mainly good to see these issues being found now instead of in 4 months or so.

Status: Needs review » Needs work
Issue tags: -D8MI, -needs profiling, -Entity Field API

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
Issue tags: +D8MI, +needs profiling, +Entity Field API

#114: d8_comment.patch queued for re-testing.

fago’s picture

I can't really say I'm fond of that $entity->entity_view_prepared thing, but not having it has a perf impact. Why are we removing it in this patch ?

It's not removed because of performance, but as it created troubles with the BC mode during entity rendering. It turns out it's not needed any more as the render-controller do no have that strange view multiple vs buildcontent single flow we had in d7 any more - so some uglyness creating troubles that can go away!

fago’s picture

FileSize
251.46 KB

Re-rolled the patch and fixed merge conflicts.

fago’s picture

Considering all recent learnings I think we should/need do #1868004: Improve the TypedData API usage of EntityNG what renders the current ContextAwareInterface::getNamespace() introduction of this patch unnecessary. Thus, I'rerolled the patch accordingly (I just added getType() to the Entity class without interface for now) and replaced it.

Also, I created the following follow-ups:
#1867888: Improve TypedData date handling
#1867880: Bring data type plugins closer to their PHP classes
#1868004: Improve the TypedData API usage of EntityNG

Apart from that I think we should do avoid instantiating field value objects by default, as this should bring some memory-usage and speed improvements. Need to figure out whether it can happen in a follow-up or not.

fago’s picture

Did some new benchmarks on the 300 comments page without xhprof and xdebug enabled:

8.x with fields
Page execution time was 1517.9 ms. Memory used at: devel_boot()=1.55 MB, devel_shutdown()=18.78 MB, PHP peak=29 MB.
8.x without fields/BC
Page execution time was 1215.02 ms. Memory used at: devel_boot()=1.55 MB, devel_shutdown()=14.25 MB, PHP peak=20.25 MB.

patched with fields
Page execution time was 2618.05 ms. Memory used at: devel_boot()=1.55 MB, devel_shutdown()=42.78 MB, PHP peak=54.25 MB.
patched without fields/BC
Page execution time was 1733.24 ms. Memory used at: devel_boot()=1.55 MB, devel_shutdown()=34.99 MB, PHP peak=43.5 MB.

Performance with the temporary BC-Mode/fields compared to HEAD:
+72,47% time , +127,80% memory (shutdown)
Performance without the temporary BC-Mode/fields compared to HEAD:
+42,65% time , +145,54% memory (shutdown)

So yes, this still makes entity operations more costly :/ I think we'll get some of the performance back by converting RDF module to leverage this stuff, making field API leverage it or once we can start lazy-loading things.

As noted before, for further direct improvements of this I think we want to avoid instantiating field value objects by default, as this should bring some memory-usage and speed improvements.This can be done without changing the API.
If that's still not enough we can think about making the first level of objects (=the field objects, holding the list of field items) what results in one less additional object per comment/field tested in the above benchmark - thus should improve performance quite a bit, but comes with a change to the API: You cannot rely on the fixed structure any more, but have to consider single vs multiple fields. (We could still make $entity->field->value work always as it is now.)

fago’s picture

I've also started separating the typed data api / EntityNG improvements from this issue so they are easier to review, see #1869250: Various EntityNG and TypedData API improvements. Please review them over there.

fago’s picture

I noticed that the change record could need some more details and examples, thus I worked over it. Please review/improve http://drupal.org/node/1806650.

fago’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

Component: comment.module » entity system
FileSize
140.79 KB
255.42 KB

I've updated the separated TypedData API improvements to work again since it has been broken by #1868274: Improved EntityNG isset() test coverage - see #1869250: Various EntityNG and TypedData API improvements

Re-rolling the combined patch and interdiff to the other issue here. Let's commit the typed data API improvements on its own first, then re-roll this on top of it so things are properly separated.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Component: entity system » comment.module
Status: Needs work » Needs review
FileSize
255.76 KB

Merged and fixed conflicts (due to #1827582: Load all nodes of comments at once).

Component: entity system » comment.module
Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
2.02 KB
257.59 KB
143.06 KB

Merged and fixed conflicts once again + converted the new comment rss plugins and tests to EntityNG (interdiff attached).

Status: Needs review » Needs work
Issue tags: -D8MI, -needs profiling, -Entity Field API

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review

#130: d8_comment.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
Issue tags: +D8MI, +needs profiling, +Entity Field API

#130: d8_comment.patch queued for re-testing.

fago’s picture

#1869250: Various EntityNG and TypedData API improvements is ready, re-rolling on top of the latest changes over there. Attached one complete patch including both issues and a diff to the other issue.

Status: Needs review » Needs work

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
143.06 KB

#1869250: Various EntityNG and TypedData API improvements got committed - rerolled against HEAD.

fago’s picture

Postponed #1877638: Rename getOriginalEntity() and remove duplicated functions from ConfigEntityBase on this.

Another related follow-up from the TypedData API and EntityNG improvements issue:
http://drupal.org/node/1877632

moshe weitzman’s picture

Looks like we should redo those benchmarks now that dependency is in?

fago’s picture

Issue tags: -needs profiling

The benchmarks won't change - it's the same code that has been split in two issues. It's just that half of the benchmarked code is already committed. We agreed to do more performance improvements as follow-ups see comment #125.

Berdir’s picture

Status: Needs review » Needs work

Nitpick review. It's great that we're back to a simple conversion here and all the complex stuff has already been dealt with.

+++ b/core/modules/comment/comment.admin.incundefined
@@ -109,41 +109,40 @@ function comment_admin_overview($form, &$form_state, $arg) {
+    $options[$comment->cid->value] = array(
...
-      'href' => 'comment/' . $comment->cid . '/edit',
+      'href' => 'comment/' . $comment->cid->value . '/edit',
...
-        'href' => 'comment/' . $comment->cid . '/translations',
+        'href' => 'comment/' . $comment->id() . '/translations',

+++ b/core/modules/comment/comment.api.phpundefined
@@ -147,7 +147,7 @@ function hook_comment_unpublish(Drupal\comment\Comment $comment) {
-    ->condition('cid', $comment->cid)
+    ->condition('cid', $comment->cid->value)

+++ b/core/modules/comment/comment.moduleundefined
@@ -147,8 +147,8 @@ function comment_node_type_load($name) {
+    'path' => 'comment/' . $comment->cid->value,
+    'options' => array('fragment' => 'comment-' . $comment->cid->value),

Is there a reason for sometimes using cid->value and sometimes id()?

+++ b/core/modules/comment/comment.admin.incundefined
@@ -307,11 +306,11 @@ function comment_confirm_delete($form, &$form_state, Comment $comment) {
-  comment_delete($comment->cid);
+  comment_delete($comment->cid->value);

$comment->delete() :)

Trivial stuff, I know, but we are touching it and will have to touch again because that wrapper function will be get removed.

+++ b/core/modules/comment/comment.moduleundefined
@@ -1710,7 +1731,7 @@ function template_preprocess_comment(&$variables) {
 
   // Preprocess fields.
-  field_attach_preprocess('comment', $comment, $variables['elements'], $variables);
+  field_attach_preprocess('comment', $comment->getBCEntity(), $variables['elements'], $variables);

Unrelated but I noticed in #1874300: Remove $entity_type argument from field.module functions that receive a single $entity that field_attach_preprocess() and the whole preprocess thing is I think the last field_attach function call that we have not yet unified through the various controllers.. (and some special cases like preview). Wondering how we could do that...

+++ b/core/modules/comment/comment.moduleundefined
@@ -1902,9 +1923,9 @@ function comment_action_info() {
-  if (isset($comment->subject)) {
-    $subject = $comment->subject;
-    $comment->status = COMMENT_PUBLISHED;
+  if (isset($comment->subject->value)) {
+    $subject = $comment->subject->value;
+    $comment->status->value = COMMENT_PUBLISHED;
   }
   else {

Not visible here but the else case uses direct sql queries to load the comment subject and manipulate the status. We should change that to load and use the entity API but not here.

+++ b/core/modules/comment/lib/Drupal/comment/CommentFormController.phpundefined
@@ -266,51 +255,47 @@ public function validate(array $form, array &$form_state) {
+    $comment->changed->value = REQUEST_TIME;

This is a bit strange, node forcefully updates changed in preSave(), this does it here and the storage controller only changes it if it's empty...

+++ b/core/modules/comment/lib/Drupal/comment/Tests/CommentActionsTest.phpundefined
@@ -35,28 +35,28 @@ function testCommentPublishUnpublishActions() {
-    $comment = comment_load($comment->id);
+    $comment = comment_load($comment->id());

That looks pointless :)

+++ b/core/modules/comment/lib/Drupal/comment/Tests/CommentContentRebuildTest.phpundefined
@@ -34,7 +34,7 @@ function testCommentRebuild() {
-    $comment_loaded = comment_load($comment->id);
+    $comment_loaded = comment_load($comment->id());

Same.

~10 more below. Probably search for $this->postComment and then get rid of them is the easiest way.

+++ b/core/modules/comment/lib/Drupal/comment/Tests/CommentTestBase.phpundefined
@@ -145,7 +145,10 @@ function postComment($node, $comment, $subject = '', $contact = NULL) {
-      return entity_create('comment', array('id' => $match[1], 'subject' => $subject, 'comment' => $comment));
+      $entity = comment_load($match[1]);
+      $entity->subject->value = $subject;
+      $entity->comment_body->value = $comment;

What the heck was that old code doing... o.0

I don't think we need to assign this here, we just loaded the entity?

+++ b/core/modules/field/field.attach.incundefined
@@ -7,6 +7,8 @@
+use Drupal\Core\Entity\EntityNG;
+use Drupal\Core\Entity\EntityBCDecorator;

Why do we need to use these two here? Possibly old code?

+++ b/core/modules/system/lib/Drupal/system/Tests/Entity/EntityTranslationTest.phpundefined
@@ -131,7 +131,7 @@ function testEntityLanguageMethods() {
-    // Try to get an untranslatable value from a translation in strict mode.
+    // Try to get an unstranslatable value from a translation in strict mode.
     try {
       $field_name = 'field_test_text';
       $value = $entity->getTranslation($this->langcodes[1])->get($field_name);
@@ -141,7 +141,7 @@ function testEntityLanguageMethods() {

@@ -141,7 +141,7 @@ function testEntityLanguageMethods() {
       $this->pass('Getting an untranslatable value from a translation in strict mode throws an exception.');
     }
 
-    // Try to get an untranslatable value from a translation in non-strict
+    // Try to get an unstranslatable value from a translation in non-strict
     // mode.
     $entity->set($field_name, array(0 => array('value' => 'default value')));
     $value = $entity->getTranslation($this->langcodes[1], FALSE)->get($field_name)->value;
@@ -156,7 +156,7 @@ function testEntityLanguageMethods() {

@@ -156,7 +156,7 @@ function testEntityLanguageMethods() {
       $this->pass("Setting a translation for an invalid language throws an exception.");
     }
 
-    // Try to set an untranslatable value into a translation in strict mode.
+    // Try to set an unstranslatable value into a translation in strict mode.

Uh, no? :)

fago’s picture

Re-rolling a patch over months seems to result in some left-overs - thanks berdir for catching them!! Everything not mentioned below is fixed in the follow-up patch.

Is there a reason for sometimes using cid->value and sometimes id()?

No. Most replacements were done by (not so simple) regexes which turned ->cid usages into ->cid->value automatically, while some rare manual replacements might convert things to ->id(). We could try to convert all usages to ->id() but I think that would be better solved in its own issue?

Unrelated but I noticed in #1874300: Remove entity_type argument from field.module functions that receive a single $entity that field_attach_preprocess() and the whole preprocess thing is I think the last field_attach function call that we have not yet unified through the various controllers.. (and some special cases like preview). Wondering how we could do that...

Good question - do you want to create an issue for it? :)

This is a bit strange, node forcefully updates changed in preSave(), this does it here and the storage controller only changes it if it's empty...

True - but I tried to avoid baking in unrelated fixes that are not necessary to get this done. I think we should clean this up by migrating the property to a (ng)date-field with default value and logic in the storage controller only.

fago’s picture

Status: Needs work » Needs review
fago’s picture

FileSize
140.66 KB

re-rolled after blocks as plugins went in.

Berdir’s picture

Issue tags: -D8MI, -Entity Field API

#144: d8_comment.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +D8MI, +Entity Field API

The last submitted patch, d8_comment.patch, failed testing.

fago’s picture

Status: Needs work » Needs review
FileSize
140.43 KB

Merged and fixed conflicts due to recent field API changes.

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Changes in the last interdiff look great to me. And it looks like it doesn't conflict with #1833334: EntityNG: integrate a dynamic property data table handling so we can safely RTBC them both at the same time.

Had another look at the performance with my perftest module and it renders a node with 300 comments (no threading) in 80-95ms, ~2,7MB memory usage. With the patch applied, it's 105-130ms and 5,5MB. That's still worse than before, but considerably better than the last patch I RTBC'd (*erm*) and we have some issues to further improve it. And we can hopefully get entity (render) caching in, which should help as well.

Also added the original Entity Field API issue to the bad guys list aka #1744302: [meta] Resolve known performance regressions in Drupal 8.

We need to get this in if we want to have a chance at finishing the conversion :)

effulgentsia’s picture

FileSize
20.17 KB
140.17 KB

We could try to convert all usages to ->id() but I think that would be better solved in its own issue?

I don't get why to punt that to a follow up. It's a simple search/replace that doesn't make the patch any bigger. Here it is. Trivial change, so leaving at RTBC.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, comment-entityng-149.patch, failed testing.

effulgentsia’s picture

Status: Needs review » Reviewed & tested by the community

Mostly nits, which I think would be fine to address in a follow up, so leaving at RTBC.

+++ b/core/modules/comment/comment.admin.inc
@@ -109,41 +109,40 @@ function comment_admin_overview($form, &$form_state, $arg) {
     // Remove the first node title from the node_titles array and attach to
     // the comment.
-    $comment->node_title = array_shift($node_titles);
-    $comment_body = field_get_items($comment, 'comment_body');
-    $options[$comment->cid] = array(
+    $node_title = array_shift($node_titles);

"attach to the comment" seems incorrect now.

+++ b/core/modules/comment/comment.module
@@ -1596,6 +1592,24 @@ function comment_preprocess_block(&$variables) {
+  // The account has been pre-loaded by CommentRenderController::buildContent().
+  $account = $comment->uid->entity;

I don't get this comment. Why does it matter whether it was preloaded or not?

+++ b/core/modules/comment/comment.module
@@ -1605,52 +1619,58 @@ function template_preprocess_comment(&$variables) {
-  $variables['signature'] = $comment->signature;
+  if (config('user.settings')->get('signatures') && !empty($account->signature)) {
+    $variables['signature'] = check_markup($account->signature, $account->signature_format, '', TRUE) ;
+  }
+  else {
+    $variables['signature'] = '';
+  }

Should we be directly accessing user.module config values from comment.module like this, or should this move to user_preprocess_comment()?

+++ b/core/modules/comment/comment.tokens.inc
@@ -122,79 +122,75 @@ function comment_tokens($type, $tokens, array $data = array(), array $options =
-          if ($items = field_get_items($comment, 'comment_body', $langcode)) {
-            $instance = field_info_instance('comment', 'body', 'comment_body');
-            $field_langcode = field_language($comment, 'comment_body', $langcode);
-            $replacements[$original] = $sanitize ? _text_sanitize($instance, $field_langcode, $items[0], 'value') : $items[0]['value'];
-          }
+          $replacements[$original] = $sanitize ? $comment->comment_body->processed : $comment->comment_body->value;

Seems like we're ignoring $langcode in the new code.

+++ b/core/modules/comment/lib/Drupal/comment/CommentFormController.php
@@ -266,51 +255,47 @@ public function validate(array $form, array &$form_state) {
-      $field = field_info_field('comment_body');
-      $langcode = field_is_translatable('comment', $field) ? $this->getFormLangcode($form_state) : LANGUAGE_NOT_SPECIFIED;
-      $comment_body = $comment->comment_body[$langcode][0];
-      if (isset($comment_body['format'])) {
-        $comment_text = check_markup($comment_body['value'], $comment_body['format']);
-      }
-      else {
-        $comment_text = check_plain($comment_body['value']);
-      }
+      $comment_text = $comment->comment_body->processed;

If the form's langcode is different than the comment's langcode, does the new code use the form's langcode as the old code did?

+++ b/core/modules/comment/lib/Drupal/comment/CommentStorageController.php
@@ -33,23 +33,46 @@ protected function buildQuery($ids, $revision_id = FALSE) {
+      $records[$key] = $record;

Since $record is an object, is this line necessary?

+++ b/core/modules/comment/lib/Drupal/comment/CommentStorageController.php
@@ -33,23 +33,46 @@ protected function buildQuery($ids, $revision_id = FALSE) {
+    // We have to determine the bundle first.
+    if (empty($values['node_type']) && !empty($values['nid'])) {
+      $node = node_load($values['nid']);
+      $values['node_type'] = 'comment_node_' . $node->type;
+    }
+    $comment = new $this->entityClass(array(), $this->entityType, $values['node_type']);
+
+    // Set all other given values.
+    foreach ($values as $name => $value) {
+      $comment->$name = $value;
+    }
+
+    // Assign a new UUID if there is none yet.
+    if ($this->uuidKey && !isset($comment->{$this->uuidKey})) {
+      $uuid = new Uuid();
+      $comment->{$this->uuidKey}->value = $uuid->generate();
     }
-    parent::attachLoad($comments, $load_revision);
+    return $comment;

Instead of duplicating so much of the parent method's implementation, can we just set $values['node_type'] and call parent::create()?

+++ b/core/modules/comment/lib/Drupal/comment/Plugin/Core/Entity/Comment.php
@@ -44,45 +44,51 @@
+   * @todo Rename to 'id'.
+   *
+   * @var \Drupal\Core\Entity\Field\FieldInterface
    */
   public $cid;

Where was it decided that we should rename id fields to id?

+++ b/core/modules/comment/lib/Drupal/comment/Tests/CommentInterfaceTest.php
@@ -67,49 +65,47 @@ function testCommentInterface() {
     $reply = $this->postComment(NULL, $this->randomName(), '', TRUE);
-    $reply_loaded = comment_load($reply->id);
+    $reply_loaded = comment_load($reply->id());

Looks like this patch changes postComment() to return a loaded comment, and assertions above this one adjusted to that. Is there a reason we need to explicitly load in this test?

+++ b/core/modules/rdf/rdf.module
@@ -617,28 +617,30 @@ function rdf_preprocess_user(&$variables) {
+  // If we are on the user account page, add the relationship between the
+  // sioc:UserAccount and the foaf:Person who holds the account.
+  if (current_path() == $uri['path']) {

How is this relevant to this issue?

effulgentsia’s picture

x-posted with #150.

effulgentsia’s picture

Status: Needs work » Needs review
FileSize
1.11 KB
140.15 KB

Fixed the syntax error from #149. CNR for bot. If green, will RTBC.

larowlan’s picture

Status: Reviewed & tested by the community » Needs review

fingers crossed... this is the last blocker for #731724: Convert comment settings into a field to make them work with CMI and non-node entities although a hell of a re-roll awaits me...

effulgentsia’s picture

Status: Needs review » Reviewed & tested by the community
larowlan’s picture

yes yes yes, great work everyone
note this is kind of major too once a real entity is using DatabaseStorageControllerNG #1885542: DatabaseControllerNG does not rollback failed ::save() operations

fago’s picture

Yep, let's get this in and address the remaining bits as a follow-up (of course I'd take care of it).

Some answers to points raised:

I don't get why to punt that to a follow up. It's a simple search/replace that doesn't make the patch any bigger. Here it is. Trivial change, so leaving at RTBC.

It's not that simple as a method call doesn't work out with isset(), but great you took care of that. Also, I was thinking it deserves more thought and maybe some discussion as sometimes it can be tricky whether it's right to use a method vs. the property the method points to (e.g. $node->title vs $node->label() for the title form element). However that should be a non-issue for id() as it doesn't appear in forms, not sure about templates? But we can figure out anything left in a follow-up.

Looks like this patch changes postComment() to return a loaded comment, and assertions above this one adjusted to that. Is there a reason we need to explicitly load in this test?

We needed it to return a proper entity object having the posted field values set.

I don't get this comment. Why does it matter whether it was preloaded or not?

It matters as it improves speed. Those two code snippets belong together and if you change one you probably want to take care of the other as-well, thus I thought a pointer helps here.

Should we be directly accessing user.module config values from comment.module like this, or should this move to user_preprocess_comment()?

We already depend on user module by using its entity, so depending on its config also shouldn't be a problem. The other way round by doing it in user module, we'd add a circular dependency back to the comment module. Thus, I think having it in comment module and have a comment -> user dependency only is the better and cleaner way.

How is this relevant to this issue?

This issue broke this code some time ago as it was accessing $comment->name. I'm not sure it's still necessary for the latest patch, but anyway I think the fix is right.

Seems like we're ignoring $langcode in the new code.

True. We should fix the token implementation to generally use the EntityNG translation API, such that it will work with other base-fields also once they are translatable. Opened #1885962: Comment tokens should use entity translation API.

Berdir’s picture

Issue link points to this issue :)

Yay for taking care of the id() issue. I think this is safe, because, unlike title/label, id() is a read only property that can not be altered by anyone. The discussions we had around title/label was that when you edit e.g. a comment or user, you want to edit the field name/title, not the label that might have been altered in whatever way (e.g. realname.module for user names).

fago’s picture

Issue link points to this issue :)

ops :D fixed.

plach’s picture

Priority: Normal » Major

This is blocking other stuff, so I'd say it is major.

fago’s picture

#153: comment-entityng-153.patch queued for re-testing.

webchick’s picture

Assigned: fago » catch

Looking at the performance data in #148, it represents a pretty significant regression. Despite the fact that catch has previously approved this, it seems like it's still good to check in one more time and ensure that's still the case. Assigning to him.

We should be careful too on #916388: Convert menu links into entities because it will probably introduce a similar performance regression (if not worse, since menu links are displayed all over the place on *every* page, not just on very popular pages with lots of activity).

YesCT’s picture

Issue summary: View changes

Updated issue summary.

YesCT’s picture

Issue summary: View changes

Updated issue summary. with issues blocked on this.

YesCT’s picture

Issue summary: View changes

Updated issue summary. moved blocks out of original report section

YesCT’s picture

@berdir let me know #293318: Convert Aggregator feeds into entities is not blocked on this anymore. "(as it reverted the usage of date field and just uses integers again for timestamps)" I'll update the issue summary. (aggregator could use some reviews though) [edit: see #113 and #114 ]

catch’s picture

With the performance regression, we more or less have a choice of reverting the new Entity API, or pressing ahead with conversions and trying to optimize it.

While the raw performance is slower, I'm also hopeful it's going to help with issues like #1868406: Allow render caching to be enabled safely in core and #1237636: Entity lazy multiple front loading which could be dramatic improvements. The question really comes down to whether we can get these done before we ship (or conversely whether we can ship without getting those issues or similar done due to the current state of 8.x performance).

fago’s picture

Yeah, we certainly can optimize things once this is in also. E.g. if we do, #1869574: Support single valued Entity fields and #1869562: Avoid instantiating EntityNG field value objects by default we would be down to 1 object being instantiated for a regular entity field access ala $node->title->value instead of 3 objects with the current patch. This should help quite a bit to reduce the memory footprint. If we manage to get to lazy-loading translations, even more - but that's to be figured out.

catch’s picture

Status: Reviewed & tested by the community » Fixed

OK. Committed/pushed to 8.x. I think this is covered by the existing change notice, but please correct me if there's comment-specific changes.

fago’s picture

Thanks! Nop, there is nothing comment specific so I agree that the general change notice should be fine.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Gábor Hojtsy’s picture

Retroactively tagging more for D8MI.

Gábor Hojtsy’s picture

Issue summary: View changes

Updated issue summary because aggregator is no longer blocked on this.