Off-shot of #1233394-44: [Policy, no patch] Agree on a property naming pattern to discuss whether we should enforce the use of methods to access properties.

#1184944: Make entities classed objects, introduce CRUD support is about to establish an EntityInterface including some getter and setter methods to access properties. Still the properties are public accessible though, e.g. via $node->title.

Let's discuss here whether we should keep them public or not.

Quoting Crell from #1233394-44: [Policy, no patch] Agree on a property naming pattern:

No. It makes absolutely no sense to me to have BOTH an exposed id() method and an id property as a standard. If they're the same thing, then it's pure redundancy and will only serve to confuse people. If they're not the same thing, then that's confusing variable names and will only serve to confuse people. Plus, if for some reason id() does something more than return $this->id, then they will get out of sync. That is breaking encapsulation for no benefit.

Ibid for uuid, label, name, etc.

If we ever want to make even the slightest change to the internal implementation, that becomes an API break. Eg, with id() we can go back and forth between putting an "id" column, "nid" column, or "node_id" column in the database without breaking every module out there. If we expose that property as part of the API, then we cannot change that without either breaking the API or writing unnecessary glue code. (Really, we'd need to build in $this->id = $this->node_id? This is somehow better than just using the id() method in the first place? I think not.)

I see no convincing reason why developers can't be expected to use a self-documenting method. Yes, I *do* think we should enforce using the methods as there is a clear benefit (a real API interface) and a clear cost to not doing so.

Comments

fago’s picture

Personally, I don't think we should make the properties protected and forbid accessing them directly.

I don't think encapsulation via the methods brings us much, as it doesn't match to the database behind. If node.module changes title database column to title2 it would not suffice to adapt the "title getter" anyway, as there still might be quite some queries that make use of the property / db column. I think it's desirable to have as much as possible directly query-able properties, thus map the entity object to the database as seamless as possible.
Given that the property matches the storage, I see no point left for requiring the usage of a method.

So here is a proposal:

  • Poperties are public accessible, ala $node->title. If developers know with which entity type they are dealing and it's desirable, they can directly make use of that properties.
  • As already suggested in #1233394: [Policy, no patch] Agree on a property naming pattern entity properties should naturally map to db-columns as far as feasible.
  • As there is no enforcement to follow the property naming pattern defined in #1233394: [Policy, no patch] Agree on a property naming pattern: Developers generically dealing with entities have to use the methods defined by the interface to access common properties like ids or label, i.e. $entity->id(), $entity->label().
sun’s picture

Isn't the pattern "only make it protected if you absolutely want to ensure that no one touches this shit from the outside" ?

I thought we had learned that the hard way by now?

That would imply that $node->title is public. However, one could possibly make the case for $node->nid and $node->uuid to be protected, under the assumption that only the CRUD controller is able to write and assign an ID or UUID (i.e., setting an ID/UUID requires additional storage logic, and an already assigned ID/UUID shouldn't change under normal circumstances).

Aside from that, I can especially get behind the argument of entity-type-specific code -- not everything written for Drupal needs to be totally abstracted.

Crell’s picture

Entity-type-specific code doesn't necessitate public properties. There is nothing whatsoever that says that a node object cannot have a ->publish() method (to toggle the published property) or comment cannot have a ->getParentEntity() method (to get the node (or other entity) that it is attached to).

However, consider that one of the things we want to do for Drupal 8 is make it possible to have a standard export format for entities, and not have to write Drupal-specific and case-specific one-off code every frickin' time we want to export something in a standard, not-serialize()ed-PHP way. That is not possible unless we can guarantee an interface for entities that we can rely on for such serialization and deserialization.

Embracing public properties on entities encourages people to continue to do what we've been doing for years, which is "eh, screw the interface, I'll just throw some maybe-kinda-namespaced properties on the object in hook_load()". As soon as you do that, not only does node_save() break half the time (we've all run into that old bug) but any such data now cannot be reliably exported without custom code for every single case of that.

Given the choice between having all entities sufficiently uniform that we have one export mechanism or having to write entity*module potential extra "oh yeah, and this thing" hooks (a la deploy.module in Drupal 6), telling module developers that they need to use ->label() in place of ->title seems like a very minor price to pay.

Crell’s picture

Additionally, to expand on the quote of mine that fago included in the OP, public properties hurt backward compatibility.

With protected properties and defined methods, we have the potential (not guaranteed, but potential) to improve the implementation without breaking the API. With public properties, it's almost guaranteed that we'd break the API.

webchick’s picture

I'm a bit rusty with OO PHP. Remind me again how a themer is supposed to know that methods like $node->label() exist, when var_dump($node) will not reveal them?

exlin’s picture

Objects can be printed out by using devel modules dsm($object);
Also http://php.net/manual/en/function.print-r.php

What comes to api methods, they will propably be documented.

neclimdul’s picture

Well, you'd use the actual definition provided by api.drupal.org and the class/interface definition populating it. Also, you're IDE if you're using one is likely to help you complete that depending on how well formed the code is. Because of the extract magic we use for templates sadly that completion probably won't happen though.

David_Rothstein’s picture

You can also use get_class_methods($node) to get a list of the available methods.

That said, an important difference to keep in mind is that for properties, var_dump() or print_r() or get_object_vars() gives you all the data at once. With methods, you can get a list, but you actually have to call them one by one to see what data is actually stored in them.

Switching to methods everywhere would have several potential benefits, but I'm not sure how it would work in practice? There are a lot of modules that add data to other modules' entities. Even in core, Menu module adds $node->menu, Book module adds $node->book, etc. We'd have to come up with an alternative for all of those that didn't involve properties, and it would have to be something that contrib modules could be convinced to adopt also (frankly, as long as PHP allows you to add arbitrary properties to an object, I think contrib modules will keep doing it). In the end, I'm afraid that a typical site's entities would have some data that could only be accessed via properties and some data that can only be accessed via methods, which wouldn't be so good...

Crell’s picture

Re #5: I would not consider "documentation exists for a reason" to be a particularly high barrier of entry, especially given "and now it's finally consistent". It's certainly a lower barrier of entry than "here's an array that may change out from under you, have fun".

Crell’s picture

Re #8: True, there's lots of disorganized legacy crap on node objects. I'd much rather be looking for an alternative they can move to and making it good enough that they do so than saying "oh, well, people are throwing crap on objects at random so we can't improve the API".

That's the reason Deploy in Drupal 6 had to use drupal_execute(), because "oh well, people are forgetting about APIs and just using FAPI as a node save API." The answer to that needs to be "if you bypass the API, your crap ain't going to work. That's your problem." But that means we need to have a proper API they should be using instead.

See #3 for another example of why that is important.

David_Rothstein’s picture

I suppose that one simple thing to do, not so much to solve the issue but to sweep it under the rug and make progress here, could be to change $node->menu to $node->data['menu'], $node->book to $node->data['book'] and similarly make $entity->data the standard dumping ground; then it can be accessed with something like $entity->getData(), $entity->setData(), etc.

This would actually be nice, since fields attached to entities are also dynamic so there'd have to be an $entity->getFields() or similar anyway. Then you'd have a nice structure where $entity->getData() and $entity->getFields() are the only two places to look for random dynamic data, and everything that is a "fundamental" part of the entity itself has its own dedicated API function.

exlin’s picture

If $entity->getFields() returns field data, i prefer it still to be object instead of array. But that's just me ;)

catch’s picture

$node->book should probably be an entity reference field.

$node->menu could also be that if menu links become entities.

$user is the worst example of allowing arbitrary crap (we didn't even kill $user->data yet). Lots of work to do there and elsewhere regardless.

However these don't in themselves prevent having methods access stuff, but it'd mean a generic setter and getter for random stuff which I'm not very keen on but it's there as an option, as David points out.

Fields - we will likely have an EntityFieldSomething controller that is responsible for handling field stuff on entity objects. That doesn't exist yet but it's been discussed in many of the longer-term entity discussions. We shouldn't expose the field data structure since it's a very unhelpful one ( LANGUAGE_NONE etc.), and the field API needs to be able to be responsible for rendering fields same as now (unless we're able to move that all down a level and use it for properties too, but it'll still not really be entities).

I posted ages ago about having the entities class instances not holding any data themselves and calling back to a storage class, see #1237636: Entity lazy multiple front loading - mainly for lazy loading from storage, the possibility to have an LRU cache that 'empties out' entities if lots are loaded but leaving the entity class itself instantiated so it can be used again later in the request without having to deal with that manually. Not many people liked that idea (or were convinced by the use csae), but it would rely on roughly what's being discussed here even if it only happens in contrib.

For sun's $node->title example. If I want to set $node->title and save it, it can be protected, but I need to do $node->update('title', $title); or something. The advantage of having that, is that we can then keep track of which properties changed in the entity object when it's saved, and pass that information along to hook implementations.

So

hook_entity_update(EntityInterface $entity, array $changes);

Then if you don't care about the keys in $changes, you can just ignore it, and no messing about trying to load the old entity from the database or $entity->original_entity etc.). (might not have the API like that, but having the entity object know what changed would be useful).

sun’s picture

It would be very wrong to assume that each and every $entity->mymodule_addition can be converted into fields. There are various modules that add meta/processing properties to entities (of any type) and have good architectural reasons for doing so. Name-calling that "legacy crap" is inappropriate.

The suggestion/outlook for being able to track changes sounds interesting, but I'm not sure (or actually doubt) whether that is going to be reliable and feasible for any and all possible properties/fields/data on an entity in the long run.

We need to be very careful to not add too much abstraction. A hypothetical $entity->update('whateverName', $newValue) is way too much abstraction in my book.

catch’s picture

It would be very wrong to assume that each and every $entity->mymodule_addition can be converted into fields. There

No I wouldn't assume that, but those were the two examples that were given.

There are various modules that add meta/processing properties to entities (of any type) and have good architectural reasons for doing so. Name-calling that "legacy crap" is inappropriate.

Well the use case is not 'legacy crap', but hook_node_load() { // Do anything at all. }; is at this point - that's only enforced by convention but it's an issue for entitycache module etc. let alone Drupal 8 changes if a module really does go wild in there.

Ages ago we had a BOF about the field API which got into volatile fields and properties - http://groups.drupal.org/node/133579. Better examples that don't make a good field would be something like comment statistics that are updated completely out of band.

A hypothetical $entity->update('whateverName', $newValue) is way too much abstraction in my book.

How is that 'way too much abstraction' compared to node making a copy of itself and assigning it to a property?

function node_save($node) {
  // Load the stored entity, if any.
    if (!empty($node->nid) && !isset($node->original)) {
      $node->original = entity_load_unchanged('node', $node->nid);
    }
fmizzell’s picture

I believe using accessor methods open a lot of more interesting options, for example I personally would like to have the same experience when setting a property from the GUI or when dealing with the property directly on the entity object.
so that
$entity->title = "Hello";
or
$entity->title("Hello");
whether you are using magic or not, would go through the same validation and processing than when doing it from the GUI.

Same thing for getter methods.

I also think that keeping track of changes, as catch suggested, its an useful improvement.

Damien Tournoud’s picture

I am on @sun#14 side.

That said, I also firmly believe that every modification to an entity should be intermediated. It's necessary if we want one day to properly handle concurrent modifications and staging of entities.

catch’s picture

Just to clarify #15. There are definitely things which are neither entity properties (in the sense of id, title, type), nor fields, and we should not try to shoehorn them into one or the other because it'll be messy - something like $node->votes for example is neither.

However for various reasons I don't think we should leave them as they are either. Knowing something about those things (in the sense that field instances allows us to find out what's connected to what) makes things like denormalization or alternative storage considerably easier, it's not only about structure of the node object.

fago’s picture

I suppose that one simple thing to do, not so much to solve the issue but to sweep it under the rug and make progress here, could be to change $node->menu to $node->data['menu'], $node->book to $node->data['book'] and similarly make $entity->data the standard dumping ground; then it can be accessed with something like $entity->getData(), $entity->setData(), etc.

No reason to introduce such a dumping ground - we already had $entity->get('property') and $entity->set('property', $value) methods in the entity patch, thus we'd have methods for accessing all the data of the entity. The question is whether the usage of them should be enforced or not.

Just to clarify #15. There are definitely things which are neither entity properties (in the sense of id, title, type), nor fields, and we should not try to shoehorn them into one or the other because it'll be messy - something like $node->votes for example is neither.

Yep, I think we should find a way to support randomly calculated/retrieved data like that via get() or at least move it out of the way/entity.

For sun's $node->title example. If I want to set $node->title and save it, it can be protected, but I need to do $node->update('title', $title); or something. The advantage of having that, is that we can then keep track of which properties changed in the entity object when it's saved, and pass that information along to hook implementations.

I don't think this would buy us much. Knowing that a property value has been updated does not imply it has really changed, so modules would still have to check for that. Instead, we could just move $entity->original to a regular hook parameter, such that modules could do:

function mymodule_node_update($node, $node_original) {
  if ($node->title != $node_original->title) {
   // ...
  }
}

@crell:

However, consider that one of the things we want to do for Drupal 8 is make it possible to have a standard export format for entities, and not have to write Drupal-specific and case-specific one-off code every frickin' time we want to export something in a standard, not-serialize()ed-PHP way. That is not possible unless we can guarantee an interface for entities that we can rely on for such serialization and deserialization.

Agreed. Still, that does not require making properties protected?

Embracing public properties on entities encourages people to continue to do what we've been doing for years, which is "eh, screw the interface, I'll just throw some maybe-kinda-namespaced properties on the object in hook_load()". As soon as you do that, not only does node_save() break half the time (we've all run into that old bug) but any such data now cannot be reliably exported without custom code for every single case of that.

I don't see a problem with using hook_load()? All we need is to know which properties are there, thus something like #1346220: Add entity property metadata. That's what RestWS already embraces in d7 based upon the entity property metadata of the entity api module.
However yes, module provided properties will never be enforced by an interface. Still, module provided properties are not evil, but a very foundational part of what makes Drupal Drupal. We allow modules to extend each other, thus we allow modules to extend entity types.

Additionally, to expand on the quote of mine that fago included in the OP, public properties hurt backward compatibility.

With protected properties and defined methods, we have the potential (not guaranteed, but potential) to improve the implementation without breaking the API. With public properties, it's almost guaranteed that we'd break the API.

Using per property getters and setters to enforce BC would mean obfuscating the real property, e.g. $node->getBody() might in reality return $node->the_new_body. Personally, I don't want to see Drupal to evolve to a level of such abstraction, as I believe it's going to result in bad DX. We won't be able to do that and to achieve the desired simple matching between database columns and entity properties any more...

I agree though that interfaces could help us to increase BC for more high-level functionality like publishing a node. While we could achieve that by a simple staying node_publish() function too, $node->publish() based upon a interface would certainly make sense and better communicate the "function" as being part of an "official" API interface.

Crell’s picture

OK, let's take a step back a moment as I think we're getting stuck in weeds.

There are a number of different things that have traditionally been added to an entity, usually nodes:

Universal properties: id, label, uuid, entity type (and maybe others)
We have an interface already that has methods for these, and therefore consistent names for them. I do not see any reason to not use those methods. The runtime overhead difference is almost unmeasurable, and they're more self-documenting and consistent across different entities, which is good for DX. And yes, I do believe we should make the properties they represent protected to help encourage that.
Fields
Right now there's no good API for these other than ->field_foo[][][][][]. As catch notes in #13 that needs a better API, probably a getField() or getFields() method or something like that. I don't know at the moment what the best API is there, but we should move toward those being accessed via methods, for reasons catch explains.
Entity Type-specific properties
Things like $node->published, $user->pass, etc. Core currently calls these "properties" (c.f. EntityFieldQuery), and they're public object properties. IMO, these should be migrated to a more standardized API. Maybe that's $node['published'] (using ArrayAccess), maybe it's $node->property('published'), maybe it's something else. I'm not sure, but we should try to come up with something more consistent, predictable, documentable, discoverable, and iteratable. (My big interest right now is exportability without n^2 hooks being involved, for professional reasons, but the point is valid regardless.) fago has suggested elsewhere, I believe, that these should all turn into Fields. While many of them likely could and should, I don't believe they all should.
Other arbitrary enhancement stuff
Mostly these are loaded via hook_load() and hook_node_load(), plus of course $user->data *shudder*. This is the "legacy crap" I was referring to (admittedly an over-zealous word choice, since up through Drupal 7 there are some things that cannot be done otherwise even though those hooks still make broken assumptions about form structure). Many of these, as catch notes, really ought to be turned into Fields, but not all. Again, I'm not certain what the best API is for these but I do believe we need a better API than "throw maybe-namedspaced properties onto the object and hope for the best".

What I am suggesting is that for the first set we standardize on "the methods you have are the Right Way(tm) to do it", and then open up separate issues to work out good APIs for the other three. (Eg, the Field access methods should be their own interface; an entity type that is fieldable would need to implement that interface, but an entity type that is not would not need to do so. That makes detecting whether a given $entity object is fieldable a trivial instanceof operation with no info lookups necessary.)

To Angie's point about themers doing var_dump($entity) to see what they can print, frankly they shouldn't be doing that anyway because there's a huge risk of printing something that hasn't been sanitized. Rather, we should be preparing sanitized values for them pre-template-file and injecting them in "safe" form. And finding and sanitizing all such values will be much easier if we know that there's only 4 places they could live, and in a consistent form.

To wit (pseudocode):

$vars['entity_id'] = (int)$entity->id();
$vars['entity_label'] = check_plain($entity->label());
$vars['entity_uuid'] = check_plain($entity->uuid());

foreach ($entity->getProperties() as $name => $property) {
    $vars['property_' . $name] = format_this_thing($property);  
}

if ($entity instanceof Fieldable) {
  foreach ($entity->getFields() as $name => $field) {
    $vars['field_' . $name] = format_this_thing($field);
  }
}

if ($entity instanceof ExtraStuffInterface) {
  foreach ($entity->getExtraStuff() as $name => $stuff) {
    $vars['stuff_' . $name] = format_this_thing($stuff);
  }
}

Poof, everything's available to themers in a sanitized, predictable way without any additional work on the part of someone writing properties, fields, or ExtraStuff(tm).

@fago:

We won't be able to do that and to achieve the desired simple matching between database columns and entity properties any more

I don't consider such matching desirable at all; at least not at the developer-facing-interface level. Remember, there's no reason to expect that all entities will live in SQL tables using core's per-field storage mechanism. Alternate storage systems that don't have "columns" should be easier in Drupal 8, not harder, and may well change per-entity. Some entities may exist on the other end of a web services call, so the concept of columns doesn't even make sense. Developers who assume that $node->nid maps to {node}.nid will have a much harder time upgrading when we decide that we really want node_id as the property and column instead, as has been discussed elsewhere #1233394: [Policy, no patch] Agree on a property naming pattern.

catch’s picture

Right now there's no good API for these other than ->field_foo[][][][][]. As catch notes in #13 that needs a better API, probably a getField() or getFields() method or something like that.

Actually there is a decent API for these compared to the raw structure, it's http://api.drupal.org/api/drupal/modules--field--field.module/function/f... but it's not as well known as it should be.

That API function completely hides the language key on entities, and even if we changed the structure (at one point we were discussing [$delta][$langauge] instead of [$language][$delta]) the api for field_get_items() could stay the same.

I would not support a hypothetical bc layer for something like a renamed property (say $node->title to $node->name but still accessed via ->title()), but something like field_get_items() that doesn't need to change anyway, that's great. And I'm still not convinced we want to store data in the actual entity class itself, I still feel like that should be devolved to a storage controller or similar (so the actual data structure might be $entity->storage->title).

I don't think this would buy us much. Knowing that a property value has been updated does not imply it has really changed, so modules would still have to check for that.

Well the $changed array would contain both keys and values. So you could still check the value as well, but you wouldn't need to iterate over two sets of object properties to do that. Either way the $original_entity should be based on the one we actually loaded at the beginning of the request, not one freshly loaded from the db. At the moment we have no locking for entity writes, so another process might have updated it already, making it not really the $original_entity.

If we only change things that are explicitly set with ->update(), then even if you have two processes updating the same entity at different times, you'd avoid a race condition in the case they were updating different things, which we currently don't account for in the slightest. I think Damien was getting at this above.

Crell’s picture

catch: Ah, you are correct. I've even used field_get_items() before and I'm still always forgetting it. Still, IMO that belongs as a method on the object, not as a separate utility function. That makes it more discoverable so that it's no longer "not as well known as it should be."

And I'm still not convinced we want to store data in the actual entity class itself, I still feel like that should be devolved to a storage controller or similar (so the actual data structure might be $entity->storage->title).

Right. That sort of internal organization within $entity should be divorced from its interface that other developers use. Whether the data is front loaded, lazy loaded, passed through to an internal storage object, etc. should not be something a module developer just printing a list of node IDs and titles should care about.

yched’s picture

An issue with $entity->label() = some random runtime value instead of $node->title = {node}.title is we lose queryability.
Then, how do you do an autocomplete widget (or a filterable admin list) when the set of values to display (runtime label()) has no predictable relation to any queryable source ? See #1194086-42: Real Name not working for user references (second paragraph).

[edit: well, that might be more related to the existence of 'label callback' than to the strict question of accessors vs properties]

I'm also convinced that there should be OO accessors on field values to avoid [][] madness. Not sure right now of the correct shape either. I'm also wondering whether field values as classed objects with an interface would help here (i.e $entity->field_foo instanceof FieldValueInterface, <code>$entity->field_foo->getValue()...)

fago’s picture

fago has suggested elsewhere, I believe, that these should all turn into Fields. While many of them likely could and should, I don't believe they all should.

Well, kind of but not really, - see #1346214: [meta] Unified Entity Field API.
What I'm suggesting is to build out a clear API for entities and their properties including fields, so one doesn't have to care whether some entity data is a) a property, b) a field or c) something else. Right now, I think the distinction between a) and b) is a major developer pain point. As developer, I should not have to care whether X is a module provided property or a user-defined field. The APIs around it should be the same.

We already had getters/setters aimed to cover both in the EntityInterface established by #1184944: Make entities classed objects, introduce CRUD support, but we decided to remove it for now as its language handling need some more thought, see #1277776: Add generic field/property getters/setters (with optional language support) for entities.

That are the currently proposed methods:

+  /**
+   * Returns the value of an entity property.
+   *
+   * @param $property_name
+   *   The name of the property to return; e.g., 'title'.
+   * @param $langcode
+   *   (optional) If the property is translatable, the language code of the
+   *   language that should be used for getting the property. If set to NULL,
+   *   the entity's default language is being used.
+   *
+   * @return
+   *   The property value, or NULL if it is not defined.
+   *
+   * @see EntityInterface::language()
+   */
+  public function get($property_name, $langcode = NULL);
+
+  /**
+   * Sets the value of an entity property.
+   *
+   * @param $property_name
+   *   The name of the property to set; e.g., 'title'.
+   * @param $value
+   *   The value to set, or NULL to unset the property.
+   * @param $langcode
+   *   (optional) If the property is translatable, the language code of the
+   *   language that should be used for getting the property. If set to
+   *   NULL, the entity's default language is being used.
+   *
+   * @see EntityInterface::language()
+   */
+  public function set($property_name, $value, $langcode = NULL);

Given that, you can access any entity property or field the very same way:

   $title = $node->get('title');
   $node->set('title', 'new title');
   $text = $node->get('field_text');
   $node->set('field_text', 'new text');

  // That said something like the following should render all data belonging to an entity:
  foreach ($entity->getProperties() as $name => $property) {
    $content['property_' . $name] = format_this_thing($property); 
  }

Also given that, one could go and convert properties to fields and vice-versa without breaking code. BC++.

I don't consider such matching desirable at all; at least not at the developer-facing-interface level. Remember, there's no reason to expect that all entities will live in SQL tables using core's per-field storage mechanism. Alternate storage systems that don't have "columns" should be easier in Drupal 8, not harder, and may well change per-entity. Some entities may exist on the other end of a web services call, so the concept of columns doesn't even make sense. Developers who assume that $node->nid maps to {node}.nid will have a much harder time upgrading when we decide that we really want node_id as the property and column instead, as has been discussed elsewhere #1233394: Agree on a property naming pattern.

Yep, I was just referring to the database as example. Anyway as discussed over at #1233394: [Policy, no patch] Agree on a property naming pattern, for DX we want the entity properties to match the data source. Still, we'd have the methods like $entity->label() for generic access.

So yes, $node->id() is more future-safe than $node->nid or the suggested $node->id. We can't have a method for $node->module_data though, so instead we'll have something along the lines of $node->get('module_data').

As said in #1 , renaming module_data would so probably break both ways of accessing the data. And even if $node->get('module_data') is more safe to use e.g. due to some not-yet existing lazy-load functionality, I don't see a cause to forbid accessing $node->module_data. If e.g. the implementing module knows it's there, why forbid accessing it? E.g. the module could do $node->module_data['entry'] what itcan't with a method.

If we only change things that are explicitly set with ->update(), then even if you have two processes updating the same entity at different times, you'd avoid a race condition in the case they were updating different things, which we currently don't account for in the slightest. I think Damien was getting at this above.

Yep, solving that properly would be very desirable. However currently we load the original entity after opening the transaction, thus the original entity really reflects the state before the update (at least for db entities). Still, the to-be saved entity might revert any interim changes as it might be based on a older entity.

catch: Ah, you are correct. I've even used field_get_items() before and I'm still always forgetting it. Still, IMO that belongs as a method on the object, not as a separate utility function. That makes it more discoverable so that it's no longer "not as well known as it should be."

Yep, that what's EntityInterface::get() is supposed to do.

Anyway, classed entities are *very* young right now. I'm all in for introducing methods but I don't think we can go and plan for requiring them right now, considering the major major shift it would be. I don't see us re-writing all of Drupal to get that in the d8 circle, in particularly as I don't see the big requisite to justify such a major shift.

Still, if closer to the release we've established classed entities and methods everywhere, things might look different and the shift might have become smaller. If reasons to do that change pop up then, why not. Right now I cannot see those reasons though, so I don't think we should plan for it to happen as we can't know.

Michelle’s picture

A big +1 to #24 from me. I'm currently working heavily with entities in D7 in writing Artesian and have, in just the last few days, been dealing with programatically getting/setting field api values attached to the entity. I ended up using the contrib Entity API wrapper functionality to do it but that feels very hacky to me. I'd much prefer something built in to do it.

Michelle

sun’s picture

Apparently, the patch in #1361228: Make the user entity a classed object exposes a shitload of custom add-on properties and data on the user account entity alone. :)

Crell’s picture

[edit: well, that might be more related to the existence of 'label callback' than to the strict question of accessors vs properties]

Yes, the direct mapping is already a landmine in Drupal 7 anyway. So we lose nothing by formalizing that.

Anyway, classed entities are *very* young right now. I'm all in for introducing methods but I don't think we can go and plan for requiring them right now, considering the major major shift it would be. I don't see us re-writing all of Drupal to get that in the d8 circle, in particularly as I don't see the big requisite to justify such a major shift.

http://groups.drupal.org/node/197583

Anything that's not accessed via a standard API is not going to be supported, and I am of the mind that if you go outside the standard API the response will be "well, sucks to be you". That means we need to 1) Have a standard API that's up to the task and 2) Encourage/convince/cajole/force/whatever people to use it. "Standard web services form of any entity", I think, counts as a "big requisite". (Although I admit to being biased as WSCCI leader.)

catch’s picture

The original entity reflects the state before the update, but what it doesn't necessarily reflect is the state when the entity was loaded before being updated. Because plenty can happen in between. I think Drupal commerce had to start a transaction in entity_load() to get around this or similar.

fago’s picture

Anything that's not accessed via a standard API is not going to be supported, and I am of the mind that if you go outside the standard API the response will be "well, sucks to be you". That means we need to 1) Have a standard API that's up to the task and 2) Encourage/convince/cajole/force/whatever people to use it. "Standard web services form of any entity", I think, counts as a "big requisite". (Although I admit to being biased as WSCCI leader.)

Agreed, however "standard API" does not mean it has to enforce using interfaces or methods. We can allow doing
$node->title = 'new title';
as part of our standard API. That won't hinder us from doing a great canonical entity representation.

Crell’s picture

It does hinder it if I have to call $node->title and $comment->subject and $user->name rather than just $entity->label(), because then I need code somewhere that deals with every type of entity. When Ryan creates a Commerce Product entity, the "export an entity to XML/JSON" and "create a new entity from XML/JSON" logic should not have to change one bit... not even the creation of a new mapping hook, not even by Ryan.

That's the standard I'm shooting for.

fago’s picture

Me too :-) However, the possibility to use $node->title does not hinder you from going with $entity->label()?

Crell’s picture

In that particular case, arguably not... unless the availability of $node->title = "foo" means that we never come up with a $node->setLabel() equivalent (or whatever it is). The absence of a standard setter method *would* hinder that.

At the micro-level, I suppose the question is "if we see a patch that uses print $node->title instead of print $node->label(), do we mark it CNW?" My position is Yes, because:

1) The latter is more consistent.
2) The latter is more backward compatible.
3) The latter, if used consistently and universally, is easier to read and therefore improves DX.
4) If the latter is used universally, then we can make $node->title protected (which we would need to do for point 2).
5) The latter, as a convention, lends itself toward encouraging the use of methods for other things than the basic properties (See #20), which is necessary for the "universal treatment of an entity is an entity is an entity", which is necessary for really solid export/import/service/syndication/stuff.
6) It reminds us that we need to consider setter methods, too, for everything. (Or whatever the abstracted setting mechanism is.)

plach’s picture

Edit: it was the right issue after all :)

Quoting Crell from #1233394-53: [Policy, no patch] Agree on a property naming pattern:

2) Data objects: An entity object should be an abstracted API, with an emphasis on a clear and consistent interface rather than 1:1 mapping to the data storage. That is, column name => property name might happen, or might not, but that's not something someone calling entity_load() really cares or thinks about. In this case, using $entity->properties['property_name'] internally is totally fine, because someone calling the object won't be using the object properties anyway; they'll be using methods (id(), uuid(), etc.)

I also think we should go for #2, for the reasons Crell has explained way better that I'd do :)

Such an approach would let us, for instance, confine all column-mapped properties into a $storage property. Standardizing on this in turn would let us define a generic implementation of an interface providing methods for storage-related property handling in the base entity class once and for all. Obviously subclasses would be free to override them [basically what catch was saying in #21]. This would have the additional benfit of avoiding possible clashes between "reserved" property names and schema column names.

In such a scenario we might even consider to move every dynamic property into one (or more) appropriate "bucket" (something like $entity->storage, $entity->field, $entity->property, $entity->youNameIt) and leave only defined ones at the "root" level. For instance, this would probably make easier for us to define an EntityInterface having parametrized methods to access dynamic properties ($entity->get('property_name')) and regular accessors for defined properties or thingies exposed as such ($entity->label()), with all the related advantages in terms of overriding and self-documentation.

See also http://groups.drupal.org/node/197848#data-structures for some related considerations regarding multilingual support. Summary in a following comment.

Crell’s picture

plach: Yep. That's pretty much what I was going for in #20.

(These two issues really shouldn't be as separate as they are. *sigh*)

plach’s picture

@catch:

That API function completely hides the language key on entities, and even if we changed the structure (at one point we were discussing [$delta][$langauge] instead of [$language][$delta]) the api for field_get_items() could stay the same.

There are some non-trivial implications about the usage of field_get_items(), quoting from #1260640: Improve field language API DX:

Moreover people often need to access the raw field values, which currently is very unconfortable because one has to explicitly provide a language, whereas usually Field API functions do not require the user to provide any to behave correctly. As a workaround someone is using field_get_items(), which is intended to provide the field raw items in the language they would be currently displayed: this works wonders in a monolingual context but totally sucks in a multilingual one, since it may return the wrong values in the case the business logic required to operate on a language different from the current one, but the code were not aware of the very concept of language.

@yched:

An issue with $entity->label() = some random runtime value instead of $node->title = {node}.title is we lose queryability.

This is true, OTOH labels don't necessarily map to storage columns: an event entity might have a computed label, since its date would need to follow the date format settings; a sport match entity might have a computed label holding the name of the teams and the end result, etc.
Perhaps every entity should define a queryByLabel() method encapsulating an alterable query. Any module responsible for a difference between the stored value (if any) and the value returned by label() could kick in and peform the needed adjustments to the query.

Quoting from http://groups.drupal.org/node/197848#data-structures:

Related to this matter is also the ability of translating entity labels. In D7 this is done by replacing entity label properties with a (text) field. In D8 we need to define more clearly what an entity label actually is: depending on the context it may be used as an administrative item, a navigational item or pure content possibily containing markup. Not every entity type might need the last two, while the first one is probably required. The default label implementation could be backed by a (translatable) property, but we might introduce the possibilty to designate a field as a "label provider" instead of the plain property. The Entity interface should expose a method to get the entity label as a plain string, it should be responsibility of the label provider to implement a to-string functionality. We should retain the capability of having computed labels, which might not even need to be translated. For instance, an event date might just need to be localized on the fly, but a value shared among all the entity translations should be enough. This flexiblilty should free us from requiring an actual storage column for the label, which would be available only where it actually makes sense. In this scenario, the pure content behavior could be implemented on a per-entity type basis, if ever needed.

@fago:

E.g. the module could do $node->module_data['entry'] what itcan't with a method.

Well, if we were using an ArrayAccess this would become $node['module_data']['entry'].

Agreed, however "standard API" does not mean it has to enforce using interfaces or methods. We can allow doing $node->title = 'new title'; as part of our standard API.

Any label callback might alter the result of entity->label(), hence having a public property would expose the risk of property and exposed value going out of sync.

To sum up: IMO a proper interface would solve far too many issues (albeit possibly introducing some :) to let public properties undermine it with their existence.
Crell++

plach’s picture

@Crell:

plach: Yep. That's pretty much what I was going for in #20.

Yes, I wrote that post before reading this issue, I just moved it from the other one since it looked OT and I did not want to just throw it away ;)

webchick’s picture

My recommendation would be:

- Add the methods to access "raw" properties, recommend using them as best practice.
- Keep the ability to access properties in D8 as a way of easing module/theme porting, and people breaking bad habits.
- Remove the ability to access properties directly in D9.
- People who don't keep up with the curve end up with a royal pain in the ass upgrade path from D8 to D9. Too bad, so sad. For others following best practices, though, it should be pretty painless.

fago’s picture

I agree with the arguments #32, using $node->label() is better than $node->title. I'm not sure all the arguments hold for e.g. a field though as we don't benefit from the label <-> property mapping as for label(). Anyway, I agree that insisting on methods makes throwing arbitrary stuff on entities look much worse, what is really a good thing.

Still, I think we have to start with the "best-practices first" approach anyway, until we know the methods fit every use-case well. Thus, the suggestion of #37 sounds good to me. :-)

Perhaps every entity should define a queryByLabel() method encapsulating an alterable query. Any module responsible for a difference between the stored value (if any) and the value returned by label() could kick in and peform the needed adjustments to the query.

What about just adding a label property always? Then it's queryable and set-able just as every property. We would just have to denote which property represents the label in addition to label() - whereby usually the property should be just called 'label', but we should not require that nor the existence of the property at all to ease integrating remote-data.
So even if the label is calculated, let's calculate it on-save so it's queryable like any other property + it can be easily altered by modules upon save.

sun’s picture

I'm ok with formalizing on methods to access the bare essential / universal entity properties (id, uuid, label, etc) and would agree that we should require to use the methods everywhere.

For me, that ultimately implies that there also have to be setter methods for the universal properties. Mere getter methods would be a bit pointless.

Given both, I'd ultimately be ok with making the universal entity properties protected. ...although I don't think it would be strictly required, and also slightly contradicts what we've learned so far; properties should only be protected if code must not be allowed to access or change it directly, and I don't see that condition being true yet (aside from OO purists reasonings). The situation might or would change if there would indeed be additional logic in the getter/setter methods.

However, before we can consider methods for entity-type specific properties, we need to discuss an entity property API in core, which is a separate issue. Without a concrete result and possibly code coming out of that discussion, it's pointless to debate optional or required usage of methods, or public or protected properties.

The same more or less applies to fields, although I really dislike that we are special-casing fields. Field is a module like any other module. (Period. In my book. So...)

For any other properties, we first need to resolve the challenge that looks impossible to resolve. We can only design and do one way or the other, if the entity API properly accounts for Drupal's modularity; i.e., entities can carry additions from installed modules, and it does not matter whether those are essential for representing the entity or not. An addition may be required to fulfill the intended behavior or business logic of a particular entity, so if we are not able to properly get/set/export/import/serialize/unserialize the additions, something's gonna blow up somewhere.

Damien Tournoud’s picture

Given the state of PHP OO, we are going to have to have a trampoline to access entity data, or entities will never be horizontally extensible anymore. (We could decide that this is a good thing, but horizontal extensibility is the essence of Drupal. If a non-horizontally extensible model is what we want, there are better ORM frameworks out there.)

Given the need of a trampoline (a generic ->get() and ->set() methods that are a router to something else), why not using the standard PHP facility of __get and __set?

This way, we keep the API as it is, and we support:

$node->label = "toto";
echo $user->label;
fago’s picture

The same more or less applies to fields, although I really dislike that we are special-casing fields. Field is a module like any other module. (Period. In my book. So...)

I cannot agree more. That's the overall goal of #1346214: [meta] Unified Entity Field API.
We've already planned those getters/setters, see #24.

Given the need of a trampoline (a generic ->get() and ->set() methods that are a router to something else), why not using the standard PHP facility of __get and __set?

Yep, certainly we need to keep horizontally extensibility or it's not drupal anymore. I'm hesitate to go with magic methods though, as all the magic makes it hard to track code-flow. So if there is a method involved, let's do it explicitly.

catch’s picture

@sun I don't think we want to special case fields in the entity API anywhere. But the field api has enough going on that it might what a separate router for its own stuff. i.e. I could see field_entity_load() adding $entity->field, which would be a class with its own methods.

@Damien I suggested __get() and __set() in the issue linked above, I don't see the benefit of entirely generic get()/set() methods compared to the magic, especially considering the current api, but others weren't keen.

plach’s picture

@DamZ:

Given the need of a trampoline (a generic ->get() and ->set() methods that are a router to something else), why not using the standard PHP facility of __get and __set?

Some time ago I went even further and proposed to use __call to provide magic accessors which would be individually overridable and constitute a proper interface:

<?php
// simplified pseudocode
function __call($name, $value = NULL) {
  $op = strpos($name, 0, 3);
  $prop = strpos($name, 4);
  return $op == 'get' ? $this->get($prop) : $this->set($prop, $value);
}

$entity->setFoo('bar');
$foo = $entity->getFoo();
?>

I also remember Crell arguing against this beign somehow old fashioned and suggesting the use of the ArrayAccess interface, using which would make individual overrding harder. However I'm not not sure how much we would actually need this capability TBH,

pounard’s picture

@sun

if code must not be allowed to access or change it directly

It also can be for pure readability and better visibility: we can make such properties protected only to hide them (and not really protect them from change) when their existence should not matter to the developper comprehension and daily usage of the code. In that regard, it makes the code more readable since good IDE will hide those from the autocompletion features, for example.

Another side goal of relying on methods is also that those same editors will provide good completion with methods. It will too with explicitely declared object properties thought, which doesn't sound that bad to use sometime (but I agree that for update notification we need to be able to track their changes, so methods are good); Actual Drupal model heavily rely on the fact that everything is higly dynalic: there is no self-documentation or code autocompletion possible at all, and data you can find on object is not predictable at all.

In that regard, magic __get() and __set() also provide a non predictable and non introspectable, and non self-documenting data model; we need helpers aside to be able to introspect the objects.

fago’s picture

@sun I don't think we want to special case fields in the entity API anywhere. But the field api has enough going on that it might what a separate router for its own stuff. i.e. I could see field_entity_load() adding $entity->field, which would be a class with its own methods.

I think that's bad if that results in a need to special case field-data for developers. Why should developers have to treat a property different just because it is a field? That would not be only a pain point, but forces properties and fields in separated and not interopable APIs.

Michelle’s picture

I think that's bad if that results in a need to special case field-data for developers. Why should developers have to treat a property different just because it is a field? That would not be only a pain point, but forces properties and fields in separated and not interopable APIs.

This is a very important point and one I'm running into right now in D7. I have custom entities in Artesian and it's simple to, say, set the post title, which is a field in the post table, not a field api field. But setting/getting the message, which is a field api field is turning out to be a serious PITA. I tried using the contrib Entity API wrappers but they don't work right so now I'm trying to figure out if I should write something custom or if there's something else I'm missing. It's a mess.

I'm not looking to get my questions answered in this thread as that would be off topic but I wanted to toss a real life use case into this abstract discussion and put in my +1000 for a unified system where I can just simply set or get a value to or from the entity and not give a rat's behind whether it's a property or a field api field or whatever. When I'm interacting with the entity from the outside, it should Just Work.

Michelle

Crell’s picture

I can go with #37. To wit:

1) Methods are the preferred way to do everything (read, write, import, export, dance) to entities.
2) We leave the raw data structure accessible for now, with big red "you probably don't want to do this" signs. Internal data structures are still structured around making the method-based access the primary interface.
3) Where we find places that we have to access them directly, that automatically implies a D9 API improvement patch. That covers us in case we miss something in D8, which realistically is probably going to happen.
4) In D9, we lock down the raw data structures. If you are still using raw data structures generally at that point, "dude, we warned you!"

This means having effectively a @deprecated API model in core, and sticking with it to remove it in the next version. I am very much OK with that, as long as we don't do what PHP itself always does and go back and forth on whether or not something is really deprecated 5 times. :-)

pounard’s picture

#47 seems to be a good compromise to start with, balanced with backward compatibility and efficiency to determine the real-world use cases.

Michelle’s picture

#47 sounds great to me as well. If the contrib Entity API module will follow this path as well, then there's a big chunk of people that will already be doing this in D7 and so it will come naturally in D8. Assuming none of this is directly backportable to D7 core, which seems likely given that D7 core doesn't have a lot of Entity stuff in, yet.

Michelle

webchick’s picture

Title: Rely on methods to access entity properties » Rely on methods to access entity properties [policy, no patch]
Status: Active » Reviewed & tested by the community

Dare I do this?

catch’s picture

That seems like a good compromise for accessing properties.

For setting properties I'd still like us to enforce a single route in Drupal 8 - so that we can properly sort out the $original_entity issue for entity updates - that's currently an area of massive inconsistency and contrib workarounds, and some kind of generic $entity->update($key, $value); entity->save(); feels like a good way to do that.

This means that doing $node->title = 'foo'; $node->save() would cease to work though (although also not throw an error), which doesn't seem pretty.

fago’s picture

yep, #47 sounds like being a good way to proceed! :-)

plach’s picture

This means that doing $node->title = 'foo'; $node->save() would cease to work though (although also not throw an error), which doesn't seem pretty.

I've seen a lot this sh*t in the Title module, that's why I'm wary about going with any variant of #47. However, if the majority of us is ok with it, I'll be too...

Crell’s picture

plach: Well, sooner or later we're going to need a better alternative to that anyway. We should figure out what that is.

catch’s picture

Status: Reviewed & tested by the community » Postponed
Issue tags: +revisit before beta

I'm still quite concerned about #47, so I don't want to close this as fixed. Instead I'll mark 'postponed' and 'revisit before release' - if we get everything done we want to with entities in core, we may not want to (or be able to) leave the public properties there, but let's see how it goes the next few months.

Michelle’s picture

Status: Postponed » Active
Issue tags: -revisit before beta

Based on an IRC conversation with catch, I am re-activating this. There are two issues that need to be decided here:

1) Will the properties be protected so they cannot be used outside of the methods? This is what catch was postponing and it will still need to be postponed once #2 is settled.

2) What form will these methods take? Saying we're going to use methods is great but will it be get($property) or getPROPERTYNAME() or that magic stuff that was discussed above? Even if we don't have a patch in this issue, the policy needs to be clearer so we know what we're doing.

Michelle

Crell’s picture

I believe the idea behind #37 and #47 is that:

1) Public for now, but marked @deprecated with the conscious intent that we implement methods to do whatever we would otherwise/previously need to do with direct access. We would probably not move them to protected in Drupal 8, but would use Drupal 8 as a testbed to verify that we didn't miss any reasonable use cases and then make them protected in Drupal 9.

2) Excellent question! But now that we have agreement that we want to use methods instead, we can discuss what form those should take. Probably in other sub-issues of #1346204: [meta] Drupal 8 Entity API improvements. Vis, "what we're doing" is "OK, go figure out what that better method/API looks like now that we know we want one".

fago’s picture

Status: Active » Postponed
Issue tags: +revisit before beta

Exactly - thus restoring the state of this issue.

For 2) first basic getters/setters are already proposed as part from #1277776: Add generic field/property getters/setters (with optional language support) for entities, what means having basic, language-aware get($property)/set($property) methods as outlined in #24.
More long-term thinking, we should definitely have methods that take care of loading referenced entities for us. Of course, feel free to open a separate issue for discussing that :)

Michelle’s picture

Ok, since the title of that issue doesn't make it clear that it's defining the solution to this issue, I'm copying from the patch for reference. This isn't yet committed so subject to change but at least gives people trying to keep up an idea of where things are going. :)


 /**
+   * Returns the value of an entity property.
+   *
+   * @param $property_name
+   *   The name of the property to return; e.g., 'title'.
+   * @param $langcode
+   *   (optional) If the property is translatable, the language code of the
+   *   language that should be used for getting the property. If set to NULL,
+   *   the entity's default language is being used.
+   *
+   * @return
+   *   The property value, or NULL if it is not defined.
+   *
+   * @see EntityInterface::language()
+   */
+  public function get($property_name, $langcode = NULL);
+
+  /**
+   * Sets the value of an entity property.
+   *
+   * @param $property_name
+   *   The name of the property to set; e.g., 'title'.
+   * @param $value
+   *   The value to set, or NULL to unset the property.
+   * @param $langcode
+   *   (optional) If the property is translatable, the language code of the
+   *   language that should be used for getting the property. If set to
+   *   NULL, the entity's default language is being used.
+   *
+   * @see EntityInterface::language()
+   */
+  public function set($property_name, $value, $langcode = NULL);
David_Rothstein’s picture

A get() method by itself is not enough for the purposes of the current issue, since as discussed above there's no way to know everything that might have been set by various modules. If you can't use the API to get everything out of the entity, then you can't deprecate the public properties. So we would need to introduce at least one of these two things:

  • Methods like $entity->setRandomData() and $entity->getRandomData() (better named, of course!), which would be a defined dumping ground for modules, like I described in #11.
  • Something like $entity->getProperties(), which is assumed in fago's code snippet from #24 and which would return all properties regardless of where they came from. This would basically mean the defined dumping ground is mixed into the top level.

I'm not sure there's another issue appropriate to leave that comment in, so leaving it here for now.

andypost’s picture

Status: Postponed » Needs review

we need special kind of access to delete NodeType and Language entities, that should be done via the same interface, suppose that better to build this within service with implementation over state as node type have now. related #2084567: Implement a EntityLockedInterface and service to allow locking entities

tim.plunkett’s picture

tim.plunkett’s picture

Issue summary: View changes

added crell's post

catch’s picture

Switching tag.

xjm’s picture

Status: Postponed » Closed (duplicate)
Issue tags: -beta target

This issue was marked as a beta target for the 8.0.x beta, but is not applicable as an 8.1.x beta target, so untagging.

Also, I believe this policy was already adopted and mostly resolved in child issues of #2016679: Expand Entity Type interfaces to provide methods, protect the properties, so marking it as a duplicate of that.