Problem/Motivation

Agree on a property naming pattern, so we can start standardizing on it.

Proposed resolution

(From #34, #55 ):
- Defined properties of a classed object should generally use $lowerCamel (This is already in coding standards.), unless something else would be most natural (That is, if they map directly to a DB field, then use the name of the DB field.).
- Non-property variables use $underscore_names. (This is already in coding standards.)
- Dynamic property names should be avoided whenever possible in favour of defined properties as those are far easier to debug. (This is already in coding standards, in practice if not in letter.) Dynamic property names on objects should use whatever is most natural. (That is, if they map directly to a DB field, then use the name of the DB field.)

- Classed objects that have a numeric unique identifier should provide access to that identifier via an id() method. The internal property name used for it is irrelevant for API purposes.
- Defined properties of a classed object that refer to the numeric unique identifier of that entity should be called id, i.e. $entity->id.
- Bare variables that refer to the numeric unique identifier of an entity should be in the form $entity_id. That is, $node_id, $user_id, etc.
- Defined properties that refer to the string unique identifier (machine name) of an object should be called name, i.e. $view->name, $filter->name.
- Bare variables that refer to the string unique identifier (machine name) of an object should be in the form $type_name. That is, $view_name, $filter_name, etc.
- Defined properties that refer to an entity other than the the object to which that property belongs should use a $entity_id (i.e. user_id, node_id reference name if the relation to the referenced entity is clear. That is, the user to which the entity belongs should be always referenced using the property user_id. (This improves DX as one does not need to remember slightly different relation names (user vs. author vs. owner vs. ..).
- Defined properties that refer to an entity other than the the object to which that property belongs should use a describing name $relation_id (i.e. parent_id, issue_id) if the relation to the referenced entity would not be clear else.

- The use of old-style abbreviated variables is discouraged. Eg, nid, uid, vid, etc. While shorter to type, they are less self-documenting and at Drupal's current scale frequently run into namespace collisions.

And taken from #42 about how the pattern should be applied:

  • Entities handled by Drupal modules should use Drupal's convention for property names (see above)..
  • Entities originating from remote systems should not mess-up with the data to match the convention, but integrate the data as is. So developers used to that data structures get the data as they know it.

Note: See #1301106: Rely on methods to access entity properties [policy, no patch] for the related discussion whether properties should be public accessible or not.

Remaining tasks

Write documentation.

User interface changes

None.

API changes

For now none. The first step is to agree on a pattern we can use when building new stuff. Changing existing stuff to match the pattern is the second step and should be done in other issues.

Original report by fago

In #1216094: We call too many things 'language', clean that up the question of how we name referencing properties came up. While we have nid, uid, cid, tid and others, field API has started doing entity_id, field_id, ... (while I remember we had etid earlier).

from #96:

And yes, field API uses field_id and entitiy_id. Is that where Drupal is heading? Then why not $language_id? We are trying to have consistency here, right? Drupal used to be consistent in Xid's. That is clearly not 100% applied to some systems, but is our most common pattern for universally used identifiers. We want language to be a universal thing like that. It will be eeeeeverywhere. If our new pattern is in the field API, then are we going to ever have user_id and node_id in core? Is that the new consistency we are striving for or is entity/field API an exception, not a rule?

I think its time to stop doing "*id" pattern. That pattern ease writing db-queries (easy joins!) for the cost of a non-intuitive drupalism one has to get to know first + they quickly run into collisions. We even have already collisions in core (sid -> session id, search item id).
Who is hand-writing lots of sql-queries today? The minority. Given the entity/field api, efq and views this is not really an issue, and with nosql finally obsolete. Instead we should focus on today developers needs and make sure they can easily understand what the property is about by just reading its name!

So let's agree upon a good pattern and start adopting it! I know changing everything now is out of question, but still agreeing upon a patter for anything new now helps.


Patterns that come to my mind are:

1) The old uid, cid, nid, pid, tid, sid, .. pattern.

2) The mentioned Field-API pattern: "field_id", "user_id", "node_id", "language_id"..
"user_id" is used to refer to a user entity, while the user would be still $user->uid or $user->id

The names are describing, but rather verbose. Also by default, we would end up with lots of _id properties in objects, e.g. assuming "comment" would use the pattern:

$comment->node_id
$comment->parent_id
$comment->user_id (or $comment->author_id?)

3) Use the general property name without an '_id' suffix, i.e. "field", "user", "node".
So "user" is used to refer to a user entity, while the user would be still $user->uid or $user->id.

The names are describing, but not as verbose as 2). The names do not tell us the properties contain only ids though. However, I don't think that's an issue as long it's consistent. Developer's would know every referencing property contains the id, not the object. And in the DB level it's pretty much obvious "node" would have to hold an id either.
So applied to the comment example we'd have

// Properties are holding ids.
$comment->node
$comment->parent
$comment->user (or $comment->author ?)



Personally, I like option 3). It would be also consistent with manually created fields, e.g. $node->tags doesn't hold objects either. Thoughts?

Comments

Gábor Hojtsy’s picture

Why author and not user? The current pattern is uid to refer to the user.

fago’s picture

yes, but actually it's the comment's author. I think 'author' is more describing than just 'user' and helps differentiating between potentially multiple, referenced users. So ideally, the property describes the relationship between the two entities/objects.

Gábor Hojtsy’s picture

Well, that change you did not explain in the OP. That sounds like your goals are

(a) more descriptive longer names (user_id)
(b) even more descriptive names when the value is a foreign key (author, author_id)
(c) shorter local names (id in user table)

That is changing every occurance of the IDs and to entirely different names depending on their role in a table/object (on possibly different names per table/object). The watchdog uid would be the "actor" or "actor_id" then? Sounds like this can really be a better model but at the same time can be debated on a case by case basis indefinitely due to getting rid of our clear standards and putting more vague "a name that works best" kind of rules in place instead which sounds like will lead to more inconsistency then the level of consistency it would achieve.

fago’s picture

Yes, in particular I think we need to make object properties more sound. As new developer I could imagine what $log_entry->actor or $log_entry->user is about, $log_entry->uid not.

Of course, we still should use consistent terminology. So we certainly go with $log_entry->user too, if we think there is going to be only one user anyway and the context of the relationship is clear. I agree that this might be more consistent, while it isn't applicable in all situations. E.g. $comment->parent is the only comment reference by default, but still $comment->comment would make no sense.

So we'd have to different variants of 3), while I think the main question now is whether we want to do something like it at all. -> I've updated 3) to show both variants.

fago’s picture

Issue summary: View changes

worked in gabor's input to use consistent names when possible

Anonymous’s picture

I think this is a great idea for DX.

I personally prefer option 2). Even though object properties like $node->tags may never hold objects, I can see how someone who doesn't know that rule would assume that $node and $comment->node would be the same kind of thing... a node object.

catch’s picture

Initial reaction would be $node->id and $user->id then $node->user_id. Node author vs user feels like the sort of semantics that fields are better at handling. I might have an author field but the user reference column would still be user_id then. Sometimes the user who posts the node is not the author of it. So looks like an option #2 vote as well.

sun’s picture

Title: agree upon a property naming pattern » Agree on a property naming pattern

In an overall platform-level scope and an "ideal world" scenario, the answer to this question is pretty clear to me:

  1. $entity->id is the serial/unique ID of $entity. Common REST(ful) pattern.
  2. $entity->user, if set, is the fully loaded object/entity of type user. See #585838: Consider a generic $entity->user property
  3. $entity->userId, if set, is the ID of a referenced user resource.
  4. Partially denormalized objects like $comment->uid, $comment->name, $comment->homepage do not exist. Somewhat related is #347988: Move $user->data into own table
  5. Incomplete stub objects like $user do not exist. See #361471: Global $user object should be a complete entity

(Note that 3. is currently not possible without manual workaround due to a bug in Drupal core: #1171866: Enforced fetching of fields/columns in lowercase breaks third-party integration)

ronald_istos’s picture

I like #7 and in general would go for more verbose for the sake of clarity. For me anything that moves us away from uid, nid, etc is great as the latter quickly becoming confusing - especially as you start adding your own domain-specific properties

sun’s picture

To clarify #7-3.:

$entity->user_id, and actually any other OO property that is using underscores, is an anti-pattern and totally off commonly adopted OO standards in the industry (not limited to PHP), which immediately makes them Drupalisms.

The only exception to that rule would be properties that are derived from module names and which are attached to objects; e.g.,

$entity->project_issue = [mixed]

But that's by design.

It is in line with the proposal in #7 though, since "user" actually maps to a module:

$entity->user = [object, hopefully soon of class EntityUser[Interface]*]
Crell’s picture

I would agree with #7 as well. Moderately verbose and self-documenting is much better than compact and obscure. Unless of course you like having functions named ioctl like the kernel folks, but I don't. :-)

It also gives a natural hint that $foo_id can be passed to foo_load($foo_id) to get $foo.

sun’s picture

3. $entity->userId, if set, is the ID of a referenced user resource.

The only "problem" with that, if you will, is that ->userId is the preferred property name on an object, whereas $userId on its own would be "violating" the current Drupal coding standards.

If you'd ask me, using ->userId on objects and $user_id elsewhere would be yet another inconsistency, so mayhaps, I'd rather loosen/soften the standards to allow for $userId variable names that are derived from object/entity properties.

But then again, you'd very possibly end up with both styles, $userId and $what_ever_custom_var, in your code, so that might be an even bigger WTF in terms of consistency.

catch’s picture

Hmm. I wouldn't mind $this->userId vs $user_id too much to be honest. We already have that code style between oo and procedural code (although not always applied to class properties consistently) so of things that could go wrong it doesn't feel too bad, or that we'd be introducing a new issue.

Crell’s picture

Frankly I still wouldn't expose classic object properties directly in most cases, so the $userID vs. $user_id point is moot. Remember, though, we're talking about SQL fields as well. I don't think anyone is proposing having a nodeId column in an SQL table, just node_id, which I am fine with.

sun’s picture

As the other issue makes obvious, I'm currently using lowerCamel casing in code and in schema to hold and store some externally delivered properties. I was skeptical about that variable/schema naming style first, but after "getting over it", I'm pretty happy with it now. Hence, my proposal in #7 actually was to use $entity->userId properties, $userId variable names, as well as 'userId' db schema columns.

Secondly, the idea of #7 is also to have a plain 'id' schema column in an entity's primary table (matching $entity->id). But only in the primary table; wherever referenced, it's [entityType]Id.

fago’s picture

>It also gives a natural hint that $foo_id can be passed to foo_load($foo_id) to get $foo.

Indeed! #7 sounds great to me too.

Though I'd suggest doing
$entity->get('user') to get the fully loaded user object instead of trying to access the conditionally set $entity->user.
I'm not proposing the more natural $entity->user() or $entity->getUser() though, as that would not work with Drupal's modular approach.

But then again, you'd very possibly end up with both styles, $userId and $what_ever_custom_var, in your code, so that might be an even bigger WTF in terms of consistency..

Yep, I must say I'm a bit worried about that. If we go down the road and have more and more OO in drupal, we'll have to convert variable names between both naming schemes more and more.
So at least, let's try to avoid unnecessary conversions and have db-column names == entity property names. So +1 for 'userId'column names.

pwolanin’s picture

Status: Active » Needs review
Issue tags: +Coding standards

For ease of loading/saving to the database, I propose we shift the coding standard to specify underscore everywhere for all object properties in Drupal 8.

We already have the field API fields names with underscores, so we are in a confused mess right now. There would be a lot of extra work (performance hit) if we have to coerce all property names between camel case in the code to underscore to save to the DB.

catch’s picture

I was thinking the other way and using camelCase in the db. Either way we want the database schema and properties to be the same, so it's a question of inconsistency or changing coding style.

$node->userId;

$user_id = $node->userId; // Note the inconsistency.
// {node}.userId

vs.

$node->user_id // Breaks current coding style.


{node}.user_id

$user_id = $node->user_id // Now this is consistent though.

I'd personally rather the $node->userId = $user_id; inconsistency - that's something we already have in core in a couple of places I think. We could also consider changing code style to allow $userId.

pwolanin’s picture

Further discussion in person - the proposal is simply to have an exception or variant of the standard so that Entity objects will use underscores for property names, even as we move beyond stdClass.

fago’s picture

$node->user_id // Breaks current coding style.

If we are usually doing $entity->get('user') or $entity->get('user_id') it would not be an issue though.
Also field names are using underscores, what won't change as they are user-input. So we'll have properties (being fields) using underscores on the entity objects anyway.

catch’s picture

That seems fine to me. I don't have a particular objection to $foo->some_thing if we're using it for properties that map 1-1 to something outside the class (like db table names or configuration).

plach’s picture

Crell’s picture

$user_id = $node->userId is not at all inconsistent. It's actually the standard PHP convention. Doing anything else would be inconsistent. :-)

For defined properties on an object that has a class, we should absolutely stick to the existing PHP coding standard. Dynamic properties on a classed object are generally a bad idea, but if we have to use them then we can't really control their format since they are, well, dynamic, and in most cases would be accessed dynamically, too.

That said, the better solution is to give up on the assumption that there is a 1:1 mapping from public object property to database column. I was just talking about that sort of "inappropriate intimacy" this week in my DrupalCon London presentation. :-) Design a proper interface (actual PHP interface), use it, and leave the rest as an implementation detail so that we can vary it without breaking the API every time we sneeze.

pwolanin’s picture

@Crell - we need a strategy to move us from the current entities to a new system. I think it will be a waste of time to try to conform to camel case naming in that phase. If we assume the properties will in the end be manipulated via the getters and setters, then as above, it won't matter in any case and we will have saved ourselves a lot of work (and a bunch of performance cost for string manipulation) by using property names that map directly to the storage.

Crell’s picture

As I mentioned at the London sprint, I'm OK with treating current entity properties as dynamic properties (they are now) and therefore an exception to the coding standards as an interim step. That doesn't mean we change the coding standards, it means we allow ourselves time to transition. I'm on board with that.

pwolanin’s picture

Status: Needs review » Fixed

ok, seems we have agreement then. I'll cross-link this from the g.d.o post.

Stalski’s picture

Is there already a consensus on the user vs author?

Gábor Hojtsy’s picture

Wow, fixed already. Wow!

This issue was opened as a followup to the already monster discussion in #1216094: We call too many things 'language', clean that up. Above it seems like you concentrate on the application of this to entities, but there are so many more things in Drupal that are loaded and manipulated structures based on database tables. Like filter configuration, contact forms, aggregator feeds, languages, etc. What's the applicability of the decisions made here to those?

Also, what's "the" g.d.o post? Did I miss it or was it not mentioned above?

Crell’s picture

Status: Fixed » Needs review

I agree, this is not fully resolved.

I will propose the following convention, therefore, so we have something refocused to argue about:

- Defined properties of a classed object should use $lowerCamel. (This is already in coding standards.)
- Non-property variables use $underscore_names. (This is already in coding standards.)
- Dynamic property names on objects should use whatever is most natural. (That is, if they map directly to a DB field, then use the name of the DB field.) Dynamic property names should be avoided whenever possible in favor of defined properties as those are far easier to debug. (This is already in coding standards, in practice if not in letter.)

- Classed objects that have a numeric unique identifier should provide access to that identifier via an id() method. (This was discussed at the London Sprint and I think we all agreed on this approach.) The internal property name used for it is irrelevant for API purposes.
- Bare variables that refer to the numeric unique identifier of an entity should be in the form $entity_id. That is, $node_id, $user_id, etc.
- Defined properties of a classed object that refer to the numeric unique identifier of that entity should be in the form $entityId. That is, $nodeId, $userId, etc.
- Bare variables that refer to the string unique identifier (machine name) of an object should be in the form $type_name. That is, $view_name, $filter_name, etc.
- Defined properties that refer to the string unique identifier (machine name) of an object should be in the form $typeName. That is, $viewName, $filterName, etc.
- Defined properties that refer to an entity other than the the object to which that property belongs should use a reference name. Eg, the user that authored a node is referred to from the node object as $authorId.

- The use of old-style abbreviated variables is discouraged. Eg, nid, uid, vid, etc. While shorter to type, they are less self-documenting and at Drupal's current scale frequently run into namespace collisions.

plach’s picture

@Crell:

The problem we are facing with language variables is that currently the language identifier is sort of a machine name, but it is a code and not a name. To fit into your proposed pattern it should be called $languageCode or $languageTag. Would this work?

Crell’s picture

plach: I believe so, yes. Properties that are neither a numeric ID nor a UUID nor a user-entered machine name would have no hard-defined name, other than "follow the coding standards on underscores and capitalization". (Arguably you could just do $language->code and $language->tag, but I leave that as an implementation detail for the language folks.)

Gábor Hojtsy’s picture

@Crell, @plach: well, in case of language, it IS a machine name. The language identifier is the machine name that Drupal and the rest of the world use to communicate about language information. However, it does not fit your example of $language_name or languageName because, the name of the language logically is English, German, etc. It is not the "title" of the language like on other objects which take machine names and display names.

pwolanin’s picture

@Crell - #28 seems to have some internal inconsistencies. Also, I, at least, am not going to write code to make any entity property at all use camel case. $user_id = $node->user_id is going to be it if we have a DB column named user_id.

I thought we had agreement on this in London? Entities are data, hence all properties are dynamic. All classes conform to just the EntityInterface, hence cannot have a defined properties, only defined methods.

Crell’s picture

What inconsistencies? (It's certainly possible as I wrote it off the cuff, which is why I'm asking for feedback. :-) )

If they're assigned dynamically externally, yes, we cannot enforce a format on them. If they're defined explicitly (class Foo { protected $someVar; }, they should follow coding standards and be lowerCamel.

An object of type Entity will absolutely have defined properties on it. That may not include db-column-derived properties (which we want to go away eventually as part of the pseudo-API), but they will be there. I'm thinking a reference to the controller for the entity, for instance. Those should not be in variable_case.

(Incidentally, that's another reason to avoid public properties as part of the API. Namespace collision with management properties like that, as it effectively blacklists "controller" as a DB column name.)

fago’s picture

If they're assigned dynamically externally, yes, we cannot enforce a format on them. If they're defined explicitly (class Foo { protected $someVar; }, they should follow coding standards and be lowerCamel.

Yep, that's exactly what we've discussed in London. In case of entity properties this means we'd stay with underscore names as they are defined dynamically.

However, the id-handling part of #28 does not comply to what we've discussed previously here and in London, as we wanted to go with just "id" and use commonly named properties like "user_id" instead of "author_id" in case the relation is clear (see discussion above).

Here is a try to revise #28 based on that:

- Defined properties of a classed object should use $lowerCamel. (This is already in coding standards.)
- Non-property variables use $underscore_names. (This is already in coding standards.)
- Dynamic property names on objects should use whatever is most natural. (That is, if they map directly to a DB field, then use the name of the DB field.) Dynamic property names should be avoided whenever possible in favour of defined properties as those are far easier to debug. (This is already in coding standards, in practice if not in letter.)

- Classed objects that have a numeric unique identifier should provide access to that identifier via an id() method. The internal property name used for it is irrelevant for API purposes.
- Defined properties of a classed object that refer to the numeric unique identifier of that entity should be called id, i.e. $entity->id.
- Bare variables that refer to the numeric unique identifier of an entity should be in the form $entity_id. That is, $node_id, $user_id, etc.
- Defined properties that refer to the string unique identifier (machine name) of an object should be called name, i.e. $view->name, $filter->name.
- Bare variables that refer to the string unique identifier (machine name) of an object should be in the form $type_name. That is, $view_name, $filter_name, etc.
- Defined properties that refer to an entity other than the the object to which that property belongs should use a $entity_id (i.e. user_id, node_id reference name if the relation to the referenced entity is clear. That is, the user to which the entity belongs should be always referenced using the property user_id. (This improves DX as one does not need to remember slightly different relation names (user vs. author vs. owner vs. ..).
- Defined properties that refer to an entity other than the the object to which that property belongs should use a describing name $relation_id (i.e. parent_id, issue_id) if the relation to the referenced entity would not be clear else.

- The use of old-style abbreviated variables is discouraged. Eg, nid, uid, vid, etc. While shorter to type, they are less self-documenting and at Drupal's current scale frequently run into namespace collisions.

fago’s picture

@machine-names:
Machine names like $view->name could easily collide with human-readable names called 'name'. We could use 'machine_name' instead, though I'd prefer on agreeing upon a different called property for human readable labels, such as 'title' or 'label'. That way we can stay with shorther variable names in code (no $filter_machine_name) too.

Gábor Hojtsy’s picture

@fago: a title of a filter? a title of a language? What would be your answer to this question: "What's the name of this language?" Would the answer be German or "de"? :) Similarly, is "What's the title of this language?" a natural question?

catch’s picture

Label seems fine though?

fago’s picture

Indeed, title doesn't work generally - though "label" should.

>"What's the name of this language?"
If we go with the "label" - human readable and "name" - machine readable distinction everywhere, this should become easy to answer: "de". However if we'd go with "machine_name" and "name" for the human-readable property the term "name" would en up being ambiguous.

Thus, I'm in favour of going with "label" for the human-readable name and "name" for the machine-readable name. Also, this suggestion plays well with the already existing concepts of entity labels and machine names.

sun’s picture

I like #34.

Classed objects that have a numeric unique identifier should provide access to that identifier via an id() method. The internal property name used for it is irrelevant for API purposes.

Technical implementation details aside, I think we want to use an ->id property, since a method would add many function calls throughout core and contrib modules in a single request.

Label seems fine though?

I'd love to standardize on ->id (ID), ->uuid (UUID), ->name (machine name), and ->label (human-readable title/name).

That's going to be a lot of work, but to me it looks very worth it.

fago’s picture

I'd love to standardize on ->id (ID), ->uuid (UUID), ->name (machine name), and ->label (human-readable title/name).

While I like that too, this would make the system less flexible and create some extra headache when integrating with remote systems. If the remote system has another name for the identifier or uuid, it's odd to have to force it into our pattern and wrangle with the data on load.

Thus, I think it's best to not *enforce* this pattern, but to start standardizing on it for all custom entity types. So all the code relying on a certain entity-type could be safely using "id", while still integrating with remote systems remains easy (and doesn't require unnecessary CPU for fixing the data).

Crell’s picture

That's exactly why using methods rather than bare properties is better. If a 3rd party entity uses "irn" or "accession_num" as its unique ID (I've worked with systems that use exactly those property names), we can still wrap it into an object with an id() method without having to rename or duplicate variables. That way, all existing code that works with entities still works. We can also handle local generation of uuids that way, too, on a 3rd party entity that doesn't have them already.

In practice I don't think we'll get much cost out of adding mostly-simple method calls. We're not doing anything deeply recursive here, so I'd call that a micro-optimization. The flexibility of going for a properly-built interface, however, is absolutely worth it.

fago’s picture

I agree with #41, however I don't think we should *enforce* using the methods.

Re-posting the suggestion from http://groups.drupal.org/node/171554#comment-581234

  • Entities handled by Drupal modules should use Drupal's convention for property names: $entity->id, $entity->uuid, .. If developers know with which entity type they are dealing, they can directly make use of that properties.
  • Developers generically dealing with entities can still use the methods defined by the interface, i.e. $entity->id(), $entity->uuid().
  • Entities originating from remote systems should not mess-up with the data to match the convention, but integrate the data as is. So developers used to that data structures get the data as they know it.
fago’s picture

Issue summary: View changes

fixed 2) to match 3)

fago’s picture

Ok, I've put #34 and #42 in the issue summary. Can we agree upon that?
If so, I guess we should start putting this into a handbook page.

Crell’s picture

No. It makes absolutely no sense to me to have BOTH an exposed id() method and an id property as a standard. If they're the same thing, then it's pure redundancy and will only serve to confuse people. If they're not the same thing, then that's confusing variable names and will only serve to confuse people. Plus, if for some reason id() does something more than return $this->id, then they will get out of sync. That is breaking encapsulation for no benefit.

Ibid for uuid, label, name, etc.

If we ever want to make even the slightest change to the internal implementation, that becomes an API break. Eg, with id() we can go back and forth between putting an "id" column, "nid" column, or "node_id" column in the database without breaking every module out there. If we expose that property as part of the API, then we cannot change that without either breaking the API or writing unnecessary glue code. (Really, we'd need to build in $this->id = $this->node_id? This is somehow better than just using the id() method in the first place? I think not.)

I see no convincing reason why developers can't be expected to use a self-documenting method. Yes, I *do* think we should enforce using the methods as there is a clear benefit (a real API interface) and a clear cost to not doing so.

Crell’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

If we expose that property as part of the API, then we cannot change that without either breaking the API or writing unnecessary glue code. (Really, we'd need to build in $this->id = $this->node_id? This is somehow better than just using the id() method in the first place? I think not.)

The property names are not part of the Entity API or interface. It are standards or if you will good practices to be used when defining our entity types - but they are not enforced in any way. So if you want to work with any entity, you have to use the methods of the interface.

Still, underlying to the method there needs to be the real property which is accessed and is stored in the database. This issue tries to establish the naming pattern for that. Whether this property then is public accessible or not is another discussion, which I think deserves its own issue. Let's not postpone this one for that - it has already enough bikeshed potential on its own ;)
I apologize for having that mixed in #42, I've removed that part from the issue summary. Let's discuss it over at #1301106: Rely on methods to access entity properties [policy, no patch].

fago’s picture

Issue summary: View changes

Updated issue summary.

fgm’s picture

By the same logic as #44 , we should probably also standardize matching methods uuid(), name() and label(), it seems.

Crell’s picture

Agreed entirely on #46. We want to encourage use of standardized methods over properties. They should probably all be added to the interface.

fgm’s picture

...which raises an obvious question: if we define the standard getters, who gets to define the setters ?

Granted, in "normal" use, these properties are read-only, but when creating instances, they have to be defined by the code somewhere in the class lineage. Do we want to let each class do it its way, or provide a modicum of standards here too ?

IceCreamYou2’s picture

I recently wrote a module that needed to extract the ID and created time from a $data object without knowing what type of object the $data was (D6). I had to do this:

/**
 * Attempt to extract the ID of an object.
 */
function example_get_id($data) {
  // See if any of a list of pre-determined IDs exists as a property of $the object.
  foreach (array('tid', 'cid', 'nid', 'sid', 'vid', 'fid', 'rid', 'aid', 'eid', 'oid', 'uid') as $key) {
    if (isset($data->$key)) {
      return $data->$key;
    }
  }
  // Look for a property that seems like it could be an ID.
  foreach ($data as $key => $value) {
    if ((drupal_strlen($key) < 5 && drupal_substr($key, -2) == 'id') || drupal_substr($key, -3) == '_id') {
      return $value;
    }
  }
}

/**
 * Attempt to extract the timestamp of an object.
 *
 * @param $data
 *   The object for which a timestamp should be retrieved.
 * @return
 *   A best guess at the timestamp for the relevant object.
 */
function example_get_timestamp($data) {
  // Practically everything uses either "timestamp" or "created." Userpoints uses time_stamp.
  foreach (array('timestamp', 'created', 'time_stamp') as $key) {
    if (isset($data->$key)) {
      return $data->$key;
    }
  }
  // If we don't find anything, return the current time.
  return time();
}

That's messy and gross, so I'm glad to see there's an effort to standardize this kind of thing.

I do want to raise one question: what should the standard be when an object needs to have two user IDs (or similar)? For example, both private messages and Facebook-style status messages have both a sending user and a recipient (which is also typically a user, but doesn't have to be). The current pattern is e.g. $pm->author and $pm->recipient, and $status->sender and $status->recipient, where ->author, ->sender, and ->recipient are all IDs. So my interpretation of the current standard is that these would become ->author_id, ->sender_id, and ->recipient_id, respectively, even though they all (typically) represent user IDs. Does that seem right? Does it make sense for our coding standards to require e.g. a ->sender_id() method?

fago’s picture

For entities we'll have methods, see #1184944: Make entities classed objects, introduce CRUD support. In short the patch there adds get($property) and set($property, $value) methods - as modules providing properties can't define methods on the entity class.

Anyway, methods or not, enforce them or not - that are all separate questions. Can we try to agree upon the naming pattern of properties first? Thanks! @see issue summary.

sun’s picture

Issue tags: +Entity system
bojanz’s picture

Okay, so I just went through this issue twice, and I'm a bit confused.

The proposal says:

- Defined properties of a classed object should use $lowerCamel. (This is already in coding standards.)

But then later on we talk about mapping columns directly from the db.
So currently we have a Comment entity class, and it has the columns explicitly defined.
If I have a Bojan entity class, and it has a first_name column, for which I then want to define the class property (following the core example), it would then be

public $first_name;

which breaks the quoted rule above. Or does that rule not apply to entity classes at all? If it doesn't, then the proposal should be clarified.
Cause having some some properties in one style and others in another ($isNew vs $first_name) would be really ugly.

I'm also against forcing people to use "id" as the id column. I think methods should definitely be used instead (id(), uuid(), label(), etc...).
Plus, this:

- Defined properties of a classed object that refer to the numeric unique identifier of that entity should be called id, i.e. $entity->id.

goes against the sentence above it:

- Classed objects that have a numeric unique identifier should provide access to that identifier via an id() method. The internal property name used for it is irrelevant for API purposes.

I was actually pondering forcing column names to be camelCase for a few minutes, but it's a lot of work and there's still the question of what to do with Field API fields (they would need to go in a array like $entity->fields['field_body'] which is ugly).

In one of these related issues someone noted how you need to be careful about column names not overlapping with defined class properties (for example, a column named "controller" if we decide we'll have $this->controller). A possible solution would be to mandate that internal (non-data) properties be prefixed with an underscore, so $this->_controller. I know it stinks of PHP4, but it would make matters clear.

Crell’s picture

The confusion stems from the fact that this is a thread to try and agree on a property naming pattern, which we have not yet done. :-) There is, I think, two schools of thought here:

1) Active Record: An entity object should be approximately an exact map to the structure in the data store. That is, column name => property name. It's essentially what D6 and D7 have for basic properties, but with a couple methods thrown on as well if one is so inclined. (D6 and D7 had no classes, so Active Record style was the only real option.) Since database tables are all lower_case, that means property names are lower_case.

2) Data objects: An entity object should be an abstracted API, with an emphasis on a clear and consistent interface rather than 1:1 mapping to the data storage. That is, column name => property name might happen, or might not, but that's not something someone calling entity_load() really cares or thinks about. In this case, using $entity->properties['property_name'] internally is totally fine, because someone calling the object won't be using the object properties anyway; they'll be using methods (id(), uuid(), etc.)

We still have not resolved that question. Disclaimer: I am firmly in the latter camp for reasons explained here and in other related threads, so I admit to a bias.

plach’s picture

plach’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

Issue summary: View changes

Updated issue summary.

fago’s picture

@bojanz:
Indeed, the proposal summary is a bit confusing. I've tried to improve the proposal based on how I understood the proposal:

- Defined properties of a classed object should generally use $lowerCamel (This is already in coding standards.), unless something else would be most natural (That is, if they map directly to a DB field, then use the name of the DB field.).
- Non-property variables use $underscore_names. (This is already in coding standards.)
- Dynamic property names should be avoided whenever possible in favour of defined properties as those are far easier to debug. (This is already in coding standards, in practice if not in letter.) Dynamic property names on objects should use whatever is most natural. (That is, if they map directly to a DB field, then use the name of the DB field.)

Does that make more sense?

Classed objects that have a numeric unique identifier should provide access to that identifier via an id() method as it is the case for entities anyway. The internal property name used for it is irrelevant for API purposes

Well, the internal name is irrelevant for the API, but for DX it still makes sense to call the internal property id and not e.g. my_cat ;) Still, if that point is confusing I think we could just remove it as it is enforced by entities right now anyway?

@crell #53:
Well, I think we somehow try to achieve both. We certainly want to go for the data objects. However still, it makes a good DX to use a natural mapping of entity properties to the stored data. So having that should be best-practice and so part of our best-practice property naming pattern, while it is not in anyway enforced or part of the API.

Crell’s picture

I am fine with a by-convention internal naming pattern for DX purposes. That makes a lot of sense. My main interest is ensuring that the external-facing interface (which is the one that breaks stuff if it changes or is violated) is consistent, abstracted, and method-centric, and that we don't follow the PHP 4 style of "throw random stuff onto $this and hope for the best".

fago’s picture

I fully agree with #56. It's really time to do away with throwing arbitrary stuff on entities :-)

So let's make sure the summary reflects this in a way it's clear for everybody. @bojanz: Does #55 help there?

bojanz’s picture

Yes, #55 is clear.

The point of my id comment was that if we're going to mandate $entity->id(), then it doesn't matter what the column is called. Contrib already kinda settled on "order_id", "message_id", etc, so I se no point in telling them that's wrong. It would be adding a silly recommendation with no real purpose.

fago’s picture

Issue tags: +policy

ok. Sounds like we have an agreement here? So I'm going to mark this RTBC (see summary).

bojanz’s picture

I think we should remove this:

- Defined properties of a classed object that refer to the numeric unique identifier of that entity should be called id, i.e. $entity->id.

I understand that it isn't a requirement for naming sql columns, but it's still confusing.

In fact, at the top we need to define what a defined and what a dynamic property is. The current text is hard to parse because that is missing.

sun’s picture

Issue tags: +UUID
andypost’s picture

Suppose better to follow id() label() bindle() mostly over the code - api freeze is near
Commited patch introduces a mess of variations #1818560-88: Convert taxonomy entities to the new Entity Field API

andypost’s picture

according #1391694-93: Use type-hinting for entity-parameters we use Interfaces so methods

Berdir’s picture

Yes, could should the methods whenever possible and correct (see my response in the taxonomy term ng conversion issue when it's for example *not*). We still need properties and a naming patttern for them.

jibran’s picture

Title: Agree on a property naming pattern » [Policy, no patch] Agree on a property naming pattern

Is this still under discussion?

Berdir’s picture

Tagging.

Berdir’s picture

Issue summary: View changes

Updated issue summary.

mgifford’s picture

This issue has been open for almost 3 years. What else needs to be discussed?

@bojanz - has @andypost addressed your concern? Can this move back to RTBC?

jhodgdon’s picture

Project: Drupal core » Drupal Technical Working Group
Version: 8.0.x-dev »
Component: base system » Code

Coding standards changes are now part of the TWG

mgifford’s picture

Can we have a time estimate on this? Would be great to know when the TWG will be deciding this.

jhodgdon’s picture

Rumor has it that the TWG is starting to meet. One of their first priorities will be to establish a formal Procedure for how to make coding standards changes, which may or may not involve this issue queue.... Patience!

mgifford’s picture

Thanks @jhodgdon - a good reminder.

tizzo’s picture

Project: Drupal Technical Working Group » Coding Standards
Component: Code » Documentation
Issue tags: +needs announcement for final discussion

Moving this issue to the Coding Standards queue per the new workflow defined in #2428153: Create and document a process for updating coding standards.

Also, I'm marking for final discussion as it appears this is ready.

pfrenssen’s picture

Just had a look at this. This seems mostly good to go, but I think a small update is needed to bring it into line with the current best practices:

In the years we have been developing D8 we've gotten a lot better at OO practices. We used to populate things like DB fields directly as a property on an entity, but we recognized this as a bad practice. Currently these are stored in a protected $entity->fields property so they don't pollute the root of the object, and potentially cause collisions. Even further, we now generally discourage to access properties directly, we now encourage to use the API instead: $entity->get('field_name');.

With our current understanding, if any data with a non-standard naming scheme is stored on your object, then you can put it in an array property like $my_object->myNonConformingData and provide a getter and setter.

So there is no actual need to have an exception for properties with a non-standard naming scheme. This means that we can remove these disclaimers:

"[Properties should use lowerCamel format] unless something else would be most natural (That is, if they map directly to a DB field, then use the name of the DB field.)."

Not having these disclaimers would make our standard less ambiguous.

fgm’s picture

pfrenssen, do you have a pointer to where "we" (who ?) recomment using entity->get() over magic properties ? I just heard the opposite (I guess from yched ?).

andypost’s picture

@fgm magic methods historically works slower also has different implementation #2563981: ContentEntityBase get and __get() works differently

tizzo’s picture

Status: Needs review » Needs work
Issue tags: -needs announcement for final discussion +Needs issue summary update

The coding standards committee reviewed this issue and we feel that we need a concrete and specific proposal of exactly what language should change in exactly which sections of the current published coding standards.

Marking needs work.