Right now, the term "property" is still overloaded. We generally refer to an entity property as property, but then the things contained in TypedData structures are also called properties. That results in some confusing methods in the property item classes, e.g. get() there takes a $property_name. It means the name of one of the item values, but having "property" in there makes one think of entity properties. In particular, as "entity properties" are the mental context, i.e. as a developer you are doing something with entity properties so in that context "property" let's you think of an "entity property".

That said, I think we should find another term for the things contained in data structures.

Comments

effulgentsia’s picture

I had also shared this concern up until last week. However, I looked around at what terminology other ORM systems use, and found oData to map most closely to our model. Quoting from http://www.odata.org/documentation/overview#EntityDataModel:

The central concepts in the EDM are entities and associations. Entities are instances of Entity Types (for example, Customer, Employee, and so on) which are structured records consisting of named and typed properties and with a key. Complex Types are structured types also consisting of a list of properties but with no key, and thus can only exist as a property of a containing entity or as a temporary value.

I'm currently leaning towards making TypedData API use the same terminology as oData's EDM: i.e., use "property" to refer to both properties of entities and properties of non-entity structures. Additional terminology changes that would bring TypedData closer to oData's EDM (at least to my understanding, correct me if I'm wrong):

Additionally, oData currently uses SimpleType and Primitive interchangeably, but there's a proposal to differentiate them to allow new "SimpleType"s (scalars) to be made by extending (but not composing) primitives. Even though this isn't part of the oData standard yet, I think the concept (and terminology) is something we might want in TypedData API.

Anyway, not sure if all this is appropriate for this issue. Please move it to other issues as you see fit, but what do you think about using oData terminology?

As a side note, there's also an oData PHP library, but it's geared towards mapping specific business object classes to the oData model rather than what TypedData API and EntityProperty API are doing, which is creating standardized data object interfaces and base classes. However, there might be some code there that's of interest to us.

fago’s picture

I had also shared this concern up until last week. However, I looked around at what terminology other ORM systems use, and found oData to map most closely to our model. Quoting from http://www.odata.org/documentation/overview#EntityDataModel:

Interesting thanks, I think this comes pretty class to what we do here.

As a side note, there's also an oData PHP library, but it's geared towards mapping specific business object classes to the oData model rather than what TypedData API and EntityProperty API are doing, which is creating standardized data object interfaces and base classes. However, there might be some code there that's of interest to us.

Yep, I agree it looks like it's more about mapping. A point that makes our implementation quite different I think is extensibility as we allow the extension of data types by implementing your own wrappers while oData don't. For oData that's obviously a nogo as they need a portable standard while we can leverage what PHP gives us.

Additional terminology changes that would bring TypedData closer to oData's EDM (at least to my understanding, correct me if I'm wrong):

WrapperInterface -> PropertyInterface (or TypeInterface?)
StructureInterface -> ComplexTypeInterface
ListInterface -> CollectionInterface (in prior oData proposals, this was called Bag, but got renamed to Collection (http://social.msdn.microsoft.com/Forums/en-US/adodotnetdataservices/thre..., http://www.odata.org/media/30002/OData.html#updateacollectionproperty)

I'm not sure that the comparison flies:

  • First off, PropertyInterface instead of WrapperInterface is where it all started - but in our case a wrapper may stand a lone as well. oData seems to require a primitive to be wrapped in an entity, we don't want that restriction for the typed data API. Thus wrapper == property does not fly as we could wrap a string on its own and we'd have no property. Use cases for that are e.g. describing menu arguments, or just take Rules and describe some event variables which could include strings.
  • Then, I'm not in favour of StructureInterface -> ComplexTypeInterface as ComplexTypes are known from XML Schema where they are anything except simple types (makes sense not?). But, this includes "lists" which are just a complex type with a restriction of sequence. A good overview for complex types is provided by http://www.w3schools.com/schema/schema_complex.asp.
  • ListInterface -> CollectionInterface could work, but at the DrupalCon it seemed that generally List was well received and preferred over Collection as it makes more clear it's not a structure. Another term also used by oData that would work is "Set", but still I think List is more-well understood.

@Structures: The following oData sentence makes me think that our term fits quite good:

Complex Types are structured types also consisting of a list of properties but with no key, and thus can only exist as a property of a containing entity or as a temporary value.

(with "key" meaning identifier in oData context).

So this makes me think we should improve our docs from

Interface for data structures that contain properties.

to

Interface for data structures containing a list of named properties.

?
and the List interface docs

Interface for a list of typed data.

to

Interface for an ordered list of unnamed items.

?

Additionally, oData currently uses SimpleType and Primitive interchangeably, but there's a proposal to differentiate them to allow new "SimpleType"s (scalars) to be made by extending (but not composing) primitives. Even though this isn't part of the oData standard yet, I think the concept (and terminology) is something we might want in TypedData API.

Yes, we already have with the possibility to define primitives that map to pre-defined primitives. So we use "primitives" like "SimpleType" and pre-defined primitives for the built-in ones. I think it's fine to go that way and don't use SimpleType as long as we do not use ComplexType as well. We could start use "ComplexType" for all non-primitives but I don't see much value added in introducing yet another term that people need to understand if we do not have an requirement and/or interface for it.

fago’s picture

oh, set does not fit really as it may not contain duplicates, but we allow them. (Think of entity reference referencing the same node multiple times or on a multiple integer field having two times the same integer put in.)

fago’s picture

OK, I research commoned terms more using wikipedia.

So regarding the wikipedia data structure article though I think structure as the wrong term as well:

In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.[1][2]

and

An array data structure stores a number of elements of the same type in a specific order. They are accessed using an integer to specify which element is required (although the elements may be of almost any type). Arrays may be fixed-length or expandable.

What completely does not map to a list being not a data structure. Even when writing this it makes no sense to me, ouch.

Record (also called tuple or struct) Records are among the simplest data structures. A record is a value that contains other values, typically in fixed number and sequence and typically indexed by names. The elements of records are usually called fields or members.
A hash or dictionary or map is a more flexible variation on a record, in which name-value pairs can be added and deleted freely.

Yes, map or hash does not fit as it's not added and deleted freely. Record or struct would fit.

Note this sentence:

The elements of records are usually called fields or members.

Whereas I find the usage of the term "element" there interesting as well as I was thinking about this one already. Field is a nogo, member - I'm not sure.

A further interesting wikipedia article is this one:

Structure Stable Unique Cells per Node
Bag (multiset) no no 1
Set no yes 1
List yes no 1
Map no yes 2

"Stable" means that input order is retained.

This table tells me that our usage of "List" fits perfectly.

Then most interesting, not the list starting with "Composite type":

Composite types

Array
Record (also called tuple or struct)
Union
Tagged union (also called a variant, variant record, discriminated union, or disjoint union)
Plain old data structure

Looks like this is fitting to our usage of "structures". The article reveals:

In computer science, a composite data type is any data type which can be constructed in a program using its programming language's primitive data types and other composite types. The act of constructing a composite type is known as composition.

Oh, yes that's what we are doing. The rest of the article is mostly about "struct"s. So would composite type a term we could use? But composite type just describes the type not the interface :/ Or better just StructInterface?

fago’s picture

http://en.wikipedia.org/wiki/Record_%28computer_science%29 says

In computer science, records (also called tuples, structs, or compound data)[1][page needed] are among the simplest data structures. A record is a value that contains other values, typically in fixed number and sequence and typically indexed by names. The elements of records are usually called fields or members.

Note "A record is a value that contains other values". So I think going back to ContainerInterface and container would be a good move.

fago’s picture

Then, I'm not in favour of StructureInterface -> ComplexTypeInterface as ComplexTypes are known from XML Schema where they are anything except simple types (makes sense not?). But, this includes "lists" which are just a complex type with a restriction of sequence. A good overview for complex types is provided by http://www.w3schools.com/schema/schema_complex.asp.

Fortunately, more discussions and investigation revealed it turns out this is wrong. Lists are simple types as well, so we could go with "complex types" instead of structures.

What stays though is that
$entity->property instanceof ComplexTypInterface
does not really fit. ComplexDataInterface could be an option.

Also, the question remains how the elements contained in a ComplexTypInterface should be called. oData uses properties, xmlSchema uses elements as XML already uses elements.

effulgentsia’s picture

oData uses "element" to refer to items in a Collection, as distinct from "properties" which are *named* items within a ComplexType.

#4 and #5 are good resources, but the problem is, depending on which schema language, ORM, language compiler, DBMS, etc. system you look at, you find slightly different terminology for these very foundational data structuring concepts. So, I suggested that we find a particular system that most closely overlaps our problem space, and use its terminology, rather than mixing and matching from different places.

I continue to think that oData's EDM is the system that most closely overlaps what we're trying to do, so unless someone finds a better matched model, I'll keep pushing us to adopt its terminology where it makes sense, including using the word "property" for both named items within ComplexTypes and named items within Entities. Really, an Entity is just a ComplexType that also has a key/id.

Where oData falls short though is that it's more of a schema / mapping model, not a set of interfaces and base classes. So, when it comes to naming the interface (e.g., ComplexTypeInterface), the terminology might not feel quite right, as in $node->body[0] instanceof ComplexTypInterface might be awkward if we think of $node->body[0] as data that has a type, not data that is a type. This needs a bit more thought? Anyone with ideas on this?

fago’s picture

I tend to think that doing interfaces like "ComplexTypeInterface" would be wrong. E.g. we've EntityInterface not EntityTypeInterface: So the class Node is an entity type, but the terms used to describe the classes and interfaces commonly refer to the instantiated nouns, not the types. That's why we use EntityInterface or StreamWrapperInterface, not StreamWrapperTypeInterface.

That said, I think we should find a term for the instance of a complex type and use that as interface name. ComplexDataInterface - it's data of a complex type.

I continue to think that oData's EDM is the system that most closely overlaps what we're trying to do, so unless someone finds a better matched model, I'll keep pushing us to adopt its terminology where it makes sense, including using the word "property" for both named items within ComplexTypes and named items within Entities. Really, an Entity is just a ComplexType that also has a key/id.

I agree that from what I've seen oData's EDM data model most closely overlaps with what we are doing. Maybe we should just get used to referring to the "fid" of $node->field_image->fid as property of the image item *and* so differentiating the term property in general vs entity property as specially referring to the properties of an entity.

Really, an Entity is just a ComplexType that also has a key/id.

Yeah, that's why the EntityInterface will extend the StructureInterface or however it will be called :-)

dixon_’s picture

I don't think we should differentiate the names of entity properties and properties contained in structures. It's an essential feature of the Typed Data API, that we rather should embrace, IMO. I think that's important.

Regarding renaming StructureInterface, I don't see the point of semi-adopting the oData terminology since we are not using oData. I think StructureInterface is pretty clear already, and don't see an immediate need of renaming it.

ListInterface -> CollectionInterface could work, but at the DrupalCon it seemed that generally List was well received and preferred over Collection as it makes more clear it's not a structure.

I agree with the discussions we had at DrupalCon, I'm in favor of staying with ListInterface. It's important to make it clear it's not a structure.

fago’s picture

@StructureInterface:
yeah, but at DrupalCon concerns were already raised that StructureInterface sounds like it includes Lists as well. After researching wikipedia - see #4 - I have to agree with that : StructureInterface sounds like data structures what would include Lists. The c "struct" is really the only thing that matches, but that's "struct" not structure. I really think we should rename it to avoid confusions here.

Regarding Collection vs List I could see Collection to be a possible way to go also, but I also think that List is clearer. It had was clear in discussions at Drupalcon also and according to the definitions on wikipedia it's exactly what we are talking about: an ordered non-unique list of things (possibly containing duplicates).

fago’s picture

As discussed during the skype call, we want to go with the following terms for now:

TypedDataInterface and TypedData( no base)
ComplexdataInterface
ListInterface

fago’s picture

For the term "property" we concluded that going with this makes sense as this is what oData and others use as well. So we have ComplexData containing properties, such as an entity that contains properties.

fago’s picture

Status: Active » Fixed

Implemented and pushed to entity-property. I've not renamed drupal_wrap_data() yet as this will go away while doing #1732724: Implement data types as plugins. anyway.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.