Problem/Motivation

One of the main goals of the D8 Entity API is making entities storage-agnostic.

Proposed resolution

#1497374: Switch from Field-based storage to Entity-based storage is going to allow us to make entity storage swappable, as the Entity Storage controller will have full responsibility over field data. The big missing piece is entity querying: currently entity-defining modules use hardcoded SQL queries to retrieve sets of entities and often retrieve partial data. This does not make sense as this way the Entity API is completely bypassed. Instead we need to convert all entity SQL queries to the Entity Query API and fully load entities whenever we need their data.

This will probably have a performance impact, but we can leverage entity caching and lazy loading as mitigating strategies, once only the Entity API is used to access entity data. Morover these can be implemented as internal refactorings without breaking the public API.

As a side note, there is consensus that once the Entity Storage API is complete and core code has adopted it, we can still change the underlying SQL schema without affecting contrib modules relying on the API to access entity data. This will allow for further refactorings, such as improving multilingual support for entity types still missing it.

Remaining sub-tasks

None

Completed sub-tasks

Module Issue
aggregator #1957312: Use the entity storage controller in aggregator module
aggregator #2041083: Move database query out of \Drupal\aggregator\FeedFormController
forum #2205185: Split up ForumManager to create ForumIndexStorage
node #2068333: Convert node SQL queries to the Entity Query API
taxonomy #2068337: Convert taxonomy term SQL queries to the Entity Query API
file #2068343: Convert file SQL queries to the Entity Query API
user #2068329: Convert user SQL queries to the Entity Query API
menu #2068349: Convert menu link SQL queries to the Entity Query API
book #2225283: Make Book storage independent
comment #2068331: Convert comment SQL queries to the Entity Query API

Related tasks

Comments

plach’s picture

I marked this as critical as I think we cannot ship without it, but feel free to change the priority if this is not actually true :)

I am now creating the actual issues.

plach’s picture

Issue summary:View changes

Added actual issue links.

plach’s picture

Issue summary:View changes

Updated issue summary.

plach’s picture

Issue summary:View changes

Updated issue summary.

chx’s picture

I think we can skip the search extender -- it'd be rather complicated to convert that one and everyone who uses nonsql uses solr or sphinx or searchapi or something.

chx’s picture

Also note this is probably postponed on #2068655: Entity fields do not support case sensitive queries

catch’s picture

Priority:Critical» Major

This feels major in the sense I'd happily backport dbtng -> entity query conversions to point releases. On the other hand it'd be great to have 8.x-1.0 releases of alternative storage backends that actually work for all core entity queries - but downgrading until this blocks one of them.

Gábor Hojtsy’s picture

I think so long there are direct queries to entities in core like that, we cannot say the storage is swappable.

Gábor Hojtsy’s picture

Issue summary:View changes

Updated issue summary.

slashrsm’s picture

While working on comments conversion some questions appeared, which I think need to be discussed:

1. How do we decide which storage controller does something belong to. There are definitely many cases where this is not a question at all (most recent comments, most recent nodes, ...), but there are some "mixed" cases where I was not completely sure (example: get N most commented nodes). This kind of operation will return nodes, while it will definitely need to query against comment tables. It is a bit strange to return nodes from comment storage controllers and is also not very nice to query comment tables from node storage controller. I assume there will be some more cases like this and I think we should define general guidelines for this cases to prevent confusion and inconsistency in API.

2. Code that will use storage controllers to get some data sometimes expects entity ids and sometimes entirely loaded entities. We should probably define how we solve that, as it could also bring confusion. One option is to define naming conventions for storage controller functions that will reflect type of data we're returning. Another way to go would be to support both ways, which would be controlled with an argument.

slashrsm’s picture

While working on comments conversion some questions appeared, which I think need to be discussed:

1. How do we decide which storage controller does something belong to. There are definitely many cases where this is not a question at all (most recent comments, most recent nodes, ...), but there are some "mixed" cases where I was not completely sure (example: get N most commented nodes). This kind of operation will return nodes, while it will definitely need to query against comment tables. It is a bit strange to return nodes from comment storage controllers and is also not very nice to query comment tables from node storage controller. I assume there will be some more cases like this and I think we should define general guidelines for this cases to prevent confusion and inconsistency in API.

2. Code that will use storage controllers to get some data sometimes expects entity ids and sometimes entirely loaded entities. We should probably define how we solve that, as it could also bring confusion. One option is to define naming conventions for storage controller functions that will reflect type of data we're returning. Another way to go would be to support both ways, which would be controlled with an argument.

chx’s picture

> How do we decide which storage controller does something belong to. There are definitely many cases where this is not a question at all (most recent comments, most recent nodes, ...), but there are some "mixed" cases where I was not completely sure (example: get N most commented nodes).

That's definitely tricky but again it's clearly comment: SELECT nid FROM comments GROUP BY nid  ORDER BY COUNT(*) DESC -- it does return nodes but it's a comment query.

> Code that will use storage controllers to get some data sometimes expects entity ids and sometimes entirely loaded entities. We should probably define how we solve that, as it could also bring confusion.

Yes, there is some confusions around this. Given that we can always extract $ids from $entities with array_keys; I think we should standardize + fix those that pass $ids.

chx’s picture

Issue summary:View changes

Added one more aggregator issue

plach’s picture

Issue summary:View changes

Updated issue summary.

plach’s picture

Issue summary:View changes

Updated issue summary.

plach’s picture

Issue summary:View changes

Updated issue summary.

plach’s picture

plach’s picture

plach’s picture

vijaycs85’s picture

Issue summary:View changes

Updating issue summary with sub-tasks.

plach’s picture

Issue summary:View changes
sun’s picture

larowlan’s picture

Issue summary:View changes

Added #2205185: Split up ForumManager to create ForumIndexStorage at request of @andypost and @chx
This will move forum SQL queries from forum.module into ForumRepository

slashrsm’s picture

fgm’s picture

plach’s picture

Issue summary:View changes
plach’s picture

Issue summary:View changes
slashrsm’s picture

Issue summary:View changes
slashrsm’s picture

Issue summary:View changes
plach’s picture

Issue summary:View changes
plach’s picture

Issue summary:View changes
plach’s picture

Issue summary:View changes
plach’s picture

naveenvalecha’s picture

Issue summary:View changes

Updating issue summary.Moved node #2068333: Convert node SQL queries to the Entity Query API and forum #2205185: Split up ForumManager to create ForumIndexStorage under completed sub tasks list.

plach’s picture

Issue summary:View changes
andypost’s picture

looks all child issues are fixed, can we close this one

chx’s picture

Berdir’s picture

It is, but that's actually only related, not really a sub-issue. That is just moving a query that already is an entity query out of the storage class, because there is no reason for it to be there, as it is fully storage-agnostic.

So +1 to closing this.

plach’s picture

Issue summary:View changes
Status:Active» Fixed

Sounds good, thanks everybody!

Status:Fixed» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Gábor Hojtsy’s picture

Issue tags:-sprint