Attempting to split #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions into some logical chunks. Here's part 1.
Problem/Motivation
The Field SQL Storage API is currently marked as private. The rationale is that in D7 fields are storage-agnostic so using that API in contrib modules introduces storage-specific code that is going to break as soon as a different storage is used. However this does not account for cases where assuming a SQL storage is legitimate for various reasons. The most obvious example is Views, which has an architecture that supports alternative storages, but which is heavily "biased" toward SQL. Another typical situation is custom code perfectly knowing which storage is being used, and relying on that information to achieve better performance in certain cases by querying directly the database. Both of these cases rely on an API that is unlikely to change but isn't guaranteed not to.
Proposed resolution
Make the Field SQL Storage API a public API, that is confined to the SQL-specific part of the core entity storage API.
This moves the following methods to the (default) table mapping implementation, which is responsible for representing how fields are mapped to tables:
ContentEntityDatabaseStroage::_fieldColumnName()
-> TableMappingInterface::getFieldColumnName()
FieldStorageConfig::getReservedColumns()
-> TableMappingInterface::getReservedColumns()
ContentEntityDatabaseStorage::_fieldTableName()
-> DefaultTableMappingInterface::getDedicatedDataTableName()
ContentEntityDatabaseStroage::_fieldRevisionTableName()
-> DefaultTableMappingInterface::getDedicatedRevisionTableName()
A new DefaultTableMappingInterface is introduced to deal with the specific implementation for core storage: other storage classes might want to provide completely different ways to deal with SQL tables and implement completely different table layouts (see #2274017: Make SqlContentEntityStorage table mapping agnostic ). In that case the concepts of dedicated tables would be meaningless.
Remaining tasks
Reviews
User interface changes
None
API changes
None, just additions.
| Comment | File | Size | Author |
|---|---|---|---|
| #31 | entity-field_sql_api-2326719-31.patch | 68.5 KB | plach |
| #31 | entity-field_sql_api-2326719-31.interdiff.txt | 14.41 KB | plach |
| #27 | entity-field_sql_api-2326719-27.patch | 69.77 KB | plach |
| #7 | interdiff.txt | 544 bytes | effulgentsia |
| #7 | entity_schema-part1-table-mapping-6.patch | 69.69 KB | effulgentsia |
Comments
Comment #2
effulgentsia commentedReverted some changes that are out of scope for this issue.
Comment #3
effulgentsia commentedAnd a couple more.
Comment #4
effulgentsia commentedSorry for the noise. I think these are the final reversions. The rest of the patch looks completely in scope for this issue title to me.
Comment #5
effulgentsia commentedThat @private is on all 3 functions. It just got copy-pasted as part of the move, but if we're putting these methods into an interface, does that mean we should also remove the @private? If not, should we remove them from the interface? Tricky, since hook_views_data() implementations need to call these, but the comment is correct that most other code should use the entity API and not deal with these directly.
Comment #7
effulgentsia commentedComment #8
yched commentedSeparating those changes out of #2274017: Make SqlContentEntityStorage table mapping agnostic makes a lot of sense IMO.
#2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities comments #365 - #368 and #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions had discussions about refactorings in the EntityStorage / SchemaHandler / TableMapping area, that back then were deferred to #2274017: Make SqlContentEntityStorage table mapping agnostic .
Since this issue here is now where the refactoring happens, it would be worth rechecking the items in the issues / comments mentioned above ?
Comment #9
yched commentedNot sure I get the relation / differences between TableMappingInterfaces and DefaultTableMappingInterface, and what makes us sure that the mapping we get here (and in other places in the patch that force-cast to DefaultTableMappingInterface) really is a "Default" one on which we can safely call the "Default" specific methods ?
More generally, this split issue could use a summary / overview of its own ;-)
Comment #10
effulgentsia commentedFrom plach in #1498720-160: [meta] Make the entity storage system handle changes in the entity and field schema definitions answering that question when I asked it there:
And my response in #1498720-161: [meta] Make the entity storage system handle changes in the entity and field schema definitions:
Now, I did not carry over into this issue the allowsSharedTableStorage() and requiresDedicatedTableStorage() methods from that issue, because those are new and should be evaluated there along with what needs to call them. I only carried into this issue: getDedicatedDataTableName(), getDedicatedRevisionTableName(), and the ones in the above block quote. With hook_views_data() implementations being the most common caller of those methods outside of the entity component itself. We have #2079019: Make Views use SqlEntityStorageInterface open for that, but AFAICT, not a whole lot of traction there. However, does Views need to care whether these tables are "dedicated" or not? Or does it only need to know the table that a field is in? In which case, should we move these methods from DefaultTableMappingInterface into TableMappingInterface, and rename them to
get(Field?)DataTableName()andget(Field?)RevisionTableName()? In which case, we wouldn't be introducing a DefaultTableMappingInterface in this issue, but may in the parent issue for code that needs to call allowsSharedTableStorage()/requiresDedicatedTableStorage(), at which point we could also come up with a better name than DefaultTableMappingInterface to convey that particular layout strategy concept.@plach: thoughts?
I can take a stab at that tomorrow, but wasn't involved in all of the history that led to this, so it would be great if someone who has can contribute to that as well.
Comment #11
plachThe optimal scenario would be Views being able to rely on the table mapping, without making assumptions on the implemented table layout. This is what I originally envisioned and what made me initially think to the table mapping data structure. And, yes, in this case Views would only need to know in which table a particular field is stored, as join information can probably be derived by PK/FK data, which for entities is known in advance.
We already have a method in the table mapping which lists all the fields stored in a particular table
TMI::getFieldNames(). I guess this could be enough, although an inverse method could be useful too. The two above, even renamed, still assume we have dedicated tables for fields, which might not be the case for table layouts different from the one adopted by core. So I wouldn't move them in the base interface.Better names for that interface (and class) are totally welcome, I just hadn't the time/will to think to something better. The idea is that those describe the default (core) table layout. There is nothing special/characteristic about it aside from the fact that it's our usual (default?) choice for SQL storage.
Comment #12
plachBtw, the issue where Views integration will be rewritten to exploit table mappings is probably #1740492: Implement a default entity views data handler.
Comment #13
plachUpdated the issue summary
Comment #14
plachI will perform these changes as soon as the new branches are set-up:
As @eff was pointing out, it makes sense to move these two to the parent class as they provide sufficiently general concepts.
As @eff was pointing out, we should remove the @private tags, but we should keep the warnings about this being SQL-specific, although text should be slightly adjusted to be less intimidating :)
Comment #15
plachComment #16
plachComment #17
plachComment #18
effulgentsia commentedThanks for the issue summary!
Why would they assume that? If the method was named TMI::getFieldDataTableName(), and it's passed a FSDI, can't the DTM implementation of the method figure out if it's dedicated or shared, and return the correct name accordingly, while some other implementation can do whatever is appropriate based on its something-unrelated-to-dedicated-vs-shared strategy? Why are we making the caller have to know the layout strategy?
Comment #19
plachOh, sorry, now I get what you mean. The Field part in the method name, combined with Data/Revision, somehow misleaded me. What you are proposing makes sense to me: although there might be table layouts where only revisions are stored, so we would have no distinction between data and revision tables. However the concept of revision is baked deep in enough in our API, that I guess it shouldn't be harmful if it went in the base interface: we could always return the same value for both
TMI::getField*TableName()methods.Comment #20
yched commented+1 on #18, that looks flawed.
[edit: crosspost with @plach - does #19 mean DefaultMappingInterface goes away ? Yay if so]
Comment #21
plachWe could be able to get rid of it here, but as @eff has mentioned, more methods exist on it in the parent issue, so I am not completely sure we will be able to generalize them.
An addition to #19:
An alternative could be having a single method returning an array of tables where a field is stored (
nidis stored in 4 tables by default) and leaving more specialized methods to sub-interfaces.Comment #22
plachWhat is concerning me about baking in the base interface concepts that are too close to our specific implementation, is that API consumers (including us) might forget there might be alternative implementations and write code that is not generic enough. This is why I initially felt the need for a sub-interface for our logic.
For example, if we implement #19 then someone might retrieve the two table names and not account for the fact that they might be equal, possibly causing all sorts of breakage in that scenario.
Comment #23
yched commentedHmm - isn't that exactly the problem with the current patch ? See #9.
Comment #24
effulgentsia commentedWhat if we address both of those by naming the methods
getPrimaryDataTableName()andgetPrimaryRevisionTableName()? Those are the ones I think we should have on the main interface since it's what Views needs. As to additional methods to return all the tables for a field, we already have in HEAD TMI::getTableNames() and TMI::getFieldNames($table_name), so edge case callers can do their own iteration if the really need to, so I suggest we hold off on adding getAllTableNames(FSDI) until we have a solid use case for it.That would be a bug with the caller and should be fixed there.
I think that's fine, because once we're clear on what method names really are strategy-specific and what example callers of them are, we can design and name accordingly.
Comment #25
plachDisclaimer: I am not trying to keep my implementation at all costs, I am just trying to make sure I express all the thoughts I put behind it. I am totally fine with going with @eff's proposal if there's consensus that it's better.
Well, all the current Field-Views integration code is currently assuming our core SQL storage is in place, so it's perfectly fine to assume the default table mapping is returned. This is going to change in #1740492: Implement a default entity views data handler, and probably in that class and sons we shouldn't rely on methods belonging to DTMI.
I am not sure the final version of the Entity–Views integration code will need even those methods. What about leaving them where they are for now, that is leaving the core-specific version on DTMI and see how needs evolve there? If it turns out we need them, moving them in the base interface can be done without API breaks with minimal BC code, but if instead we don't actually need them, we could avoid exposing the concept of revision table in the base interface, which still feels a bit wrong to me. AAMOF:
I agree that it would be the caller's fault, but IMHO we wouldn't be providing the most intuitive DX possible here.
Comment #26
effulgentsia commentedDisclaimer: I wrote this up before reading #25, but am posting it as-is, and we can reconcile in additional comments.
Ok, so let's maybe think through what kinds of implementation choices we want to allow vs. what we want callers to be able to rely on. Here's some of my initial thinking on that:
I'll post another comment with some method suggestions to represent the above, but posting this first in case anyone wants to give feedback on it in the meantime.
Comment #27
plach#26 looks promising :)
Posting a new patch to address #14 meanwhile.
Comment #28
effulgentsia commentedIgnoring #26.5, here's what I think TMI could look like:
And if we want to support #26.5, then we would need to change getPrimaryTableName() to something like:
Comment #29
effulgentsia commentedRe #25, I'd also be okay with punting #28 to a separate, post-beta issue, but if we think something like that is at least on the right track, then perhaps for this issue, we could ditch DefaultTableMappingInterface, and leave specific strategy methods on DefaultTableMapping, without interface definitions, and let callers continue to assume they have that specific implementation class to interact with, until we get TMI finished.
Comment #30
plach#28 looks correct and comprehensive to me, my only doubt is whether we would actually need all those methods.
#29 works for me :)
Comment #31
plachThis implements #29.
Comment #32
effulgentsia commented#31 looks great to me.
Comment #34
webchickLooks straight-forward to me. Lots of the same changes made lots of times. :)
Committed and pushed to 8.x. Thanks!
Comment #35
xjmRetroactively tagging as part of #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions.
Comment #37
plachCreated #2401107: Consider generalizing DefaultTableMapping to discuss #26.