Updated: Comment #N
Tagged as critical because we can't ship beta with things how they are
Problem/Motivation
Entity ids no longer have to be numeric.
{comment} stores entity ids as integers.
Comment stores entity statistics in {comment_entity_statistics} table but entity_id in there is assumed to be integer.
comment_entity_load handles loading up data from {comment_entity_statistics} table and adding to the CommentItem field item when attached to a node (allowing $node->commentfield->comment_count etc. This is outside the normal field storage stack.
#2195915: Cannot save text filter config using PostgreSQL if Comment is enabled [blocks installation!] manifests this problem
Proposed resolution
Approach 1
Drop {comment_entity_statistics} and store the statistics in the field tables.
Comment's various entity hooks ensure that data is in sync (write/updates), delegated to CommentStatisticsInterface service.
comment_entity_load() is removed, field api handles loading for us.
Make entity id field in {comment} a varchar
Approach 2
Make separate tables for each entity type, comment_node, comment_user, comment_node_statistics, comment_user_statistics
Remove the entity_type field/column
Make schemas generate automagically.
Approach 3
Comment fields aren't supported for non-integer entity ids
Remaining tasks
Decide on best approach
Patch
Test
Review
etc
User interface changes
None
API changes
Not sure yet
| Comment | File | Size | Author |
|---|---|---|---|
| #51 | interdiff.txt | 3.06 KB | jessebeach |
| #51 | comment-string-ids-2205215-51.patch | 7.73 KB | jessebeach |
| #46 | comment-string-ids-2205215.07a7f8c.patch | 7.32 KB | larowlan |
| #46 | interdiff.txt | 1.12 KB | larowlan |
| #44 | 2205215-comment-string-ids-44.patch | 15.74 KB | dixon_ |
Comments
Comment #1
larowlanComment #2
andypostYeah! we can move
\Drupal::state()->get('comment.maintain_entity_statistics')on per field/instance basis with settings.The only question here are we going to reuse current comment field by simply extending its schema and definition?
Comment #3
larowlanYes and yes
Comment #4
larowlanThis will blow up because tracker/forum et-al hard-wired to use that table.
Lets see how it goes for comment module though.
This is based off of #2101183: Move {comment_entity_statistics} to proper service
do-not-test-patch is this issue only
main patch combines both.
Comment #6
larowlanWe don't need merge anymore, as record always exists, its just update now.
Not sure what to do with langcode...
Again patch includes #2101183: Move {comment_entity_statistics} to proper service, do-not-test patch is diff against that.
Comment #8
larowlanUpdated for #2195907: Search ranking for number of views fails in PostgreSQL
Comment #9
berdirHm. So if you're going to use the field data table storage, then you should just update the entity field and save it. Messing with the field database tables directly is not allowed :)
Which essentially means that all the comment statistics stuff would be doing is calculate the statistics, does it even still need to be a service or could it simply be a method on Comment?
Comment #10
andypost@berdir (un)posting a comment should cause a statistics calculation and affects commented entity so it is a field, the same time we have a comment field that actually displays this statistics (currently in node links) but supposed via own formatter #1920044: Move comment field settings that relate to rendering to formatter options
Comment #12
larowlan@berdir can I save a single field? if so yes that is the answer.
otherwise I can do the whole entity again, but seems overkill.
Comment #13
berdirNo you can not and yes, it is overkill. But if you use field storage, $entity->save() is the only API you're allowed to use.
Comment #14
catchNot updating the entity is one of the reasons this was never put into a field, and why it was moved out of the {node} table in 2003/4 or so.
Also the comment statistics aren't data attached to the field - it's denormalization/state, not field values as such.
If we had optimized entity updates, so that storage only gets written for fields that were explicitly changed, then this wouldn't be a problem (or at least less of a problem), but we don't have that yet.
I'm not really sure how we ended up at this issue, what's it blocking?
Comment #15
andypostThere's nothing blocked on it. Once #2101183: Move {comment_entity_statistics} to proper service is in we can implement calculated field a-la #2201051: Convert path.module form alters to a field widget
Comment #16
catchRight calculated field suits this better. So is the only issue #2195915: Cannot save text filter config using PostgreSQL if Comment is enabled [blocks installation!] then?
Comment #17
berdirSee #2195915: Cannot save text filter config using PostgreSQL if Comment is enabled [blocks installation!], which explains the need for per entity-type tables due to entity ID as string being supported now, which do not have to be field storage. As discussed, this would only solve half the problem anyway as we still have the main comments table.
Another thing is the entity cache, so what we should do at least is invalidate that cache when we change it. Right now I'm using the load_uncached() hook for this there.
Comment #18
catchI think uncached is OK for this sort of information - or at least it's a trade-off between cache invalidation and loading each time.
Or another option would be to not try to put the information onto the entity at all, and only access it when needed, which in the majority of cases is during rendering anyway.
Comment #19
andypost@catch Also I'd like to point what we discussed with @berdir - statistics like history needs a way to "load multiple" for node listings at least, currently only configurable fields have this ability.
Cache invalidation for entity is needed here, otherwise we should move history/comment_stats into some "out-of-bound"/"post render" stage because:
1) history is a per user data, so should not be cached
2) comment statistics should be invalidated when comment added/removed - so actually commented entity is changing, this is a serious improvement for forum and probably front page too
Comment #20
berdirhook_entity_load() is just as multiple as configurable fields are, they receive the same set of entities at the same time.
Comment #21
catchThe commented entity is only changing because we use hook_entity_load(). hook_entity_prepare_view() would attach the information for rendering, but not as part of the entity proper.
Comment #22
larowlanSo how to proceed here? (and on the general {comment}.entity_id only supports integers)
Another option is that we simply don't support comment-fields on entity types with non integer ids.
Comment #23
larowlanNew title and updated issue summary to outline two alternate approaches
Comment #24
larowlanadded proposed approach 3
Comment #25
larowlanAdding needs arch review tag to get some requests for feedback
Comment #26
larowlanPersonal preference is option 2, a similar approach will be needed for history module so some reusable schema building logic could be extracted from field module and used by all three.
Comment #27
andypost+1 to 2, but suppose better to remove 'entity' from table name so
{comment_node_statistics}PS Updated summary
Comment #28
moshe weitzman commentedThe prioritization criteria have changed. Downgrading accordingly. See https://drupal.org/core/issue-priority#critical-bugs
Comment #29
catchI think we need to take a conscious decision on this either way as to whether it's fixable or not at all in 8.x and whether we can change it safely before. Bumping back up until that's done.
Comment #30
xjmAlso, this sounds more like a bug to me, since elsewhere the entity system supports non-integer IDs.
Comment #31
xjmNote that the decision here will likely affect what Views can be built, among other things. See #1986606: Convert the comments administration screen to a view. In that issue, a separate Views bug was preventing comments on non-nodes from being listed, but once resolved, we can potentially create a view that lists all comments for moderation -- not just on nodes. However, that isn't a feature of D7 (see also #1040786: Include entities in taxonomy_index), so maybe we just let it go.
I am guessing a denormalization table of some manner is probably still important to have, though?
Comment #32
berdirI don't think that is related to this issue at all. I wanted to comment in that issue for a while that views should be doing the same as the current code is doing. a custom field formatter that does batch-loading of all referenced entities and displays it as a link, instead of using a relationship. We're already doing exactly that for the files view, see EntityLabel.
Comment #33
xjmIf we move things out into {comment_node} and {comment_user}, we no longer have the cross-entity-type, denormalized {comment} base table to query against. Same problem as Taxonomy has. It's not about the relationship or field formatting for results for each entity (which was the previous bug), it's about the listing of the entities to begin with in a performant query. But maybe best to move discussion over there. :)
Comment #34
xjmTo clarify #33 using the files example: There is one {file_managed} table. We don't have {file_node}, {file_user}, etc.
Comment #35
berdirYeah, you're right, sorry :)
file_usage.id is a string, though, but we're not doing joins on that...
Comment #36
andypostAs I said in #19 the history table has the same issue!
In ideal world, comments should be attached to "field instance record" not the entity itself.
But currently we duplicate this data into comment table (entity_type, field_name, entity_id)
So here's only a few solutions:
1) convert
entity_idfield to string2) split
commenttable intocomment_[entity_type]tables - preferred3) add intermediate table to store entity_type, field_name, entity_id (as string) and point comment here
4) merge "entity_type:entity_id" into one field
PS: related #2228763: Create a comment-type config entity and use that as comment bundles, require selection in field settings form does not affects this
Comment #37
catchIf we split into separate tables, I don't think we'd want the serial IDs for the comments to be split (i.e. duplicated), also Comment::load($cid) should still work.
So that seems to me that we'd need a {comment} table still, with the autoincrement ID and target entity type - if only to have unique numeric IDs and support Comment::load($cid);.
Comment #38
xjmWe discussed this issue with @larowlan, @dixon_, @alexpott, @effulgentsia, and myself, and decided that option 1 (simply not supporting string IDs for comment in core) is the best option at this point in the release cycle. If for some reason someone needs to support comments on an entity type with a string ID (@effulgentsia supplied the example of a remote entity), then they can swap out the service.
For this patch, we should:
Comment #39
larowlanSomething like this
Comment #41
larowlan39: comment-string-ids-2205215.16c2c23.patch queued for re-testing.
Comment #43
dixon_I'll take a look at this and fix the last tests there.
Comment #44
dixon_I've been a bit on and off this today, but this should fix a couple of tests.
There were a few hard coded assumptions around entity IDs being integers that I had to work around in the tests that I'm not very happy with.
Also, when an entity type with string IDs have field tables generated I ran into problems because there are assumptions around the revision_id data type as well.
I'm thinking that we should break out this new test entity type into a separate set of simpler tests...
Comment #46
larowlanWhat if we only use that type in route discovery?
Comment #47
larowlanWinner winner chicken dinner
Comment #48
dixon_That approach looks good and does the job!
Comment #49
dixon_Comment #50
catchThis exact code is repeated twice - can we add a helper? _comment_entity_uses_integer_id() will do.
Is it possible to do this check where the option is added, rather than removing it later?
Comment #51
jessebeach commentedFields are included in the "New field" dropdown list on the field overview page for a content type if they do not declare a
no_uiproperty in their@FieldTypeannotations.I don't know of an intervention point before
comment_form_field_ui_field_overview_form_alterto unset this value.I addressed the duplicated code from 50:1 in this patch.
Comment #52
larowlanI think the function belongs elsewhere, there is similar need in entity reference item's schema method.
Perhaps in entity module or on one of the entity storage classes?
Comment #53
dixon_Awesome.
Fair enough.
The entity module is about to be removed soon I think. @fago have plans for working on that I've heard.
And considering this is already procedural code I'm fine with keeping that code in
_comment_entity_uses_integer_id()as @catch already suggested.Comment #55
alexpottCommitted e05d622 and pushed to 8.x. Thanks!
Exception messages should not use t() or end with a full stop. Fixed during commit.
Comment #57
andypostComment #58
davidwbarratt commentedInteresting...
Approach #2 could have been implemented with #1995944: Remove entity_id and entity_type from the comment table and replace with relationship tables .
oh well, I think going with Approach #3 was better in terms of getting Drupal 8 out the door. But I think it's important to evaluate database design and determine the unintended consequences of whatever design we implement.
Thanks!
Comment #59
attiks commentedI created a follow up, because now we allow people to define comment types for all entities, but we hide it if the field is using a string id: #2496913: Don't expose entity types with string ids as a target option when creating comment types