Updated: Comment #5
Problem/Motivation
This issue has a root cause (see below), and two ways to reproduce (see below).
Steps to Reproduce #1
Attempt to install Drupal using PostgreSQL database and the Standard install profile, in a language other than English.
You will get a PDO error:
PDOException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "en" LINE 4: WHERE (entity_id = 'en') AND (entity_type = 'language_entit... ^: SELECT c.cid AS cid FROM {comment} c WHERE (entity_id = :db_condition_placeholder_0) AND (entity_type = :db_condition_placeholder_1) ; Array ( [:db_condition_placeholder_0] => en [:db_condition_placeholder_1] => language_entity ) in comment_entity_predelete() (line 971 of core/modules/comment/comment.module).
Steps to Reproduce #2
With the Editor and CKEditor modules installed, if you attempt to edit a text format that already had an Editor assigned to it, and you're running in PostgreSQL, you will get a PDO exception:
PDOException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "basic_html" LINE 4: WHERE (entity_id = 'basic_html') AND (entity_type = 'editor... ^ in PDOStatement->execute() (line 61 of core/lib/Drupal/Core/Database/Statement.php).
Enabling backtrace, you can see this is coming from:
PDOStatement->execute([Array]) Drupal\Core\Database\Statement->execute([Array], [Array]) Drupal\Core\Database\Driver\pgsql\Connection->query(SELECT c.cid AS cid FROM {comment} c WHERE (entity_id = :db_condition_placeholder_0) AND (entity_type = :db_condition_placeholder_1) , [Array], [Array]) Drupal\Core\Database\Query\Select->execute() comment_entity_predelete(Drupal\editor\Entity\Editor, editor) call_user_func_array(comment_entity_predelete, [Array]) Drupal\Core\Extension\ModuleHandler->invokeAll(entity_predelete, [Array]) module_invoke_all(entity_predelete, Drupal\editor\Entity\Editor, editor) Drupal\Core\Config\Entity\ConfigStorageController->invokeHook(predelete, Drupal\editor\Entity\Editor) Drupal\Core\Config\Entity\ConfigStorageController->delete([Array]) Drupal\Core\Entity\Entity->delete() editor_form_filter_admin_format_submit([Array], [Array]) [followed by a bunch of form/controller/handle page request stuff that is kind of irrelevant]
Basically, the submit of the edit is causing the 'basic_html' editor config object to be deleted and then recreated, which is then triggering the comment module "respond to entity delete" function, which has problems in this case.
Root cause
The real problem here is that:
a) Whenever an entity is being deleted, the Comment module wants to delete all comments that might belong to to the entity.
b) Currently, it tries to check this on every entity deletion (config or content entity of any type, fieldable or not), by seeing if any comments are recorded on that entity ID.
c) The {comment} table, which stores the correspondence between entities and comments, has an integer field for storing the entity ID.
d) Many entities use strings and not integers for their ID, including most config entities (but it is allowed now for content entities as well).
e) MySQL doesn't really care if you try a WHERE clause with a string value on an integer field, but PostgreSQL does. Therefore, the query used to find the comments on an entity being deleted fails in PostgreSQL with a PDO exception.
Proposed resolution
There are three ways that the root cause of this problem could be fixed:
a) Require all content entities to have integer IDs. This is undesirable -- they were changed recently to allow non-integer IDs and we don't want to revert it.
b) Change comment module to use string IDs in its db table. This is somewhat undesirable, because it would lead to inefficiencies.
c) Only allow comments for entities with integer IDs, and verify this is the case when checking for comments in the method that has the problem. This is probably the approach to take.
Remaining tasks
1. Figure out which fix is better.
2. Make a patch.
3. Add a test that has the Editor, CKEditor, and Comment modules installed and attempts to edit a text format that previously had an editor assigned... although SearchCommentTest in \Drupal\search\Tests does reproduce the problem, so we may not need a new test. It seems like an explicit test might be useful though, in case SearchCommentTest changes in the future? Or else maybe we should put comments into SearchCommentTest that these steps are crucial for testing another bug?
4. Verify manually that the patch fixes the PostgreSQL problem in the "steps to reproduce" above, that the new test (if one is added) fails in PostgreSQL without the patch, and that the new test and SearchCommentTest both pass with the patch.
User interface changes
None, except things will work better in PostgreSQL.
API changes
None.
Original report by @jhodgdon
Part of #2195815: [meta] Search module tests do not pass on PostgreSQL
I ran the Search module tests on PostgreSQL today, with the patch for #2158339: Search results page broken on PostgreSQL applied.
The test SearchCommentTest failed... The error showed up in a weird place on the test results page, but by looking at the verbose debugging output, I could see that it happened just after line 71, trying to do this:
// Enable check_plain() for 'Basic HTML' text format.
$basic_html_format_id = 'basic_html';
$edit = array(
'filters[filter_html_escape][status]' => TRUE,
);
$this->drupalPostForm('admin/config/content/formats/manage/' . $basic_html_format_id, $edit, t('Save configuration'));
The error on that page says:
PDOException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "basic_html" LINE 4: WHERE (entity_id = 'basic_html') AND (entity_type = 'editor... ^ in PDOStatement->execute() (line 61 of core/lib/Drupal/Core/Database/Statement.php).
I'm going to investigate now to see whether this is also a problem in the UI for the text formats, or a problem with the above SearchCommentTest code.
Comment | File | Size | Author |
---|---|---|---|
#36 | comment-delete-2195915.36.patch | 1.76 KB | larowlan |
#36 | interdiff.txt | 1.2 KB | larowlan |
#34 | comment-delete-2195915.34.patch | 1.57 KB | larowlan |
#34 | interdiff.txt | 1.24 KB | larowlan |
#29 | comment-delete-2195915.27.patch | 1.31 KB | andypost |
Comments
Comment #1
jhodgdonComment #2
jhodgdonAha, this is actually a bug in the text format page.
To reproduce:
a) Install with PostgreSQL and Standard install profile (if only so that the "basic HTML" format is installed).
b) Go to admin/config/content/formats/manage/basic_html (edit the Basic HTML text format).
c) Check the "Display any HTML as plain text" box.
d) Click "Save configuration".
You will get this error (or at least I do):
PDOException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "basic_html" LINE 4: WHERE (entity_id = 'basic_html') AND (entity_type = 'editor... ^ in PDOStatement->execute() (line 61 of core/lib/Drupal/Core/Database/Statement.php).
Unfortunately it is not giving me the full query here (I checked dblog too and it doesn't have it there).
Looks like this is a filter.module bug, and since I think it means you cannot save any changes to text filters in PostgreSQL, I think it's a "Major".
Comment #3
jhodgdonAs a reality check, I verified it works fine in a MySQL installation that is pretty recent. Plus I assume the tests would be failing all over the place if not.
As another reality check, since the error above seemed to say entity_type = 'editor...', I disabled the CKEditor and Text Editor modules. I am now able to save the text filter.
So the problem is apparently in the Text Editor module, some modification it is doing to the text filter configuration form, which is not working in PostgreSQL. Nothing jumped out at me when I looked at the code...
Comment #4
jhodgdonFurther debugging information:
a) You have to be editing a text format that already had an editor associated with it.
b) I turned on debugging with backtrace (newly possible today!) and have more information about this error:
Comment #5
jhodgdonDecided we needed an issue summary. :)
Comment #6
andypostAt least that should fix the bug, actually it makes sense to separate "entity hooks" for content and config entities
Comment #7
BerdirPatch looks fine. Looking at comment_entity_insert(), it does two different checks:
a) Should we support that for predelete() too? (disabling maintain statistics)
b) Is it actually save to do this, or could it be an issue with removing data from fields (the getFields() check)
Even with the check, this is going to execute queries for a lot of entities that don't have comment fields. Maybe we can at least switch to an $entity->getEntityType()->isFieldable() check?
Also, another problem to look into is that fieldable entities can now have string ID's. and comment statistics don't support that, so we either need to not support comments for those and fail nice and early for them or b), change to string and live with the performance regression.
Comment #8
andypostComment #9
bzrudi71 CreditAttribution: bzrudi71 commentedApplied patch makes tests go green :-) However, because of Berdirs comments in #7 this may need some discussion or a follow up. So not setting to RTBC for now.
Comment #10
jhodgdonI just updated the issue summary with a summary of the actual cause of the problem, and two proposed solutions.
I don't think the latest patch is a real long-term solution to the problem. I think there are really only two possible solutions that will fix this problem permanently:
a) Only allow comments on content entities with integer ID fields, and only check for comments upon entity deletion if the ID is an integer.
b) Allow comments on all content entities, making the entity ID field in the comment table be string-valued instead of integer-valued.
If we do (a), then I think the right check in the pre_delete function would be to do an is_integer() before running the query, in addition to probably (as in the current patch) checking that it is a ContentEntityInterface, and maybe also checking that it is fieldable?
If we do (b) then we need to change the database table, and we should still also probably check that it's a content entity and that it is fieldable.
I also suggested in the updated issue summary that we add a test that is explicitly for this behavior, rather than relying on some code in a Search module test, which could be deleted sometime in the future, to test this.
Comment #11
bzrudi71 CreditAttribution: bzrudi71 commentedNot sure if either a) or b) is the way to go. Sure, it will fix the issue in comments but we have the exactly same problem all over in core and many more tests fail currently because of this. AFIK entity_id is defined as varchar, as well as vocabulary id in taxonomy module and (maybe) many more, while field ids must be integers, see #1823494: Field API assumes serial/integer entity IDs, but the entity system does not.
IMO the most errors related to this are because of the loadMultiple() methods in:
DatabaseStorageController.php
ConfigStorageController.php
EntityStorageController.php
FieldableDatabaseStorageController.php
There are some ideas in the issue queue, starting from adding query tags up to silently filter out non numeric ids in loadMultiple(). While I would vote for make all ids integers there is for sure a reason while it's not the case in core, so we need to workaround ;-)
Comment #12
andypost$entity->getEntityType()->isFieldable()
makes sense same for pre deleteThe int|string PK needs follow-up
Comment #13
larowlanPatch at #6 is appropriate an inline with what we do in comment_entity_load().
Again I think there is cause for a hook_content_entity_load|predelete etc but an early exit, like patch #6 is the next best thing.
Lets discuss disabling maintaining statistics in predelete from #2101183: Move {comment_entity_statistics} to proper service
Comment #14
jhodgdonIsn't this still going to fail for some entities? I don't see why this is the right fix. The current patch is just checking to see that it's a ContentEntity, but there is no guarantee that all ContentEntity types have integer IDs. It may fix it for the text filters, but it doesn't really address the root problem at all.
Comment #15
BerdirThis is fine if we have a follow-up to think about entities with non-integer ID's. Until a few weeks ago, that was impossible for fieldable entities anyway to have non-integer, we only recently made that possible. There are zero entities like that in core, including test entities, the test that covers the new flexibility is a unit test that is faking the entity information.
I don't think it should block this commit.
Supporting that needs more discussions, I honestly don't see a way to do it right now so as a stop-gap, we simply might want to find a way to prevent it.
Comment #16
jhodgdonI had a quick discussion with webchick about this in IRC today.
Just to clarify... I just want to reiterate that the current patch is a "band-aid". It doesn't fix the real root cause of the problem here, which is:
- Comment module, whenever some entity is being deleted, checks to see if it might have comments that need deleting.
- Comment module's database tables have an integer-valued column for entity ID.
- Not all entities have integer IDs; according to #15, not even all Content entities are guaranteed to. (In Core currently, while we have many examples of Config entities that have string IDs, there are currently zero examples (except a test case) of Content entities that have string IDs. However, it is allowed.)
- The database queries that happen when Comment checks to see if there are comments on an entity try to match the entity ID with the integer column. This type of matching of a string on an integer database table column causes exceptions in PostgreSQL. This error is coming up when trying to delete a config entity from the Editor module, which happens whenever you save a text filter config if it previously had an Editor associated with.
The fix in the current patch is to only run this check if the entity is a Content entity. This is OK as long as Content entities all have integer IDs, but will end up failing down the road if a content entity is created with non-integer ID.
So, while the patch does allow the Search module tests to pass, it is not a very good fix for the root cause of the problem. Whether or not to commit this patch it is not my decision to make, but if it were my decision I think I would push for a real fix to be made rather than a temporary solution.
The only real fixes would be:
a) Require all content entities, after all, to have integer IDs.
b) Change comment module to use string IDs in its db table.
c) Only allow comments on entities with integer IDs, and verify this is the case when checking for comments.
I would also be more comfortable with the supplied patch if besides verifying it's a content ID it also made sure the ID it was checking was actually an integer.
Comment #17
webchickBased on #14+, I'd personally feel more comfortable to see a bit more discussion here, I think.... it seems like the only reason this workaround works is because core hasn't yet made use of the capability to use non-integer IDs on content entities. But since that's an artificial limitation, and not a real one, and it doesn't solve it for contrib if/when they take advantage of that capability. That means we'd end up closing one major bug and re-opening another major bug, which doesn't sound like a huge gain...
If I'm misunderstanding though, happy to hear counter-arguments. Otherwise, #16.b sounds like a fairly non-invasive change that we've done elsewhere, e.g. filter formats.
Comment #18
BerdirYes, it's an incomplete bandaid fix. But it fixes the real, practical problem of failing PostgreSQL tests. Content entities with non-integer ID's are a theoretical feature that core does not and will not (certainly not in 8.x) use, the only thing that exists is the schema generation support for it and only has "faked" (as in, no real entity type based) unit test coverage.
I'm still +1 on getting this in as it is.
a) would essentially mean a revert of the issue that introduced this.
b) Would be a performance regression for the 99.9% use case as it would make indexes bigger and joins slower. It's not just the statistics table (we'e opened #2205215: {comment} and {comment_entity_statistics} only support integer entity ids for that) but also the actual comment table.
c) This is the only practical solution but it's not something that Field API supports right now, I'm not sure how to do this.
Checking the field type of the primary key would be possible, or alternatively check if there are comment fields on that entity type. Both should at least prevent any failures until you actually try to add a field to a non-integer content entity and then it will fail much earlier.
Comment #19
jhodgdonI do not see why (c) could not at least be implemented in this patch.
The code that is crashing, after this patch, is:
Given that this will *still* fail in PostgreSQL if $entity->id() is anything but an integer, could we not instead do:
That seems like a simple change that does not add any huge performance problems (maybe it is even more efficient than the existing patch?), and at least gets closer to addressing the real problem, which is that you cannot pass every $entity->id() into this query because {comment}.entity_id is an integer field. I mean, the problem is not that $entity might not be a content entity -- the problem is that the db field is incompatible with non-integers and PostgreSQL is strict about that, right? So why not make the if() statement actually test for the exception-causing issue rather than testing for something that is only by coincidence related?
Comment #20
BerdirI don't really care, if that makes you accept the patch, fine with me :)
However, then let's do something like this:
> 0, because is_int('5') => FALSE, and I have no idea if the ID is really always an actual int and not a string with a number, when it's loaded from the database or something...
Comment #21
jhodgdonAgreed on isFieldable() as being a better test that "is it a content entity".
Agreed that the id() might be an integer masquerading as a string. I'm wondering in that case whether we may also need to cast to (int) in the query to avoid the PostgreSQL problems?
Also, we could use is_numeric() to test this, possibly in addition to the "> 0" (I had to run php -a to verify for myself that ('abc' > 0) for sure evaluated to FALSE and ('52' > 0) for sure evaluated to TRUE -- it does work though).
Comment #22
bzrudi71 CreditAttribution: bzrudi71 commentedFYI I Just tried to install D8 using non english language and install fails because of this issue :(
PDOException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "en" LINE 4: WHERE (entity_id = 'en') AND (entity_type = 'language_entit... ^: SELECT c.cid AS cid FROM {comment} c WHERE (entity_id = :db_condition_placeholder_0) AND (entity_type = :db_condition_placeholder_1) ; Array ( [:db_condition_placeholder_0] => en [:db_condition_placeholder_1] => language_entity ) in comment_entity_predelete() (line 971 of core/modules/comment/comment.module).
Berdirs approach in #20 fixes the problem. (didn't try the is_numeric() way as mentioned in #21 for now)
Comment #23
jhodgdonBlocking install with Standard profile in non-English using PostgreSQL brings this up to "critical" status, I think? Updated the issue summary
Comment #24
larowlanWorking on this shortly
Comment #25
larowlanpatch combines @jhodgdon and @berdir's approaches.
no interdiff as fresh start.
Comment #26
andypostI'd prefer to make sure that value is int (
is_int($entity->id())
), entities could have machine names like numbers ('000' for ex)EDIT also this kind of check prevents to useless queries when id is 0 but needs inline comment
Comment #27
jhodgdonandypost: The main goal here is to make sure this query does not fail with an exception, and a secondary goal is to not even run the query for entities that do not support comments (hence the isFieldable() check).
The query will fail with an exception if the entity ID is a non-integer.
The patch here is not quite right -- I think it needs to cast ->id() into int after checking is_numeric().
Comment #28
jhodgdonThis patch also needs a dedicated test. I do not think we should be relying on a search.module test that could go away at any time to make sure this stays fixed.
Comment #29
andypostThis is a temporary fix, finally we should split comment statistics into "table-per-entity type" like catch suggested in #2081585-79: [META] Modernize the History module
Also this should uncover bugs when entity with int ID somehow returns string data
Comment #31
andypostThe failed test exposes that
node->id()
is string type somehowComment #32
BerdirAs commented above, that is to be expected and is why I recommended to use > 0, not is_int(). If you for example load something from the database, it's always going to be a string.
Comment #33
jhodgdonOK then. I'm glad to see that we have confirmation via a failing test instead of a hypothethical "it probably works this way".
So, it does look like we should either use > 0 or is_numeric to test for this being an actual integer value, and then I think we also need to cast the value to (int) in the condition() statements in the queries.
I think that we should also put in a code comment, probably with an @todo, that explains why we are even testing this to see if it's numeric in the first place.
Comment #34
larowlanSomething like this?
Comment #35
jhodgdonThe code in the patch looks fine to me, and obviously passes the tests this time.
Could we change the comment from
to something more like:
Entities can have non-numeric IDs, but {comment} and {comment_entity_statistics} tables have integer columns for entity ID, and PostgreSQL throws exceptions if you attempt query conditions with mismatched types. So, we need to verify that the ID is numeric (even for an entity type that has an integer ID, $entity->id() might be a string containing a number), and then cast it to an integer when querying.
And maybe we should also add:
@todo Should we support non-integer entity IDs for comments?
Comment #36
larowlanupdated comment as advised.
left out @todo for comment with non-int ids, we already have another critical #2205215: {comment} and {comment_entity_statistics} only support integer entity ids where we're discussing that
Comment #37
jhodgdonThanks! I'm +1 on marking this RTBC now.
Comment #38
andypostSo let's do it!
Comment #39
webchickGreat work, folks! I'm glad we were able to come up with a solution here that satisfies everyone, and the docs are very thorough to explain why we're going through these kind of shenanigans.
Committed and pushed to 8.x. Thanks! 1 more small step for PostgreSQL kind! :)