Problem/Motivation
Follow-up from reviewing #3026290: PostgreSQL constraints are still not renamed properly on table renames.
The pgsql driver does not set the ownership of sequences it creates when changeField method is called. This happens if you specify a table without a serial field, then later change an integer field into one. This means that the preferred pg_get_serial_sequence function will not return the newly created sequence.
Sequences created automagically with CREATE TABLE are "owned" by the relation, but those created with CREATE SEQUENCE need the OWNED BY statement appended with the relation and the column.
Drupal 8 and 7 core currently do not have any database updates that change fields into serial columns, but contrib. or custom code might.
Proposed resolution
- Add
OWNED BY {table}.fieldto the create sequence statements. - Add a database update to check serial fields to make sure they exist via pg_get_serial_sequence, and if not, alter the sequence with the same statement above.
- Potentially refactor as a protected (or public?) function for ease-of-use?
- Write a test to assert that sequence ownership exists on change field.
- Write a test to assert that the database update works.
Remaining tasks
- Write a patch.
- Review
- Commit and unpostpone the 7.x issue
API changes
None
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #105 | interdiff_100-105.txt | 7.97 KB | arantxio |
| #105 | 3028706-105.patch | 18.78 KB | arantxio |
Comments
Comment #2
mradcliffeHere's a test only that adds an assertion into SchemaTest.
The code to fix this in an update hook is pretty database-specific. I think it does make sense to add some more public methods because otherwise the code is going to have
(bool) $connection->query("SELECT pg_get_serial_sequence('" . $connection->prefixTables($table) . '", '" . $field . "')")->fetchField()and$connection->query("ALTER SEQUENCE " . $connection->makeSequenceName($table, $field) . " OWNED BY " . $connection->prefixTables($table) . "." . $field)->execute().Core doesn't have any instances of the issue so a database update would only affect contrib.
I'm not sure what we should do when the sequence doesn't exist at all. Should try/catch and ignore?
Comment #3
mradcliffeFlipping back to needs work after the fail is there. Forgot to add the postgresql test first.
Comment #5
mradcliffeAdds an update hook test without an update. Interdiff had failed hunks because it was for 8.7.x, not 8.8.x and SchemaTest has changed.
I am not sure where to add the schema for dummy table so I chose the entity_test module. This is tricky because I need to make sure that the table is defined in some schema somewhere so that the actual update hook can check the installed database schema for serial fields and check that the sequence is returned appropriately, but I can't actually use the API to reproduce the problem. So I needed to provide a database fixture that creates the table and simulates an alter on the column to a sequence (that then becomes unowned).
The next step:
1. The schema fix.
2. The update hook.
Hopefully someone can give some feedback about the approach.
Keeping this at Needs work because the test is going to default to MySQL anyway and pass.
Comment #6
mradcliffeBad merge. This was removed at some point and I added it back.
Comment #7
mradcliffeNew patch without the bad merge.
Comment #8
mradcliffeHere's a start to a fix.
- I modified the test and moved the schema addition to a new pgsql_test module.
- I initially tried using invokeAll for schema, but that isn't used anywhere in core. Instead I needed to check enabled modules first.
Comment #9
mradcliffeI noticed #2926070: Deprecate ModuleHandlerInterface::getName() so probably need to change to using the extension.list.module service. That might help reduce the code that I added to check enabled modules.
Comment #10
mradcliffeLooks like I killed a bunch of other tests because the column for the revision key isn't right (maybe not the id key either?)
Local snippet of Drupal\Tests\contact\Functional\Update\ContactUpdateTest:
Comment #11
gnugetI'm working on #826552: Altered serial columns break sequences and it seems that this needs to be fixed first :-)
Comment #12
andypostComment #13
mradcliffeLet's try checking if tables exist. I'm not sure why simpletest_test_id and taxonomy_term_revision tables don't exist in the test runs. It should not do this if the module isn't enabled?
Also a re-roll.
Comment #14
mradcliffeWhat was I thinking adding this to the base Connection class?
Changed this to the postgresql connection class.
Comment #15
mradcliffePretend that last one didn't exist. Also, it seems checking that the table exists didn't work for the first part of the batch and maybe it no longer exists? I'm not sure why.
Comment #16
mradcliffeSelf-review. There are 2 coding standard issues with the patch in #15:
I didn't realize branch tests were failing for PostgreSQL since a few days ago. These seem to be related to #3008029: Migrate D7 i18n field option translations.
Comment #18
mradcliffeLet's try this patch. I re-rolled and moved the update test under system module, which is probably where it should be.
Then I looked at the config override storage and I think that it should be using order by. This seems to pass locally for me on my dev environment using postgresql.
It's been a while since I've tested this patch so hopefully it doesn't blow up spectacularly.
Comment #19
mradcliffeYep, messed that patch up. :-)
Comment #20
andypostBit of clean-up after re-roll
1) module handler should not be used in loop from \Drupal
2) .module file is optional
3) variables should be snake_case
Looks it still exists after https://www.drupal.org/node/3098322
Comment #21
mradcliffeYes, it looks like
drupal-8.6.0.bare.testing.php.gzis removed in 9.0.x, but not 8.9.x. So probably should change to 8.8.0.Comment #22
ravi.shankar commentedHere I have tried to address comment #21.
Comment #23
daffie commentedThe patch looks good.
Why are we adding the
ORDER BY? If it is not really needed, then please do not add it. It does not make Drupal faster.Maybe better: "Retrieves a sequence name that is owned by the table and column."
Why are we adding the join to this query? It is not being used.
Maybe better: "Checks if a sequence exists"
I think that
updateSequenceOwnershipis better. Change implies that you can do different things. This method does only one thing.Can we add some documentation about why this method exists and add a link to this issue/change record.
Why are these 2 lines inside the foreach loop?
What happens when there is no sequence by that name in the database?
I would rather that we do that in the method
changeSequenceOwnership.Comment #24
alphin commentedI'm working on it.
Comment #25
prabha1997 commented@ravi.shankar I am not able to apply patch.
Comment #26
ravi.shankar commentedHi prabha1997,
It Looks like needs a re-roll.
Comment #27
sashi.kiran commentedRe-rolled the patch.
The conflict was with the two functions:
function system_update_8901 and function system_update_8806
I have placed system_update_8806 function code next to system_update_8901 function code and merge went well.
Need to work on the suggestions by daffie.
Comment #28
ravi.shankar commentedComment #29
sivaji_ganesh_jojodae commentedLooks like merge conflict is not resolved fully.
Patch re-rolled to address the same.
Comment #30
daffie commentedThanks to all that worked on rerolling the patch for this issue.
Putting the issue bad to "needs work". My remarks from comment #23 still need to be addressed.
Comment #32
daffie commentedComment #33
mindbet commentedThe attached patch is a reroll for Drupal version 9.1
It addresses comments 1-7 in #23, with 8-10 TBD
Comment #34
daffie commentedComment #35
mindbet commentedAttached please find patch 35.
Patch 33 appears to have been "backwards" — my apologies.
In regard to 23.9 and 23.10 —
I having trouble getting my head around (colloquial English, means 'understanding') the need to scan the tables in the database
on system.install
if a custom entity didn't have a working sequence, won't inserts already be failing?
Comment #36
mradcliffeIf a custom entity didn't have a serial column, but then a column was altered later, then the sequence that is created, which is working, won't be owned by the table. These are orphans that won't be cleaned up by any further alter or delete. Finding the orphan sequences (which may never be perfect) is the complexity in system.install, particularly with entity definitions from Drupal 8 that may not be defined as serial (this was later changed in #2928906: The field schema incorrectly stores serial fields as int, mentioned above by @daffie). I think because of that change, the code that looks for entity serial columns may no longer be necessary.
This can be changed to 'serial' now I think.
And the comment corrected.
This needs to be changed now to `core_version_requirement: ^9`. Or maybe whatever the other test modules use.
That's why the tests are all failing at the moment.
Comment #37
mindbet commentedVersion 37 changes the core version requirement for the test module.
Also, I misinterpreted how you were doing the loop in the update (in system.install line 1538) so I changed it back to what you had originally.
Comment #40
daffie commentedComment #41
ankithashettyUpdated the patch in #37 to fix custom command failure errors, thanks!
Comment #42
alexpottHow come this is in a transaction? There's only a single statement in updateSequenceOwnership(). Also this code will eat database exceptions - should we log when this fails or ignore it?
Comment #43
mradcliffeMy memory is hazy, but without a transaction, if any of those statements fail, it will cause "current transaction is aborted, commands ignored until end of transaction block" and no other queries on the persistent connection will be run. We need to force the implicit commit for each sequence so it can be rolled back.
+1 on logging errors and also decrementing
$sandbox['fixed'].Comment #44
daffie commentedPatch looks good!
Using the function pg_get_serial_sequence() does return the sequence name that is used for the field. However it does not say anything about who owns the sequence. The query you need for that is:
The query is from the patch from #838992: Change the uid field from integer to serial by leveraging NO_AUTO_VALUE_ON_ZERO on MySQL. Maybe it is best to create a new method in the class Schema.
Maybe rename the method to
getSequenceName().This has missed the window for 9.1. Therefor change the name to
system_update_9301()Can we add type hinting to the new methods. Something like
public function getSequence(string $table, string $column): ?string {,public function sequenceExists(string $name): bool {andpublic function updateSequenceOwnership(string $table, string $column) {.Can we move these 2 methods to the class Schema. To me that is where they belong.
Comment #46
darklight90 commentedHello to everyone.
I can confirm that #41 works as designed for Postgres 12.
Thanks,
Marco
Comment #47
joshua1234511Rerolled the patch for 9.4.x-dev
Addressed Issues from #44
Comment #48
mradcliffeThank you for the re-roll, @joshua1234511. It looks like there is one spelling typo in the old patch and that we need to add some PostgreSQL terms to the cspell ignore comments:
Spelling typo, should be "Retrieves".
Need to add refobjid, refobjsubid, objid and classid to the CSpell ignore comment for this file.
Comment #49
joshua1234511Thank you @mradcliffe for the quick review.
Updated the patch with the review from #48
Comment #50
ankithashettyUpdated patch in #49 to make few corrections in
core/modules/pgsql/src/Driver/Database/pgsql/Schema.phpfile. Attached an interdiff.Thanks!
Comment #51
mradcliffeThank you @ankithashetty and @joshua1234511 for the fixes. It looks like the tests are failing due to recent changes in Drupal 9.4.x.
The sqlite driver fail doesn't seem to be related to the patch.
module_load_include is deprecated in 9.4.x.
Comment #52
daffie commentedThe function module_load_include() has been deprecated. Please replace it with
\Drupal::moduleHandler()->loadInclude().Let replace this with 9.0.0 version. Drupal 8 is already EOL.
Can we rename this from "drupal-8" to "drupal-9" and move it to the pgsql module.
Can we add the return type hint ": void".
Can we change "See issue https://www.drupal.org/project/drupal/issues/3028706" to "@see https://www.drupal.org/i/3028706".
Can we replace the variable $args with
[':name' => $name].Can we replace
$seq_owner =withreturn. We do not need the variable $seq_owner. Also the last empty line in this method can be removed.As this fixture is only used by pgsql, is it not necessary to add the if statement.
We need to move this to the start of the method and change it to:
We need to a list with tables names and column name to test if they owner their sequence.
These 2 lines can be removed.
What is it? Are we returning an array or an object? Can we also describe which keys are being returned.
Can we change this to:
Comment #53
mradcliffe+1 on all suggestions from @daffie. We can definitely now move most of this to the pgsql module.
I removed the Assigned property. We shouldn't use this in the Drupal core issue queue.
Comment #54
alexpottAre we sure that we should be adding all this public API to the Postgres schema class. We're only adding this for the update code. I think this could just as easily live in _ methods in the .install file.
This can also move to the postgres module now...
Shouldn't this test fail in $seq_owner is NULL?
Comment #55
ravi.shankar commentedFixed following points of comment #52:
1, 2, 4, 5, 6, 7, 8, 11, 13.
Still needs work for other points.
Comment #56
andregp commentedRemoved a random addition introduced by #55 (inside plugin.es6.js and plugin.js)
Addressed some more points from #52 and fixed some typos.
1. corrected wrong parameter being passed to
\Drupal::moduleHandler()->loadInclude()3. done
9. done
12. partially addressed (I don't know how to check which keys are being returned to indicate there.)
Now there only missing the points 10, 12, and 14 from #52, they are:
10. We need to a list with tables names and column name to test if they owner their sequence.
12. Can we describe which keys are being returned.
14. As the database driver have been moved to their own modules See: https://www.drupal.org/node/3129492. And because this an only PostgreSQL fix, we can move all files to the pgsql module.
The points from #54 still need to be addressed too.
Comment #57
andregp commentedI'm sorry, the patch wasn't uploaded.
Comment #58
mradcliffeThan you for the patch, @andregp.
Won't this will add the fixture to the current site?
This should be included as part of
PostgreSqlSequenceUpdateTest::setDatabaseDumpFiles()and not in the database driver module file.Comment #59
andregp commented@mradcliffe Sorry I may have misunderstood what @daffie meant when he said on #52.3
Should I have just renamed it and keept where it was at patch #50?
Comment #60
andregp commentedUpdated path in pgsql.module and renamed file
drupal-8.pgsql-orphan-sequence.phptodrupal-9.pgsql-orphan-sequence.phpstill not sure on what to do about #58.Comment #61
mradcliffeI think we can add that fixture as a second fixture here.
Do we have a policy about new database-driver specific tests being in system module or in their respective database driver modules, @daffie?
We may want to move this test into \Drupal\Tests\pgsql\Functional\Database namespace and directory too.
The drupal-9.0.0.bare.standard line would need to be changed a bit to account for the directory change. And we probably can move the postgresql fixture into pgsql module as well.
Does that help, @andregp?
Then we can remove the driver checks and markTestSkipped checks.
Comment #62
andregp commented@mradcliffe, thanks for clarifying.
I moved PostgreSqlSequenceUpdateTest.php and drupal-9.pgsql-orphan-sequence.php to the pgsql module,
added back the line __DIR__ . '/../../../fixtures/update/drupal-9.0.0.bare.standard.php.gz', to UpdatePathTestBase::setDatabaseDumpFiles and updated it according to the new file location,
removed the driver checks and markTestSkipped checks.
Still need to address #56
Comment #65
arantxioI have rerolled the patch for 10.1.
Comment #66
mondrakeThis is blocking #3337513: Fix batch process race conditions by making ‘bid (batch id)’ auto-increment, and is a broken API - to me should qualify as Major.
Comment #67
mondrakeReviewed, for me it's RTBC, but I was thinking it would be good to have a test to demonstrate
... then realized thatDriverSpecificSchemaTestBase::testChangePrimaryKeyToSerial()is actually meant to be doing that, and should be running for every driver. So now I do not understand why that is not failing for pgsql.EDIT: it does not seem that this is blocking #3337513: Fix batch process race conditions by making ‘bid (batch id)’ auto-increment
Comment #68
daffie commentedI agree with @mondrake in comment #67: that test is necessary.
For me just 1 other remark:
This if-statement can be removed. It should always be true.
Comment #69
mondrakeAdjusted tags to reflect #67 edited comment
Comment #70
andypostAddress #68 and next but calling hooks inside of update hook is better to rewrite with listing all sequences.
Moreover except
hook_schema()other services can provide schema (usingensureSchema()out of head) so leaving "Needs work" for upgrade path reworkthis else looks useless, sandbox init is executed only once and then for-loop can always iterate
This code should not appear in update hooks, modules could have pending updates for their schema (which possibly invalid ATM)
pending update hooks also can change storage
Comment #71
andypostas sandbox using batch size 50 - any reason for transaction?
Comment #72
daffie commentedJust being on the save side. AFAIK it does no harm.
Comment #73
arantxioAfter some consultation with @daffie I've created a new patch, which fixes the custom commands and also some changes regarding the update function name.
Comment #74
arantxioComment #75
daffie commentedAll code changes look good to me.
Testing has been added.
I have updated the IS and I have added a CR.
For me it is RTBC.
Comment #76
alexpottThat this will miss tables that are not part of hook_schema or an entity table. In core we have cache tables and batch storage (for example)... and there's an issue to convert batch storage to serial.
Is there another way to do this? Like can we get all the tables - find all the possibly serial colums and check for sequences that do not have the correct ownership and then fix them. We might just have to look for sequences based on integer column names as postgres does not record the fact it is a sequence via the column type.
We need an upstream issue in the drupal phpstan project to fix this. The error is incorrect. I opened https://github.com/mglaman/phpstan-drupal/issues/516
Comment #77
daffie commentedIf we do this than we can also change non Drupal tables. I am not sure that is something we should do.
Comment #78
alexpott#77 is a good point. Hmmm.... that's really really tricky. So this is a best effort situation. So let's leave the code as is but add a comment explaining the reasoning in the update function.
We also should address #54.1 / #76.1 - I'm not convinced the new API to upport the update function is worth it. We could inline the queries and be done.
Comment #79
andypostbtw Listing of tables and sequences looks more predictable (there's table prefix and if other tables has it they are doomed) vs calling hooks from update hook, I recall only sqlite issue with listing #2949229: SQLite findTables Returns Empty Array on External DB.
Comment #80
andypostBut if no prefix used then core can't list all schema service consumers for tables
Comment #81
arantxioI've addressed #54.1, but I've moved the code to .module instead of .install, as it wouldn't work for me when I placed it in the .install file as requested.
Comment #83
daffie commentedAll code changes look good to me.
The functionality has moved to functions in the module file and their names start with an underscore, therefor they are @internal. No addion to the public API. The code has been moved there, because to code is used in multiple instances and for better readability.
I have updated the IS.
There is testing being added.
For me it is RTBC.
For the committer: the CR must not be published.
Comment #84
alexpottLet's go with a slightly different approach here. Let's add a final @internal class Drupal\pgsql\Update10101 that's instantiated from the hook_update_N like this:
\Drupal::classResolver(Update10101::class)->update($sandbox- see \Drupal\Core\Config\Entity\ConfigEntityUpdater for something similar. That way this class can encapsulate all the functionality and be instantiated in the test so it can access the same functionality. Adding functions likefunction _updateSequenceOwnership(string $table, string $column): void {don't really work. This function should be called_pgsql_update_sequence_ownership()for example. But also all these functions end up in the global function namespace and will be harder to remove once we no longer need to maintain the update.Comment #85
arantxioI've created a patch based on your request @alexpott, please check if this is what you meant.
Comment #86
arantxioI forgot to add a comment to the update function in the Update10101 class.
Comment #87
daffie commentedAll points of @alexpott have been addressed.
Back to RTBC.
Comment #88
larowlanWe have a mix of dependency injection and the global singleton.
We should inject the DB I think
We can add a return type hint here ?TranslatableMarkup
We can use str_starts_with now
ubernit: >80
We can use camel case for these method names now, e.g. $this->getSequenceName() (no need for the pgsql prefix either)
we can use new \Drupal\Core\StringTranslation\PluralTranslatableMarkup instead of using the singleton here
Why was this needed?
This doesn't take into account hook_entity_base_field_info_alter
And I don't think we can rely on the entity-type manager during an update hook.
I think we need to use
\Drupal\Core\Entity\EntityLastInstalledSchemaRepositoryInterface::getLastInstalledFieldStorageDefinitionsinsteadComment #89
arantxio@larowlan Thank you for your input, I will look onto these suggestions soon and will post a new patch. Also regarding #88.7, Alexpott made a issue for this problem in phpstan stated in comment #76.2, the issue can be found through the following link: Github issue
Comment #90
arantxioIf I'm correct I've added all requested changes from @larowlan except for #88.7 as explained in my previous comment.
Comment #91
arantxioI've changed the return type for the update function to ?PluralTranslatableMarkup instead of PluralTranslatableMarkup|NULL.
Comment #93
andypostthere's
extension.list.moduleservice to get list of enabled modules, so moduleExists() could be removedComment #94
rohan-sinha commentedhi as @andypost extension.list.module is not yet stable, in D10, is it good to use it here?
https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Extension...
Comment #95
arantxio@rckstr_rohan this service already gets used a lot in core so it is not that bad to use it in my opinion.
So as requested by @andypost, here is a updated patch with the service. I also altered some code so we get the services using the container instead of calling them in the function. Changes can be seen in the interdiff.
Comment #96
arantxioMy bad forgot to comment the new parameters in the __construct function and had to regenerate the baseline, because that change is not necessary anymore.
Based the interdiff on patch #91 instead of #95.
Comment #97
daffie commentedThe remarks of @andypost (comment #93) has been addressed.
Back to RTBC.
Comment #98
catchThis is covering hook_schema() and entities but does it also need to run for services with lazy table creation? I'm not sure how if it does...
Comment #99
catchDiscussed this in slack with @daffie.
The consequences of not updating a table are not that bad, so if we fix this for all new sites + entity and hook_schema() tables on new sites that will solve things for a lot of people.
We can then open a new issue to figure things out for on-demand tables.
Let's change this to 'Discovers all tables defined with hook_schema()' and add a @todo pointing to the follow-up issue.
Comment #100
daffie commentedCreated the requested followup (#3358777: New serial columns do not own sequences for on demand tables).
Comment #101
arantxioHere is the updated comment and added the todo.
Comment #102
daffie commentedAll remarks from @catch have been addressed.
Back to RTBC.
Comment #104
larowlanI realise this code has been raked over ad-nauseam, so apologies in advance.
As this is 10.1+ only now, let's use constructor property promotion and do away with a fair bit of boiler plate here
nit: we don't need this array, @count is already set by the first argument in the constructor for PluralTranslatableMarkup
Any reason not to pass this in as a parameter given we already calculated it on line 181?
This seems to be only used in the test, any reason we can't move this method to the test?
Will likely need to also change some cspell ignore entries on both this class and the test
Keen to get this in, so please ping me when this is RTBC again.
Comment #105
arantxioI think i've added all code requested in #104 by Larowlan. Here is a updated patch and interdiff, please review.
Comment #106
daffie commentedAll points of @larowlan have been addressed.
Back to RTBC.
Comment #107
larowlanFixed on commit
Committed 167a7c8 and pushed to 11.x. Thanks!
As there's a new update hook here, this isn't backport eligible so will not be in a tagged release until 10.2
Thanks all.
Comment #108
larowlanComment #110
andypostIt needs CR update for protected method which is not public now
Comment #112
quietone commentedPublished change record
Comment #113
boobaa