Make the postgresql driver independant of the schema [#349671]

Comment	File	Size	Author
#43	349671-postgresql-free-at-last_webchick-changes-41.patch	9.29 KB	josh waihi
#36	349671-postgresql-free-at-last-not-really.patch	7.44 KB	damien tournoud
#31	349671-postgresql-free-at-last_not-really.patch	8.93 KB	josh waihi
#24	349671-postgresql-free-at-last_3.patch	7.49 KB	josh waihi
#18	349671-postgresql-free-at-last_2.patch	7.24 KB	josh waihi
#15	349671-postgresql-free-at-last.patch	7.49 KB	damien tournoud
#11	pgsql_349671.patch	5.67 KB	drewish
#2	349671-postgresql-free-at-last.patch	5.31 KB	damien tournoud
#1	349671-postgresql-free-at-last.patch	5.33 KB	damien tournoud

Comment #1

damien tournoud commented 21 December 2008 at 02:43

Status:

Active

» Needs review

Status	File	Size
new	349671-postgresql-free-at-last.patch	5.33 KB

As a RFC, here is a first implementation.

Log in or register to post comments

Comment #2

damien tournoud commented 21 December 2008 at 02:48

Status	File	Size
new	349671-postgresql-free-at-last.patch	5.31 KB

New version, hopefully with less obvious mistakes.

Log in or register to post comments

Comment #3

Crell commented 21 December 2008 at 06:48

Status:

Needs review

» Closed (duplicate)

There's already an issue for this, which asks to put this functionality into the schema itself. #301038: Add a cross-compatible database schema introspection API IMO that's where it belongs, not in the connection object.

Working to get that issue taken care of, though, would be awesome. :-)

Log in or register to post comments

Comment #4

damien tournoud commented 21 December 2008 at 11:57

Status:

Closed (duplicate)

» Needs review

@Crell: please stop kidding: we only need two information about the table, and we only need them when dealing with InsertQuery. There is zero need to load the full-blown schema object for this. Please review the patch.

Log in or register to post comments

Comment #5

Crell commented 21 December 2008 at 19:47

Who's kidding? As of right now we do not support ANY non-local databases (local = where Drupal is installed) aside from MySQL and SQLite, because of the requirement of the schema for field-type checking. If we make the schema object able to detect the schema as needed, even if it requires some DB-specific logic, we have enabled remote-DB access for all supported database types. This patch works only for Postgres, and puts information about the schema in the connection object rather than the schema object. That's the wrong place for it.

The schema object should be the abstraction through which all information about a given connection's tables are accessed so that we can do whatever weirdness we need to do. Note, I'm talking about the DatabaseSchema class, not the drupal_get_schema() mega-array. IMO that mega array should be split up by table and moved behind the schema object to begin with.

Also, we care for UpdateQuery, too, not just InsertQuery.

Log in or register to post comments

Comment #6

damien tournoud commented 21 December 2008 at 22:14

I never said we don't need to add introspection in schema.inc. But this is completely another issue; for Insert and Update queries we only need some very specific information, and we really don't need nor want to load the full schema.inc for this.

Log in or register to post comments

Comment #7

drewish commented 25 December 2008 at 01:08

Priority:

Normal

» Critical

subscribing. chx pointed me to this issue after I posted #351002: Cannot update D6 to D7 on pgsql. this patch fixes that issue and in light of that i'm marking this as critical.

Log in or register to post comments

Comment #8

drewish commented 25 December 2008 at 01:21

Just to follow up. I gave the patch a good look and it looked okay to me but I'm not comfortable with the affected code to mark it RTBC.

Considering that #301038: Add a cross-compatible database schema introspection API has no patch at this point I'd say we should fix this and then consider a heavier solution like that.

Log in or register to post comments

Comment #9

keith.smith commented 25 December 2008 at 01:24

Strictly just reading the code comments, "informations" has an extra "s".

Log in or register to post comments

Comment #10

dries commented 26 December 2008 at 21:41

Status:

Needs review

» Needs work

Let's fix the typo identified by Keith.

I'd like to get another update from Crell too.

Log in or register to post comments

Comment #11

drewish commented 26 December 2008 at 22:04

Status:

Needs work

» Needs review

Status	File	Size
new	pgsql_349671.patch	5.67 KB

Here's a re-roll that fixed the typo.

Log in or register to post comments

Comment #12

Crell commented 26 December 2008 at 22:04

Status:

Needs review

» Needs work

I am still not convinced that this belongs in the connection object. Information about the DB's schema belongs in the schema object. If the schema object is too clunky right now, then let's make it less clunky.

Log in or register to post comments

Comment #13

Crell commented 26 December 2008 at 22:05

Status:

Needs work

» Needs review

Cross posted. Setting back to CNR for the bot, even though I still do not agree with the approach.

Log in or register to post comments

Comment #14

drewish commented 2 January 2009 at 21:24

it'd be great to get it in even if it's imperfect because without it you're unable to run updates on HEAD.

Log in or register to post comments

Comment #15

damien tournoud commented 2 January 2009 at 23:42

Status	File	Size
new	349671-postgresql-free-at-last.patch	7.49 KB

Information about the DB's schema belongs in the schema object. If the schema object is too clunky right now, then let's make it less clunky.

I'm not worried about the API, I'm worried about the size of the code itself. I don't see why we should force loading a full schema.inc on every page request for PostgreSQL, while we need only a very specialized piece of information.

The attached patch takes care about updating UpdateQuery_pgsql too, improve code comments and consistency.

Log in or register to post comments

Comment #16

drewish commented 3 January 2009 at 00:46

Status:

Needs review

» Needs work

I don't think @see tags are complete sentences and therefore shouldn't end with a period.

Tried to do a clean install after applying the patch and got the following error:

PDOException: UPDATE batch SET token=:db_update_placeholder_0, batch=:db_update_placeholder_1 WHERE (bid = :db_condition_placeholder_13) - Array ( [target] => default [return] => 2 [already_prepared] => 1 ) SQLSTATE[08P01]: <<Unknown error>>: 7 ERROR: bind message supplies 2 parameters, but prepared statement "pdo_pgsql_stmt_0203b6ac" requires 3 in batch_process() (line 2633 of /Users/amorton/Sites/dh/includes/form.inc).

Log in or register to post comments

Comment #17

damien tournoud commented 4 January 2009 at 23:37

Issue tags:

+PostgreSQL Surge

Testing the new tag system.

Log in or register to post comments

Comment #18

josh waihi commented 5 January 2009 at 03:50

Status	File	Size
new	349671-postgresql-free-at-last_2.patch	7.24 KB

something like this will help satisfy both parties - I agree with Crell, I see no reason why that extra schema stuff should be in the connection. This patch attempts to put it in Schema though this is untested and most likey won't work. but surely we can do something like this?

Log in or register to post comments

Comment #19

damien tournoud commented 5 January 2009 at 11:23

Title:

PostgreSQL surge #15: Make the postgresql driver independant of the schema

» Make the postgresql driver independant of the schema

@Josh: well, this is one of the worst solution. queryTableInformation() is a very specialized function: it only fetches information about blobs and sequences used on a table. It is not a general schema query solution, and has nothing to do in schema.inc. My points in making that this way are:

(1) we simply don't need to query the whole table information for Insert and Update queries
(2) we don't need to load the full schema.inc file at every request (the PostgreSQL driver is slow enough)

And removed the title that is of no use anymore, now that we have proper tags :)

Log in or register to post comments

Comment #20

Crell commented 5 January 2009 at 16:01

Just how much of a performance hit is there really to loading the schema.inc file? It's not an especially complicated class, so it should parse quickly, and loading the class itself doesn't automatically load the full schema out of the cache. (Side note: We really do need to break that up still. Bah.)

Log in or register to post comments

Comment #21

josh waihi commented 5 January 2009 at 22:27

Sorry, I was under the impression that the schema.inc had already been included. I thought it was the drupal_get_schema() function calls you wanted to avoid. I still think that if its schema related then it should reside in the schema.inc. I'm still not convinced that queryTableInfomation() belongs in Database_pgsql.

I'm not sure how my suggestion loads all the table information:
$this->connection->schema()->queryTableInformation($this->table);
I assumed this did the same as $this->connection->queryTableInformation($this->table); but stored the function in the schema.

Log in or register to post comments

Comment #22

josh waihi commented 6 January 2009 at 05:10

in regards to my comment above, when schema() is called is mearly initializes an instance of DatabaseSchema_pgsql which in turn assigns the connection to a variable within the instance. When schema() is called, it also assumes that schema.inc has been included. So I'm still a little confused as to how moving queryTableInformation to the schema.inc will slow things down.

Log in or register to post comments

Comment #23

damien tournoud commented 6 January 2009 at 08:20

@Josh: calling ->schema() will trigger a $this->schema = new $class_type($this);, which will trigger, via the SPL autoloader, the inclusion of schema.inc.

Log in or register to post comments

Comment #24

josh waihi commented 6 January 2009 at 22:26

Status	File	Size
new	349671-postgresql-free-at-last_3.patch	7.49 KB

ok fair enough, I'm convinced. This is Damien's patch with a few small changes and error clean-ups. am currently testing it on a PostgreSQL testbed. will post results when done

Log in or register to post comments

Comment #25

josh waihi commented 6 January 2009 at 23:08

Status:

Needs work

» Needs review

forgot to change status

Log in or register to post comments

Comment #26

josh waihi commented 6 January 2009 at 23:10

Status:

Needs review

» Needs work

also test results on postgres: Failed: 8138 passes, 5 fails, 0 exceptions <-- no different to CVS HEAD

Log in or register to post comments

Comment #27

josh waihi commented 6 January 2009 at 23:10

Status:

Needs work

» Needs review

stupid status defaults!!!

Log in or register to post comments

Comment #28

Crell commented 7 January 2009 at 07:47

I still disagree that this belongs in the connection object rather than the schema object. At absolute minimum, if the performance difference is enough to justify it we should document why we're doing this lookup in the "wrong" place.

Log in or register to post comments

Comment #29

8 January 2009 at 01:10

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #30

josh waihi commented 8 January 2009 at 02:31

Status:

Needs work

» Needs review

faulty slave, retesting

Log in or register to post comments

Comment #31

josh waihi commented 30 January 2009 at 03:47

Status	File	Size
new	349671-postgresql-free-at-last_not-really.patch	8.93 KB

ok. Time trials, I selected the database unit tests (717 tests) and ran them through the web interface. Results:

HEAD 1/30/09: The tests finished in 14 min 34 sec.
Patch applied: The tests finished in 14 min 22 sec.
Patch Below applied:The tests finished in 14 min 45 sec. (logic was stored in schema as Crell suggested)

In this instance we are talking seconds of difference but there is no doubt that the patch above in #24 is the optimal.

Log in or register to post comments

Comment #32

catch

he/him

English

commented 30 January 2009 at 11:21

Josh - I know it's annoying, but we probably need those run through again, I'm sure there's a few seconds margin of error running tests on most machines anyway. Either that or can you run ab on a database heavy page?

Log in or register to post comments

Comment #33

josh waihi commented 31 January 2009 at 23:13

@catch, this was only over 717 tests out of 8000+ I'd imagine the results would amplify and become clearer if all 8000+ tests were run. Time is the issue here. wish boombatower would get those postgres testbeds running ;)

Log in or register to post comments

Comment #34

bjaspan commented 4 February 2009 at 14:25

This patch seems to work to fix the problem with installation on pgsql identified by #361683: Field API initial patch. I did not review the code other than to try the patch.

Log in or register to post comments

Comment #35

josh waihi commented 17 February 2009 at 04:46

#24 fixes #375264: Field API failing on PostgreSQL

Crell, can we get this into head? Cause HEAD is broken on PostgreSQL

Log in or register to post comments

Comment #36

damien tournoud commented 17 February 2009 at 07:09

Status:

Needs review

» Reviewed & tested by the community

Status	File	Size
new	349671-postgresql-free-at-last-not-really.patch	7.44 KB

This is a reroll of #24. It's quite time to get this little one in.

Log in or register to post comments

Comment #37

webchick

she/they

English

Vancouver 🇨🇦

commented 18 February 2009 at 13:56

Hm. I want to commit this, I really do, because every second that we can't install D7 on pgsql means we are likely introducing more pgsql bugs in subsequent patches since we are effectively unable to test.

However, I still can't tell if Crell's concerns were addressed. :(

Crell, are you out there? Would you support committing this even as an interim fix before we re-work schema.inc in the way that you want?

Log in or register to post comments

Comment #38

webchick

she/they

English

Vancouver 🇨🇦

commented 18 February 2009 at 15:11

Status:

Reviewed & tested by the community

» Needs review

Log in or register to post comments

Comment #39

Crell commented 18 February 2009 at 17:36

I'll try to have a look at this patch tonight US time. Sorry, the past week and a half have been totally insane. :-(

Log in or register to post comments

Comment #40

Crell commented 22 February 2009 at 06:46

Status:

Needs review

» Reviewed & tested by the community

I still disagree that this information belongs in the connection object at all. Putting it there instead of in the schema object is a hack and I'm still not convinced that it's a necessary hack. However, it's a single-database-specific hack and we do need to make simpletest work in postgres. So for now RTBC, and let's try really hard to block out time to clean up schema so we can move this logic there later.

Log in or register to post comments

Comment #41

webchick

she/they

English

Vancouver 🇨🇦

commented 23 February 2009 at 06:29

Status:

Reviewed & tested by the community

» Needs work

Some minor issues:

+   * We introspect the database to collect that information required by insert
+   * and update queries.

Probably "the" information required...

+   *   An object with two member variables:
+   *     * 'blob_fields' that lists all the blob fields in the table.
+   *     * 'sequences' that lists the sequences used in that table.

use - rather than * for the bulleted list.

+          $table_information->sequences[] = $matches[1];

Just a question.. why are these not keyed like the blob_fields are above?

+          $blobs[$blob_cnt] = fopen('php://memory', 'a');

We don't abbreviate variable names. $blob_count. Also consistent with the lines at the bottom of the patch.

+          ++$blob_cnt;

$blob_count++ would be the norm. Could you place a comment there to explain why we pre-increment?

Log in or register to post comments

Comment #42

Crell commented 23 February 2009 at 06:40

Pre-increment is marginally faster than post-increment in PHP, for deep engine reasons I don't fully understand but have seen benchmarks for. In cases where either work equally well logically, pre-increment is therefore a (rather) slight performance benefit.

I don't know if intricacies of PHP engine semantics like that are worth a comment, or if they cross into the "we expect you to already know this" territory (as we do for a lot of other micro-optimizations).

Log in or register to post comments

Comment #43

josh waihi commented 23 February 2009 at 19:14

Status:

Needs work

» Reviewed & tested by the community

Status	File	Size
new	349671-postgresql-free-at-last_webchick-changes-41.patch	9.29 KB

Log in or register to post comments

Comment #44

webchick

she/they

English

Vancouver 🇨🇦

commented 24 February 2009 at 16:34

Status:

Reviewed & tested by the community

» Fixed

Committed to HEAD. Thanks!

Log in or register to post comments

Comment #45

josh waihi commented 2 March 2009 at 03:41

Status:

Fixed

» Needs review

ah crap, the patch I posted were fixes made to the schema version. @Crell, I guess you got your wish

Log in or register to post comments

Comment #46

2 March 2009 at 03:55

Status:

Needs review

» Needs work

The last submitted patch failed testing.

Log in or register to post comments

Comment #47

josh waihi commented 30 October 2009 at 03:37

Status:

Needs work

» Closed (fixed)

closing -

Log in or register to post comments

Make the postgresql driver independant of the schema

Comments