See http://drupal.org/node/686196#comment-2742526

My own local testing shows that queries to information_schema take at least 0.6s each - so it's extremely likely that these calls are the cause of apache timeout issues on the installer.

This patch just removes the check, tries to make the table, and catches the exception afterwards. I do get a nice big red error with the correct error message, but I also get notices in drupal_add_http_header(). I'm exceptions stupid, so anyone wants to fix that would be great.

I expect this would have some improvement on simpletest run times as well, would be interesting to compare if anyone has time.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

andypost’s picture

db_create_table() already without checking, so maybe we remove all checks for existance in object creation functions? db_add_*

Also return value is FALSE when exception but ???NULL when success, I think this should be unified

catch’s picture

Status: Needs review » Closed (duplicate)
catch’s picture

Status: Closed (duplicate) » Needs review
FileSize
1.02 KB

Status: Needs review » Needs work

The last submitted patch, createtable.patch, failed testing.

aspilicious’s picture

Status: Needs work » Needs review

#3: createtable.patch queued for re-testing.

Crell’s picture

Status: Needs review » Needs work

Catching an exception just to throw it again without changes is useless. We want to catch it and then rethrow a DatabaseSchemaObjectExistsException for consistency with the rest of the Schema API.

Berdir’s picture

Status: Needs work » Needs review
FileSize
1.03 KB

Ok, what about this?

Crell’s picture

Status: Needs review » Needs work

That's more like it. We should probably document why we're doing it this way instead of a pre-check like everything else, though. Otherwise it's going to confuse the heck out of us in a year. :-)

Berdir’s picture

Status: Needs work » Needs review
FileSize
1.27 KB

Ok, added a comment.

catch’s picture

Looks good. Only issue here is I think that MySQL 5.0 is slow for all information_schema queries (this is apparently fixed in 5.1 for most things) - so should we look into making the same change elsewhere? It's less likely that we do many operations on tables apart from install, but it's still about 0.6 seconds per query extra.

Berdir’s picture

Oh, I did a short test for columns and it was quite fast (0.00something) and decided against doing the same for columns. But I'm using MySQL 5.1, obviously...

Can someone try the following query on MySQL 5.0?
SELECT SQL_NO_CACHE 1 FROM information_schema.columns WHERE table_schema = 'database_name' AND table_name = 'system' AND column_name = 'name'

And report the time...

Dries’s picture

I question the correctness of DatabaseSchemaObjectExistsException. I don't think it is guaranteed that the table already exists. When I look at the code, it feels like creating the table could fail because the schema is incorrect, or because the disk is full for that matter. Thoughts?

Berdir’s picture

Correct.

We could run tableExists() in case of an exception and simply re-throw the PDO exception in case the table does not exist....

catch’s picture

I still think we should just rethrow the PDO exception here (or let it bubble up), consistency is pointless if it's not correct.

bjaspan’s picture

re #11: On a mysql 5.0 server with 2000ish databases:

mysql> SELECT SQL_NO_CACHE 1 FROM information_schema.columns WHERE table_schema = 'g996' AND table_name = 'system' AND column_name = 'name';
+---+
| 1 |
+---+
| 1 | 
+---+
1 row in set (1.05 sec)
David_Rothstein’s picture

We could run tableExists() in case of an exception and simply re-throw the PDO exception in case the table does not exist....

This will work for MySQL but will still lead to false positives for the other database drivers, since their createTableSql() implementations run more than one SQL statement (beyond the initial table creation).

As I already commented in the other issue, I think the only way to fix that would be something like this:

... can we trust that the first statement returned by $this->createTableSql() is always the one that actually creates the table? If so, then it's possible to know when to do the tableExists() call - we could just do it only when we catch an exception on the first statement. If not, then it's not clear when we should check it without risking mistaken information.

On the other hand, even committing the patch as it is now would at least be trading a critical bug for a normal one, which is not such a bad tradeoff :)

****

Once we figure out the correct pattern here, we should definitely make all the others (columnExists, indexExists, etc) conform to it as well. If those queries are slow, there is no reason to do them except when necessary.

andypost’s picture

#15 If querying information schema is a hard and most shared hostings have a lot of databases maybe better to use

select 1 from {tablename} limit 1

and catch exception if table does not exists

Damien Tournoud’s picture

I actually like the SELECT 1 LIMIT 1 idea. Feels just hackish enough so that it could work :)

David_Rothstein’s picture

I like that too. Would probably work for columnExists() also, e.g: SELECT column_name FROM {table_name} LIMIT 1, right?

andypost’s picture

A bit hackish but it works, suppose this approach just require a comment.
Also this could help to catch and throw internal DatabaseSchemaObjectExistsException

bjaspan’s picture

re #17-20: I thought that's what D6 did. In fact the mysql driver for D6 did SHOW TABLES LIKE escape($tablename) or SHOW COLUMNS FROM escape($tablename) LIKE escape($columnname). So I'd say SELECT 1/column_name FROM tablename LIMIT 1 and catching an exception is (a) no more hacky than what D6 did, (b) likely to be faster since SHOW LIKE might end up scanning the whole list anyway, and (c) is what I thought we were already doing anyway. :)

catch’s picture

Status: Needs review » Needs work

Yeah I like SELECT 1 too, we can throw a specific exception without worrying if it's the right one, and it can go straight into createTable() and the others so we don't need to change the db_* functions around.

andypost’s picture

Status: Needs work » Needs review
FileSize
2.31 KB

Patch to start, this needs an extensive comments, so maybe some help with it

Crell’s picture

Status: Needs review » Needs work

Isn't this workaround only needed for MySQL? Unless Postgres and SQLite need the same hack, let's move that logic down into the MySQL driver rather than in the generic implementation. And then document why we're doing it (speed, with the possibility to remove it when we start requiring MySQL 5.1.)

I like this approach better than catching/decoding/rethrowing exceptions. It just feels cleaner.

andypost’s picture

@Crell I think this approach is universal for any database so it could live in base class

Crell’s picture

I don't mean if it would work, I mean if it's necessary. If the canonical approach works fast enough in Postgres and SQLite, I'd rather leave the canonical approach in there and just override MySQL. Either way it needs better documentation.

andypost’s picture

As I write before this needs good comments and benchmarks.

bcn’s picture

Reroll of andypost's patch from #23 with what (I presume) Crell had in mind with the comment in #24?

andypost’s picture

Status: Needs work » Needs review

Just a status change

David_Rothstein’s picture

We know that MySQL is a problem because a number of people have been trying Drupal 7 on servers that have thousands of MySQL databases on them. Has anyone tried the same for the other database drivers?

It seems like we don't know that it's a problem for those, but don't know that it's not a problem either :) In many ways it would be simpler to just make the change for all of them.

Either way, it still needs some documentation explaining why we do it this way.

catch’s picture

This isn't a problem only due to the number of databases, my localhost with < 20 databases it still takes about 600ms to query information_schema, the issue has been fixed in MySQL 5.1 I think.

One reason for overridding it in the MySQL driver is we could simply remove those hunks of code when we require MySQL 5.1 (so we should add a @todo for that).

Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

Let's do that.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

We're still missing the @todo or add better documentation -- see #31.

Also, 'drupal' should be written 'Drupal'.

Crell’s picture

#28 needs:

1) Documentation as to why we're overriding those methods.

2) A @todo to see if we can remove it once we require MySQL 5.1.

3) It is not necessary to duplicate the docblocks from the parent class.

Once those are addressed, I'm good with this approach.

andypost’s picture

Suppose it's better to check mysql version internaly and make something like savepoints support for sqlite in #669794: Use savepoints for nested transactions

catch’s picture

I don't think we need to do that, this approach works equally well on either MySQL version and should be similar performance-wise, a @todo here is fine.

andypost’s picture

Title: MySQL-optimized tableExists() implementation » Don't check for existing tables in db_create_table()

Postgres also is not lightning fast on metadata, I think we need more statistics. Probably this change could make pgsql is much faster with tests

EDIT postgres tests on User

Without patch: Test run duration: 4 min 40 sec
Patch: Test run duration: 4 min 32 sec
Crell’s picture

Title: Don't check for existing tables in db_create_table() » Don't check for existing tables in db_create_table() for MySQL

The base implementation should do the SQL standard approach.

If we need to do something else in specific drivers, do that in those drivers.

If we need to do a Postgres version of this as well, let's open a new issue and do that in the Postgres driver.

Crell’s picture

Title: Don't check for existing tables in db_create_table() for MySQL » MySQL-optimized tableExists() implementation

Better title

andypost’s picture

Title: Don't check for existing tables in db_create_table() » MySQL-optimized tableExists() implementation
Status: Needs work » Needs review
FileSize
1.08 KB

This patch needs help with comment

Crell’s picture

Revised patch with expanded comments. I did a little googling and dropped in a link to the MySQL bug ticket, which is apparently still not fixed. I therefore took out the reference to MySQl 5.1 and instead said to remove the override when MySQL stops sucking.

Status: Needs review » Needs work

The last submitted patch, 748340-mysql-exists-d7.patch, failed testing.

Crell’s picture

Status: Needs work » Needs review

How is that possible when nothing changed except comments, and the details report says green? I call shenanigans...

andypost’s picture

Status: Needs review » Reviewed & tested by the community

@Crell I think that we should re-open #730220: One fragile test: ImageFieldDisplayTestCase because "Image field display tests" was broken so I ask to re-test and this makes green

Crell’s picture

Hm. Like I said, shenanigans. :-) Either way, it looks like it works now so yay!

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Great job -- love the diligence. Committed to CVS HEAD.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.