Many core tables have very poor indicies - for example {system}. At the very least we should do a basic audit of the core tables where queries are running on every page load.

For a start, the system table. The most common queries are of the form SELECT * {system} WHERE type = 'module' AND status = 1 and SELECT * {system} WHERE type = 'module' AND name = '%s'

right now, neither of these sorts of queries uses any index. Also, the existing index 'weight' is useless and will never be used.

mysql> explain select * FROM system where type = 'module' AND status = 1  ORDER by weight, filename; 
+----+-------------+--------+------+---------------+------+---------+------+------+-----------------------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  | rows | Extra                       |
+----+-------------+--------+------+---------------+------+---------+------+------+-----------------------------+
|  1 | SIMPLE      | system | ALL  | NULL          | NULL |    NULL | NULL |   41 | Using where; Using filesort |
+----+-------------+--------+------+---------------+------+---------+------+------+-----------------------------+

if we add the key: KEY `type_status` (`type`(8),`status`,`weight`,`filename`) we do much better (see module.inc line ~61 - both queries have the same explain result):

mysql> explain select * FROM system where type = 'module' AND status = 1  ORDER by weight, filename; 
mysql> explain select * FROM system where type = 'module' AND status = 1 AND bootstrap = 1 ORDER by weight, filename; 
+----+-------------+--------+------+---------------+-------------+---------+-------------+------+-------------+
| id | select_type | table  | type | possible_keys | key         | key_len | ref         | rows | Extra       |
+----+-------------+--------+------+---------------+-------------+---------+-------------+------+-------------+
|  1 | SIMPLE      | system | ref  | type_status   | type_status |      28 | const,const |   13 | Using where |
+----+-------------+--------+------+---------------+-------------+---------+-------------+------+-------------+

Adding an index like KEY `name_type` (`name`(16),`type`(8)) optimizes the other sort of query.

mysql> explain SELECT * FROM system where type = 'module' AND name = 'system';
+----+-------------+--------+------+-----------------------+-----------+---------+-------------+------+-------------+
| id | select_type | table  | type | possible_keys         | key       | key_len | ref         | rows | Extra       |
+----+-------------+--------+------+-----------------------+-----------+---------+-------------+------+-------------+
|  1 | SIMPLE      | system | ref  | type_status,name_type | name_type |      72 | const,const |    1 | Using where |
+----+-------------+--------+------+-----------------------+-----------+---------+-------------+------+-------------+

Starting patch attached just for system.schema.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pwolanin’s picture

FileSize
1.67 KB

a small start too on taxonomy suggested by David Strauss.

moshe weitzman’s picture

subscribe ... thanks for diving into this.

Zothos’s picture

Where else can the indexes increase the performance? I cant be only in the taxomy table... :P

BioALIEN’s picture

Finally someone has stepped up to this. Thanks pwolanin!

Last month I analysed the database of a 5.x install. Here are the commends I noted:
1. node table, phpMyAdmin spit out a "PRIMARY & INDEX keys should not both be set of column 'nid'. More than one INDEX key was created for column 'status'."
2. blocks table, NO INDEX DEFINED! - added INDEX on delta or module columns?
3. flood table has no INDEX! - Create INDEX on timestamp?
4. All CCK generated tables require "nid" to be set as INDEX. Issue submitted to CCK devs: http://drupal.org/node/162535

The list goes on but it is related to contrib modules rather than core itself.

Hope this helps!

pwolanin’s picture

Thanks - there was an article recently (http://www.dave-cohen.com/node/1066) that pointed out that some tables such as blocks are missing a PRIMARY key (bad).

pwolanin’s picture

Priority: Normal » Critical
FileSize
5.06 KB

the frickin Actions module's table has a broken schema index declaration, no primary key, and no validation function to avoid inserting duplicates.

Also, the {locales_target} is missing a Primary key - should it be (lid, language, plid) or (lid, language, plural) or something else???

2 of 3 of the search module tables don't have a Primary key- is there another patch pending for these tables?

ChrisKennedy’s picture

http://drupal.org/node/147298 has some earlier work on this issue by David if you haven't seen it.

David Strauss’s picture

Subscribing.

pwolanin’s picture

wtf? Trying to double-check that the suggested patch by David Strauss for removing the bid column from {blocks} makes sense - there is obviously some confused code in block module:

$ grep -rn blocks.*bid *

modules/block/block.module:277:  return db_fetch_array(db_query("SELECT bx.*, bl.title FROM {boxes} bx INNER JOIN {blocks} bl ON bx.bid = bl.delta WHERE bl.module = 'block' AND bx.bid = %d", $bid));

modules/block/block.module:382:    $result = db_query(db_rewrite_sql("SELECT DISTINCT b.* FROM {blocks} b LEFT JOIN {blocks_roles} r ON b.module = r.module AND b.delta = r.delta WHERE b.theme = '%s' AND b.status = 1 AND (r.rid IN (%s) OR r.rid IS NULL) ORDER BY b.region, b.weight, b.module", 'b', 'bid'), $theme_key, implode(',', array_keys($user->roles)));

modules/block/block.admin.inc:296:  db_query("DELETE FROM {blocks} WHERE module = 'block' AND delta = %d", $form_state['values']['bid']);

modules/system/system.install:2497:    db_query("UPDATE {blocks} SET title = '%s' WHERE delta = %d and module = 'block'", $box->title, $box->bid);

So somehow the delta from {boxes} and the bid are the same in come cases?

David Strauss’s picture

@pwolanin My patch in the old issue isn't nearly complete, and I wouldn't make such a change on this issue because this issue is performance-focused.

The reason I removed the bid column on the old issue's patch is because it's redundant, at best. We already identify blocks by module and delta; it's unnecessary to introduce another column. Also, because we so casually delete and insert rows from/into the blocks table to refresh it, an auto_increment column isn't even a reliable primary key.

pwolanin’s picture

@David Strauss - yes - seems an autoincrement doesn't make much sense. Still, perhaps defining a unique index on (module, delta, theme) would be simpler than removing this column (at least for 6.x)?

bjaspan’s picture

Subscribe.

m3avrck’s picture

This can still make it in Drupal 6 since it is both a bug and performance fix.

catch’s picture

Just a quick note that: http://drupal.org/node/146688 was duplicate, and has a different suggested index to the latest patch doesn't it?

David Strauss' post from that issue here for ref:

ALTER TABLE {term_data} DROP INDEX vid;
ALTER TABLE {term_data} ADD INDEX taxonomy_tree (vid, weight, name);
catch’s picture

http://drupal.org/node/172724

was also duplicate.

coupet’s picture

subscribe... we need a comprehensive solution for ver. 6

jaydub’s picture

subscribe

dmitrig01’s picture

Bump. Needs to be re-rolled for actions -> triggers

pwolanin’s picture

split off the Trigger module portion here: http://drupal.org/node/178650

pwolanin’s picture

re-rolling patch with the rest of the changes in the above patch

catch’s picture

This patch: http://drupal.org/node/147298 also related although not sure if/which should be marked as duplicate.

catch’s picture

#171685 was duplicate.

Dries’s picture

I'd love to commit this patch but it is hard to verify whether the changes actually improve things or not. All the information seems to be spread across multiple issues. Is there a good way to test this or do we just commit this and assume it does more good than wrong?

JirkaRybka’s picture

Another issue is http://drupal.org/node/185126 (system table, containing a patch, with upgrade path and benchmark attempt too (seems to be 3.5% gain for cached pages)).

NancyDru’s picture

I can confirm (at least under 5.x) that adding theme, module, delta as a key to blocks makes a visible performance difference, although I have not benchmarked it.

NancyDru’s picture

Also note that the "nid" and "status" indexes on node are redundant.

catch’s picture

Is this needs work because there's no upgrade path? What else needs to be done to get it in?

#69111 was yet another duplicate.

catch’s picture

Dries: http://drupal.org/node/146688 states that taxonomy_get_tree (forum index) currently uses a filesort. This is a big issue on drupal.org and any site with a decent number of forums. The proposed index by David Strauss in that issue is exactly the same as the one in this patch.

However the system changes look much better dealt with over at: http://drupal.org/node/185126 I've attached a patch with that section taken out so the system module index can be dealt with in it's own right.

So what does this need to get in? Upgrade path?

catch’s picture

ack! schema files are gone of course.

So reroll to move into .install + upgrade path. Sorry for so many posts in a row.

catch’s picture

Status: Needs work » Needs review
FileSize
1.2 KB

This patch moves the schema changes back to .install.

This section:

  $schema['term_hierarchy'] = array(
@@ -35,7 +35,6 @@ function taxonomy_schema() {
     'indexes' => array(
       'nid' => array('nid'),
       'tid' => array('tid'),
-      'vid' => array('vid')

Appears to have already been changed, so I left that out. Marking back to review now. This still needs an upgrade path assuming agreement on the indexes themselves.

JirkaRybka’s picture

Related issue: http://drupal.org/node/85453 - Statistics tables.

catch’s picture

#147298 was closed in favour of this one, however David Strauss' patch has a lot of work that didn't make it over here (and is against .schema files). Deals with aggregator, block, filter, system and taxonomy. system is best dealt with on that other issue, but I'll try to roll anything that doesn't conflict into a new patch against HEAD today. We'll still need an upgrade path for this once that's done though.

catch’s picture

FileSize
6.82 KB

Ok this is a re-roll of the previous patches in this issue with the addition of David Strauss' suggested changes linked above.

There are three schema changes in this (the removal of id columns in block, taxonomy and filter module schemas and their replacement with better primary keys) - I'm not sure if that will get in at this stage with beta, but I've left them in. If they should be taken out I'm happy to re-roll without those sections and either revive or create seperate issues for those particular changes rather than complicate this one.

Leaving at "needs review", but we still have zero upgrade path for any of these changes as yet, so more work to be done (and probably not by me, all I'm doing is reading issues and cut and pasting to get this presentable).

This is untested but I'll clear my dev copy + search through modules to see how much is broken by the schema changes a bit later.

catch’s picture

Thinking about it, we should absolutely not introduce any schema changes in this patch. I'll look for any relevant block/taxonomy/filter issues and split those changes over to there, probably for D7.

So this patch only deals with index changes, still needs review and upgrade path.

catch’s picture

block.module currently only has a primary key on bid, nothing else, so presumably there are index optimisations possible which we could deal with here.

Linked issues above removed bid (autoincrement) entirely, I've split these to #14158 with a re-rolled patch for D7, don't see changes of that size getting into D6 at this point.

catch’s picture

More reading around, sorry for the multiple bumps, every time I think I'm done I find another issue trying to deal with this.

node.install has these indexes on {node}.

    'indexes' => array(
      'node_changed'        => array('changed'),
      'node_created'        => array('created'),
      'node_moderate'       => array('moderate'),
      'node_promote_status' => array('promote', 'status'),
      'node_status_type'    => array('status', 'type', 'nid'),
      'node_title_type'     => array('title', array('type', 4)),
      'node_type'           => array(array('type', 4)),
      'status'              => array('status'),
      'uid'                 => array('uid'),
      'tnid'                => array('tnid'),
      'translate'           => array('translate'),
      ),
    'unique keys' => array(
      'nid_vid' => array('nid', 'vid'),
      'vid'     => array('vid')
      ),
    'primary key' => array('nid'),
    );

This leads redundant indexes on nid and status, attached patch removes the ('nid', 'vid') unique key and the 'status' index as reported by nancyw in #26 (and phpmyadmin). There's still some redundancy even with this, but it'll require better eyes than mine to work out how to remove them cleanly without making something else worse.

David Strauss’s picture

At this stage, we probably can't remove important indexes and columns in a way that would require code refactoring.

NancyDru’s picture

I thought bid (in the boxes table) was how the box was linked to the block. If you remove it, how do they tie together?

David Strauss’s picture

@nancyw Blocks are uniquely identified by module and delta. bid is a redundant unique index.

JirkaRybka’s picture

On boxes table, indeed bid is used, but the patch earlier in this issue removed bid on blocks table, where it seems to be really unused. The relation (as seen in the code) seems to be blocks.delta - boxes.bid . But anyway, this is D7 stuff, schema changes, so let's proceed wihout this, as suggested above.

NancyDru’s picture

Ah, I see now. I was confused because there is no "bid" in the blocks table in D5. But I did manually add a primary on theme, module, and delta and saw a performance difference.

catch’s picture

primary on theme, module, and delta

This is the one main change I can think of that isn't in the patch as yet. I also looked through all D6 tables and I'm pretty sure the parent and tid indexes in {term_hierarchy} could go - primary key is ('tid', 'parent') anyway.

David Strauss’s picture

The (tid, parent) key is not usable for queries selecting on parent alone.

JirkaRybka’s picture

Yet another frequently performed queries are in the locale module. I didn't test it, but the indexes seem to me not optimal too, and these tables are big (thousands of rows).

'locales_source' have primary key 'lid', and index 'source'(30), but almost all queries use also the new 'textgroup' column, so I guess it should be an index too. 'locales_target' have indexes 'lid', 'plid', 'plural', 'language' (which is fine), but no primary key (not sure whether this is a problem; the unique identifier here is lid+language FYI), and upgrade path is missing the new 'language' index (!!!) which might be a separate issue.

The frequent runtime queries are:
Uncached string (>75 chars long):
SELECT s.lid, t.translation, s.version FROM {locales_source} s LEFT JOIN {locales_target} t ON s.lid = t.lid AND t.language = '%s' WHERE s.source = '%s' AND s.textgroup = 'default'
Cache rebuild (after new string added):
SELECT s.source, t.translation, t.language FROM {locales_source} s LEFT JOIN {locales_target} t ON s.lid = t.lid AND t.language = '%s' WHERE s.textgroup = 'default' AND s.version = '%s' AND LENGTH(s.source) < 75

pwolanin’s picture

@David - for blocks is the unique key really (module, delta), or is it (module, delta, theme)?

David Strauss’s picture

@pwolanin Sorry, you're correct. You need the theme column to form a candidate key.

catch’s picture

Status: Needs review » Needs work

OK so block.module needs a key on theme, module, delta

@JirkaRybka - if the upgrade path is bad in locale, should we split to it's own issue or try to include here?

Marking to needs work for those, and because there's still no upgrade path in my patch from #36

JirkaRybka’s picture

Status: Needs review » Needs work

Re: #47 - I guess we should create a new issue if it's only the locale upgrade path (which I will as time permits), but if we decide to change some keys for locale here, then it might go together.

--------------------------------------
Edit (2007-11-11): Issue created at http://drupal.org/node/191236

David Strauss’s picture

FileSize
16.49 KB

Here's my latest work. I'd like to have people agree on what we want before I spend time writing update code.

David Strauss’s picture

Status: Needs work » Needs review

I'd like people to review my proposed changes. Once we can agree on the indexes, I'll write the update code.

dmitrig01’s picture

Status: Needs review » Needs work

The index on comments.status is good - usually we select comments where status = 1
The order in statistics.install is odd

David Strauss’s picture

"The index on comments.status is good - usually we select comments where status = 1."

What queries did you have in mind? The mere existence of "status = 1" in the WHERE clause does not mean the index is useful.

chx’s picture

Status: Needs work » Needs review

If it's CNW noone will look into it. That's not what we want at this point.

catch’s picture

The {term_node} indexes should probably also be applied to {forum} as well - it's the same fields, many similar queries (more if this patch gets in).

NancyDru’s picture

As long as you're looking at clean up tasks, what is "tid" in the Permissions table? It is always 0 in all my sites.

scott.mclewin’s picture

I'm seeing slow queries show up on one of my sites is in the cron job cleanup of the watchdog logs. In my reading of the patch, I don't see this timestamp field indexed. Granted, this is not an eighteen second query time that my users explicitly sit through when browsing to a page, but it is tying up the database server, which would slow down other requests.

# Query_time: 18 Lock_time: 0 Rows_sent: 0 Rows_examined: 0
DELETE FROM watchdog WHERE timestamp < 1191564006;

explain needs a select, so here's a variation on the delete:

mysql> explain select * FROM watchdog WHERE timestamp < 1191564006;
+----+-------------+----------+------+---------------+------+---------+---
---+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | re
f | rows | Extra |
+----+-------------+----------+------+---------------+------+---------+---
---+--------+-------------+
| 1 | SIMPLE | watchdog | ALL | NULL | NULL | NULL | NU
LL | 120879 | Using where |
+----+-------------+----------+------+---------------+------+---------+---
---+--------+-------------+
1 row in set (0.00 sec)

Granted, I've kept too much watchdog log around (something I'm fixing), but the point for this thread is that the timestamp field in the watchdog table is a candidate for this patch.

David Strauss’s picture

I'm against adding an index to watchdog. It will slow down INSERTs, which do occur during user page loads. Plus, watchdog clean-outs are moving more toward fix row limits, anyway.

scott.mclewin’s picture

That's a good point, and I didn't know about the row limit approach being taken/considered for the watchdog logs.

catch’s picture

I ran this snippet which identifies worst performing indexes, linked from here. I think David Strauss' patch already deals with some of these redundant indexes, but here's the results for reference (3 results removed because they were duplicates between drupal installs).

table..column..indexname..seq..cols..card..estrows..sel%
cache_page  	expire  	expire  	1  	1  	0  	990  	0.00
cache_views 	expire 	expire 	1 	1 	0 	28 	0.00
locales_target 	lang 	locale 	1 	1 	2 	16271 	0.01
locales_target 	plural 	plural 	1 	1 	1 	16271 	0.01
node 	node_promote_status 	promote 	1 	2 	1 	7089 	0.01
node 	node_moderate 	moderate 	1 	1 	1 	7089 	0.01
node 	node_status_type 	status 	1 	3 	1 	7089 	0.01
This script shows the top 10 worst indexes (in terms of selectivity %) on the whole MySQL server instance. Selectivity is the percentage of distinct values in an indexed field compared to the number of records in the table.

same page also says:

If index has 2 values and one of them is in 99% and other in 1% it well may be index is a good idea to look up for second value - status field is typical type of such field.

-which at least those node indexes are likely to fall under.

klando’s picture

Status: Needs work » Needs review

Postgresql (>=8.2) users should start the statitistic, add lot of index, and after few days select from pg_stat* to see which index are used, or never used, etc.... It is usually a good starting point as it provide very good data on what was read from disk, memory, via an index or not

JirkaRybka’s picture

The locales_target lang/language/locale (or whatever is it named through the versions) index will get much more selective once you've more languages set up. With only just one language, all entries have the same, obviously.

catch’s picture

Well it looks like the only thing that needs doing with this patch is the statistics order and the upgrade path. None of the proposed changes have been challenged and additional indexes can always have their own issue later on.

edit: #49 still applies with offset. So it's small reroll of that plus upgrade path. Do we have a good example of an upgrade which adds and/or drops indexes and works on pgsql?

Gábor Hojtsy’s picture

Anyone taking this up?

pwolanin’s picture

If there is consensus now on the changes required, I'll try to work on the remaining pieces

catch’s picture

pwolanin, nothing added in #49 has been disputed as yet, and a lot in there had been proposed in other, duplicate, issues of this one. If there's anything missing it's easy to do a followup patch as well.

David Strauss’s picture

If this patch causes any problems, there are only two possible causes:
1. We're trying to store invalid data into a table.
2. Or, we misunderstand what sort of data we should storing in a table.

pwolanin’s picture

looking though the patch #49 - whose comment is this?

-        'description' => t('A comma-separated string of roles; references {role}.rid.'),
+        'description' => t('A comma-separated string of roles; references {role}.rid.'), // This is stupid.
pwolanin’s picture

Status: Needs review » Needs work

ok -patch applies cleanly, but install breaks:

Warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '0 (nid) ) /*!40100 DEFAULT CHARACTER SET UTF8 */' at line 6 query: CREATE TABLE history ( `uid` INT NOT NULL DEFAULT 0, `nid` INT NOT NULL DEFAULT 0, `timestamp` INT NOT NULL DEFAULT 0, PRIMARY KEY (uid, nid), INDEX 0 (nid) ) /*!40100 DEFAULT CHARACTER SET UTF8 */ in /Users/Shared/www/drupal6/includes/database.mysql.inc on line 167

Warning: Specified key was too long; max key length is 1000 bytes query: CREATE TABLE blocks ( `bid` INT NOT NULL auto_increment, `module` VARCHAR(64) NOT NULL DEFAULT '', `delta` VARCHAR(32) NOT NULL DEFAULT '0', `theme` VARCHAR(255) NOT NULL DEFAULT '', `status` TINYINT NOT NULL DEFAULT 0, `weight` TINYINT NOT NULL DEFAULT 0, `region` VARCHAR(64) NOT NULL DEFAULT '', `custom` TINYINT NOT NULL DEFAULT 0, `throttle` TINYINT NOT NULL DEFAULT 0, `visibility` TINYINT NOT NULL DEFAULT 0, `pages` TEXT NOT NULL, `title` VARCHAR(64) NOT NULL DEFAULT '', `cache` TINYINT NOT NULL DEFAULT 1, PRIMARY KEY (bid), UNIQUE KEY tmd (theme, module, delta), INDEX list (theme, status, region, weight, module) ) /*!40100 DEFAULT CHARACTER SET UTF8 */ in /Users/Shared/www/drupal6/includes/database.mysql.inc on line 167
NancyDru’s picture

I'm glad I'm not the only one to run into the length problem on blocks.

I have mine set to: theme(100), module(50), delta(6)

I can't imagine a theme name that long, but it worked and my performance improved, so I didn't change it.

If the theme name was changed to VARCHAR(64) like the module name, it might take care of the whole thing.

NancyDru’s picture

Just to confirm that, I just changed one of my test sites so that theme is defined as VARCHAR(64). Then I deleted and recreated the primary key without lengths and it worked fine.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
17.99 KB

found this lovely error too, now that the filter index is unique:

Warning: Duplicate entry '1-filter-3' for key 2 query: INSERT INTO filters (format, module, delta, weight) VALUES (1, 'filter', 3, 10) in /Users/Shared/www/drupal6/includes/database.mysql.inc on line 167

This is due to a bug in system.install - I'll file a separate patch, since that's a trivial fix: http://drupal.org/node/194029

New patch attached. Limits the theme name to 64 chars for the index. The module field is already set to 64. Are theme names really allowed to be 255 chars?

Also, correct a defect in the {history} schema definition.

FiReaNGeL’s picture

You can limit the index size on one part (such as theme name) to like 10 characters instead of limiting the field itself. You really don't have to index the full string here; and I'm sure there's other places like this in already existing indexes.

pwolanin’s picture

FileSize
20.1 KB

ah - I cross-posted with nancyw - I think I'll agree that changing the column to 64 (rather than the index alone) is the better solution.

Here's a revised patch with a start on the update path.

pwolanin’s picture

@FiReaNG3L - yes, I also though of that first, but it doesn't really make sense to limit the length of the indexed part if we are trying to achieve a unique index. You cnnot be sure the index is unique unless you index the full length of the field.

FiReaNGeL’s picture

pwolanin: my bad, I missed the unique part; if we can't change the table substantially (add a primary key field or something), then I agree that reducing the size of the field would be the only option. 255 chars for such a field is overkill anyway - whats the theme with the longest name nowadays?

catch’s picture

Status: Needs review » Needs work

fresh install. enabled all modules. installed schema.module, all smooth so far.

catch’s picture

pwolanin’s picture

Status: Needs work » Needs review
FileSize
20.95 KB

ok - patch with (untested) update code. Code does not yet cover taxonomy, locale, or user.

locale needs bigger work that will be captured here: http://drupal.org/node/194310

taxonomy and user can be added to this patch (if anyone has time).

pwolanin’s picture

FileSize
22.53 KB

opps. last patch missed upload too. This one should have all except locale - taxonomy and user are included.

catch’s picture

Status: Needs review » Needs work

OK I installed 5.x-dev, enabled all core modules. ugpraded to D6, installed schema module. Update went smoothly but the following errors reported:


    * profile_values.uid is part of the primary key but is not specified to be 'not null'.
    * profile_values.fid is part of the primary key but is not specified to be 'not null'.

    *
      vocabulary
          o indexes list: missing in database

    *
      profile_values
          o column fid:
            declared: array('type' => 'int', 'unsigned' => 1, 'not null' => , 'default' => 0)
            actual: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0, 'disp-width' => '10')
          o column uid:
            declared: array('type' => 'int', 'unsigned' => 1, 'not null' => , 'default' => 0)
            actual: array('type' => 'int', 'unsigned' => 1, 'not null' => 1, 'default' => 0, 'disp-width' => '10')

#

    *
      aggregator_category_feed
          o indexes fid: missing in database
    *
      aggregator_category_item
          o indexes iid: missing in database


bjaspan’s picture

Regarding primary key fields not being 'not null', please read http://drupal.org/node/159328. Bottom line, add 'not null' => TRUE to all field definitions in the primary key.

When schema.module says an index is "missing in database" it means the index is declared by the module's hook_schema() but no index has been created in the database, presumably because the database has been upgraded from D5 and the update functions do not yet create that index.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
23.57 KB

um, well - I'll have to run schema module on my other computer. Apparently it needs PHP 5.1 (see: http://drupal.org/node/194764 )

Anyhow, here's another (as yet untested) patch that should correct the errors above.

catch’s picture

Status: Needs review » Needs work

quick 5.x-6.x-patched upgrade, I still get this in schema module. Everything else looks fixed though!

      vocabulary
          o indexes list: missing in database

fwiw in the actual 6.x database itself, vocabulary has a primary key on vid, and nothing else. ought to be weight and name according to schema definition.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
23.68 KB

new patch catches the missing vocabulary index. I think everything else matches the schema definition.

catch’s picture

Status: Needs review » Reviewed & tested by the community

indeed it does.

bjaspan’s picture

Status: Reviewed & tested by the community » Needs work

Errors on pgsql.

* warning: pg_query() [function.pg-query]: Query failed: ERROR: constraint "upload_pkey" does not exist in C:\Documents and Settings\bjaspan\Home\WORK\drupal\head\includes\database.pgsql.inc on line 155.
* user warning: query: ALTER TABLE upload DROP CONSTRAINT upload_pkey in C:\Documents and Settings\bjaspan\Home\WORK\drupal\head\includes\database.pgsql.inc on line 790.
* warning: pg_query() [function.pg-query]: Query failed: ERROR: multiple primary keys for table "upload" are not allowed in C:\Documents and Settings\bjaspan\Home\WORK\drupal\head\includes\database.pgsql.inc on line 155.
* user warning: query: ALTER TABLE upload ADD PRIMARY KEY (vid,fid) in C:\Documents and Settings\bjaspan\Home\WORK\drupal\head\includes\database.pgsql.inc on line 778.

It's dinnertime, I'll try to look into this later.

pwolanin’s picture

I'm getting ready to try it on postres - did you get those errors on a 5.x upgrade, or on install?

pwolanin’s picture

a clean install with PostgeSQL 8.2 (obviously applying the patch before the install) doesn't give me any errors - and schema module reports only "Match" for all installed tables

pwolanin’s picture

ok - I'm seeing about these same sorts of errors now on a D5 -> D6 update. Looks for the first like the system update that's supposed to match the MySQL and PostgreSQL schema needs to do something more with the upload table? The index I see is named file_revisions_pkey, not upload_pkey

pwolanin’s picture

Status: Needs work » Needs review
FileSize
24.78 KB

attached patch straightens out the primary key and other indexes for {upload} by fixing up system_update_6022(). This prevents the errors described above.

as expected, schema module reports a mismatch for locale module:

* languages
indexes list: missing in database

the {languages} index probably needs to be fixed in: http://drupal.org/node/194310 since it's not clear if those update functions will come before or after.

bjaspan’s picture

Status: Needs review » Needs work

With patch #90 applied, I get

* warning: pg_query() [function.pg-query]: Query failed: ERROR: index "locales_target_language_idx" does not exist in C:\Documents and Settings\bjaspan\Home\WORK\drupal\head\includes\database.pgsql.inc on line 155.
* user warning: query: DROP INDEX locales_target_language_idx in C:\Documents and Settings\bjaspan\Home\WORK\drupal\head\includes\database.pgsql.inc on line 854.

along with the languages.list index warning, which sounds like is as expected. Note that I have not actually reviewed any of the code or new indexes for correctness, I just ran the D5 > D6 update for pgsql.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
24.73 KB

ah - at the last minute I added a line to drop an index- turns out it doesn't exist in the 5.x schema. I did it because in the patch for locale.install this index is removed form the 6.x schema. Attached patch is tested and should update without errors on pgSQL.

jaydub’s picture

Testing results with both MySQL (5.0.38) and PostgreSQL (8.1):

Fresh Drupal 6 Beta 3
- Installed all modules
- Installed Schema module
- Ran schema compare. No mismatched tables
- Installed the patch. No errors in MySQL or PostgreSQL
- Ran schema compare. 3 mismatched tables in both MySQL and PostgreSQL:

upload:
upload
* primary key:
declared: array('vid', 'fid')
actual: array('fid', 'vid')
* indexes fid: missing in database
* indexes vid: unexpected (not an error)

locale:
languages
* indexes list: missing in database

book:
book
* unique keys nid: missing in database
* indexes nid: unexpected (not an error)

Re-ran test with fresh Drupal 5.3
- Installed all modules
- Installed Drupal 6 Beta 3 over Drupal 5 and ran upgrade. No errors in MySQL or PostgreSQL
- Installed Schema module
- Ran schema compare. No mismatched tables
- Installed the patch. No errors in MySQL or PostgreSQL
- Ran schema compare. 3 mismatched tables in both MySQL and PostgreSQL same as above.

catch’s picture

jaydub - please test with the latest cvs checkout, or at least the latest development tarball:
http://drupal.org/node/97368
http://drupal.org/node/320

Also, you'll need to apply the patch before you run update.php from 5.3 to 6.x since that's the main issue here.

catch’s picture

Status: Needs review » Needs work

needs a re-roll in light of this: http://drupal.org/node/194310

pwolanin’s picture

Status: Needs work » Needs review
FileSize
24.71 KB

simple re-roll for system.install function numbering, etc - not tested yet.

catch’s picture

#96 applies cleanly and passes schema testing on a D5-6 update (MySQL). Missing locale index is still there but was being dealt with elsewhere I think.

pwolanin’s picture

FileSize
28.12 KB

re-roll including putting the proper schema API stuff into locale.install based on http://drupal.org/node/194310#comment-637940

pwolanin’s picture

Status: Needs review » Needs work

ok - getting some errors on PG upgrade:

warning: pg_query() [function.pg-query]: Query failed: ERROR: column "Account settings" does not exist LINE 1: UPDATE profile_fields SET category = "Account settings" WHER... ^ in /var/www/drupal/includes/database.pgsql.inc on line 155.
user warning: query: UPDATE profile_fields SET category = "Account settings" WHERE LOWER(category) = "account" in /var/www/drupal/modules/system/system.install on line 2705.
warning: pg_query() [function.pg-query]: Query failed: ERROR: column "javascript" of relation "languages" already exists in /var/www/drupal/includes/database.pgsql.inc on line 155.
user warning: query: ALTER TABLE languages ADD COLUMN javascript varchar(32) NOT NULL default '' in /var/www/drupal/includes/database.pgsql.inc on line 700.

I'm not sure if the first set is due to this patch - second set certainly is

pwolanin’s picture

pwolanin’s picture

FileSize
29.8 KB

ok - first error is due to bad use of quotes in a different patch. See:

http://drupal.org/node/188498#comment-646495

new patch attached which fixes locale module error above (and renumbers locale's update functions).

With this patch schema module reports 100% match of schema to definition with 5.x->6.x update on postgres

pwolanin’s picture

Status: Needs work » Needs review
catch’s picture

Status: Needs review » Reviewed & tested by the community

Tested again on MySQL and this fully passes schema testing on 5-6 upgrade. Should be ready.

Gábor Hojtsy’s picture

Can we run benchmarks on some key pages with and without the patch? Dries had a concern up above. I read all the follow ups in this issue, and it seems like all the indexes added came from actual site experience, slow query logs, worst performing indexes, etc. It would be great to validate this with some benchmarks though.

David Strauss’s picture

"It would be great to validate this with some benchmarks though."

Just a friendly reminder that this patch isn't just about performance. The new and improved UNIQUE indexes improve data integrity.

Gábor Hojtsy’s picture

Also, the "stupid" comment on user roles should be toned down a bit IMHO. We know not everything in Drupal is best architecture wise, but we can document this more politely IMHO. (Of course this is a minor little nitpick which I can modify on commit even, so take your time with the benchmarks instead).

webchick’s picture

Title: Improve table indicies for common queries » GHOP #63: Improve table indicies for common queries
Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Needs review
pwolanin’s picture

no one seems to have claimed this as a GHOP task yet - but we do need it done soon...

webchick’s picture

I specified as part of the task that it expires when this issue is marked 'fixed'. So anyone is free to benchmark, regardless if a student signs on to do this or not.

catch’s picture

Status: Needs review » Needs work

system_update_6041() just got committed, so this needs a reroll.

CorniI’s picture

I claimed this task one minute ago in GHOP ;)
I'm in IRC Corni

Gábor Hojtsy’s picture

Great, awaiting your feedback.

pwolanin’s picture

FileSize
29.77 KB

here's a re-rolled patch - it applies cleanly, but not tested further.

pwolanin’s picture

Status: Needs work » Needs review
catch’s picture

Just a quick note to say Corni kindly let us know that he'll be unable to begin this task until at least the weekend and it's been reopened for now on the GHOP task list. So this is still open for anyone to bench.

pwolanin’s picture

ok - some very crude benchmarks - I installed all core modules that create tables, created some users and content (~1500 nodes) with devel_generate, added a couple feeds manually. Increased items on the front page to 30.

(note - this exercise makes me think poll module and aggregator have some bugs).

using ab (1.3) across my local net to the server (an old Dell running kubuntu with Apache 2.2, PHP 5.2, MySQL 5.0 - no query cache).

$ ab -n200 -c2 http://192.168.1.5/drupal/node      
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.5 (be patient)
Completed 100 requests
Finished 200 requests
Server Software:        Apache/2.2.4                                       
Server Hostname:        192.168.1.5
Server Port:            80

Document Path:          /drupal/node
Document Length:        73281 bytes

Concurrency Level:      2
Time taken for tests:   216.135 seconds
Complete requests:      200
Failed requests:        0
Broken pipe errors:     0
Total transferred:      14758200 bytes
HTML transferred:       14656200 bytes
Requests per second:    0.93 [#/sec] (mean)
Time per request:       2161.35 [ms] (mean)
Time per request:       1080.68 [ms] (mean, across all concurrent requests)
Transfer rate:          68.28 [Kbytes/sec] received

Aggregator:


$ ab -n200 -c2 http://192.168.1.5/drupal/aggregator
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.5 (be patient)
Completed 100 requests
Finished 200 requests
Server Software:        Apache/2.2.4                                       
Server Hostname:        192.168.1.5
Server Port:            80

Document Path:          /drupal/aggregator
Document Length:        25144 bytes

Concurrency Level:      2
Time taken for tests:   123.125 seconds
Complete requests:      200
Failed requests:        16
   (Connect: 0, Length: 16, Exceptions: 0)
Broken pipe errors:     0
Total transferred:      5130778 bytes
HTML transferred:       5028778 bytes
Requests per second:    1.62 [#/sec] (mean)
Time per request:       1231.25 [ms] (mean)
Time per request:       615.62 [ms] (mean, across all concurrent requests)
Transfer rate:          41.67 [Kbytes/sec] received

Tracker

$ ab -n200 -c2 http://192.168.1.5/drupal/tracker
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.5 (be patient)
Completed 100 requests
Finished 200 requests
Server Software:        Apache/2.2.4                                       
Server Hostname:        192.168.1.5
Server Port:            80

Document Path:          /drupal/tracker
Document Length:        10735 bytes

Concurrency Level:      2
Time taken for tests:   107.285 seconds
Complete requests:      200
Failed requests:        76
   (Connect: 0, Length: 76, Exceptions: 0)
Broken pipe errors:     0
Total transferred:      2247514 bytes
HTML transferred:       2145514 bytes
Requests per second:    1.86 [#/sec] (mean)
Time per request:       1072.85 [ms] (mean)
Time per request:       536.42 [ms] (mean, across all concurrent requests)
Transfer rate:          20.95 [Kbytes/sec] received

did a mysql dump (omitting create table commands):

 mysqldump -uroot -p -c --create-options -q --disable-keys -t dp60 > dump.sql

dropped all tables. Applied patch from #114. Installed Drupal and enabled modules. Truncated all tables from mysql prompt and then loaded the SQL dump back in.

Node

$ ab -n200 -c2 http://192.168.1.5/drupal/node
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.5 (be patient)
Completed 100 requests
Finished 200 requests
Server Software:        Apache/2.2.4                                       
Server Hostname:        192.168.1.5
Server Port:            80

Document Path:          /drupal/node
Document Length:        73281 bytes

Concurrency Level:      2
Time taken for tests:   216.310 seconds
Complete requests:      200
Failed requests:        0
Broken pipe errors:     0
Total transferred:      14758200 bytes
HTML transferred:       14656200 bytes
Requests per second:    0.92 [#/sec] (mean)
Time per request:       2163.10 [ms] (mean)
Time per request:       1081.55 [ms] (mean, across all concurrent requests)
Transfer rate:          68.23 [Kbytes/sec] received

Aggregator

$ ab -n200 -c2 http://192.168.1.5/drupal/aggregator
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.5 (be patient)
Completed 100 requests
Finished 200 requests
Server Software:        Apache/2.2.4                                       
Server Hostname:        192.168.1.5
Server Port:            80

Document Path:          /drupal/aggregator
Document Length:        25144 bytes

Concurrency Level:      2
Time taken for tests:   122.903 seconds
Complete requests:      200
Failed requests:        0
Broken pipe errors:     0
Total transferred:      5152520 bytes
HTML transferred:       5050010 bytes
Requests per second:    1.63 [#/sec] (mean)
Time per request:       1229.03 [ms] (mean)
Time per request:       614.52 [ms] (mean, across all concurrent requests)
Transfer rate:          41.92 [Kbytes/sec] received

Tracker

$ ab -n200 -c2 http://192.168.1.5/drupal/tracker
This is ApacheBench, Version 1.3d <$Revision: 1.73 $> apache-1.3
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.1.5 (be patient)
Completed 100 requests
Finished 200 requests
Server Software:        Apache/2.2.4                                       
Server Hostname:        192.168.1.5
Server Port:            80

Document Path:          /drupal/tracker
Document Length:        10735 bytes

Concurrency Level:      2
Time taken for tests:   107.643 seconds
Complete requests:      200
Failed requests:        74
   (Connect: 0, Length: 74, Exceptions: 0)
Broken pipe errors:     0
Total transferred:      2247505 bytes
HTML transferred:       2145505 bytes
Requests per second:    1.86 [#/sec] (mean)
Time per request:       1076.43 [ms] (mean)
Time per request:       538.22 [ms] (mean, across all concurrent requests)
Transfer rate:          20.88 [Kbytes/sec] received

So - the benchmarks in this case look essentially unchanged. However, that's expected since many of the index changes relate to data integrity, not speed.

klando’s picture

I haven't parse all the patch, not applyed.

But, my 2 cents:

on table like that (only few thousand of rows) indices are not really use, and the change absent, because of cache.

data integrity should also be done by using foreign key.

I don't know what are the benefits to chage primary key like that : PK(a,b) vs PK(b,a) what do you really expect ?

Anyway the best thing is to look to the explain and explain analyze results.

Benchs must be done on bigest DB.

catch’s picture

@klando Reordering of primary keys (and other indexes) - MySQL reads indexes left to right, so (a,b) gives a free index on (a). This eliminates the overhead of some duplicate indexes which are removed in the patch.

MyISAM doesn't have proper (any?) support for foreign keys.

pwolanin’s picture

@klando - feel free to come up with and do better benchmarks. We all know ab is not the best tool.

Otherwise, I think this is RTBC....

catch’s picture

Status: Needs review » Reviewed & tested by the community

Anyone I 'know' is welcome to a 5.x dump with c. 19k nodes, 2.5k users and 250k comments for benchmarking if that helps. But yes I'm moving this back to rtbc. Benchmarks are good, but I this is about more than performance (and it would only be an issue if they showed a performance hit, which they don't, so we only win here).

David Strauss’s picture

"so (a,b) gives a free index on (a)"

That's 99% true. There is a touch more overhead for MySQL to use an (a, b) index for just (a) than a simple (a) index. But *generally*, that doesn't merit the disk space and processing time needed to create an index on (a) when you already have (a, b).

CorniI’s picture

OK, I saw that i'm able to claim a task and did it in a week, not in 2 days. so i'd do this saturday. But is this now already done, or do you want me to benchmark this, too (Platform: Gentoo Linux, MySQl 5.0, PHP 5.2.5, Core2Duo E6320, 4GB RAM)?

catch’s picture

David Strauss, thanks for the clarification!

Cornil: until this gets committed it's not done yet. Alternative benchmarks (and with a bigger dataset) would be great. As would running some relevant queries through EXPLAIN like klando suggested, although that's outside the remit of the GHOP task of course.

pwolanin’s picture

@Corni - an additional round of benchmarking would be well worth while if you are set up to do it.

CorniI’s picture

OK, i've a db of ~580MB created (devel_generate is teh rul0r ;)) Is this enough, or shall I add one or 2hundred megs? I'll post the benchnmark the next day (saturday), then you're able to commit and fix this issue as fast as possible :)

pwolanin’s picture

@CorniI - see if you can find someone to show you how to put a translation in too, plus some feeds, etc. Maybe also some user profile categories and data if possible.

Probably that's big enough - make sure that you can dump and re-load without the table structure (unless you can think of a better way to bench the same content with different indices). As above, make sure you have no query cache. Hit several of the relevant pages - such as: a listing of users with a profile category, one or more taxonomy pages, aggregator, and a page with lots of comments,

CorniI’s picture

This is my GHOP task, if anyone of the Maintainers can say that's ok then i'll post it at google's tracker, so you can mark my task as closed :)

Specs:
Core2Duo E6320, overclocked to 2,24Ghz 4GB Ram unknown hdd
Preparations:
I prepared a database of ~585M
>102000 Rows in comments
~84000 in node
~27000 in term_data
~16850000 in term_node
200000 in users
and 10000 in vocabulary
ALl modules are enabled, except of search, the search index can't be built if you have that amount of data (at least, with 100nodes/cron cron stops executing with php errors, with below nodes/cron i get the message cron failed when executing it manually).
I added manually one forum container+forum+1thread, and some randomly chosen heise news-feeds. I enabled all Blocks and sent them randomly over the page ;)
I runned cron directly before I started ab
What I benched:
the aggregator page
the tracker page
the index
a heavily, big node with >4200pages of comments and a huge amount of terms
I exited X for the benchmarks, so there shouldn't be that much load except from ab and apache/mysql.
After benching the unindiced tables, I deleted all tables, applied the patch and reinstalled drupal. I activated all modules. I truncated all tables. I imported my old SQL. I redid the benchs.
All benchmarks are done with
$ab2 -c1 -n500 url

Compared results:

/node

unpatched

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)


Server Software:        Apache
Server Hostname:        localhost
Server Port:            80

Document Path:          /head/node
Document Length:        266 bytes

Concurrency Level:      1
Time taken for tests:   0.75467 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Non-2xx responses:      500
Total transferred:      215000 bytes
HTML transferred:       133000 bytes
Requests per second:    6625.41 [#/sec] (mean)
Time per request:       0.151 [ms] (mean)
Time per request:       0.151 [ms] (mean, across all concurrent requests)
Transfer rate:          2769.42 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
              Connect:        0    0   0.0      0       0
              Processing:     0    0   0.0      0       0
              Waiting:        0    0   0.0      0       0
              Total:          0    0   0.0      0       0

              Percentage of the requests served within a certain time (ms)
                50%      0
                  66%      0
                    75%      0
                      80%      0
                        90%      0
                          95%      0
                            98%      0
                              99%      0
                               100%      0 (longest request)

patched:

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)


Server Software:        Apache
Server Hostname:        localhost
Server Port:            80

Document Path:          /head/node
Document Length:        266 bytes

Concurrency Level:      1
Time taken for tests:   0.78039 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Non-2xx responses:      500
Total transferred:      215000 bytes
HTML transferred:       133000 bytes
Requests per second:    6407.05 [#/sec] (mean)
Time per request:       0.156 [ms] (mean)
Time per request:       0.156 [ms] (mean, across all concurrent requests)
Transfer rate:          2678.15 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
              Connect:        0    0   0.0      0       0
              Processing:     0    0   0.0      0       0
              Waiting:        0    0   0.0      0       0
              Total:          0    0   0.0      0       0

              Percentage of the requests served within a certain time (ms)
                50%      0
                  66%      0
                    75%      0
                      80%      0
                        90%      0
                          95%      0
                            98%      0
                              99%      0
                               100%      0 (longest request)

Result:
for an unknown reason, the patched version is a bit slower then the unpatched drupal

heavily overloaded node:

unpatched:

  This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
  Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
  Copyright 2006 The Apache Software Foundation, http://www.apache.org/

  Benchmarking localhost (be patient)


  Server Software:        Apache
  Server Hostname:        localhost
  Server Port:            80

  Document Path:          /head/node/45
  Document Length:        269 bytes

  Concurrency Level:      1
  Time taken for tests:   0.79884 seconds
  Complete requests:      500
  Failed requests:        0
  Write errors:           0
  Non-2xx responses:      500
  Total transferred:      216500 bytes
  HTML transferred:       134500 bytes
  Requests per second:    6259.08 [#/sec] (mean)
  Time per request:       0.160 [ms] (mean)
  Time per request:       0.160 [ms] (mean, across all concurrent requests)
  Transfer rate:          2641.33 [Kbytes/sec] received

  Connection Times (ms)
                min  mean[+/-sd] median   max
                Connect:        0    0   0.0      0       0
                Processing:     0    0   0.0      0       0
                Waiting:        0    0   0.0      0       0
                Total:          0    0   0.0      0       0

                Percentage of the requests served within a certain time (ms)
                  50%      0
                    66%      0
                      75%      0
                        80%      0
                          90%      0
                            95%      0
                              98%      0
                                99%      0
                                 100%      0 (longest request)

patched:

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)


Server Software:        Apache
Server Hostname:        localhost
Server Port:            80

Document Path:          /head/node/45
Document Length:        269 bytes

Concurrency Level:      1
Time taken for tests:   0.76488 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Non-2xx responses:      500
Total transferred:      216500 bytes
HTML transferred:       134500 bytes
Requests per second:    6536.97 [#/sec] (mean)
Time per request:       0.153 [ms] (mean)
Time per request:       0.153 [ms] (mean, across all concurrent requests)
Transfer rate:          2758.60 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
              Connect:        0    0   0.0      0       0
              Processing:     0    0   0.0      0       0
              Waiting:        0    0   0.0      0       0
              Total:          0    0   0.0      0       0

              Percentage of the requests served within a certain time (ms)
                50%      0
                  66%      0
                    75%      0
                      80%      0
                        90%      0
                          95%      0
                            98%      0
                              99%      0
                               100%      0 (longest request)

Result:

Here we have a bit faster drupal - as expected for the heavily database stuff

/tracker

unpatched:

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)


Server Software:        Apache
Server Hostname:        localhost
Server Port:            80

Document Path:          /head/tracker
Document Length:        269 bytes

Concurrency Level:      1
Time taken for tests:   0.80013 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Non-2xx responses:      500
Total transferred:      216500 bytes
HTML transferred:       134500 bytes
Requests per second:    6248.98 [#/sec] (mean)
Time per request:       0.160 [ms] (mean)
Time per request:       0.160 [ms] (mean, across all concurrent requests)
Transfer rate:          2637.07 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
              Connect:        0    0   0.0      0       0
              Processing:     0    0   0.0      0       0
              Waiting:        0    0   0.0      0       0
              Total:          0    0   0.0      0       0

              Percentage of the requests served within a certain time (ms)
                50%      0
                  66%      0
                    75%      0
                      80%      0
                        90%      0
                          95%      0
                            98%      0
                              99%      0
                               100%      0 (longest request)


patched:

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)


Server Software:        Apache
Server Hostname:        localhost
Server Port:            80

Document Path:          /head/tracker
Document Length:        269 bytes

Concurrency Level:      1
Time taken for tests:   0.92375 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Non-2xx responses:      500
Total transferred:      216500 bytes
HTML transferred:       134500 bytes
Requests per second:    5412.72 [#/sec] (mean)
Time per request:       0.185 [ms] (mean)
Time per request:       0.185 [ms] (mean, across all concurrent requests)
Transfer rate:          2284.17 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
              Connect:        0    0   0.0      0       0
              Processing:     0    0   0.6      0      13
              Waiting:        0    0   0.6      0      13
              Total:          0    0   0.6      0      13

              Percentage of the requests served within a certain time (ms)
                50%      0
                  66%      0
                    75%      0
                      80%      0
                        90%      0
                          95%      0
                            98%      0
                              99%      0
                               100%     13 (longest request)


Result:

For an unknown reason, we take more time :(

/aggregator

unpatched:

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)


Server Software:        Apache
Server Hostname:        localhost
Server Port:            80

Document Path:          /head/aggregator
Document Length:        272 bytes

Concurrency Level:      1
Time taken for tests:   0.80175 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Non-2xx responses:      500
Total transferred:      218000 bytes
HTML transferred:       136000 bytes
Requests per second:    6236.36 [#/sec] (mean)
Time per request:       0.160 [ms] (mean)
Time per request:       0.160 [ms] (mean, across all concurrent requests)
Transfer rate:          2644.22 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
              Connect:        0    0   0.0      0       0
              Processing:     0    0   0.0      0       0
              Waiting:        0    0   0.0      0       0
              Total:          0    0   0.0      0       0

              Percentage of the requests served within a certain time (ms)
                50%      0
                  66%      0
                    75%      0
                      80%      0
                        90%      0
                          95%      0
                            98%      0
                              99%      0
                               100%      0 (longest request)

patched:

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)


Server Software:        Apache
Server Hostname:        localhost
Server Port:            80

Document Path:          /head/aggregator
Document Length:        272 bytes

Concurrency Level:      1
Time taken for tests:   0.77273 seconds
Complete requests:      500
Failed requests:        0
Write errors:           0
Non-2xx responses:      500
Total transferred:      218000 bytes
HTML transferred:       136000 bytes
Requests per second:    6470.57 [#/sec] (mean)
Time per request:       0.155 [ms] (mean)
Time per request:       0.155 [ms] (mean, across all concurrent requests)
Transfer rate:          2743.52 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
              Connect:        0    0   0.0      0       0
              Processing:     0    0   0.0      0       0
              Waiting:        0    0   0.0      0       0
              Total:          0    0   0.0      0       0

              Percentage of the requests served within a certain time (ms)
                50%      0
                  66%      0
                    75%      0
                      80%      0
                        90%      0
                          95%      0
                            98%      0
                              99%      0
                               100%      0 (longest request)

Result:

Here we gain a bit of speed, not that much, but, a bit :)

Overall Conclusion: I can't recommend anything, sorry guys. But I'd commit the patch due to the better node-load-page-time

SteveR’s picture

SteveR’s picture

Just subscribing

David Strauss’s picture

These performance tests have been entirely untargeted for what this patch affects.

CorniI’s picture

and for what *is* the patch? and what should i've tested as part of GHOP #63?

hass’s picture

Status: Reviewed & tested by the community » Needs work

Needs a reroll for system update 6042 -> 6043

pwolanin’s picture

@DavidStrauss - please do share which benchmarks you think would be valuable - this issue has been held up on this.

Looking at the patch, I see changes to the aggregator tables - though probably not performance change from these. The changes for blocks and book are more about data integrity. There might be some performance gain from the changes to contact, maybe some gains when listing filters, perhaps a gain for poll when checking whether a user already voted, a significant gain when looking up a particular user's access in statistics modules, perhaps a gain when listing vocabularies, and perhaps a gain when looking up users by role. So benchmarking these last 3 seems useful.

@CorniI - looking at the number of bytes you got per request, I think there was some flaw in your process - looks like you were just hitting the "Access denied" page. You have < 300 bytes, while I has, for example, 73281 bytes. Did you have access enabled for anonymous users (especially for aggregator, etc)?

I had this problem when I first tried benchmarking - try running a test with ab with a high verbosity (e.g. ab -v6 -n1) do that you can be sure you you are getting a 200 response code. Also, you might want -c5 or some such.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
29.77 KB

re-rolled patch - applies cleanly.

pwolanin’s picture

@CorniI - also you could run a set of EXPLAIN queries with the two set of indicies and show that we go from filesort to using the index. See the very top of the issue for one example.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Still RTBC.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

I've tested this patch and it seems to work. While the performance tests might not be extensive, I'll trust the community's judgement with regard to performance. I also value David's point about improved data integrity, although it might not be the best way to guarantee this. All in all, excellent job folks. :-)

catch’s picture

Nice to see this go in :)

Just realised that some of the reason for the inconclusive benchmarking may be that the biggest performance gain here was in system - already dealt with in #185126 separately, so didn't make it into the final patch despite being mentioned a lot early on.

CorniI’s picture

what do you expect me to do for GHOP now? Can david strauss give me a hint? I feel a bit lost ;)

AjK’s picture

Priority: Critical » Normal
Status: Fixed » Active

Just like to point out that making the users.mail field UNIQUE has now made it impossible to register a user without a unique email address. For the most part people would say "yes, and so....?". Well, the old drupal.module (moved from Core to Contrib as site_network.module) uses this. Ie, it would create a new user without an email address. So that's one "use case" for a user without an email address.

You can see this now by creating a d.o account and then using that to create a new login on g.d.o (you get a user without an email address which you can subseqently add/alter after a successful login).

Here's the error message:-

user warning: Duplicate entry '' for key 'mail' query: INSERT INTO users (name, pass, init, status, created) VALUES ('AjK@drupal.org', '********************************', 'AjK@drupal.org', 1, 1198842942) in /usr/home/ajk/WS/DRUPAL/HEAD/drupal/modules/user/user.module on line 314.

This is derived from line 314 of users.module:-

  db_query('INSERT INTO {users} ('. implode(', ', $fields) .') VALUES ('. implode(', ', $s) .')', $values);

It's worth noting that Drupal records "New external user" in the watchdog just fine, it just fails to create the user account as the mail field isn't present in that INSERT. Since the anonymous user already has an empty users.mail field the Db quite rightly rejects the SQL.

If we are going to start making fields "unique" should we not be testing for uniqueness and the required presence of the field in any INSERT?

As for site_network.module, I'm not entirely sure which way to go. I could make it return the email address but (a) not sure that's wise in a world of "my data is private, don't give away at all" and (b) it would make site_network.module not backward compat with the older drupal.module.

Anyway, let's not discuss site_network.module on this issue. From my perspective this is "should we be making db fields UNIQUE without actually testing for uniqueness?" and can we have a generic query building line (as above) that's likely to fail if a unique field is missing? From the above, Drupal worked just fine and was unaware of a problem. It just failed to create the user.

Gábor Hojtsy’s picture

AjK: (offtopic) OpenID module is in a similar situation, so it requires that you provide your email address before saving you as a user (and it also even validates the email address with an activation mail sent).

(ontopic) Well, yes, if we have unique fields, the code should check whether there is an existing row with that value. I just encountered the same problem with taxonomy terms (on the forum admin interface) while testing. I accidentally tried to add two forum terms with the same name, and I got a PHP error, instead of a Drupal warning. That's the same unique field problem.

NancyDru’s picture

Andy, I don't know about your systems, but I've been unable to add users (aside from the drupal module) without a unique email address since 4.7.6 (where I started). I find it a bit annoying because I sign myself up as "admin" (user/1) and as "nancy" (usually user/2), so I have to use two separate email addresses. (Maybe this is another use case.)

I would be more concerned that drupal/site_network doesn't store an email address. It should be retrieving that and saving on my site.

AjK’s picture

Gábor (offtopic), yup, the first thing I did was to look at OpenId to see what it was doing. When transffered to the OpenId server you have to supply a "nickname" and an "email address". So there I thought that's how that was working. However, when I actually tried to do a full cycle OpenId login all I got was a drupal_set_message("Login failed", "error") and nothing in the watchdog log as a clue to why it failed. I'm more assuming at this point that it failed due to teh way I create "test sites" on my local server but I'm not entirely sure as yet. But I would have expected a failed login to have left a record in the watchdog.

Getting back on topic, having looked at the patch it's entirely "database schema" related and no application code at all. I think if database schema changes are made then the application also needs to alter to cater for any specific instances (like those we are now facing). Imho this is one patch that should be rolled back.

AjK’s picture

Looking into this a little further, I tried to initially add in some simple application checks. For example, using db_affected_rows() to ensure the INSERT was successful was my first method. Another method could be to check the UNIQUE fields exist and are infact unique.

However, user_save() should always return a user object so trying to return something differnent to indicate a failure (such as FALSE) breaks the API. And we're not in a position to start changing the API at this late stage of the release cycle.

So, in the case of user_save() there doesn't appear to be a bailout condition to handle errors of this type.

I've only looked at users.mail being made UNIQUE. As Gábor points out above, elsewhere he has hit a similar problem in taxonomy. Given that this patch "liberally sprinkled" database constraits around without any application code to support them, I'm still of the humble opinion this needs rolling back (even though I am a great fan of making the database work better for you!).

pwolanin’s picture

Title: GHOP #63: Improve table indicies for common queries » Improve table indicies for common queries
Priority: Normal » Critical

So - at the very least the users one probably needs to change now. Webernet also pointed out that the new unique indices may cause a failure during 5.x- > 6.x updates. For example, if someone already had two taxonomy terms with the same name, or two users with blank e-mail addresses.

pwolanin’s picture

Status: Active » Needs work
FileSize
1.44 KB

this makes a the new user mail unique key a simple index instead.

Patch also need to add validation on taxonomy form validation, and checks in the update code to insure that the unique conditions now imposed for the taxonomy tables are met before adding the unique keys.

chx’s picture

Adding unique keys is a feature past feature freeze they never should happen. I have not looked into this issue before but unique keys added here should be removed IMNSHO and re added, one by one in a separate feature issue for D7. The reason? They could break update.

pwolanin’s picture

FileSize
4.48 KB

a more complete rollback patch - untested

chx’s picture

Status: Needs work » Needs review

I do not want to RTBC yet but a CNR is in order.

David Strauss’s picture

Status: Needs review » Needs work

Removing the UNIQUE index on the mail column is *wrong*. We use mail addresses to look up accounts; we cannot have duplicates. If you want the value absent for the mail column, allow it to be NULL. Multiple NULL values aren't considered a violation of the UNIQUE constraint.

pwolanin’s picture

@David - we all agree it is wrong, but user_save() does not do any checks to insure that the query won't fail. In other words, the amount of work on the PHP code to support this properly could be substantial, and is probably a bad idea at RC1. Also, is the duplicate NULL true across the DBs we do support (or hope to support)?

David Strauss’s picture

@pwolanin How are we handling user_save() where the username is already taken? We could do the same thing with an email address that's already in use. And, yes, the duplicate NULL values in a UNIQUE index is a widespread feature in databases, not just a MySQL thing.

I just don't see how we can justify having things like password recovery use email lookups and not enforce uniqueness of email addresses for accounts.

What is it: are they unique or not? We should either enforce uniqueness across the board or remove the checks entirely.

pwolanin’s picture

@David - the user registration/edit form does validation checks to insure that the name and e-mail are unique.

However, user_save() does not check. So right now if you are foolish enough to make an API call to user_save() with a duplicate username, I think you'll get a failed query as well. And worse, you may get back the uid for the previously inserted user, and overwrite their data in hook_user.

http://api.drupal.org/api/function/user_save/6

Opened this issue to attempt some modicum of error checking in user_save(): http://drupal.org/node/204705

catch’s picture

Status: Needs work » Needs review

Tested #150 on a clean install, and on a clean 5.x - 6.x upgrade.

No issues, passed schema testing etc.

Marking back to review. However, iirc this was committed before RC1 was released - do we need an upgrade path from RC1 to RC2?

coupet’s picture

Data integrity versus code logic?

pwolanin’s picture

@coupet - the problem is the lack of synchronization between the code and the DB constraints. If the code has not been updated to respect the new DB constraints, then it's a bug in the code.

Gábor Hojtsy’s picture

Status: Needs review » Needs work

Actually David had a reason to CNW this in #152. Along the same lines, we should check whether existing code validated uniqueness, and only remove the constraint where such validation is not in place and is not feasible to implement IMHO.

(I also got a note about an upgrade report with locale tables with unique indexes introduced in Drupal 6, which is also due to broken data: http://drupal.org/node/205613 but I am not sure at all that we should keep supporting broken data).

pwolanin’s picture

Status: Needs work » Needs review

@Gabor - in #152 "wrong" is more of a moral/idealist/DBA judgment than a PHP code judgment. In Drupal 5, the system works ok without this consraint. If you want to make this column unique for D6, it will require a new update function, at the very least, and perhaps a reworking of the schema to allow this column to be NOT NULL, and a reworking of user_save() to allow it to save a NULL value to the mail column.

Gábor Hojtsy’s picture

Status: Needs review » Needs work

pwolanin: Well, understood.

Now, we need to think about RC1 users. Although we did not support upgrades from betas, we have a tradition to support upgrades from RC1 onward. After all this is an RC, so people are possibly using it as "this is nearly a release", and rightly so. So as catch noted in #156, we should think about providing an upgrade path. The tricky part is that we need to modify the existing update code as well, since we would not like to cause problems there with the unique indexes added on upgrades (from Drupal 5), but we'd like to support upgrades from D6RC1, where the unique indexes are in place.

catch’s picture

So we have:
5.x - RC1 - RC2
RC1 - RC2
5.x - RC2

system_updates run sequentially right? Is it off the wall to suggest a new one to drop the uniques and create the new indexes and leaving the existing update intact? That may be stupid, but if it's not then it'd mean all cases get the same upgrade path - just 5.x - RC2 (and onwards) gets it all in one go but no special cases.

Gábor Hojtsy’s picture

The problem with the unique indexes is that you might have a "broken" data set (ie. multiple user accounts without email address), so you get an error of that index not being possible to apply. This issue was reopened in part because of this symptom. Simply put, we should not leave any of the later dropped unique indexes created on the update path.

catch’s picture

Of course, doh! I should've realised from reading the issue back a few posts. Thanks for the explanation.

Gábor Hojtsy’s picture

A few things we can do (all start with applying the changes to the existing update code, but continue on alternative ways):

1. Add a new update function which detects if there are unique indexes already added (RC1), and "demotes" them to plain indexes. If these unique indexes are not detected, do nothing.

2. Add a variable_set() to the existing, modified update function to store the info that we are running through it from a pre-RC1 update (those who already run the update will not have that variable). Add a new update function which demotes the unique indexes only if that variable is not set. If the variable is set, do nothing.

First is acting based on database schema detection, second is acting on a missing variable. The effect is the same, after the update, the database will be in a consistent state regardless of direct 5 => RC2 update or 5 => RC1 => RC2.

pwolanin’s picture

@Gabor - is there a core schema API function for determining the actual type of an index, or does that require schema module?

Gábor Hojtsy’s picture

Well, I don't know of such introspection functions in Drupal itself, and I looked into it now, but did not find anything, not even for checking whether an index exists at all or not, let alone the type of the index. What we have done previously is to communicate 'this update function ran' type of things with variable_set() and variable_get(), there are quite a few examples for this in (previous) update code.

Gábor Hojtsy’s picture

Just tested the update on a data set of a running Drupal site (before patch):

Failed: ALTER TABLE {users} ADD UNIQUE KEY mail (mail)
Failed: ALTER TABLE {term_data} ADD UNIQUE KEY vid_name (vid, name)
Failed: ALTER TABLE {term_node} ADD PRIMARY KEY (tid, vid)

By the way, from the error reports, all are result of same bad data, but this "feature update" would be too far fetched to fix, so I can validate the need to not add these indexes in a rush.

fractile81’s picture

I just wanted to chime in and say that the update fails for me when adding the UNIQUE to mail as well. Also, option 2 in #165 sounds like the path of least resistance for fixing this IMHO.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
5.85 KB

here's a go at an update function per #2. Not tested - please have at it.

It leaves the one on blocks - is that one good? It's on ('theme', 'module', 'delta') - which seems safe. The module, delta combination must be unique for each block and you can't have two themes with the same name.

Also seems ok based on the way the form is built:
http://api.drupal.org/api/function/block_admin_display_form/6

Though if the blocks one is ok, maybe we should leave the one on filters too? ('format', 'module', 'delta')

David Strauss’s picture

We should try to leave the one on filters; it helped up find one bug already.

catch’s picture

#170 works smoothly for rc-1 to 6.x-dev upgrade and passes schema.module testing. Didn't test 5.x-rc1-6.x or straight 6.x yet.

Gábor Hojtsy’s picture

Agreed with pwolanin and David on #171. I also have an interesting update issue on my test data on the tid, vid index on term_node added here, but I need to investigate that more.

pwolanin’s picture

Status: Needs review » Needs work

@Gabor - that tid, vid index is just a conversion from the unique key (i.e. I didn't spend a lot of thought on it).

Setting to CNW to put back the filter unique key (unless I'm misreading #173?).

Gábor Hojtsy’s picture

pwolanin: yes, you understood it right.

I took some time to dig down to get insight on the tid, vid issue (without the patch):

- system_update_6001 adds the vid field to term_node so that term relations can be versioned
- it does UPDATE {term_node} t SET vid = (SELECT vid FROM {node} n WHERE t.nid = n.nid)
- my term_node table had nid, tid relations for nonexistent nodes, so their vid became 0
- because there were multiple (10) instances of this, the addition of the unique index on tid, vid failed

So to solve this, I added 'DELETE FROM {term_node} WHERE nid NOT IN (SELECT nid FROM {node})' to my usual pre-upgrade cleanup scripts, which make all the update work fine and "go green" (ie. no failed queries) at the end.

Note that I have a very old data set which was migrated to Drupal from another system in 2003 and runs on Drupal since then, so these 10 faulty associations might have come in anytime. That said, these kinds of errors will probably be rare, but the above query should not hurt (it just removes term_node associations for nodes you have deleted already), so I'd suggest adding this in to the code before where we add the tid, vid unique index.

Gábor Hojtsy’s picture

BTW just to note, my update is all green now (without the patch) because I have added that DELETE and because I cleaned up the term_data and users table data, where the unique indexes caused problems earlier. I had two users with the same email address and a few terms with same names (tids in succession, so probably a symptom of PHP crashing and users resubmitting). So as said, my original data set validates the need for this patch, as others will not do these fine grained data set corrections I do, and the above note in #175 is just as addendum to add a safe data cleanup query into Drupal to help make a unique index work, which could otherwise fail as well.

pwolanin’s picture

@Gabor - will that query: 'DELETE FROM {term_node} WHERE nid NOT IN (SELECT nid FROM {node})'

scale to (e.g.) 200,000 nodes? Do we need to do that in batches?

catch’s picture

I have three rows returned by SELECT * FROM term_node WHERE nid NOT IN (SELECT nid FROM node);

This is a database from 4.5 which had a phpbb2drupal import run on it during 4.7. So rare, but not unheard of.

catch’s picture

If a site has 200,000 rows returned by that query, they have more serious problems than this update failing IMO.

Gábor Hojtsy’s picture

pwolanin: I have already quoted an UDPATE query above, which "joins" the term_node and node table, so it will get to even more rows. It does not work with a list of 200,000 values though. I am not sure how all the databases resolve these IN subqueries, and I don't have a bigger data set to performance test it with.

pwolanin’s picture

FileSize
4.75 KB

how about: 'DELETE FROM {term_node} WHERE vid = 0'?

Here's a re-rolled patch which leaves the filter table alone and adds this extra cleanup query.

pwolanin’s picture

Status: Needs work » Needs review
David Strauss’s picture

Faster alternative to #178:

SELECT * FROM term_node tn
LEFT OUTER JOIN node n ON n.nid = tn.nid
WHERE n.vid IS NULL

pwolanin’s picture

ran both an update from 5.x and a clean install with this patch (w/ MySQL 5.0). Schema module reports no mismatches, so the patch code is at least correct.

yched’s picture

http://drupal.org/node/206941 looks like another occurrence of non code-backed unique indexes on {term_data} table

catch’s picture

Status: Needs review » Reviewed & tested by the community

I can confirm this passes schema.module with a clean issue 5.x - 6.x upgrade.

'DELETE FROM {term_node} WHERE vid = 0' ought to work fine since we already have the vid column by the time we get to system_update_6044(). It could be added to 6001 as well, but no need, since only sites which already upgraded to rc1 with bad entries in term_node (probably zero sites) will be affected by the unique index anyway. (I have to admit to running gabor's subquery on my actual site to clean it out, so I no longer have test data without restoring an older backup.).

So, RTBC.

catch’s picture

Status: Reviewed & tested by the community » Needs review

Hmm, does this need testing on pgsql as well?

Pancho’s picture

Title: Improve table indicies for common queries » Improve table indices for common queries
chx’s picture

Status: Needs review » Reviewed & tested by the community

No it does not need testing with postgresql. Schema API just works. Very nice job and cures my fears, thanks to all that participated (pwolanin, davidstrauss, catch and Gabor -- hope I have not forgotten anyone).

hswong3i’s picture

Reroll patch from #181 via CVS HEAD. Tested with Debian etch PostgreSQL 8.1.9, pass system_update_6044() with no question.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for the hard work and all the testing folks, committed.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

sammos’s picture

Hello I'm confused about something - I thought that installing 5.7 would pick up this patch, but the sitedoc module still recommends adding indexes. Also my 5.5 to 5.7 update script picked up no new updates. So do existing drupal installs need to manually add the indexes? And if so, has anybody made up an update script? thanks!

NancyDru’s picture

Well, Sitedoc tells you that there are no indexes, not that you have to have them.

This was not a back-ported fix. You may go in and find the applicable ones to back port manually.

sammos’s picture

yes I know - didn't mean to imply otherwise. But I'd like to have them and figured maybe somebody wrote an update script for v.5. I'll see about writing one and posting it. Any thoughts welcome about whether it's better to backtrace this thread or to look at the code in the sitedoc module as it did also make suggestions. Thanks!

NancyDru’s picture

Hmm, I don't remember what suggestions I would have made other than viewing this thread. I'll have to go look again.

sinmao’s picture

I've upgraded to 5.10. Do I still need this patch? Thx.

sinmao’s picture

Hi, so were you able to make an update script?

NancyDru’s picture

This is for 6.x, not 5.x.

sinmao’s picture

Um...What should I do with 5.10, then? Thx for the reply.

NancyDru’s picture

If you are having specific problems, manually apply the indexes as shown in this patch.