As discussed on the infra and devel mailing lists using mysql merge tables for log tabls is advantagious.

This patch only gives an idea on how it will work, the update path is missing and the moving of the entries from the insert to the storage table is missing.

The patch is untested, but should work on a new Drupal install. If your mysql install supports the MERGE engine, then you will have the two tables created for watchdog and accesslog log events. Inserts will go into the _insert table but will be selected from the merged log tables.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

killes@www.drop.org’s picture

FileSize
14.66 KB

the moving of the entries from one table to another is a bit difficult.

One could of course lock the tables during that operation, but locking is unpopular. So I have worked around this by first selecting the entries in the _insert table to get their watchdog IDs.
Then I move them and afterwards delete them. The question is if there are any limits for queries that use the IN () construct when it comes to the number of arguments. Depending on the popularity of your website and your cron intervall, this might be a lot of arguments. The query might also not be very fast, but that shouldn't matter much since it is run by cron.

Updated patch attached.

chx’s picture

Status: Needs review » Needs work

I would prefer SELECT an ID as a boundary and do an INSERT ... SELECT then a DELETE. Should be a lot faster... and this method will choke on mysql packet size i am afraid.

Also, why the strtr instead of str_replace?

killes@www.drop.org’s picture

Status: Needs work » Needs review
FileSize
14.28 KB

good idea, selecting a single ID is certainly faster. :)

The strtr is borrowed from db_prefix_tables.

killes@www.drop.org’s picture

FileSize
14.3 KB

Updated patch, with fixes.

I would be helpful if people on shared hosting could execute

SHOW ENGINES;

on mysql/phpmyadmin and post their result together with which hosting service the result is from. Note that the patch won't break either accesslog or watchdog if you don't have the MERGE engine. In that case the code works as before.

killes@www.drop.org’s picture

FileSize
16.42 KB

patch complete with update path. There is most likely an issue with autoincrement values that needs fixing.

The previous patch would fail since variable_set executed cach_clear_all which wasn't available.

m3avrck’s picture

From Site5.com (I know a lot of people host with them):

MyISAM  	DEFAULT  	Default engine as of MySQL 3.23 with great perform...
HEAP 	YES 	Alias for MEMORY
MEMORY 	YES 	Hash based, stored in memory, useful for temporary...
MERGE 	YES 	Collection of identical MyISAM tables
MRG_MYISAM 	YES 	Alias for MERGE
ISAM 	NO 	Obsolete storage engine, now replaced by MyISAM
MRG_ISAM 	NO 	Obsolete storage engine, now replaced by MERGE
InnoDB 	YES 	Supports transactions, row-level locking, and fore...
INNOBASE 	YES 	Alias for INNODB
BDB 	NO 	Supports transactions and page-level locking
BERKELEYDB 	NO 	Alias for BDB
NDBCLUSTER 	NO 	Clustered, fault-tolerant, memory-based tables
NDB 	NO 	Alias for NDBCLUSTER
EXAMPLE 	NO 	Example storage engine
ARCHIVE 	NO 	Archive storage engine
CSV 	NO 	CSV storage engine
BLACKHOLE 	NO 	Storage engine designed to act as null storage
m3avrck’s picture

From Totalchoicehosting.com (another very popular one):

 Engine  	 Support  	 Comment
MyISAM 	DEFAULT 	Default engine as of MySQL 3.23 with great perform...
HEAP 	YES 	Alias for MEMORY
MEMORY 	YES 	Hash based, stored in memory, useful for temporary...
MERGE 	YES 	Collection of identical MyISAM tables
MRG_MYISAM 	YES 	Alias for MERGE
ISAM 	NO 	Obsolete storage engine, now replaced by MyISAM
MRG_ISAM 	NO 	Obsolete storage engine, now replaced by MERGE
InnoDB 	YES 	Supports transactions, row-level locking, and fore...
INNOBASE 	YES 	Alias for INNODB
BDB 	NO 	Supports transactions and page-level locking
BERKELEYDB 	NO 	Alias for BDB
NDBCLUSTER 	NO 	Clustered, fault-tolerant, memory-based tables
NDB 	NO 	Alias for NDBCLUSTER
EXAMPLE 	NO 	Example storage engine
ARCHIVE 	NO 	Archive storage engine
CSV 	NO 	CSV storage engine
BLACKHOLE 	NO 	Storage engine designed to act as null storage

Looks like 2 solid major hosting providers will support this patch.

m3avrck’s picture

I've been following the devel thread but looking at the code now--what is the performance impact?

We have 2 tables: one for insert and one for longtime storage. Are writes going to speed up that much when the table is empty verse a large table? If we had lots of indexes on the table sure, that would have to be rebuilt on each insert, but we only have a primary key.

If that is the case, how does this approach compare to just making the log tables InnoDB? Is it faster?

Based on some simple hosting checks, seem either approach could be supported. Maybe we even have a check for InnoDB to make the log tables InnoDB by default if supported, then goto MERGE, then default.

However, that assumes InnoDB is faster. Maybe it's not?

beginner’s picture

When I try
SHOW ENGINES
or
SHOW TABLE TYPES
in phpmyadmin, I get a SQL syntax error message, both with mysql 4.1.10 and 4.0.
http://dev.mysql.com/doc/refman/4.1/en/show-engines.html

robertDouglass’s picture

tracking.

killes@www.drop.org’s picture

The assumption is that indeed inserts into a small table are faster than for a big one. After all you only need to open a smaller file. How this compares to Innodb, I don't know.

killes@www.drop.org’s picture

beginner: You maybe need to add a semicolon ; ?

killes@www.drop.org’s picture

FileSize
16.86 KB

updated patch, seems to work now.

sammys’s picture

Hi killes,

Going through with PostgreSQL stuff and i've decided to report a couple things before i'm done. Firstly, you've named the tables {watchdog_insert} and {watchdog_storage}. Would it not be better to name them {watchdog}_insert and {watchdog}_storage to make it easier when adding prefixes?

Secondly, the variable mysql_can_merge may be more aptly named db_can_merge. All versions of PostgreSQL can do it but I thought it might be better to generalise it.

I'm still going on the pgsql patch.

Cheers,

--
Sammy Spets
Synerger
http://www.synerger.com

killes@www.drop.org’s picture

Sammy: I'll incorprate your suggestions with the next re-roll

sammys’s picture

Status: Needs review » Needs work

There is a major barrier to doing this cleanly. The autoincrement fields are going to cause some issues. In PostgreSQL the sequence is shared between the two tables making it mandatory that the key field is in the insert query. Perhaps you could remove the autoincrement from the _storage table and modify the queries to include the key field. That and the two suggestions above are all that's required before it'll work on PostgreSQL (after you get my patch, of course).

Wheeeeeee!

sammys’s picture

Also missing are the {} around the table names in the MERGE statement. Here is a patch (rolled it all up in 20 more minutes) for all work so far on both MySQL and PostgreSQL.

killes@www.drop.org’s picture

Hmm, we have a problem here. For the merge thing to wrok, the merged tables need to be identical. I am not sure I can drop the auto_increment fromthe storage table, will need to try it. With "queries" I asumme you mean the ones done by the _cron hooks?

sammys’s picture

Status: Needs work » Needs review
FileSize
18.96 KB

ok... @killes has tested my theories on mysql and it all looks good. I've updated my patch with update code. Now i'm a little annoyed that there isn't a better way to go about it in PostgreSQL, but it is how it is.

It turns out two copy operations have to be performed on the data in order to preserve the original association between the table and the sequence. I.e if you choose to do DROP TABLE {watchdog} CASCADE in a future update, the sequence {watchdog}_wid_seq will be dropped automatically. If I was to simply rename the table, create children and copy the data across to the new one, this association would not be preserved. Bit of a PITA, but *shrug*.

I've also added ORDER BY clauses to the MySQL updates. MySQL stuff really needs to be tested as I don't use that platform.

Cheers.

RobRoy’s picture

Here is mine for my dedicated server:

mysql> SHOW ENGINES;
+------------+----------+------------------------------------------------------------+
| Engine     | Support  | Comment                                                    |
+------------+----------+------------------------------------------------------------+
| MyISAM     | DEFAULT  | Default engine as of MySQL 3.23 with great performance     |
| HEAP       | YES      | Alias for MEMORY                                           |
| MEMORY     | YES      | Hash based, stored in memory, useful for temporary tables  |
| MERGE      | YES      | Collection of identical MyISAM tables                      |
| MRG_MYISAM | YES      | Alias for MERGE                                            |
| ISAM       | NO       | Obsolete storage engine, now replaced by MyISAM            |
| MRG_ISAM   | NO       | Obsolete storage engine, now replaced by MERGE             |
| InnoDB     | DISABLED | Supports transactions, row-level locking, and foreign keys |
| INNOBASE   | DISABLED | Alias for INNODB                                           |
| BDB        | NO       | Supports transactions and page-level locking               |
| BERKELEYDB | NO       | Alias for BDB                                              |
| NDBCLUSTER | NO       | Clustered, fault-tolerant, memory-based tables             |
| NDB        | NO       | Alias for NDBCLUSTER                                       |
| EXAMPLE    | NO       | Example storage engine                                     |
| ARCHIVE    | NO       | Archive storage engine                                     |
| CSV        | NO       | CSV storage engine                                         |
| BLACKHOLE  | NO       | Storage engine designed to act as null storage             |
+------------+----------+------------------------------------------------------------+

Will test a bit more later.

m3avrck’s picture

Did some benchmarking, this method appears to run roughly 10% faster on a Drupal site, assuming log tables have 1M records in them, verse an table with 0 records.

I'll have more details tomorrow... too tired to make sense of those results anymore :-)

m3avrck’s picture

Ok so I ran a bunch of tests last night to see if this patch makes sense; it does in theory, but what about practice?

So I set out to test this in the following fashion: I loaded a full Drupal bootstrap, plus all of the logic in index.php. I then wrote some arbitrary code to insert 10 watchdog errors per call... this is simulating a pretty bad site that is generating 10 errors per load :-)

To test, I ran the following benchmark: ab -n 5000 -c 20 localhost:8888/drupalMerge/log_multiple.php

The results:

   ### Empty watchdog table, MyISAM
  
   Trial 1:
   Requests per second:    44.30 [#/sec] (mean)
   Time per request:       22.573 [ms] (mean, across all concurrent requests)

   Trial 2:
   Requests per second:    43.84 [#/sec] (mean)
   Time per request:       22.808 [ms] (mean, across all concurrent requests)


   ### 1M rows in watchdog table, MyISAM

   Trial 1:
   Requests per second:    39.56 [#/sec] (mean)
   Time per request:       25.276 [ms] (mean, across all concurrent requests)

   Trial 2:
   Requests per second:    39.15 [#/sec] (mean)
   Time per request:       25.541 [ms] (mean, across all concurrent requests)


   ### Empty Watchdog table, InnoDB

   Trial 1:
   Requests per second:    42.37 [#/sec] (mean)
   Time per request:       23.599 [ms] (mean, across all concurrent requests)

   Trial 2:
   Requests per second:    42.99 [#/sec] (mean)
   Time per request:       23.260 [ms] (mean, across all concurrent requests)

   Trial 3:
   Requests per second:    44.22 [#/sec] (mean)
   Time per request:       22.613 [ms] (mean, across all concurrent requests)


   ### 1M rows in watchdog table, InnoDB
  
   Trial 1:
   Requests per second:    39.01 [#/sec] (mean)
   Time per request:       25.636 [ms] (mean, across all concurrent requests)

   Trial 2:
   Requests per second:    40.51 [#/sec] (mean)
   Time per request:       24.687 [ms] (mean, across all concurrent requests)

m3avrck’s picture

Conclusions:

- InnoDB doesn't seem to be any faster, so keeping the mechanism as MyISAM will be ok
- Inserting records on an empty table is on average 12% faster than compared to one that has 1M records in it

Keeping your watchdog table as empty as possible will net the most gains. Keeping your site from generating lots of log messages will speed up yoursite--so cut down on the errors (can't control lots of simultaneous users and creation of content [that's a good thing])

Seems to be solid evidence for the support of this patch.

m3avrck’s picture

Just confirmed with Dries, InnoDB shouldn't be any faster, there are no table locks with the watchdog inserts. Also, he agrees that -n 5000 and -c 20 are good settings to test this patch.

Next up is to actually test this patch and make sure it still applies :-)

m3avrck’s picture

Status: Needs review » Needs work

No longer applies. Statistics.install needs to be tweaked, along with 1006 -> 1007 a few other minor issues. Unfort I have no time to reroll right now hopefully someone else can in the next day or so, this needs to be committed it's a nice performance tweak for heavy sites :-)

killes@www.drop.org’s picture

Status: Needs work » Needs review
FileSize
19.19 KB

petch updated

Dries’s picture

Status: Needs review » Needs work

Ted's test secenario is not a realistic one and does not allow us to value the usefulness of this patch.

Generating 10 watchdog entries per requested page obviously skews the results in favor of this patch. We should try the following: do a full bootstrap and generate one watchdog entry per 10 page requests. That is a more realistic scenario (but still too optimistic). If the performance gain is still significant, it is worth the added complexity.

killes@www.drop.org’s picture

Since this patch also applied to the accesslog table, one write per page would be realistic.

m3avrck’s picture

Ok I ran the tests again, but this time, each request only produced *one* watchdog entry. This is very realistic... on average 1 per user makes sense.

Here are the numbers:

ab -n 1000 -c 20 localhost:8888/drupalLog/log_multiple.php

## 1M Rows

Requests per second:    29.87 [#/sec] (mean)
Time per request:       33.479 [ms] (mean, across all concurrent requests)

Requests per second:    26.88 [#/sec] (mean)
Time per request:       37.208 [ms] (mean, across all concurrent requests)

Requests per second:    28.50 [#/sec] (mean)
Time per request:       35.086 [ms] (mean, across all concurrent requests)

Requests per second:    27.93 [#/sec] (mean)
Time per request:       35.810 [ms] (mean, across all concurrent requests)



## Empty

Requests per second:    30.99 [#/sec] (mean)
Time per request:       32.270 [ms] (mean, across all concurrent requests)

Requests per second:    31.16 [#/sec] (mean)
Time per request:       32.092 [ms] (mean, across all concurrent requests)

Requests per second:    32.34 [#/sec] (mean)
Time per request:       30.918 [ms] (mean, across all concurrent requests)

On average the empty table was still about 10% faster.

killes@www.drop.org’s picture

Version: x.y.z » 6.x-dev
FileSize
19.48 KB

updated for D6.

killes@www.drop.org’s picture

Status: Needs work » Needs review
FileSize
19.51 KB

patch had errors.

killes@www.drop.org’s picture

FileSize
19.51 KB

yes another bug squished

killes@www.drop.org’s picture

FileSize
19.64 KB

bug in upgrade path...

kbahey’s picture

This is from a cheap bargain priced shared host. Both InnoDB and merge are supported.

mysql> show engines;
+------------+---------+------------------------------------------------------------+
| Engine     | Support | Comment                                                    |
+------------+---------+------------------------------------------------------------+
| MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance     |
| HEAP       | YES     | Alias for MEMORY                                           |
| MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables  |
| MERGE      | YES     | Collection of identical MyISAM tables                      |
| MRG_MYISAM | YES     | Alias for MERGE                                            |
| ISAM       | NO      | Obsolete storage engine, now replaced by MyISAM            |
| MRG_ISAM   | NO      | Obsolete storage engine, now replaced by MERGE             |
| InnoDB     | YES     | Supports transactions, row-level locking, and foreign keys |
| INNOBASE   | YES     | Alias for INNODB                                           |
| BDB        | NO      | Supports transactions and page-level locking               |
| BERKELEYDB | NO      | Alias for BDB                                              |
| NDBCLUSTER | NO      | Clustered, fault-tolerant, memory-based tables             |
| NDB        | NO      | Alias for NDBCLUSTER                                       |
| EXAMPLE    | NO      | Example storage engine                                     |
| ARCHIVE    | NO      | Archive storage engine                                     |
| CSV        | NO      | CSV storage engine                                         |
| BLACKHOLE  | NO      | Storage engine designed to act as null storage             |
+------------+---------+------------------------------------------------------------+
17 rows in set (0.00 sec)
kbahey’s picture

Status: Needs review » Needs work

After the watchdog function was made into a hook, and watchdog is renamed to dblog, this patch will need to be rerolled.

chx’s picture

Version: 6.x-dev » 7.x-dev
Priority: Normal » Critical
robertDouglass’s picture

Why critical? Because Drupal 6 runs slow like a dog who's bee shot in the leg. Every performance improvement should be marked critical. Drupal is slowly dying a performance death.

Crell’s picture

Version: 7.x-dev » 8.x-dev
catch’s picture

Priority: Critical » Major

Downgrading all D8 criticals to major per http://drupal.org/node/45111

dawehner’s picture

Version: 8.0.x-dev » 8.1.x-dev
Issue summary: View changes

This won't land any time soon, given that it requires the other issue (#566548: Add storage engine support to schema API.) first.

mgifford’s picture

Assigned: killes@www.drop.org » Unassigned

Just unassigning issues that haven't been developed for a bit in the D8 queue.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.