Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
As discussed on the infra and devel mailing lists using mysql merge tables for log tabls is advantagious.
This patch only gives an idea on how it will work, the update path is missing and the moving of the entries from the insert to the storage table is missing.
The patch is untested, but should work on a new Drupal install. If your mysql install supports the MERGE engine, then you will have the two tables created for watchdog and accesslog log events. Inserts will go into the _insert table but will be selected from the merged log tables.
Comment | File | Size | Author |
---|---|---|---|
#33 | merge_2.patch | 19.64 KB | killes@www.drop.org |
#32 | merge_1.patch | 19.51 KB | killes@www.drop.org |
#31 | merge_0.patch | 19.51 KB | killes@www.drop.org |
#30 | merge.patch | 19.48 KB | killes@www.drop.org |
#26 | merge.patch_5.txt | 19.19 KB | killes@www.drop.org |
Comments
Comment #1
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedthe moving of the entries from one table to another is a bit difficult.
One could of course lock the tables during that operation, but locking is unpopular. So I have worked around this by first selecting the entries in the _insert table to get their watchdog IDs.
Then I move them and afterwards delete them. The question is if there are any limits for queries that use the IN () construct when it comes to the number of arguments. Depending on the popularity of your website and your cron intervall, this might be a lot of arguments. The query might also not be very fast, but that shouldn't matter much since it is run by cron.
Updated patch attached.
Comment #2
chx CreditAttribution: chx commentedI would prefer SELECT an ID as a boundary and do an INSERT ... SELECT then a DELETE. Should be a lot faster... and this method will choke on mysql packet size i am afraid.
Also, why the strtr instead of str_replace?
Comment #3
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedgood idea, selecting a single ID is certainly faster. :)
The strtr is borrowed from db_prefix_tables.
Comment #4
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedUpdated patch, with fixes.
I would be helpful if people on shared hosting could execute
SHOW ENGINES;
on mysql/phpmyadmin and post their result together with which hosting service the result is from. Note that the patch won't break either accesslog or watchdog if you don't have the MERGE engine. In that case the code works as before.
Comment #5
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedpatch complete with update path. There is most likely an issue with autoincrement values that needs fixing.
The previous patch would fail since variable_set executed cach_clear_all which wasn't available.
Comment #6
m3avrck CreditAttribution: m3avrck commentedFrom Site5.com (I know a lot of people host with them):
Comment #7
m3avrck CreditAttribution: m3avrck commentedFrom Totalchoicehosting.com (another very popular one):
Looks like 2 solid major hosting providers will support this patch.
Comment #8
m3avrck CreditAttribution: m3avrck commentedI've been following the devel thread but looking at the code now--what is the performance impact?
We have 2 tables: one for insert and one for longtime storage. Are writes going to speed up that much when the table is empty verse a large table? If we had lots of indexes on the table sure, that would have to be rebuilt on each insert, but we only have a primary key.
If that is the case, how does this approach compare to just making the log tables InnoDB? Is it faster?
Based on some simple hosting checks, seem either approach could be supported. Maybe we even have a check for InnoDB to make the log tables InnoDB by default if supported, then goto MERGE, then default.
However, that assumes InnoDB is faster. Maybe it's not?
Comment #9
beginner CreditAttribution: beginner commentedWhen I try
SHOW ENGINES
or
SHOW TABLE TYPES
in phpmyadmin, I get a SQL syntax error message, both with mysql 4.1.10 and 4.0.
http://dev.mysql.com/doc/refman/4.1/en/show-engines.html
Comment #10
robertDouglass CreditAttribution: robertDouglass commentedtracking.
Comment #11
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedThe assumption is that indeed inserts into a small table are faster than for a big one. After all you only need to open a smaller file. How this compares to Innodb, I don't know.
Comment #12
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedbeginner: You maybe need to add a semicolon ; ?
Comment #13
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedupdated patch, seems to work now.
Comment #14
sammys CreditAttribution: sammys commentedHi killes,
Going through with PostgreSQL stuff and i've decided to report a couple things before i'm done. Firstly, you've named the tables {watchdog_insert} and {watchdog_storage}. Would it not be better to name them {watchdog}_insert and {watchdog}_storage to make it easier when adding prefixes?
Secondly, the variable mysql_can_merge may be more aptly named db_can_merge. All versions of PostgreSQL can do it but I thought it might be better to generalise it.
I'm still going on the pgsql patch.
Cheers,
--
Sammy Spets
Synerger
http://www.synerger.com
Comment #15
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedSammy: I'll incorprate your suggestions with the next re-roll
Comment #16
sammys CreditAttribution: sammys commentedThere is a major barrier to doing this cleanly. The autoincrement fields are going to cause some issues. In PostgreSQL the sequence is shared between the two tables making it mandatory that the key field is in the insert query. Perhaps you could remove the autoincrement from the _storage table and modify the queries to include the key field. That and the two suggestions above are all that's required before it'll work on PostgreSQL (after you get my patch, of course).
Wheeeeeee!
Comment #17
sammys CreditAttribution: sammys commentedAlso missing are the {} around the table names in the MERGE statement. Here is a patch (rolled it all up in 20 more minutes) for all work so far on both MySQL and PostgreSQL.
Comment #18
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedHmm, we have a problem here. For the merge thing to wrok, the merged tables need to be identical. I am not sure I can drop the auto_increment fromthe storage table, will need to try it. With "queries" I asumme you mean the ones done by the _cron hooks?
Comment #19
sammys CreditAttribution: sammys commentedok... @killes has tested my theories on mysql and it all looks good. I've updated my patch with update code. Now i'm a little annoyed that there isn't a better way to go about it in PostgreSQL, but it is how it is.
It turns out two copy operations have to be performed on the data in order to preserve the original association between the table and the sequence. I.e if you choose to do DROP TABLE {watchdog} CASCADE in a future update, the sequence {watchdog}_wid_seq will be dropped automatically. If I was to simply rename the table, create children and copy the data across to the new one, this association would not be preserved. Bit of a PITA, but *shrug*.
I've also added ORDER BY clauses to the MySQL updates. MySQL stuff really needs to be tested as I don't use that platform.
Cheers.
Comment #20
RobRoy CreditAttribution: RobRoy commentedHere is mine for my dedicated server:
Will test a bit more later.
Comment #21
m3avrck CreditAttribution: m3avrck commentedDid some benchmarking, this method appears to run roughly 10% faster on a Drupal site, assuming log tables have 1M records in them, verse an table with 0 records.
I'll have more details tomorrow... too tired to make sense of those results anymore :-)
Comment #22
m3avrck CreditAttribution: m3avrck commentedOk so I ran a bunch of tests last night to see if this patch makes sense; it does in theory, but what about practice?
So I set out to test this in the following fashion: I loaded a full Drupal bootstrap, plus all of the logic in index.php. I then wrote some arbitrary code to insert 10 watchdog errors per call... this is simulating a pretty bad site that is generating 10 errors per load :-)
To test, I ran the following benchmark:
ab -n 5000 -c 20 localhost:8888/drupalMerge/log_multiple.php
The results:
Comment #23
m3avrck CreditAttribution: m3avrck commentedConclusions:
- InnoDB doesn't seem to be any faster, so keeping the mechanism as MyISAM will be ok
- Inserting records on an empty table is on average 12% faster than compared to one that has 1M records in it
Keeping your watchdog table as empty as possible will net the most gains. Keeping your site from generating lots of log messages will speed up yoursite--so cut down on the errors (can't control lots of simultaneous users and creation of content [that's a good thing])
Seems to be solid evidence for the support of this patch.
Comment #24
m3avrck CreditAttribution: m3avrck commentedJust confirmed with Dries, InnoDB shouldn't be any faster, there are no table locks with the watchdog inserts. Also, he agrees that -n 5000 and -c 20 are good settings to test this patch.
Next up is to actually test this patch and make sure it still applies :-)
Comment #25
m3avrck CreditAttribution: m3avrck commentedNo longer applies. Statistics.install needs to be tweaked, along with 1006 -> 1007 a few other minor issues. Unfort I have no time to reroll right now hopefully someone else can in the next day or so, this needs to be committed it's a nice performance tweak for heavy sites :-)
Comment #26
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedpetch updated
Comment #27
Dries CreditAttribution: Dries commentedTed's test secenario is not a realistic one and does not allow us to value the usefulness of this patch.
Generating 10 watchdog entries per requested page obviously skews the results in favor of this patch. We should try the following: do a full bootstrap and generate one watchdog entry per 10 page requests. That is a more realistic scenario (but still too optimistic). If the performance gain is still significant, it is worth the added complexity.
Comment #28
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedSince this patch also applied to the accesslog table, one write per page would be realistic.
Comment #29
m3avrck CreditAttribution: m3avrck commentedOk I ran the tests again, but this time, each request only produced *one* watchdog entry. This is very realistic... on average 1 per user makes sense.
Here are the numbers:
On average the empty table was still about 10% faster.
Comment #30
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedupdated for D6.
Comment #31
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedpatch had errors.
Comment #32
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedyes another bug squished
Comment #33
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedbug in upgrade path...
Comment #34
kbahey CreditAttribution: kbahey commentedThis is from a cheap bargain priced shared host. Both InnoDB and merge are supported.
Comment #35
kbahey CreditAttribution: kbahey commentedAfter the watchdog function was made into a hook, and watchdog is renamed to dblog, this patch will need to be rerolled.
Comment #36
chx CreditAttribution: chx commentedComment #37
robertDouglass CreditAttribution: robertDouglass commentedWhy critical? Because Drupal 6 runs slow like a dog who's bee shot in the leg. Every performance improvement should be marked critical. Drupal is slowly dying a performance death.
Comment #38
Crell CreditAttribution: Crell commentedThis would be blocked by #566548: Add storage engine support to schema API. anyway.
Comment #39
catchDowngrading all D8 criticals to major per http://drupal.org/node/45111
Comment #40
dawehnerThis won't land any time soon, given that it requires the other issue (#566548: Add storage engine support to schema API.) first.
Comment #41
mgiffordJust unassigning issues that haven't been developed for a bit in the D8 queue.