Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Displaying statistics pages can be slow. The attached patch removes one of the "GROUP BY" columns to increase the speed of generating the "Top pages in the past n days" page.
(Grouping by 'title' does not work, as there are many different paths that can have the same title. Instead, 'path' is unique for each page, so it is a more logical column to group by. I tested this on my active webpage to verify that the resulting page was what I expected.)
Comment | File | Size | Author |
---|---|---|---|
#6 | statistics_9.patch | 1.16 KB | Jeremy |
statistics.module-cvs_0.patch | 615 bytes | Jeremy | |
Comments
Comment #1
drummHow about a key on the path column?
Comment #2
Dries CreditAttribution: Dries commentedCommitted to HEAD. Marking this active until we clarifie the path index thing.
Comment #3
Dries CreditAttribution: Dries commentedComment #4
Jeremy CreditAttribution: Jeremy commentedIn which case several of the columns should have keys. I was going to add it to my earlier patch, but haven't had time.
Comment #5
Cvbge CreditAttribution: Cvbge commentedSQL code as in the patch won't work with postgresql:
Comment #6
Jeremy CreditAttribution: Jeremy commentedThe attached adds three keys that I confirmed are used. The keys are on path, url and uid.
Before adding a key for "path":
After:
And another query that gains from the "path" key:
Before adding the key for "url":
And after:
Before adding the key for "uid":
After:
Comment #7
Jeremy CreditAttribution: Jeremy commentedComment #8
moshe weitzman CreditAttribution: moshe weitzman commentedsure, these indices speed up the admin pages. but remember that every index needsb to be maintained for every insert. since accesslog is inserted into on every view, this is potentially harmful to a lot more people than admins. i'm not sure how to measure this tradeoff.
one approach is to copy the access log table to a read only table and do admin pages off of the copy. that means we have a copy dedicated to reading a different one dedicated to writing.
Comment #9
drummMySQL's insert delayed extension would be perfect for this. Unfortunately it is not ANSI SQL.
The inserts are done in the exit hook so the extra time is not usually passed on to the user (when drupal_goto() is used the exit hook execution happens before the redirect is sent). Although I'm guessing the total index maintenance time may be larger than the savings on the statistics pages.
Comment #10
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedpostgres part is missing.
Comment #11
Cvbge CreditAttribution: Cvbge commentedI think this needs to be benchmarked. I think adding 3 indexes that need to be updated for every page access just for the sake of 1 admin-visible page is dubious.
Comment #12
Jaza CreditAttribution: Jaza commented-1 from me. The accesslog table has a much heavier INSERT than SELECT rate on higher-traffic sites, and as such, we should have as few keys as possible on this table, in order to optimise it for fast INSERTing, not for fast SELECTing.
Closing issue (due to my personal -1, and due to extended inactivity).