Interesting finding mentioned by JohnAlbin elsewhere: http://www.cznp.com/blog/3/strtr-vs-str_replace-a-battle-for-speed-and-d...

HEAD:

# ab -c 1 -n 50 http://drupal.test/node/1
Document Path:          /node/1
Document Length:        9371 bytes

Concurrency Level:      1
Time taken for tests:   28.608 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      489700 bytes
HTML transferred:       468550 bytes
Requests per second:    1.75 [#/sec] (mean)
Time per request:       572.158 [ms] (mean)
Time per request:       572.158 [ms] (mean, across all concurrent requests)
Transfer rate:          16.72 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   11   8.4     16      31
Processing:   469  558  55.6    547     750
Waiting:      453  540  55.5    516     734
Total:        484  569  54.6    562     750

Percentage of the requests served within a certain time (ms)
  50%    562
  66%    562
  75%    562
  80%    578
  90%    672
  95%    719
  98%    750
  99%    750
 100%    750 (longest request)

Patch:

# ab -c 1 -n 50 http://drupal.test/node/1
Document Path:          /node/1
Document Length:        9371 bytes

Concurrency Level:      1
Time taken for tests:   26.436 seconds
Complete requests:      50
Failed requests:        0
Write errors:           0
Total transferred:      489700 bytes
HTML transferred:       468550 bytes
Requests per second:    1.89 [#/sec] (mean)
Time per request:       528.723 [ms] (mean)
Time per request:       528.723 [ms] (mean, across all concurrent requests)
Transfer rate:          18.09 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   13   5.8     16      16
Processing:   469  515  46.0    500     703
Waiting:      453  497  46.1    484     687
Total:        484  528  47.3    516     719

Percentage of the requests served within a certain time (ms)
  50%    516
  66%    531
  75%    547
  80%    547
  90%    578
  95%    594
  98%    719
  99%    719
 100%    719 (longest request)

Didn't replace the line for easier benchmarking.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mcrittenden’s picture

Sweet! Subscribe.

Gerhard Killesreiter’s picture

I've tried this with a minimal script and I am not convinced this is an improvement.

528 47.3

569 54.6

Lookat these values, they aren't even one std dev apart.

sun’s picture

FileSize
841 bytes
2.2 KB

strtr() seems to be much slower than str_replace() when the string subject exceeds a certain char limit, approx. ~100 chars. Perhaps also the amount of replacements plays a role.

1) The patch resulting in the best performance for me. Only t(), format_plural(), and DatabaseConnection converted.

2) A simple test script.

smk-ka’s picture

Status: Needs review » Needs work

Note that strtr and str_replace are NOT equivalent replacements. From the PHP manual:

strtr() will always look for the longest possible match first and will *NOT* try to replace stuff that it has already worked on.

This means, if a string contains both @foo and @foo-bar placeholders, strtr will replace the longest match, while str_replace will replace whichever replacement variable comes first, leading to unpredictable results.

Gerhard Killesreiter’s picture

Status: Needs work » Needs review
FileSize
2.3 KB

Ah too bad, I guess this pretty much kills it.

I've refined the above script a bit

# repl, # str, strlen, str_repl, strtr, str_replace < strtr 
1, 1, 14, 0.249, 0.135, 0 
1, 2, 29, 0.245, 0.178, 0 
1, 3, 29, 0.252, 0.256, 1 
1, 4, 35, 0.254, 0.277, 1 
1, 5, 40, 0.252, 0.308, 1 
1, 6, 41, 0.254, 0.253, 0 
1, 7, 66, 0.248, 0.43, 1 
1, 8, 77, 0.251, 0.511, 1 
1, 9, 73, 0.252, 0.515, 1 
1, 10, 88, 0.246, 0.631, 1 
1, 11, 89, 0.254, 0.555, 1 
1, 12, 97, 0.261, 0.679, 1 
1, 13, 116, 0.251, 0.813, 1 
1, 14, 112, 0.259, 0.814, 1 
1, 15, 120, 0.266, 0.84, 1 
1, 16, 121, 0.253, 0.846, 1 
1, 17, 132, 0.269, 0.92, 1 
1, 18, 146, 0.308, 1.037, 1 
1, 19, 151, 0.287, 0.884, 1 
1, 20, 154, 0.291, 1.079, 1 
2, 1, 21, 0.324, 0.155, 0 
2, 2, 34, 0.328, 0.335, 1 
2, 3, 47, 0.323, 0.328, 1 
2, 4, 47, 0.342, 0.442, 1 
2, 5, 56, 0.331, 0.988, 1 
2, 6, 64, 0.346, 0.932, 1 
2, 7, 69, 0.327, 1.371, 1 
2, 8, 83, 0.325, 0.773, 1 
2, 9, 82, 0.334, 0.561, 1 
2, 10, 98, 0.332, 1.567, 1 
2, 11, 94, 0.332, 1.352, 1 
2, 12, 100, 0.345, 0.643, 1 
2, 13, 119, 0.357, 0.791, 1 
2, 14, 125, 0.346, 1.234, 1 
2, 15, 126, 0.36, 1.6, 1 
2, 16, 135, 0.382, 2.073, 1 
2, 17, 139, 0.376, 1.753, 1 
2, 18, 152, 0.399, 1.886, 1 
2, 19, 159, 0.412, 3.599, 1 
2, 20, 163, 0.383, 2.598, 1 
3, 1, 37, 0.402, 0.533, 1 
3, 2, 44, 0.406, 0.475, 1 
3, 3, 41, 0.397, 0.512, 1 
3, 4, 58, 0.4, 1.003, 1 
3, 5, 57, 0.41, 0.908, 1 
3, 6, 78, 0.41, 0.987, 1 
3, 7, 88, 0.402, 1.205, 1 
3, 8, 87, 0.408, 1.828, 1 
3, 9, 103, 0.422, 2.318, 1 
3, 10, 100, 0.421, 2.25, 1 
3, 11, 106, 0.46, 2.289, 1 
3, 12, 117, 0.412, 2.796, 1 
3, 13, 127, 0.526, 2.832, 1 
3, 14, 131, 0.497, 2.052, 1 
3, 15, 144, 0.478, 2.422, 1 
3, 16, 147, 0.76, 2.62, 1 
3, 17, 154, 0.557, 3.807, 1 
3, 18, 158, 0.504, 4.117, 1 
3, 19, 170, 0.555, 2.941, 1 
3, 20, 178, 0.555, 3.004, 1 
4, 1, 45, 0.505, 0.388, 0 
4, 2, 54, 0.476, 0.623, 1 
4, 3, 62, 0.49, 1.003, 1 
4, 4, 61, 0.491, 0.891, 1 
4, 5, 73, 0.501, 0.803, 1 
4, 6, 85, 0.51, 1.942, 1 
4, 7, 89, 0.495, 2.096, 1 
4, 8, 93, 0.499, 1.937, 1 
4, 9, 113, 0.52, 1.646, 1 
4, 10, 103, 0.502, 1.37, 1 
4, 11, 126, 0.528, 3.037, 1 
4, 12, 121, 0.55, 3.1, 1 
4, 13, 140, 0.588, 3.64, 1 
4, 14, 135, 0.567, 3.314, 1 
4, 15, 156, 0.571, 2.5, 1 
4, 16, 158, 0.581, 4.239, 1 
4, 17, 160, 0.599, 3.806, 1 
4, 18, 168, 0.6, 4.138, 1 
4, 19, 178, 0.61, 5.139, 1 
4, 20, 186, 0.577, 4.956, 1 
5, 1, 55, 0.548, 0.683, 1 
5, 2, 63, 0.552, 1.021, 1 
5, 3, 72, 0.584, 1.072, 1 
5, 4, 71, 0.567, 1.258, 1 
5, 5, 83, 0.613, 1.65, 1 
5, 6, 96, 0.559, 2.029, 1 
5, 7, 104, 0.574, 2.137, 1 
5, 8, 106, 0.611, 2.449, 1 
5, 9, 106, 0.575, 2.3, 1 
5, 10, 123, 0.632, 2.794, 1 
5, 11, 129, 0.676, 3.08, 1 
5, 12, 129, 0.664, 3.056, 1 
5, 13, 137, 0.655, 3.224, 1 
5, 14, 148, 0.678, 3.72, 1 
5, 15, 154, 0.674, 3.952, 1 
5, 16, 172, 0.702, 4.656, 1 
5, 17, 173, 0.674, 4.844, 1 
5, 18, 178, 0.698, 4.769, 1 
5, 19, 189, 0.689, 4.972, 1 
5, 20, 195, 0.701, 5.266, 1 
Gerhard Killesreiter’s picture

Status: Needs review » Closed (won't fix)

oops

Gerhard Killesreiter’s picture

Status: Closed (won't fix) » Needs work

Maybe this was a bit premature.

In some cases strtr has no advantage over str_replace, ie for format_plural it should be ok.

Gerhard Killesreiter’s picture

in t() the change is too dangerous, I think, because if the problem outlined by smk-ka.

in format_plural() it would be safe, but I believe there is only a small gain, if it doesn' hurt. There is only one string to be replaced and the strings are usually short.

In the database class it should get more investigation: Due to the prefixes the change should be safe, the strings are typically long and there are often several replacement in a string.

sun’s picture

Status: Needs work » Needs review
FileSize
1.71 KB

So we need to benchmark this one.

catch’s picture

Subscribing. Don't think we'll get much from this, but profiling 1000 node inserts from a migration, strstr takes 1% of the overall request time. 14th most expensive function for exclusive CPU time. So this is worth looking into given it's such a small change.

retester2010’s picture

Issue tags: -Performance

#9: drupal.str-replace.9.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Performance

The last submitted patch, drupal.str-replace.9.patch, failed testing.

DaPooch’s picture

After profiling with XHProf it is quite clear that strtr takes a HUGE performance hit for me. Certain operations take 15-30s to complete and strtr is always listed as the culprit. In some instances it gets over a million calls and can take over 50% of the total operation time. I don't recall this being so performance hefty on other drupal installs... maybe it's a server config issue. Anyone else come across this before?

dalin’s picture

This kind of patch could help improving performance of the slowest 5% of page loads. For example the ones right after clearing all the caches where a lot of DB queries happen. In the benchmarks below I've enabled all core modules, all blocks, 10 nodes on the front page, and completely disabled the cache system (by returning FALSE and array() from cache_get() and cache_get_multiple() respectively).

Two things to note:

1) I tried to setup database prefixing to confirm that these variations don't break anything, but any database prefixing that contains a '.' breaks. Or am I misunderstanding how to use prefixing:

$databases = array (
  'default' => 
  array (
    'default' => 
    array (
      'driver' => 'mysql',
      'database' => 'drupal-head',
      'username' => **********,
      'password' => **********,
      'host' => 'localhost',
      'port' => '',
      'prefix' => array(
        'default'   => 'drupal-head.',
        'users'     => 'drupal-head.',
        'sessions'  => 'drupal-head.',
        'role'      => 'drupal-head.',
        'authmap'   => 'drupal-head.',
      ),
    ),
  ),
);

2) There doesn't seem to be enough variance between the options to make a firm judgement call. Though the difference should be greater if I could get prefixing working.

First up, the original:


  public function prefixTables($sql) {
    // Replace specific table prefixes first.
    foreach ($this->prefixes as $key => $val) {
      $sql = strtr($sql, array('{' . $key . '}' => $val . $key));
    }
    // Then replace remaining tables with the default prefix.
    return strtr($sql, array('{' => $this->defaultPrefix , '}' => ''));
  }
$ ab -n100 -c5 http://drupal-head.local/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking drupal-head.local (be patient).....done


Server Software:        Apache/2.2.15
Server Hostname:        drupal-head.local
Server Port:            80

Document Path:          /
Document Length:        42247 bytes

Concurrency Level:      5
Time taken for tests:   30.796 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      4269200 bytes
HTML transferred:       4224700 bytes
Requests per second:    3.25 [#/sec] (mean)
Time per request:       1539.795 [ms] (mean)
Time per request:       307.959 [ms] (mean, across all concurrent requests)
Transfer rate:          135.38 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   485 1520 553.5   1422    4562
Waiting:      462 1481 555.1   1390    4539
Total:        485 1520 553.5   1422    4562

Percentage of the requests served within a certain time (ms)
  50%   1422
  66%   1492
  75%   1588
  80%   1614
  90%   2288
  95%   2505
  98%   3477
  99%   4562
 100%   4562 (longest request)
  public function prefixTables($sql) {
    // Using str_replace() instead of strtr() here, as str_replace() is faster
    // the longer the string is, the more replacements happen, and when the
    // needles are delimited.
    // @see http://php.net/manual/en/function.strtr.php

    // Replace specific table prefixes first.
    $search = $replace = array();
    foreach ($this->prefixes as $key => $val) {
      $search[] = '{' . $key . '}';
      $replace[] = $val . $key;
    }
    // Then replace remaining tables with the default prefix.
    $search[] = '{';
    $replace[] = $this->defaultPrefix;
    $search[] = '}';
    $replace[] = '';
    return str_replace($search, $replace, $sql);
  }
$ ab -n100 -c5 http://drupal-head.local/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking drupal-head.local (be patient).....done


Server Software:        Apache/2.2.15
Server Hostname:        drupal-head.local
Server Port:            80

Document Path:          /
Document Length:        42247 bytes

Concurrency Level:      5
Time taken for tests:   30.660 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      4269200 bytes
HTML transferred:       4224700 bytes
Requests per second:    3.26 [#/sec] (mean)
Time per request:       1532.980 [ms] (mean)
Time per request:       306.596 [ms] (mean, across all concurrent requests)
Transfer rate:          135.98 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   567 1519 458.9   1498    4628
Waiting:      541 1483 448.4   1465    4504
Total:        567 1519 458.9   1498    4628

Percentage of the requests served within a certain time (ms)
  50%   1498
  66%   1558
  75%   1611
  80%   1663
  90%   1779
  95%   1873
  98%   3557
  99%   4628
 100%   4628 (longest request)

Lets try a static.

  public function prefixTables($sql) {
    static $search, $replace;

    // Using str_replace() instead of strtr() here, as str_replace() is faster
    // the longer the string is, the more replacements happen, and when the
    // needles are delimited.
    // @see http://php.net/manual/en/function.strtr.php

    if (!isset($search)) {
      // Replace specific table prefixes first.
      $search = $replace = array();
      foreach ($this->prefixes as $key => $val) {
        $search[] = '{' . $key . '}';
        $replace[] = $val . $key;
      }
      // Then replace remaining tables with the default prefix.
      $search[] = '{';
      $replace[] = $this->defaultPrefix;
      $search[] = '}';
      $replace[] = '';
    }
    return str_replace($search, $replace, $sql);
  }
e$ ab -n100 -c5 http://drupal-head.local/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking drupal-head.local (be patient).....done


Server Software:        Apache/2.2.15
Server Hostname:        drupal-head.local
Server Port:            80

Document Path:          /
Document Length:        42247 bytes

Concurrency Level:      5
Time taken for tests:   29.962 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      4269200 bytes
HTML transferred:       4224700 bytes
Requests per second:    3.34 [#/sec] (mean)
Time per request:       1498.106 [ms] (mean)
Time per request:       299.621 [ms] (mean, across all concurrent requests)
Transfer rate:          139.15 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   853 1482 406.3   1460    4438
Waiting:      835 1451 407.4   1427    4403
Total:        853 1482 406.3   1460    4438

Percentage of the requests served within a certain time (ms)
  50%   1460
  66%   1512
  75%   1572
  80%   1588
  90%   1744
  95%   1917
  98%   2656
  99%   4438
 100%   4438 (longest request)
  public function prefixTables($sql) {
    static $replace;

    // Using str_replace() instead of strtr() here, as str_replace() is faster
    // the longer the string is, the more replacements happen, and when the
    // needles are delimited.
    // @see http://php.net/manual/en/function.strtr.php

    if (!isset($replace)) {
      // Replace specific table prefixes first.
      $replace = array();
      foreach ($this->prefixes as $key => $val) {
        $replace['{' . $key . '}'] = $val . $key;
      }
      // Then replace remaining tables with the default prefix.
      $replace['{'] = $this->defaultPrefix;
      $replace['}'] = '';
    }
    foreach ($replace as $from => $to) {
      $sql = strtr($sql, array($from => $to));
    }
    return $sql;
  }
$ ab -n100 -c5 http://drupal-head.local/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking drupal-head.local (be patient).....done


Server Software:        Apache/2.2.15
Server Hostname:        drupal-head.local
Server Port:            80

Document Path:          /
Document Length:        42422 bytes

Concurrency Level:      5
Time taken for tests:   30.074 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      4286700 bytes
HTML transferred:       4242200 bytes
Requests per second:    3.33 [#/sec] (mean)
Time per request:       1503.700 [ms] (mean)
Time per request:       300.740 [ms] (mean, across all concurrent requests)
Transfer rate:          139.20 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:   563 1488 423.2   1467    4793
Waiting:      546 1446 424.7   1427    4775
Total:        563 1488 423.2   1467    4794

Percentage of the requests served within a certain time (ms)
  50%   1467
  66%   1536
  75%   1590
  80%   1631
  90%   1685
  95%   1854
  98%   2415
  99%   4794
 100%   4794 (longest request)
dalin’s picture

So I figured out how to do the database table prefixing (had to enclose the DB name in ``) and setup prefixes on 19 tables + default. I then repeated benchmarks on the four scenarios that tested in #14 on the front page with 10 nodes, all modules and blocks enabled, anon has all perms, and completely disabled the cache system (by returning FALSE and array() from cache_get() and cache_get_multiple() respectively) to simulate a cache miss. There are 478 queries per page. I did three runs for each scenario. The numbers below are for requests-per-second and number of ms that 75% of pages were served.

no patch
2.21rps 2336ms
2.13rps 2437ms
2.01rps 2582ms
ave: 2.12rps 2451ms

single str_replace()
2.36rps 2164ms
2.36rps 2192ms
2.25rps 2358ms
ave: 2.32rps 2238ms
improvement: 9.4% 8.7%

single str_replace plus static search/replace arrays
2.44rps 2118ms
2.41rps 2135ms
2.37rps 2199ms
ave: 2.41rps 2151ms
improvement: 14% 12%

static replace array + multiple strtr()
2.17rps 2426ms
2.17rps 2383ms
2.18rps 2386ms
ave: 2.17rps 2398ms
improvement: 2.4% 2.2%

Based on these results plus those in #14 it looks like the change makes little improvement when there are no prefixes, but if there are we can see a good gain on high-query-count pages which seems to confirm @sun @Gerhard's research above. I'll throw together a patch using variation #3. Someone correct me if I'm wrong, but prefixes will not change during page generation so we should use plain old static over drupal_static.

dalin’s picture

Status: Needs work » Needs review
FileSize
1.62 KB

Status: Needs review » Needs work

The last submitted patch, 561422_strtr.diff, failed testing.

dalin’s picture

Status: Needs work » Needs review
FileSize
1.39 KB

Testbot answered my question about statics. Simpletest is an example where the prefix changes during the page request. So my first thought was to use the drupal_static_fast pattern, only to discover that it won't work here. To use a static we'd need a workflow something like:
- setup the static variable
- the database connection or prefixes change so reset the static
However we don't have a concept of a "current" connection like we did with db_set_active(). All connections are active and you just choose which one to act on at any given time. Therefore I don't think statics will work.

New patch using variation 2 from #14 and #15.

sun’s picture

Not sure whether str_replace() is really appropriate here. The last two replacements in the stack are not unique. Furthermore, the passed in $sql string is often a single table name only.

However, playing a bit around with strtr(), it seems like we could at least cut this down to a single strtr().

dalin’s picture

FileSize
10.41 KB

Not sure whether str_replace() is really appropriate here. The last two replacements in the stack are not unique. Furthermore, the passed in $sql string is often a single table name only.

The substitutions just need to have the default happen last. str_replace() does replacements in the order of the $search array, this works fine because we add the default to the array last. strtr() does replacements in order of longest to shortest, this works fine because '{' and '}' are always shorter than "{$key}". So either will work fine.

I did benchmarks of the patch in #18 vs. #19 vs. un-patched for 0 prefixes and 19 tables prefixed. 10 nodes on the front page, all modules and blocks enabled, anon has all permissions, the caching system is disabled, 477 queries on the page. No XDebug, APC is enabled. Results attached.

Neither patch makes a statistically significant effect when there aren't any tables prefixed. However with 19 tables prefixed sun's strtr() patch makes a ~3.9% improvement, while my str_replace() makes a ~11% improvement.

dalin’s picture

Component: base system » database system
FileSize
1.66 KB

I wanted this for a D6 site as well. Here's a patch.

This might actually be the way to do it for D7 as well since it uses strtr() if $db_prefix is not an array.

JamesSharpe’s picture

This issue becomes important when there are a large number of prefixes e.g. when using CiviCRM 4 in split database mode where all civicrm tables have a prefix. I was seeing 34% of the page generation time being spent in the prefixTable code. See the benchmarks in #1192620: Performance improvement in prefixTables

sun’s picture

Version: 7.x-dev » 8.x-dev
Priority: Normal » Major
Issue tags: +Needs backport to D7

#1192620: Performance improvement in prefixTables has been marked as duplicate, contains a patch that's 99% identical to #19.

Last comment over there:

the other [this] issue hasn't considered the case of large numbers of tables; in my case 34% of the page generation time was spent in strtr calls which almost makes Drupal 7/CiviCRM 4 sites unusable when configured with split databases.

lyricnz’s picture

Version: 8.x-dev » 7.x-dev
Priority: Major » Normal
Issue tags: -Needs backport to D7
FileSize
1.78 KB

Here's a patch for D7/D8 that uses str_replace() rather than strtr() for database prefixing, including pre-calculating the arguments. It uses instance variables rather than static variables, to support >1 DatabaseConnection, and to minimize the code in prefixTables().

lyricnz’s picture

Version: 7.x-dev » 8.x-dev
Priority: Normal » Major
Issue tags: +Needs backport to D7

Fixing crosspost.

pillarsdotnet’s picture

I see a repeated pattern of finessing code to gain a 1-5% performance improvement (per-function-call, not overall) based on weird, illogical quirks of the current version of PHP. So we invent silly rules that result in a patch being rejected because it uses (e.g.) if (empty($foo)) rather than the 5% faster if (!isset($foo)).

If we're not willing to work toward a 500% improvement in performance, why do we kill ourselves over these tiny incremental gains?

lyricnz’s picture

The change proposed here is not based on some illogical quirk - str_replace() is both logically and practically faster then strtr(). The optimization proposed here is more like 60% faster (typical case: medium length SQL, with single prefix) to 95% faster (medium length SQL, 100 prefixes), for a function that is called a few hundred time per page.

Who says we're not working towards hiphop?
http://fourkitchens.com/blog/2010/02/03/making-drupal-pressflow-more-mun...
http://php.webtutor.pl/en/2011/05/17/drupal-hiphop-for-php-vs-apc-benchm...

catch’s picture

I have never seen a patch rejected on isset vs empty based on performance, plenty of times because they were incorrectly used.We do avoid array_key_exists() unless it's absolutely necessary, function calls are slower than language constructs regardless of php version. Examples please?

DB table prefixing has been found responsible for 30% slow downs on installs running civicrm, micro-optimization is doing this to code that runs rarely, not that runs potentially hundreds of times on every request.

I'm typing from a phone but would very much question the 500% claim with hiphop. Benchmarks against D6 with hiphop didn't get anything like that vs APC. The main issue with hiphop and D7/8 is not that people don't want to support it, but the poor support in hiphop for pdo and stream wrappers that we rely on, which makes it tricky.

pillarsdotnet’s picture

@catch:

We do avoid array_key_exists() unless it's absolutely necessary

Why? Incremental performance gains. We avoid is_array() and a handful of other PHP functions for the same reason. I wish there was a page somewhere to document the exact subset of PHP that Drupal finds acceptable.

Examples please?

(Looking...) Okay, it was !isset() vs. is_null(). In this patch review, @sun authoritatively declared, "Always use !isset() instead of is_null()." Why? 5% performance gain is the only reason I can find. Conversely, coder recommends replacing every substr() by the much slower drupal_substr(), even when the result is guaranteed to contain no unicode byte sequences. So I think there is some inconsistency regarding these rules.

catch’s picture

If there's no meaningful difference between is_null() and isset() most of the time, why not use the one that's 5% faster? More to the point, why not pick one to standardize on (same as we standardize on print vs. echo despite print being nominally slower) so that there is consistency (same as we do for elseif vs. else if). I'm not sure what you think the problem is?

is_array() isn't avoided, but there are cases where it is called hundreds (or thousands) of times during a request and eats a bunch of cpu. drupal_attributes() in core, and unpack_options() in views are two examples. Sometimes things get solved by trying to optimize the function call, sometimes it's via major refactoring, sometimes both. Drupal's usage is not good, and got considerably worse between D6 and D7, so if there are quick wins for marginal gains they are better than nothing.

Coder is likely playing on the safe side for multibyte string compatibility, if you want to see what the actual 'rules' are look at http://drupal.org/coding-standards, if you think a coding standard is 'silly' open an issue to discuss changing it.

Also, http://drupal.org/sandbox/sun/1181434

sun’s picture

The primary reason we're using isset($foo) instead of is_null($foo) is that is_null($foo) throws a PHP notice if $foo is undefined. is_null() only works silently when $foo is actually defined (NULL or an actual value). It's not about performance, it's about preventing sloppy PHP notices. Lastly, is_null() is a negated condition in 99% of all cases - following positive condition logic is much easier for the human brain.

@pillarsdotnet: If you want to discuss a handbook page or whatever other documentation for these things, then open a new issue, please. While I can understand your rant, it's side-tracking the work on the actual issue here.

dalin’s picture

Status: Needs review » Needs work

And getting us back to the actual issue at hand...

- Great to see that this works now for more than one database connection.
- Other than that this is almost exactly the same as the patch in #18 except that it removes several of the inline comments. I think we should bring those back in to describe the logic behind what may not be immediately obvious.

Other than that it looks great.

lyricnz’s picture

Status: Needs work » Needs review
FileSize
1.89 KB

The latest patch is a bit different in that it pre-calculates the arguments for str_replace() rather than doing this for each invocation of prefixTables().

I think the long comment in #18 is redundant - once the change is committed, the use of strtr() becomes a historical legacy. Updated patch with slightly more verbosity about the table-specific replacement happening ahead of the default prefix.

sun’s picture

Status: Needs review » Reviewed & tested by the community

Looks ready to fly for me.

catch’s picture

Title: Performance: Replace strtr() with str_replace() » Replace strtr() with str_replace() for db prefixing
Issue tags: +Needs backport to D6

There is a D6-only issue at #42827: optimize/enhance db prefixing, looks like the new code is not as different from there as I thought when updating that issue in 2009, so marking this one for backport.

The patch here is simpler than the patch over there which is also good.

lyricnz’s picture

FWIW I just benchmarked another alternative - using str_replace() + creating $search/$replace on the fly, and only including elements that were actually in the string (using strpos()). As expected, performance-wise, it came in between the precalculated version, and the on-the-fly version. Here's the test harness, and results for various numbers of prefixes.

Here's the results (in ms) for 200 iterations of 100 prefixes (roughly realistic for Civicrm)

    [100] => Array
        (
            [prefixTables] => 485.5189
            [prefixTables_single_strtr] => 64.4381
            [prefixTables_precalc_singletr] => 53.4070
            [prefixTables_multiple_str_replace] => 38.5571
            [prefixTables_single_str_replace] => 37.7331
            [prefixTables_precalc_single_str_replace] => 23.6192
            [prefixTables_culled_single_str_replace] => 30.1590
        )

In order these are:

  1. the current code - multiple strtr()
  2. a single strtr() per invocation
  3. as above, but precalculating the arguments to strtr()
  4. straight replacement of current code - strtr() for str_replace() - i.e. multiple per invocation
  5. a single str_replace per invocation
  6. as above, but precalculating the arguments to str_replace()
  7. as #5, but checking if the {tablename} exists in $sql first with strpos() before adding it to the search/replace

So the fastest (in all cases, as it happens) is precalculating the arguments and using a single str_replace(). About 20 times faster than current code.

sun’s picture

Thank you for the additional benchmarks!

So the chosen RTBC'ed solution prefixTables_precalc_single_str_replace wins on all ends, and will even shave of some ms for the majority of sites that don't use prefixes, it seems. Very nice.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Awesome! This looks like a nice improvement.

Since this affects the database layer I'd like to see a DB maintainer sign off on this, ideally Crell.

catch’s picture

This may help with test run speed, do we have a way to find out?

johnp’s picture

Oh, nice to see that there is work going on the prefixing issue!

When I've been profiling drupal 6 sites I noticed that db_prefix_tables took surprisingly long time, so I whipped together my own version. I've used it for a Drupal 6 multisite, with partly shared database, for about a year. It is probably not as elegant, nor as fast as the solution you guys have up come with for D7, but it is about 5-7 times faster than the original when I've tested it . Perhaps someone can find it useful in the wait for a backport to D6.

Here is the dirty hack:

function db_prefix_tables($sql) {
global $db_prefix;

$pattern = '/\{(.+?)\}/';
preg_match_all($pattern, $sql, $tables);
$tables_len=sizeof($tables[0]);
$i=0;
$prefix_default = (array_key_exists("default",$db_prefix)) ? $db_prefix["default"] : "";

while ($tables_len > $i) {
$table = trim($tables[1][$i]);
$table_esc = trim($tables[0][$i]);
$prefix = (array_key_exists($table,$db_prefix)) ? $db_prefix[$table] : $prefix_default;
$sql = str_replace($table_esc,$prefix.$table,$sql);
$i++;
}
return $sql;
}

// John

Crell’s picture

Status: Needs review » Needs work
+++ b/includes/database/database.inc
@@ -288,6 +288,13 @@ abstract class DatabaseConnection extends PDO {
+  /**
+   * Pre-calculated parameters for use in prefixTables().
+   *
+   * @var array
+   */
+  protected $prefixSearch = array(), $prefixReplace = array();

These should be separate declarations with their own docblocks. I wasn't even aware that PHP allowed multiple properties to be declared together like that. :-) We don't do so anywhere else, so let's not start here.

+++ b/includes/database/database.inc
@@ -391,6 +398,19 @@ abstract class DatabaseConnection extends PDO {
+    $this->prefixSearch = $this->prefixReplace = array();

Same here. This is not a typical pattern in Drupal. Let's not use it.

Powered by Dreditor.

Other than those stylistic issues this looks like a great optimization. Thanks, lyricnz!

lyricnz’s picture

Status: Needs work » Needs review
FileSize
2.01 KB
Crell’s picture

Status: Needs review » Reviewed & tested by the community

I do love it when we can precompute things.

webchick’s picture

Version: 8.x-dev » 6.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)
Issue tags: -Needs backport to D7

Awesome. :) Thanks for the sanity check, Crell!

Committed to 8.x and 7.x. Marking for backport to D6.

Damien Tournoud’s picture

Title: Replace strtr() with str_replace() for db prefixing » [7.x and 8.x broken] Replace strtr() with str_replace() for db prefixing
Version: 6.x-dev » 8.x-dev
Category: task » bug
Priority: Major » Critical
Status: Patch (to be ported) » Needs work

This completely broke SQLite on both branches.

Damien Tournoud’s picture

Status: Needs work » Needs review
Issue tags: +Needs backport to D7
FileSize
1.21 KB

Please commit the attached.

chx’s picture

Status: Needs review » Reviewed & tested by the community

So SQLite adds a dot for each prefix which is not reflected in the new prefixSearch and prefixReplace properties. The most important part of the patch is $this->setPrefix($prefixes); which makes this happen.

The other hunk, line by line: it's no longer necessary to mess with $this->prefixes because setPrefix will set it up for us, so the reference is removed. Then, instead of a numeric key we put the default prefix where setPrefix expects it to be. Finally, the foreach is just simplification, reuse the $prefixes variable.

lyricnz’s picture

My bad. But... perhaps we should be bot-testing in some non-MySQL databases?

Damien Tournoud’s picture

This was properly caught three days ago by our testing of secondary databases: https://drupaltesting.org/jenkins/job/test-drupal7-linux-sqlite/

lyricnz’s picture

Oh sweet, I never knew about that.

aspilicious’s picture

I trust you guys but isn't there a possibility to run the tests with this patch on the sql lite test system BEFORE we commit this?

Damien Tournoud’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +API change

There is another fix needed for DatabaseConnection_sqlite::queryTemporary(). Also, this is an API change, it should be marked as such. Most of the contrib database drivers will likely have to be updated.

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
4.01 KB

This issue introduced an incompatible API change to the Database Engine API. While we are at it, let's clean-up this and remove the defaultPrefix variable. It will make everyone's life easier.

Also fixed the temporary query handling in SQLite, which also needs to mess-up with prefixes.

Damien Tournoud’s picture

FileSize
4.01 KB

Fix typo.

Damien Tournoud’s picture

webchick’s picture

Issue tags: +Release blocker

Ack!! Good catch, Damien!

Tagging "release blocker". If this patch is ready to go in the next 24 hours or so (would love to see Crell chime in again, if possible), I'll commit it, else I'll roll back the earlier patch so we don't ship 7.3 with broken SQLite support.

Crell’s picture

I didn't realize we had an SQLite test bot now either. Do we have one for Postgres, as well? Does it approve of #54?

It seems sane to me, and I like moving the prefixes into an internal getter method. If Damien and the bots are happy with this on SQLite and Postgres, then I am too.

lyricnz’s picture

Status: Needs review » Needs work

setPrefix() should not include literal 'default' in the arrays used for the str_replace() - this is undesirable replacement. Also, this change makes 'default' an unusable table name for explicit prefixing - perhaps we should use a non-valid table name, like '#default'.

lyricnz’s picture

And the description of field $prefixes "The non-default prefixes used by this database connection" is no longer accurate.

lyricnz’s picture

Status: Needs work » Needs review
FileSize
4.25 KB

Here's an update of the patch in #54 that updates the field comment, and excludes 'default' from the search/replace arrays - it's still used at the end, of course. I haven't changed the name of the default prefix in the array.

lyricnz’s picture

Should function name be plural?

  /**
   * Return the list of prefixes used by this database connection.
   */
  public function getPrefix() {
    return $this->prefixes;
  }
Crell’s picture

Status: Needs review » Needs work

Hm. Yeah, it probably should be since it returns an array.

lyricnz’s picture

I suspect the reason for the name was so that it matched setPrefix() - which supports both $prefix and $prefixes style invocation. Thoughts?

pillarsdotnet’s picture

setPrefix() should then be likewise renamed.

lyricnz’s picture

Status: Needs work » Needs review
FileSize
4.8 KB

Renamed both.

webchick’s picture

We can't arbitrarily rename functions in a stable point release, so we'll need a D7 version of this patch as well.

lyricnz’s picture

So, rename or not rename? I'm not terribly bothered either way.

pillarsdotnet’s picture

lyricnz: no rename. Then suggest rename in a follow-up d8-only patch.

@webchick: See #60 for non-rename version.

chx’s picture

FileSize
1.25 KB
4.09 KB

No rename. Also, even the most rabid Java OOP fiend would not use a getter to get a property in the same object! I really enjoy (or not) the debate whether properties should have setters and getters and you know where I am on that but the same class? Gimme a break.

This is a reroll of #60 and the interdiff.

lyricnz’s picture

#69 looks good to me, but would like a DB maintainer to review before RTBC - since the patch is a little bit mine.

Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

Tested #69 on SQLite and it passed. I'm very happy with it.

Damien Tournoud’s picture

The patch applies cleanly to both to 7.x and 8.x.

lyricnz’s picture

Thanks team.

Crell’s picture

*stamps his approval*

chx’s picture

Note: you can not have a specific prefix for a table named default after this. However, default is a reserved SQL word for everything due to its being used in CREATE / ALTER table probably. I checked this in SQL 92, SQL 99, SQLite, MySQL, PostgreSQL, MS SQL and Oracle. So you can't have a table named default anyways.

Damien Tournoud’s picture

Note: you can not have a specific prefix for a table named default after this.

That was true before. The structure of the prefix array (from settings.php) is done that way. There are absolutely no functional change in this patch.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Ok great. Thanks a lot for rallying on this, folks!!

Committed to 7.x and 8.x!

webchick’s picture

Title: [7.x and 8.x broken] Replace strtr() with str_replace() for db prefixing » Replace strtr() with str_replace() for db prefixing
jhodgdon’s picture

This issue is tagged "API change". If it is actually an API change, can someone please provide a summary, if it needs to be added to the update docs in any way?

Also, just pointing out that it is marked "needs d6 port", so once the issue of update docs is resolved, it probably needs to be marked d6 / to be ported?

jhodgdon’s picture

Status: Fixed » Needs work
Issue tags: +Needs documentation updates

Changing status until doc question is resolved, so it doesn't get closed behind my back.

catch’s picture

Priority: Critical » Major

This is an API change for contrib database drivers, I don't know what the documentation would look like, maybe just a note and point to this issue.

And it does need backporting to D6, unless we shifted back to #42827: optimize/enhance db prefixing for that.

Downgrading to 'major' since that was original status prior to the regression.

jhodgdon’s picture

To document API changes, it's very helpful to have a before/after code example, and a short narrative summary of what changed. Neither of which is obvious to me at first glance on this issue. The issue title is all about strstr, but the patch doesn't seem to have anything to do with that, and seems to have removed part of some database class?

catch’s picture

There are two patches:

http://drupal.org/node/561422#comment-4640200 - this fixed the original bug but broke sqlite.

http://drupal.org/node/561422#comment-4659224 - this fixed sqlite. The code change example would be the hunk in that patch dealing with sqlite.

It is possible that contrib drivers aren't affected by this issue in the same way that mysql and postgres didn't need any specific changes.

Crell’s picture

I suspect drivers will only need to be updated if they have custom prefix handling of some kind. Damien obviously knows about it, but we should ping Andrea as well.

catch’s picture

Status: Needs work » Fixed
Issue tags: -Needs documentation updates

http://drupal.org/update/modules/6/7#driver_prefix - not amazing but should point people in the right direction.

pillarsdotnet’s picture

catch’s picture

that is what I mean, thanks.

lyricnz’s picture

Version: 8.x-dev » 6.x-dev
Status: Fixed » Active

Needs backport to D6

lyricnz’s picture

Status: Active » Needs review
FileSize
1.57 KB

str_replace() version using a static in db_prefix_tables()

pillarsdotnet’s picture

Note that "0 pass(es)" means that it didn't pass, despite the misleading green-tinted "PASSED" prefix.

EDIT: Or maybe that it simply didn't run any tests? At least the test details show that the patch applied properly. Hmm...

lyricnz’s picture

Does D6 simpletest fiddle with prefixes on the fly? Perhaps the static cache is preventing prefixes working.

dalin’s picture

Status: Needs review » Needs work
Issue tags: -Release blocker, -Needs backport to D7

"Does D6 simpletest fiddle with prefixes on the fly?"

Yes it does.

lyricnz’s picture

Status: Needs work » Needs review
FileSize
1.4 KB

With no explicit "set prefix" method, or object that contains enough context, we can't easily [1] cache the arrays used to perform the substitution. So, the closest we can get to a backport is doing a single str_replace on a dynamically generated array. This is option "prefixTables_single_str_replace" from #53 above - about 12x faster than the current code.

[1] Well.. if we kept a second copy of $db_prefix and compared for equality in each db_prefix_tables(), before regenerating the cache, we could still cache it. But that feels a bit ugly+expensive. Thoughts?

lyricnz’s picture

Status: Needs review » Needs work

0 passes. That's strange - simpletest works OK here. Is the testbot doing something wierd?

pillarsdotnet’s picture

Ask rfay to be sure, but I think that the d6 testbot *only* checks whether the patch applies; it doesn't actually run any tests.

catch’s picture

No unfortunately that is a genuine failure (or non-pass or whatever), there's 190 assertions in D6:

http://qa.drupal.org/pifr/test/16784

Very likely it is the static cache. If you look at the 6.x-1.x memcache branch there is some simpletest-specific code in there to handle the same problem, it's not pretty though.

lyricnz’s picture

Latest patch does not include any cache at all - still failed while accessing the DB tables.

andypost’s picture

Version: 6.x-dev » 8.x-dev
Issue tags: +Needs backport to D7

As I see it takes a lot of page request time http://pix.am/qGe6/

Currently str_replace() is used, also marked as duplicate #960798: Faster DatabaseConnection::prefixTables() when database prefix is an array
Test can be found at #960798-3: Faster DatabaseConnection::prefixTables() when database prefix is an array

catch’s picture

Version: 8.x-dev » 6.x-dev
Issue tags: -Needs backport to D7

@andypost, this was committed to 7.x and 8.x already?

andypost’s picture

Version: 6.x-dev » 8.x-dev
Issue tags: +Needs backport to D7

Results for script from #960798-3: Faster DatabaseConnection::prefixTables() when database prefix is an array

php 5.3.8 - str_replace is much faster! strtr 0.8406500816 str_replace 0.0321018696
php 5.2.17 results are looks the same 0.0041248798 0.0023090839

andypost’s picture

Version: 8.x-dev » 6.x-dev
Issue tags: -Needs backport to D7

Crossposted, suppose we should check this for upcoming PHP 5.4

pillarsdotnet’s picture

Issue tags: +Needs backport to D7

@andypost:

As I see it takes a lot of page request time http://pix.am/qGe6/

Can you provide a current patch? I don't see where the patch you reference could possibly be faster than the current code, but I would welcome benchmarks to prove otherwise.

At any rate, if you do have a solution that provides better performance than the current code, you should probably open (or re-open) a separate issue, as this one is currently focused on backporting the performance improvements to 6.x.

andypost’s picture

@pillarsdotnet I see no ability to improve performance for D7 but for D6 it's #93 probably makes sense

d7 & d8 implementation str_replace() d6 implementation strtr()

pillarsdotnet’s picture

@andypost:

suppose we should check this for upcoming PHP 5.4

I can assure you that the current 7.x and 8.x code runs fine on my PHP 5.5 site. I'm not certain the Drupal 6.x will ever fully support PHP 5.3 or later.

for D6 it's #93 probably makes sense

Well, that's why the issue version is currently at 6.x-dev. The problem with the patch in #93 is that it does not pass the 190 core tests. If you can produce a patch that passes tests, it would be greatly appreciated.

Off-topic:
Can anyone point to a consensus agreement on whether the needs backport to D7 tag should be removed for issues that have already been backported to D7? Sometimes it gets removed, and sometimes it stays; we should decide on one or another and document the decision.

EDIT: Got an answer from webchick and updated the docs.

donquixote’s picture

Status: Needs review » Needs work

Just discovered this problem myself, on a D6 site where the settings.php defines a list of 187 civicrm tables.
I had literally 149.200 calls to strtr() in one request.

I experimented with 5 different solutions:
- original D6 strtr() implementation.
- str_replace() without cache as suggested in #93 (lyricnz)
- str_replace() with static cache (home-made)
- preg_replace_callback() without cache (home-made)
- preg_replace_callback() with static cache in the callback (home-made)

I used array_slice() to experiment with $db_prefix arrays of different size.
I did benchmarking, and compared the result with the result from the original, on a normal page request.

Basically, in every db_prefix_tables() I let all 5 solutions run one after the other, track the time for each, and sum up.
After 500 calls (those that usually happen within one page request), I stop and print the numbers.
Then I do some spreadsheet magic, to calculate what you see here.

solution cache? 187 entries 9 entries 3 entries
str_replace() - 5.612 2.235 1.540
preg_replace_callback() - 10.087 2.360 1.400
str_replace() static 19.499 4.604 2.732
preg_replace_callback() static 31.431 2.899 1.585

higher number = faster
All numbers are relative to the original algorithm.
A number of 31.431 means, this method is around 31.431 times as fast as the original.
This is only for the time for db_prefix_tables(), not the full page generation time.

Conclusion:
preg_replace_callback() does pay off for large numbers, but it only would make sense with a case distinction logic. Probably not worth it (although I love it so much)
The static cache does usually pay off.
We see the biggest effect (not surprisingly) for large numbers.
The static cache could be implemented differently (as an instance variable, etc), which might yield different results.

dalin’s picture

@donquixote your research seems to confirm the D6 research that I made in #20 a year ago.

Anonymous’s picture

Status: Needs work » Needs review

#93: 561422-prefix-strreplace.patch queued for re-testing.

Owen Barton’s picture

Subscribe

fen’s picture

Subscribe

andypost’s picture

But whats about measures 5.x vs 5.3 as I posted at #561422-100: Replace strtr() with str_replace() for db prefixing

pillarsdotnet’s picture

@#110 by andypost on October 13, 2011 at 9:18pm:

But whats about ...

We need a patch that passes core tests. Nothing else matters.

andypost’s picture

As @donquixote pointed in #105 the faster results is static preg_replace_callback() also this looks reasonable to make it ported to upstream.

@pillarsdotnet Both patches are passes tests
current
#561422-93: Replace strtr() with str_replace() for db prefixing
and
#960798-3: Faster DatabaseConnection::prefixTables() when database prefix is an array

But also I'd like to make benchmarks on a small site instance shared 10-30 fields.

Can we make a test of real impact of prefixed tables? (I have no idea of scenario)

pillarsdotnet’s picture

Category: bug » task
Priority: Major » Minor

No, the patch in #93 does not pass tests, despite the green color. According to catch in #96:

No unfortunately that is a genuine failure (or non-pass or whatever), there's 190 assertions in D6:

http://qa.drupal.org/pifr/test/16784

So we are still waiting for a patch that passes tests.

Also, this is not a "bug report" because nothing is broken.

This is only an opportunity to improve performance for Drupal-6 websites that are running on PHP 5.3 or later.

Adjusting category, priority, and status accordingly.

donquixote’s picture

@pillarsdotnet
Are you aware how severe the performance penalty is on the typical CiviCRM use case we are talking about?
For me that's easily x2 for any request. As mentioned in #105, I saw 149.200 calls to strtr() on one request.

That's not just strtr vs str_replace, the real problem is that our strtr is sitting inside a foreach loop. Even if we stick with strtr, getting it out of foreach should result in a big improvement.

Of course, this only applies to those people with a reason for setting this prefix array. But that's a documented feature of Drupal, with no warning about performance.

pillarsdotnet’s picture

Of course, this only applies to those people with a reason for setting this prefix array. But that's a documented feature of Drupal, with no warning about performance.

Sounds like a documentation problem, and you strengthen the point I made earlier. There is an opportunity to improve performance for:

  1. Drupal-6 sites
  2. On PHP 5.3 or later
  3. With CiviCRM
  4. Using split prefixes.

The only time this issue was legitimately a "Critical" "Bug report" was in #45 through #77, where the changes made here "completely broke SQLite on both branches."

Anyhow, complaining is really pointless until someone submits a working patch.

donquixote’s picture

> Anyhow, complaining is really pointless until someone submits a working patch.

Yep.
I would love to.
Unfortunately, I have no idea why the test bot does not run any tests on #93. It seems this is rather a test bot problem, than a problem with the patch. What can we do?

donquixote’s picture

> On PHP 5.3 or later

And I still don't see how this is relevant. Unless you play these points as "OR".
In the civicrm case mentioned above, we could get an improvement even with strtr, just by getting it out of the foreach loop.

dalin’s picture

Priority: Minor » Normal

Since the testbot changes $db_prefix on the fly (to run the simpletests) I'm guessing that this is an error with the patch, not the bot. Can you get the tests to run locally?

Also, any thoughts on using the technique from the original D6 patch in #20 to use strtr() if $db_prefix is not an array?

Also when you do your benchmarking be sure that neither xDebug or XHProf are being loaded as PHP extensions. They cause a lot of additional overhead on internal PHP functions such as is_array(), and string functions which do not exist in a standard PHP installation (even if you have not enabled profiling).

Also your patch does not conform to coding standards (you are using camel case).

And this is not a minor issue since it can be summarized as "If you flip this switch your performance inexplicably drops by 25%".

lyricnz’s picture

Coding standards aside (the code is a backport, from an object-based solution), the patch in #93 doesn't cache anything, so shouldn't be impacted by simpletest changing the prefix. I'll look at it again now.

Mark Theunissen’s picture

Status: Needs work » Needs review

#93: 561422-prefix-strreplace.patch queued for re-testing.

Mark Theunissen’s picture

Here's a patch with what I think was the problem, basically the case where the $db_prefix is a simple string was not accounted for, it was setting the default_prefix to be '' instead of $db_prefix.

I also cleaned up the coding standards problems.

lyricnz’s picture

Yeah, that looks like the bug. Other than renaming the local variables, that's the same as #93 right?

Mark Theunissen’s picture

Yup.

So I guess we can't use a static cache because of the changing-prefixes-on-the-fly use case? We suspect Simpletest needs to change prefixes on the fly, but is there any other situation where this is required?

Mark Theunissen’s picture

Issue tags: -API change
FileSize
1.64 KB

Removing tag 'API change' because this does not break the API.

Here is a new patch that uses preg_replace_callback and doesn't precompute any arrays beforehand. It should work with PHP 4 and 5, as it doesn't use a closure for the callback but rather a real function.

My testing shows this is by far the fastest method, so we should go with this unless there is a problem with the approach?

My test was saving a view on a site with 151 entries in the db_prefixes array, using XHProf to determine the total inclusive wall time spent in the function db_prefix_tables.

There are 9225 total calls to db_prefix_tables on a View save page.

Inc. wall time:

15,503,778 microsecs (15 seconds) with Drupal 6 core default.
3,509,399 microsecs with patch in #121.
433,416 with this patch (using preg_replace_callback).

So with preg_replace_callback it's 12x as fast as the patch in #121.

Mark Theunissen’s picture

Actually, we don't need the case insensitive match in the regular expression which may speed it up and it certainly doesn't hurt to remove it.

Mark Theunissen’s picture

One more patch, just noticed it's probably safer to use !is_array(), instead of is_string().

lyricnz’s picture

Status: Needs review » Needs work

preg_replace_callback() is not universally faster - it depends very much on the number of substitutions and the number of prefixes - see comment 105 above. 151 entries is definitely atypical - any solution needs to be the same or better in the configuration used on 99% of sites: 0-1 prefix.

Also, you seem to have reverted to strtr() in the degenerate (and most common) case. Wasn't str_replace() already proven to be significantly faster?

Mark Theunissen’s picture

Status: Needs work » Needs review

For the configuration used on 99% of D6 sites, the 0-1 prefix, my patch leaves Drupal 6 code unchanged. This I believe is the safest option, as Drupal 6 has been out for many years now and optimizing this routine has never been a priority, as it's already very fast. I've just done some testing on a stock Drupal 6 install, and I was unable to produce a statistically significant difference between using str_replace() with two arrays and using strtr().

We also cannot use a static cache as this will break Simpletest, and it is also an API change in Drupal, which could potentially break hundreds of sites. So static cache of $search and $replace arrays are out, in my opinion. Removing those from the results, you'll see preg_replace_callback() is faster or not significantly slower.

In #105, donquixote doesn't actually benchmark the case where there are 0-1 prefixes. The benchmarks start by comparing 3 prefixes. Even then, the improvement is 1.5x vs 1.4x, not significant. As soon as you get to 9 items, preg_replace_callback is faster for the non-static cache.

Regardless of this, donquixote mentions using a static cache in his preg_replace_callback() method, which my patch does not need, as it does not precompute arrays, so comparing my patch vs donquixote's is apples vs. oranges. We need the exact patches used by donquixote and the benchmarking code to make a meaningful comparison.

In cases where there are hundreds of prefixes, preg_replace_callback() is much much faster.

I think the goal of this patch should be to drastically optimize the horrifically slow case, rather than create some complicated set of case statements just to eke out another 5 or 10ms from a page load for the majority of users that for the most part aren't complaining about it.

JacobSingh’s picture

Status: Needs review » Reviewed & tested by the community

I've been working on a special case where they have 400+ prefixes. We dropped from 256s to 60s on a full cache clear. So +1 to this patch.

lyricnz’s picture

I updated the benchmark code from #36 to include a preg_replace_callback() option as used in #126. For 100 prefixes, the various methods took:

 [100] => Array
        (
            [prefixTables] => 692.4798 // current method
            [prefixTables_single_strtr] => 101.6431
            [prefixTables_precalc_singletr] => 75.0070
            [prefixTables_multiple_str_replace] => 75.2752
            [prefixTables_single_str_replace] => 51.7621
            [prefixTables_precalc_single_str_replace] => 21.8170
            [prefixTables_culled_single_str_replace] => 67.8430
            [prefixTables_preg_callback] => 10.0579
        )

i.e. the preg_callback method is about 5 times faster than a single str_replace() and 10 times faster than a single strtr().

However the patch still uses strtr() for the single-prefix-as-string case, and str_replace() is ~20 times faster, so why wouldn't we use that? I know it's not terribly significant in overall page time, but it's a straight replacement in this case.

    [1] => Array
        (
            [prefixTables_single_strtr] => 65.0170
            [prefixTables_single_str_replace] => 3.8080
        )

FWIW, on my machine, the "crossover point", where preg_callback() becomes faster than str_replace() is about 5 prefixes (on a sample SQL with two replacements).

mgifford’s picture

Status: Reviewed & tested by the community » Closed (won't fix)

It's been RTBC for 2 years, but don't see this getting into Core for D6.

Fabianx’s picture

Status: Closed (won't fix) » Reviewed & tested by the community

I disagree, if anyone wants to take over this again for D6 maintenance, they could.

I think we should close this just when D6 is EOL.

donquixote’s picture

We need the exact patches used by donquixote and the benchmarking code to make a meaningful comparison.

Just saying sorry I did not upload the patches and did not follow up on #128.
And now I don't have this stuff available anymore (at least cannot find it atm), and it no longer applies to any active project I work on.
But #126 looks reasonable, so why not.
(But the rtbc should be renewed by someone still using D6)

Status: Reviewed & tested by the community » Closed (outdated)

Automatically closed because Drupal 6 is no longer supported. If the issue verifiably applies to later versions, please reopen with details and update the version.