PHP offers two different function groups for regular expressions. The ereg functions expect a POSIX-style pattern syntax, while preg functions use a perl-compatible syntax.
Not only does Drupal mostly use the latter, php.net actually states that the latter is faster and it seems that ereg may be removed entirely in PHP 6. However, six places remain where core uses ereg. I don't see any problem with rewriting these expressions for preg instead, which would standardize core on a single form and may (negligibly) improve performance.
If there is a problem or a disadvantage with using preg in these places, this should probably be documented inline (currently isn't).
Here's a grep.
$ grep ereg -R includes/ modules/
includes/file.inc: $regex = '/\.(' . ereg_replace(' +', '|', preg_quote($extensions)) . ')$/i';
includes/file.inc: elseif ($depth >= $min_depth && ereg($mask, $file)) {
includes/unicode.inc: if (!$bom && ereg('^<\?xml[^>]+encoding="([^"]+)"', $data, $match)) {
includes/unicode.inc: $data = ereg_replace('^(<\?xml[^>]+encoding)="([^"]+)"', '\\1="utf-8"', $out);
modules/blogapi/blogapi.module: if (eregi('<title>([^<]*)</title>', $contents, $title)) {
modules/blogapi/blogapi.module: $contents = ereg_replace('<title>[^<]*</title>', '', $contents);
Comments
Comment #1
damien tournoud commentedOk, I'm responsible for one of them (the ereg_replace() in drupal_xml_parser_create()), it was a long time ago, while I was young and naive (oh, wait a minute, I'm still at least one of the two, goooood!).
Those can be changed without side effects (it already started in user_validate_name(), which got into D7 some weeks ago), no doubt about it. But don't forget also
split()andspliti(), which are hidden forms of ereg.Comment #2
cburschkaThanks, I forgot the functions that use posix but don't contain ereg in their names. spliti() is never used, but split() is.
I'm out of time for rolling a patch tonight, but it's trivial really. For the most part, the only required change in the pattern are the delimiters (//) and escaping slashes.
Comment #3
Anonymous (not verified) commentedSubscribing
Comment #4
cburschkaNote: file_scan_directory exposes its regular expression handling to contrib, so changing the function is an API change. So two things:
1.) That part cannot be ported to 6.x, unless we can somehow convert patterns from posix to perl on the fly (which is probably not worth it).
2.) Core uses file_scan_directory in a lot of places, making that part non-trivial (and I'm still short on time). This patch fixes only the 8 trivial lines.
Comment #5
catchThis is a duplicate of http://drupal.org/node/64967.
I'll bump the other issue.