When trying to understand I got to

    $routes = $this->connection->query("SELECT name, route FROM {" . $this->connection->escapeTable($this->tableName) . "} WHERE pattern_outline IN (:patterns) ORDER BY fit", array(
      ':patterns' => $ancestors,
    ))
    ->fetchAllKeyed();

    $collection = new RouteCollection();
    foreach ($routes as $name => $route) {
      $route = unserialize($route);
      if (preg_match($route->compile()->getRegex(), $path, $matches)) {
        $collection->add($name, $route);
      }
    }

this part and asked "weird. Is it possible that preg_match doesn't match? Didn't we already do the matching in SQL? Where is that regexp?". It's dynamically created -- but at the end of the day it's just a string. It must be stored in the router. It's more performant for sure and also saves anyone's sanity who wants to debug a Drupal process from the beginning to the end.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Crell’s picture

Component: base system » routing system
Issue tags: +WSCCI

Categorizing and tagging...

If we can pre-store data in the routing table to make the routing process faster (the regex or otherwise), I'm all for it. The name, route object, and fit are the required things for functionality. Anything else we want to put in there for performance I'm very open to discussing.

Wim Leers’s picture

Priority: Normal » Critical
Issue summary: View changes
Issue tags: +Performance

Needs investigation, but it looks like this has the potential to meaningfully reduce Drupal 8's "baseline" performance.

dawehner’s picture

Just to be clear, we don't assume that every performance improvement is critical right?

In order to fix it, we would have to store the compiled route on serializer/unserialize time, which we don't do at the moment. There is though
no way to do that without hacking symfony itself, ************ private **************** ****** ***** **********. Currently working on a PR to allow that, see https://github.com/symfony/symfony/pull/12012

catch’s picture

This will need an update, or at least a forced router rebuild, so marking D8 upgrade path.

Not changing status, but yes I don't think this is critical without some numbers.

dawehner’s picture

Status: Active » Needs review
FileSize
2.45 KB

Okay, so let's try it in Drupal, but yeah I doubt that the amount of code is really relevant, we just save a bunch of RouterCompiler code.

Status: Needs review » Needs work

The last submitted patch, 5: compiled_route-1964922-4.patch, failed testing.

Wim Leers’s picture

Priority: Critical » Major
Issue tags: +needs profiling

Let's make it major then. chx described it to me like a huge performance problem. But I guess we should first investigate.

chx’s picture

> Is it possible that preg_match doesn't match? Didn't we already do the matching in SQL?

This should not be a question. The routing system innards should have been documented before commit as the menu was. But it was not because noone knows not even those who forced this pile of crap upon us.

dawehner’s picture

This should not be a question. The routing system innards should have been documented before commit as the menu was. But it was not because noone knows not even those who forced this pile of crap upon us.

Just to be clear, talking about the history should not matter in this history. This part certainly always had documentation on http://symfony.com/doc/current/components/routing/introduction.html#usage

> Is it possible that preg_match doesn't match? Didn't we already do the matching in SQL?

You can use regex in sql to validate that a certain bit is a number? Could be possible but I am not sure it is worth trying out. In general though we could optimize here by storing a flag whether the regex stores
anything more than just the ancestors, because those we already check in the SQL query. I am 100% sure that we do have routes where we use \d for example.

chx’s picture

Hrm. What I am talking about is that we match already node/{node} to node/% in SQL. Can you have node/this-must-be-a-number and node/this-is-not-a-number-but-still-variable routes both?? How would that even look in a routing yaml?

dawehner’s picture

Hrm. What I am talking about is that we match already node/{node} to node/% in SQL. Can you have node/this-must-be-a-number and node/this-is-not-a-number-but-still-variable routes both?? How would that even look in a routing yaml?

Yes, for the SQL query we use node/%. Here is an example how this would look like in the YAML file.

node.view:
  path: /node/{node}
  defaults:
    ...
  requirements:
    node: \d
    _access: 'TRUE'

node.special:
  path: /node/{node_or_user}
  defaults:
    ...
  requirements:
    _access: 'TRUE'
Crell’s picture

Daniel is correct. The regex in UrlMatcher is more specific and restrictive than the SQL query precisely because we are not doing any regex matching in the SQL lookup. There's no SQL-agnostic way of doing so that I'm aware of, so we just ported forward the query algorithm from D7.

Another example: A path of /foo/bar/{baz} with baz having a requirement of a|b. /foo/bar/c will happily pass the SQL check but not match the regex, so it would be a 404 from UrlMatcher rather than RouteProvider.

I am the one that excluded the compiled route object from the serialized route object because it kept fataling on me. Rather than figure out why I just excluded it.

Now that we have the ability to have SQL-specific backends, another alternative is to have a MySQL-specific dumper/provider that leverages http://dev.mysql.com/doc/refman/5.5/en/regexp.html, and does the regexing directly in the query. UrlMatcher would still regenerate the regex if we get that far but we'd have fewer routes, and some 404s would get handled directly by RouteProvider.

That said! The number of routes that make it all the way to UrlMatcher should be very small. In fact most will have only a single route. So we're talking about saving, in the typical case, one regex. We need some stand-alone tests of how expensive that regex is to compile and run before we know if it's even worth bothering.

dawehner’s picture

Now that we have the ability to have SQL-specific backends, another alternative is to have a MySQL-specific dumper/provider that leverages http://dev.mysql.com/doc/refman/5.5/en/regexp.html, and does the regexing directly in the query. UrlMatcher would still regenerate the regex if we get that far but we'd have fewer routes, and some 404s would get handled directly by RouteProvider.

Are you sure it is worth (in terms of performance) to make the query more complex (because it maybe runs on every single entry)
instead of calling it on a quite limited subset of the routes.

dawehner’s picture

Status: Needs work » Needs review
FileSize
2.72 KB

Let's get it at least working.

Status: Needs review » Needs work

The last submitted patch, 14: 1964922-14.patch, failed testing.

Damien Tournoud’s picture

+            'compiled_route' => serialize($compiled),

Why would you store the whole compiled route, when you need just $compiled->getRegexp()? Serializing / deserializing is not exactly cheap.

dawehner’s picture

Why would you store the whole compiled route, when you need just $compiled->getRegexp()? Serializing / deserializing is not exactly cheap.

Well, fact is that we do use the compiled route, or rather stuff like $route->compile()->getVariables() in more than one place

dawehner’s picture

---

dawehner’s picture

Status: Needs work » Needs review
FileSize
11.38 KB

Let's get a little bit more radical.

Status: Needs review » Needs work

The last submitted patch, 19: 1964922-17.patch, failed testing.

dawehner’s picture

FileSize
12.22 KB
863 bytes
chx’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 21: 1964922-21.patch, failed testing.

dawehner’s picture

Status: Needs work » Needs review
FileSize
14.16 KB
1.94 KB

Meh

Status: Needs review » Needs work

The last submitted patch, 24: 1964922-24.patch, failed testing.

dawehner’s picture

Status: Needs work » Needs review
FileSize
14.71 KB
1.08 KB

Maybe that ...

Status: Needs review » Needs work

The last submitted patch, 26: 1964922-26.patch, failed testing.

dawehner’s picture

Status: Needs work » Needs review
FileSize
17.09 KB
4.43 KB

Things seems to get better now.

Status: Needs review » Needs work

The last submitted patch, 28: 1964922-28.patch, failed testing.

Crell’s picture

https://github.com/symfony/symfony/pull/12329

At our request, Symfony just merged in code to allow the compiled route to be serialized along with the route object. Since we are serializing the route, that means after we next upgrade our Symfony version we should get a saved regex (and the rest of the compiled route) for free with no additional work. At most we would need to make sure the route is compiled before we save it so that we're sure the serialized version has the compiled object as well.

Shall we just wait for that and close this issue as redundant?

dawehner’s picture

I think it would be still be nice to know how much we gain by doing that.

dawehner’s picture

Status: Needs work » Fixed

Given that the symfony PR went in and we update to symfony 2.6, this issue is done. Nice!

chx’s picture

Does it mean we will automatically use the compiled regex and don't need to recompile it again?

dawehner’s picture

FileSize
82.67 KB

Does it mean we will automatically use the compiled regex and don't need to recompile it again?

Exactly, we compile the regex already on route dump time, so we don't have compile again on recieve time,
yes we still have the compile() call in there, but there is no other way to get to the compiled route atm. Manually checked to ensure,
that we don't compile again, see screenshot.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.