This is essentially a sub-task of #1888424: Make Drupal's URL generation logic available to HttpKernel, and minimize code repetition/divergence, which we decided should be broken up into 2 issues - one to deal with incoming path processing (this one) and one to deal with the outgoing (url generation) side of things. The original issue will be kept for the outgoing side and will build on the work done here.

The goal here, then, is to turn the process for resolving the incoming path to a system path into something that can be mirrored on the outgoing side. The current set-up uses a series of request listeners to perform the necessary futzing:

  • one for urldecoding the path
  • one that resolves an empty path to the system path for the front page
  • one for stripping off the language prefix if there is one
  • one for converting an alias to a system path

This patch replaces the series of request listeners with a single listener that uses a PathProcessorManager to resolve the system path. Individual processors are registered to the manager with a priority indicating where in the sequence they should come, and the manager goes through them in sequence to return a fully processed path.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Crell’s picture

+++ b/core/lib/Drupal/Core/EventSubscriber/PathSubscriber.php
@@ -20,24 +21,24 @@
+    $path = trim($request->getPathInfo(), '/');

We already went over this patch extensively in the parent issue, so there's not much to say. It all looks great. :-) The line above is really my only remaining question mark.

All of the Symfony logic assumes a path that has a / prefix. Drupal has always assumed a path without prefix. That means currently we have a number of these transition points around the code where we have to translate from path prefixed to not path prefixed.

At what point do we rip those out and just use a / prefix everywhere, and avoid this fun potential bug? I don't know if this is a good time to do that, but it's worth asking if we should just tell path processors to assume a /-prefixed path, rather than an unprefixed path as Drupal has done in the past.

Saving that for a later system-wide issue is fine, but this seemed a reasonable place to bring it up.

Other than that question this should be RTBC.

sun’s picture

+ * Contains Drupal\Core\PathProcessor\IncomingPathProcessorInterface.

+ * Contains Drupal\Core\PathProcessor\PathProcessorDecode.

+ * Contains Drupal\Core\PathProcessor\PathProcessorFront.

+ * Contains Drupal\Core\PathProcessor\PathProcessorManager.

Could we move these into a Drupal\Core\HttpKernel namespace? (with or without additional PathProcessor subnamespace)

I'm not sure I can make sense of an independent and own PathProcessor component — the facility always seems to be bound to the Request/Subrequest/Response processing stack.

+++ b/core/lib/Drupal/Core/PathProcessor/IncomingPathProcessorInterface.php

+ * Defines an interface for classes that manipulate the incoming path.
...
+interface IncomingPathProcessorInterface {
...
+   * Process the incoming path.
...
+  public function processIncoming($path, Request $request);

Can we rename "incoming" to "inbound"?

Inbound + outbound are slightly clearer and more direct opposite terms than Incoming and Outgoing (or whatever term you had in mind).

+++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorDecode.php

+class PathProcessorDecode implements IncomingPathProcessorInterface {

Thinking ahead of time, would it make sense to put all the inbound path processors into a separate ./Inbound subdirectory?

So that we can possibly complement that later on with an ./Outbound subdirectory?

+ * Contains Drupal\language\PathProcessorLanguage.

Can we move this into a \HttpKernel subnamespace; i.e.: Drupal\language\HttpKernel\PathProcessorLanguage?

Crell’s picture

I don't believe we have a Drupal\Core\HttpKernel namespace yet, do we? I'm not against such a thing in concept, but I don't think this is the place to introduce it. If we were to add such a thing, there are other classes that might migrate there, too. I'd prefer to punt on that.

The rest of the comments from #2 seem reasonable.

katbailey’s picture

FileSize
14.37 KB
20.46 KB

Thinking ahead of time, would it make sense to put all the inbound path processors into a separate ./Inbound subdirectory?

It's not inconceivable that we might have classes that implement both the inbound and the outbound interface - I'd rather wait until we've figured out the outbound side and then we can see if this makes sense.

The only changes I've made here are 1) use "incoming" instead of "inbound", and 2) move the language path processor into a HttpKernel subnamespace.

katbailey’s picture

Oh, and about dealing with the leading "/" on paths, my inclination is to say let's leave that for a follow-up issue - but that may very well just be my laziness talking, so if anyone feels strongly that it should be done here I'll add in that change.

Crell’s picture

Status: Needs review » Needs work
+++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorManager.php
@@ -0,0 +1,96 @@
+  /**
+   * Holds the array of processors, sorted by priority.
+   *
+   * @var array
+   *   An array of path processor objects.
+   */
+  protected $sortedIncoming = array();
+

This should get renamed to inbound, too, as should incomingProcessors.

+++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorManager.php
@@ -0,0 +1,96 @@
+  protected function getIncoming() {

Ibid.

Other than that, this is good to go. I'm fine with punting on the / for now and making that a dedicated issue.

katbailey’s picture

Status: Needs work » Needs review
FileSize
20.44 KB

OK, this time I really do promise there are no more instances of "incoming" in the patch. Or "manipulate" for that matter.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

Let's do. :-)

sun’s picture

+++ b/core/lib/Drupal/Core/CoreBundle.php
@@ -367,4 +371,27 @@ protected function registerTwig(ContainerBuilder $container) {
+      ->addTag('inbound_path_processor', array('priority' => 100));

s/inbound_path_processor/path_processor_inbound/g

Minor, so we can also do it in a follow-up patch/issue.

What concerns me with this is the resulting negative DX effect of "random string identifier that apparently denotes a service" + "random string identifier that apparently denotes a tag that is related to a service with a similarly random string identifier".

Only the original author understands that kind of API design.

+++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorManager.php
@@ -0,0 +1,96 @@
+    if (empty($this->inboundProcessors[$priority])) {
+      $this->inboundProcessors[$priority] = array();
+    }

Unless I'm terribly mistaken, this sub-array initialization isn't needed in PHP5 anymore.

+++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorManager.php
@@ -0,0 +1,96 @@
+  /**
+   * Sorts the processors according to priority.
+   *
+   * @param string $type
+   *   The processor type to sort, e.g. 'inboundProcessors'.
+   */
+  protected function sortProcessors($type) {
+    $sorted = array();
+    krsort($this->{$type});
+
+    foreach ($this->{$type} as $processors) {
+      $sorted = array_merge($sorted, $processors);
+    }
+    return $sorted;
+  }

That looks funky.

It actually exceeds the funky-cool level to an extent at which we should have dedicated a test for the expected result of this sorting algorithm. :)

I briefly looked around, but can't see why the foreach() even exists in the first place? $sorted is always empty?

+++ b/core/modules/language/lib/Drupal/language/LanguageBundle.php
@@ -0,0 +1,29 @@
+    // Register the language-based path processor.
+    $container->register('path_processor_language', 'Drupal\language\HttpKernel\PathProcessorLanguage')
...
+      ->addTag('inbound_path_processor', array('priority' => 250));

I already wondered about the priorities in registerPathProcessors() at first sight...

Can we move the decode processor to 1,000, and move this language futzer to 300? (replacing the current 300 of the decode processor)

katbailey’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
14.17 KB
29.76 KB

Addressed sun's points, with the exception of:

I briefly looked around, but can't see why the foreach() even exists in the first place? $sorted is always empty?

I'm not sure what you mean. $sorted starts out empty but gets the processors merged into it on each iteration through the various priorities.

I've added a DrupalUnitTest to test the processor manager using incorrect and correct orderings of the processors.

sun’s picture

Status: Needs review » Reviewed & tested by the community

I still do not understand why the foreach() and array_merge() is needed. $sorted always starts out as an empty array, so the elements of $this->inboundProcessors are repetitively added to $sorted, since there is nothing else to merge. The entire effect of that sort method is the krsort() only.

Anyway...

sun’s picture

Oh. I'm sorry. You're not only sorting, but also flattening the second dimension into the first. Gotcha.

+    $this->inboundProcessors[$priority][] = $processor;
katbailey’s picture

#10: 1910318.path_futzers.10.patch queued for re-testing.

Crell’s picture

Priority: Normal » Major

Bumping priority since this is blocking the generator/outbound merging issue.

catch’s picture

Status: Reviewed & tested by the community » Fixed

Can't find anything to complain about, committed/pushed to 8.x.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

mbaynton’s picture

Issue summary: View changes

This comment pertains to a @todo in PathProcessorDecode (which it looks like was inherited from Drupal\Core\EventSubscriber\PathSubscriber), but given that this issue is the origin of the @todo's current home, I think this issue is the best venue to begin a discussion:

@todo Revisit whether or not this logic is appropriate for here or if controllers should be required to implement this logic themselves. If we decide to keep this code, remove this TODO.

The logic being a urldecode() of the path. It should be kept, because controllers aren't the only thing benefiting from this path processor - later path processors do too. This is the reason URL aliases built out of the broader unicode charset work, not an uncommon thing since IRIs are received as RFC3986-encoded strings -- any internationalized URI is in need of this urldecode. Sure, you could go around remembering to decode it in any other current/future path processors as well, but in general I think the above illustrates how having this here will mostly just keep things working as expected for anyone dealing with paths. Thinking to urldecode paths in order to be internationalized is an unneeded gotcha.

If that makes sense to others, perhaps a new issue just to remove the comment?