Problem/Motivation

(Temporarily blocked on #3388563: Option to keep language prefix in URLs)

? TBD.

Something is messed up with internal URL handling (language prefixes, base://, ...)

Steps to reproduce

?

Proposed resolution

TBD

Remaining tasks

  1. Clarify the expected behavior and what the bug is.
  2. Update test coverage accordingly.
  3. Make sure the fix causes all the tests to pass.
  4. Reviews / refinements.
  5. RTBC.
  6. Commit.

User interface changes

API changes

Data model changes

Release notes snippet

Original issue report by @Berdir

The config schema and test is currently a mess, there seem to be multiple different ways of configurating the global and local settings.

This cleans it up a bit. There are still lots of test fails, but at least the test is now running through. I also fixed the incorrect getEditableConfigNames(), changed the test @group to pathologic (the standard is to have a group per project. This is also required for my d8 module status overview at http://d8ms.worldempire.ch/.

Replacing base:// with user-path: as I think this can be considered as user input, base: is still available too.

CommentFileSizeAuthor
#86 pathologic-language-prefix-2418369-86.diff31.21 KBtcrawford
#84 pathologic-language-prefix-2418369-84.diff29.08 KBtcrawford
#81 pathologic-language-prefix-2418369.diff11.64 KBayalon
#76 pathologic_2418369-76.patch3.98 KBmark.labrecque
#74 pathologic_2418369-74.patch4.9 KBmark.labrecque
#72 2418369-72.patch4.47 KBn.ghunaim
#62 2418369-62.patch16.55 KBdww
#61 2418369-61.patch16.57 KBdww
#61 2418369-61.test-only.patch11.38 KBdww
#59 2418369-59.patch4.88 KBn.ghunaim
#58 2418369-58.patch4.88 KBtcrawford
#53 internal_url_handling_2418369-70.patch4.52 KBbletch
#52 2418369-52.patch5.21 KBrandalv
#51 internal-url-handling-2418369-51.patch4.47 KBnickdjm
#50 interdiff-48-50.txt3.97 KBspadxiii
#50 2418369-50.patch4.47 KBspadxiii
#48 2418369-48.patch5.03 KBrandalv
#47 interdiff-47.txt891 byteschristophdemon
#47 pathologic-2418369-47.patch5.4 KBchristophdemon
#46 interdiff-46.txt2.06 KBpieter-e1
#46 pathologic-2418369-46.patch5.26 KBpieter-e1
#41 pathologic-2418369-41.patch4.67 KBrandalv
#37 internal_url_handling-2418369-37.patch3.69 KBspadxiii
#36 pathologic_language_prefixes_URL-2418369-36.patch591 bytessteveoriol
#33 2418369.15_33.rawdiff.txt1.68 KBdww
#33 2418369-33.patch3.31 KBdww
#1 pathologic-schema-fixes-2418369-1.patch5.9 KBberdir
#7 2418369-7.patch562 bytesleksat
#10 2418369-10.patch742 bytesjhodgdon
#11 2418369-11.patch800 bytesmaijs
#13 interdiff.txt1.87 KBderhasi
#13 config_schema_base-2418369-13.patch1.62 KBderhasi
#15 internal_url_handling-2418369-15.patch3.05 KBderhasi
#15 interdiff.txt1.65 KBderhasi
#21 internal_url_handling-2418369-20.patch3.36 KBollaankoodeis
#22 internal_url_handling-2418369-23.patch3.88 KBollaankoodeis
#25 internal_url_handling-2418369-25.patch3.5 KBguedressel
#25 interdiff_15-23.txt996 bytesguedressel
#27 internal_url_handling-2418369-27.patch3.92 KBdksdev01
#29 internal_url_handling-2418369-29.patch3.85 KBspadxiii
#29 interdiff-25-29.txt2.79 KBspadxiii

Issue fork pathologic-2418369

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

berdir’s picture

StatusFileSize
new5.9 KB

Status: Needs review » Needs work

The last submitted patch, 1: pathologic-schema-fixes-2418369-1.patch, failed testing.

Garrett Albright’s picture

Status: Needs work » Fixed

I've pushed two commits last night which should fix the config stuff in frankly a bit cleaner way than that patch. It kind of seems like I got distracted while working on that stuff somehow - I didn't realize how unfinished a lot of it was…

Test running completes now, all green except for good ol' problematic "files:" compatibility. I'll have to give that another look when I can (so maybe in another month and a half…).

Thanks.

Garrett Albright’s picture

Status: Fixed » Needs work

No, re-opeing because I need to look into that "user-path" thing you did - I'm not sure what that is. Is that a new D8 thing which will let me avoid the base:// hack?

berdir’s picture

Yes and no. They're not really a hack, not anymore. Both accept a bath, but it has a slightly different meaning:

base: points to something that does not exist as a route, like public files/assets. We don't bother with trying to match to a route there.

user-path: is used for a path that a user entered/could have entered, so it might point to a route that exists, and the link generator will try to find a matching route and do access checks.

So I think user-path: makes more sense for pathologic.

Also, my patch still contains a lot of fixes that you did not include. I don't really see how the tests can be passing for you, with the config schema that doesn't match, the settings form should be broken because of the incorrect getEditableConfigNames(), are you using an old core version?

The @group change would also be nice imho, so it runs the tests on http://d8ms.worldempire.ch/.

leksat’s picture

I should add that using the base:// hack, the language prefix is not added to the URL even if the 'language" option is set. At least on D8 beta9.

I've fixed this with:

   // Now to build the URL. Drumroll, please…
-  $url = Url::fromUri('base://' . $url_params['path'], $url_params['options'])->toString();
+  $url = Url::fromUri('internal:/' . $url_params['path'], $url_params['options'])->toString();
leksat’s picture

Status: Needs work » Needs review
StatusFileSize
new562 bytes

Converted #6 into a patch. Can we commit at least this, so that we don't have the language path prefix issue?

acbramley’s picture

Status: Needs review » Reviewed & tested by the community

Patch in #7 fixes the issue where node urls were not replaced with the aliased url.

jhodgdon’s picture

I had the same issue as #2692641: Convert href to the aliased URL if possible and was referred here. I agree that the patch in #7 fixes that issue. I haven't tested the language stuff yet.

jhodgdon’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new742 bytes

Hm. Further testing... With internal: in there instead of base:, some file URLs fail.

I was trying to use it for image style URLs. For example, I had in my node body:

<img src="files:styles/mini_thumb/public/page_images/createtype.jpg" />

With the patch in #7, the output URL is:
(mysiteurl)/sites/default/files/styles/mini_thumb/public [Note that this is TRUNCATED!]
Without the patch in #7, the output URL is correct:
(mysiteurl)/sites/default/files/styles/mini_thumb/public/page_images/contentlist.jpg [full correct URL here]

So, the patch in #7, while it solves the path alias problem, breaks this type of file URL. I really have no idea why.

As a note, an A tag pointing to the unstyled image works fine for both:

<a href="files:page_images/createtype.jpg">

So... I got it to work with this code:

  // Now to build the URL. Drumroll, please…
  if ($parts['path'] == '<front>') {
    $url = Url::fromRoute('<front>', [], $url_params['options'])->toString();
  }
  elseif ($cached_settings['is_file']) {
    $url = Url::fromUri('base:/' . $url_params['path'], $url_params['options'])->toString();
  }
  else {
    $url = Url::fromUri('internal:/' . $url_params['path'], $url_params['options'])->toString();
  }

Here's a patch.

maijs’s picture

StatusFileSize
new800 bytes

Re-roll of the patch in #10.

Status: Needs review » Needs work

The last submitted patch, 11: 2418369-11.patch, failed testing. View results

derhasi’s picture

Title: Config schema, base:// and test cleanup » Internal URL handling (Aliases, language prefixes, base://, ...)
Status: Needs work » Needs review
Issue tags: +Needs tests
StatusFileSize
new1.87 KB
new1.62 KB

Building upon the findings of @jhodgdon I rerolled the patch with some additional changes (see interdiff.txt)
* Replaced the use of $cached_settings['is_file'] with $is_file, as I see no need of using the static settings variable for storing the information for the specific url - especially as it is allways explicitely set in line 195 or 198.
* Moved `base:/' ... call in the try catch part, so rare exceptions like in #2602312 cannot occur for file names.

Additionally, I adjusted the issue title to reflect the current state of the issue.

I think we would need some tests here, to test the expected behavior in the case of aliases being used or multilingual content. Right now I do not know how to write the specific test.

My expectation would be for a node (nid:1) having an english alias english-article and a german alias deutscher-artikel, with German using the de prefix:

URL in text input Current language URL Output
node/1 EN /english-article
node/1 DE /de/deutscher-artikel
de/node/1 DE / EN /de/deutscher-artikel
english-article DE / EN /english-article
de/deutscher-artikel DE / EN /de/deutscher-artikel
en/english-article DE / EN ?
deutscher-artikel DE / EN ?

I am unsure about the last two cases, I guess those would and should end up in a 404?

Can anyone give me hint on how to best test that? i guess I would not to set up node creation, path aliases and languages - but never did that in a test so far.

Status: Needs review » Needs work

The last submitted patch, 13: config_schema_base-2418369-13.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

derhasi’s picture

Status: Needs work » Needs review
StatusFileSize
new3.05 KB
new1.65 KB

I found another use case, that needed to be fixed: in my case we have a german Artikle (de/deutscher-artikel) but this one has no translation in English. As the language prefix was removed from the path, the internal path for that artikel could not be found and the path would be seen as a non routed path to /deutscher-artikel instead of a routed path to node/1.

When we keep the language prefix to the path when passing to Uri:fromUri(), the paths are recognized correctly, as Drupal knows about the language prefixes and can consider those to find the correct internal path. Additionally if some path would not be a routed path but would need the language prefix, like a redirect of Redirect module, this will keep the non-routed path intact (like /en/my-custom-redirect).

Attached is the patch and interdiff for this change. I still have not updated the tests, sorry :(

Status: Needs review » Needs work

The last submitted patch, 15: internal_url_handling-2418369-15.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

vurt’s picture

#15 fixed a bug with language prefixes for me. Thank you!

idflood’s picture

I confirm that the patch in #15 also fixed an issue I had with missing language prefix in link. Thanks!

I also quickly checked the code style and could not spot any error.

drcolossos’s picture

Status: Needs work » Reviewed & tested by the community

Also tested #15 and it works as expected.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 15: internal_url_handling-2418369-15.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

ollaankoodeis’s picture

StatusFileSize
new3.36 KB

We have had problems with image styles of new embedded images. New embedded images (added to ckeditor) were not identified as files. The style urls were not including style name folder and the image file. Created a new patch of this internal url handling without replacing any of previous changes.

ollaankoodeis’s picture

StatusFileSize
new3.88 KB

Reworked last patch to check if variable index exists.

ollaankoodeis’s picture

moshe weitzman’s picture

We tried the latest patch here and like #10, had issues with image style urls. I did not check if #2718473: Dynamic routes in image styles break under pathologic fixes it.

guedressel’s picture

StatusFileSize
new3.5 KB
new996 bytes

This patch integrates my solution from #2718473-19: Dynamic routes in image styles break under pathologic.
My solution differs from @KJankko in #23 as it does not rely on the path name ('[...]/files/styles/[...]') but resolves the path through the router. My approach seems more robust to me as it overcomes any custom/different/unexpected path configurations.

guedressel’s picture

dksdev01’s picture

StatusFileSize
new3.92 KB

Testing 23 with little variation, default lang code should be site's default instead of the current page.

guedressel’s picture

If the language option is omitted (as in the original code) it would be set to the requests current interface language in Url::fromRoute() - what IMHO is fine and follows language negotiation configuration of the current site.
@dksdev01 why would you explicitly set it to the sites default language?

spadxiii’s picture

StatusFileSize
new3.85 KB
new2.79 KB

Implemented the linked patch in #25, but updated it for use with flysystem (which uses 2 different routes for files)
Also reverted setting the default language initially, as added in #27.

dww’s picture

Category: Task » Bug report

Marked #3126584: Multilinguism : Pathologic remove lang code in url duplicate. Since the current code does very wrong things with language prefixes, I'm upgrading this to a bug. If all goes well, I should have time today to give this issue some lovin' and try to get it ready to commit.

dww’s picture

Okay, finally had a chance to read the whole issue history. Eeek. Sorry no maintainer has replied in 7 years. 😅 This issue seems to have gone off the rails a bit, and is currently including all kinds of semi-related fixes, code from other issues, etc. It's now hard to untangle what belongs here vs. #2718473: Dynamic routes in image styles break under pathologic. I just committed #2692641: Convert href to the aliased URL if possible since it was small and sweet (although I probably should have required tests for it). I also committed #3027696: hook_pathologic_alter external option doesn't work which is sort of related (at least touching very similar code).

I'm not exactly sure how to proceed. On the surface, it seems this issue should only be about making sure the language prefixes are right. I think I just fixed the alias problem. We've got #2718473 for making sure image styles work. The current patch there only supports image styles, although the title suggests other dynamic paths could work, too. I'd love to have a more general solution there that doesn't require hard-coding support for flysystem, too.

I think I need to go through the patch very carefully and try to pull out only the essential elements that aren't also trying to fix image styles, etc. Like, go back to #15, re-roll that for the current 8.x-1.x branch, add some tests, and commit it. Then we can correctly sort out everything else at #2718473.

Sound okay?

Thanks/sorry,
-Derek

dww’s picture

Title: Internal URL handling (Aliases, language prefixes, base://, ...) » Internal URL handling (language prefixes, base://, ...)
dww’s picture

StatusFileSize
new3.31 KB
new1.68 KB

Okay, here's #15 re-rolled and merged with the latest 8.x-1.x branch, after my recent flurry of commits.
Interdiff is confused, so here's a raw diff of the two patch files. Yes, all this could be written with lots of ternary operators and be more concise:

    // If we've been told this is already an external URL, leave it alone.
    if (!empty($url_params['options']['external'])) {
      $scheme = '';
    }
    // If it's a file, use 'base:' so that the path is not re-written.
    elseif ($is_file) {
      $scheme = 'base:/';
    }
    // For everything we did not recognize as external or files, we use the
    // internal scheme so aliases and language prefixes are set correctly.
    else {
      $scheme = 'internal:/';
    }
    $path = $scheme . $url_params['path'];

However, given all the bugs and trouble with this stuff and how complicated it all is to get right, I'd rather be hyper explicit and leave room for lots of good inline comments, instead of trying to be thusly fancy:

    $scheme = empty($url_params['options']['external']) ? ($is_file ? 'base:/' : 'internal:/') : '';

or something similar.

Meanwhile, PathologicTest now fails locally for me:

There was 1 failure:

1) Drupal\Tests\pathologic\Functional\PathologicTest::testPathologic
Simple paths. Clean URLs: Yes; protocol style: full.
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'<a href="http://localhost:8866/drupal-9_4/foo"><img src="http://localhost:8866/drupal-9_4/foo/bar" /></a>'
+'<a href="foo"><img src="foo/bar" /></a>'

The expected and actual have got to be reversed in the test code. Yup:

      $this->assertEqual(
        check_markup('<a href="foo"><img src="foo/bar" /></a>', $format_id),
        '<a href="' . $paths['foo'] . '"><img src="' . $paths['foo/bar'] . '" /></a>',
        t('Simple paths. Clean URLs: @clean; protocol style: @ps.', $t10ns)
      );

Honestly, this portion of PathologicTest.php sure is hard to follow. 😢 I'm not sure why we're not expecting absolute URL in this case. The test code is rather complex with 2 levels of looping, various arrays where we compute what we think we expect, etc. I need to get more up to speed with how this test works currently, or do #3157085: Convert tests/src/Functional/PathologicTest.php to a Kernel test first and re-write all this to use a @dataProvider. 😉

Anyway, uploading what I've got so far to get feedback on the general strategy for fixing this...

dww’s picture

Hah, somewhere along the way of DrupalCI, the HTML in the output is being stripped out:

https://www.drupal.org/pift-ci-job/2332237

There was 1 failure:

1) Drupal\Tests\pathologic\Functional\PathologicTest::testPathologic
Simple paths. Clean URLs: Yes; protocol style: full.
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-''
+''

That makes it a little annoying to debug. But all the other context is the same ("Simple paths. Clean URLs: Yes; protocol style: full.", and the failure is on line 104), so that's what's happening.

I have no idea why that's what the test expects, and if that's a bug in the test or a bug in the code. But I'm also running out of time for today, so maybe I'll see if @Berdir has any sagely advice before I go much further.

Thanks/sorry,
-Derek

dww’s picture

Assigned: dww » Unassigned

Looking a little more closely, PathologicTest.php includes this procedural helper function:

function _pathologic_content_url($path, $options) {
  // If we should pretend this is a path to a file, make url() behave like clean
  // URLs are enabled.
  // @see _pathologic_replace()
  // @see http://drupal.org/node/1672430
  if (!empty($options['is_file'])) {
    $options['script_path'] = '';
  }

  if (parse_url($path, PHP_URL_SCHEME) === NULL) {
    if ($path == '<front>') {
      return Html::escape(Url::fromRoute('<front>', [], $options)->toString());
    }
    $path = 'base://' . $path;
  }
  return Html::escape(Url::fromUri(htmlspecialchars_decode($path), $options)->toString());
}

So a lot of this test seems pretty bogus. There's so much fancy logic in the test, I have no idea what it's really proving. Any bug fixes in the module's own URL handling need to be replicated in this helper? Huh? I'm really lost with this test...

steveoriol’s picture

Hello, I am far from understanding everything in this great module, but for the moment the only method I have found to make the urls work on for "images style" in the other languages of the site, is to add in "web/ modules/contrib/pathologic/pathologic.module" line 308 just before "// Search for prefix within added languages." in the table $parts['language_obj'] the default language of the site.

[...]
    $prefix = array_shift($path_args);

    //Set default language:
    $parts['language_obj'] = \Drupal::service('language.default')->get();

    // Search for prefix within added languages.
    foreach ($languages as $language) {
[...]

I don't know if this is good, but in any case it works for me after clearing the cache...

spadxiii’s picture

StatusFileSize
new3.69 KB

A quick re-roll of #29

ayalon’s picture

The latest patch breaks all inline media links with double language prefix and all absolute links that already have a language prefix (copied by the user). I cannot recommend it therefore.

jeslin.shaji’s picture

What is the solution for #38?

thomaswalther’s picture

I inserted an image with the image-insert button from the ckeditor on a multilanguage page.

The result I see in the code is:
<p><img alt="test" data-align="center" data-entity-type="" data-entity-uuid="" height="100" src="/sites/default/files/content/myimage.png" width="100" /></p>

In the backend textarea the image is correct, only frontend page has wrong path.

If I go on the "german" (main lang) site on /de the url of the img also have the /de prefix. On /en the image got the url with /en before.
The real entered absolute URL "/sites/default/files/content/myimage.png" should work for the core image (before having media).

randalv’s picture

StatusFileSize
new4.67 KB

@thomaswalther I've attached a little patch that refactors some of the previous patches.
I'm not entirely content with it though, I'm using regexes and trims to find out whether this is a file and to clean up the file path for internal processing, but for now it seems to work until we have a more stable solution.

daniel korte’s picture

I am using #41 with 8.x-1.0-alpha3 on a non-multilingual site and I am seeing image srcset values double-encoded causing images to 404. I have a custom patch to remove srcset processing as a quick fix.

guedressel’s picture

Using @dww's patch from #33 the language prefix get's added on image_style URLs.
I don't understand all the internals of the language prefix mechanic, but I assume that disabling PathProcessor->processOutbound() (by setting the url option "path_process" to false) prevents any further manipulation of file URLs.

[...]
    // If it's a file, use 'base:' and disable path processing so that the path is not re-written.
    elseif ($is_file) {
      $scheme = 'base:/';
      $url_params['options']['path_processing'] = FALSE;
    }
[...]

Edit: actually I'm not sure if this behavior of adding language prefixes is "new", since I'm experiencing this on a project we just migrated to Drupal 9.4. Cloud be that the patch from #33 worked well in Drupal 9.3 - you may consider this in your review.

szeidler’s picture

Edit: actually I'm not sure if this behavior of adding language prefixes is "new", since I'm experiencing this on a project we just migrated to Drupal 9.4. Cloud be that the patch from #33 worked well in Drupal 9.3 - you may consider this in your review.

I'm experiencing the same. The behavior changed between Drupal 9.3 and 9.4

guedressel’s picture

Regarding my finding from #43:

We just ran in another occurance of the "double language prefix" (../de/de/..) after upgrading a project to Drupal 9.4.
This time it's not on a image style link but a link to a node within a processed text.

In _pathologic_replace the $original_url is a clean absolute path (incl. language prefix) to the referenced node, which actually doesn't need any further processing.
The code block to detect $specific_language picks this language prefix cleanly up and the $url_params['language'] gets set accordingly.
But finally a second language prefix gets set in
$url = Url::fromUri($path, $url_params['options'])->toString();

Again: If the $url_params['path_processing'] would have been set to false no additional language prefix would be added.

I'm not certain if disabling path_processing could interfere with other (plugable / extandable) functionality.
Hence I propose to remove the original path prefix in the code block to detect $specific_language and let it be added again in the Url::fromUri() invocation.

This differs from my proposal for image style links (or general file links) since those links really should not be touched once they are set.

pieter-e1’s picture

StatusFileSize
new5.26 KB
new2.06 KB

Reroll of #41 to catch remote paths that point to a file. Before, any path ending in a file extension was considered local and transformed into a relative path.

christophdemon’s picture

StatusFileSize
new5.4 KB
new891 bytes

Added an extra check on the failing host part.

randalv’s picture

StatusFileSize
new5.03 KB

Refactored the patches a bit to go back to a simpler concept, it's been tested internally and seems to work stable.

guedressel’s picture

+++ b/pathologic.module
@@ -192,10 +199,18 @@ function _pathologic_replace($matches) {
+  // file url starting with 'sites/default/files' (or with a leading slash).
...
+    preg_match('/sites\/(.*?)\/files\//', $parts['path'], $files_matches);

The public file path is configurable - see https://www.drupal.org/node/22241

Please check on actual path, instead of hard-coding patterns for the default path.

spadxiii’s picture

StatusFileSize
new4.47 KB
new3.97 KB

I found that the last patch didn't work for our platform anymore. We use flysystem (and s3/cloudfront) so our 'file public path' isn't anywhere near /sites//files and had to find a different solution.

I removed the whole part of trying to rewrite the file-path, because the path is fine already.
I've added the 'is_file' boolean to the $parts so that the alter can change it later on. (We added an alter to set is_file to true for flysystem-urls)

nickdjm’s picture

StatusFileSize
new4.47 KB

Updated to work with latest release.

randalv’s picture

StatusFileSize
new5.21 KB

I've refactored my patch from #48 to use the dynamic path retrieved from PublicStream::basePath().
Now this should work with non-standard public file paths.

As well as applying the changes from the latest version, testing it against 2.0.x.

I've taken #48 instead of the other patches because I do think we need to step away from the hardcoded list of file extensions, even if I was the one (to my regret) to introduce that into the line of patches.

bletch’s picture

Version: 8.x-1.x-dev » 2.0.0-alpha1
StatusFileSize
new4.52 KB

Patch against 2.0-alpha IGNORE THIS, I did not see the comment above mine. 52 is a better patch.

ronaldmulero’s picture

Patch #52 fixes the following problem.

1. Create a Spanish translation of an English page.
2. Hard-code a link to an existing pdf on the same site.
<a href="/files/whatever/file.pdf">same-site.pdf</a>
3. Save
4. Before patch, the same-site link is incorrectly rendered with "/es" at the start of the path: <a href="http://example.com/es/files/whatever/file.pdf">same-site.pdf</a>
5. After patch, same-site link is correctly rendered as: <a href="http://example.com/files/whatever/file.pdf">same-site.pdf</a>

Thanks @RandalV

Pathologic: 8.x-1.0-alpha4
Drupal core 9.5.9

dww’s picture

Issue summary: View changes
Status: Needs work » Postponed (maintainer needs more info)
Issue tags: +Needs issue summary update, +Needs steps to reproduce

Now that I'm starting to get the tests for this module a bit more under control, I'm turning my attention to trying to fix this. However, even after re-reviewing the issue, I'm not clear what bugs we're actually fixing here. 😅 I'm not sure how to flesh out the test coverage we need since I don't know what should be working once the patch is applied.

Can someone who's experiencing a problem here clarify what bug you're actually hitting that this patch would solve? What kinds of links are being broken and why?

randalv’s picture

Hi @dww,

Thank you for turning your attention to this issue.
One of the main problems is the module adding language prefixes to inline file URLs.

If an editor adds a file inline, for example: `/sites/default/files/2023-05/a-picture-of-a-cat.jpg`
Then the module would refactor this to, for example: `/en-GB/sites/default/files/2023-05/a-picture-of-a-cat.jpg`

I've been messing around to find the right solution but even in my last patch I'm not quite sure that it's what we want.
I do think it's the most stable patch so far, with dynamic public file paths recognition etc.

TL;DR: I think we need to check if language path prefixes are added (or not added) correctly in all situations, this doesn't just include files but also other URLs like translated nodes etc

svendecabooter’s picture

FYI: patch #52 breaks our specific use case, where we have relative paths to files, that we want to turn into a full url ("Full URL" selected for "Processed URL format" setting).

This change gets ignored in the following blob of code:

elseif (str_contains($parts['path'], PublicStream::basePath())) {
    // This url should not be turned into a URL object, because we don't want
    // language handling for this path.
    $dont_rewrite = TRUE;
  }

In the initial if statement, the clause $parts['scheme'] === 'files' is FALSE by the way.
Perhaps because these are images uploaded inline via a WYSIWYG editor?

tcrawford’s picture

StatusFileSize
new4.88 KB

We are using #51 on a project. I have re-rolled patch #51 so that it applies on 2.0.x, which is required for an upgrade to D10.0.x. As per previous concerns, there is no test coverage for the patch, but I hope it helps someone who is upgrading.

n.ghunaim’s picture

StatusFileSize
new4.88 KB

Adding "webp" extension for file extensions.

dww’s picture

I started working on fleshing out test coverage for this, and ran into #3393800: Kernel tests can't use path aliases on entities. Maybe we'll have to do this in a Functional test until that's fixed and backported. 😞

dww’s picture

Assigned: Unassigned » dww
Status: Postponed (maintainer needs more info) » Needs work
Issue tags: -Needs tests
StatusFileSize
new11.38 KB
new16.57 KB

Re: #60: No, we can override that behavior in specific Kernel tests if we want to. But I still couldn't get Kernel tests to run the Pathologic filter and get the same behavior as when doing it in Functional tests. 😢 It's a mess I don't really have time to fully untangle.

I'm adding a Functional test for this, which starts to flesh out the expected behavior, and the need for something like the patches in here. Moving the code for creating and using a filter format out of PathologicKernelTestBase into a Trait for reuse.

However, some of the expected cases I added are still failing. I think the way the filter is stripping country codes is broken, and it's doing it far more often than it should. See also #3388563: Option to keep language prefix in URLs. For now, I commented those out.

Anyway, I'm actively working on this now, so I'm going to assign myself. Stay tuned for more.

dww’s picture

StatusFileSize
new16.55 KB

This is with all the assertions active. Fails locally like so:

There was 1 failure:

1) Drupal\Tests\pathologic\Functional\PathologicLanguageAliasTest::testContentTranslation
en: fr/reference-fr link uses the FR alias
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'<a href="/drupal-10/fr/reference-fr">Test node link</a>'
+'<a href="/drupal-10/reference-fr">Test node link</a>'

/Applications/MAMP-common/htdocs/drupal-10/vendor/phpunit/phpunit/src/Framework/Constraint/Constraint.php:122
/Applications/MAMP-common/htdocs/drupal-10/vendor/phpunit/phpunit/src/Framework/Constraint/IsIdentical.php:79
/.../pathologic/tests/src/Functional/PathologicLanguageAliasTest.php:145
/Applications/MAMP-common/htdocs/drupal-10/vendor/phpunit/phpunit/src/Framework/TestResult.php:728

FAILURES!
Tests: 4, Assertions: 94, Failures: 1.

dww’s picture

Assigned: dww » Unassigned
Status: Needs work » Needs review

Moved to https://git.drupalcode.org/project/pathologic/-/merge_requests/12

Would love to get feedback on https://git.drupalcode.org/project/pathologic/-/merge_requests/12/diffs?...

When do we ever want to be stripping off the langcodes in the URLs we're processing? That just always seems wrong. All existing tests are now passing, but I don't trust them that well to believe this isn't going to break things for someone.

Should we proceed with #3388563: Option to keep language prefix in URLs where we "fix" the language handling with a setting? That also doesn't seem right.

Status: Needs review » Needs work

The last submitted patch, 62: 2418369-62.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

dww’s picture

Status: Needs work » Needs review

Bot is confused, the good code is in the MR now.

dww’s picture

To minimize disruption, should we add filter settings that control any / all of these behaviors? Seems like a mess, but I’m open to input from others.

Thanks,
-Derek

mark_fullmer’s picture

To minimize disruption, should we add filter settings that control any / all of these behaviors? Seems like a mess, but I’m open to input from others.

My initial instinct is similiar -- adding settings would complicate the permutations of what this module can do, requiring additional overhead to maintain and add test coverage. That said, given that it appears that there are multiple, valid ways that sites may want to handle language prefixes in URLs, I don't think there is a way to have a one-size-fits-all solution.

This feeling is based on conversations coming from the Linkit module, which like Pathologic, provides a text format filter that manipulates URLs:

I think this should be configurable, whether the link should switch to the language of the referenced entity or if it should stay in the current.

-- Berdir, https://www.drupal.org/project/linkit/issues/3041045#comment-13038924

A far safer solution would be to instead set hreflang if the user chose a specific translation that does not match the current language.

-- Wim Leers, https://www.drupal.org/project/linkit/issues/2886455#comment-14982870

Agreed this probably something that needs to be configurable given the maturity of the module. Hard to sell a change like this without a way to revert it to how it works now.

-- bkosborne, https://www.drupal.org/project/linkit/issues/3041045#comment-13875365

(That last link provides an example of where different sites would want different URL outputs for translated node URLs)

So, with that in mind, I'm almost leaning toward thinking we *should* put #3388563: Option to keep language prefix in URLs in place as a first step, and maybe even expand the settings options.

Following along on this issue now and helping where I can...

dww’s picture

Title: Internal URL handling (language prefixes, base://, ...) » [PP-1] Internal URL handling (language prefixes, base://, ...)
Issue summary: View changes
Status: Needs review » Postponed

I'm still not clear when the current behavior is ever correct. 😅 But I picked up #3388563: Option to keep language prefix in URLs, moved it to 2.0.x, added tests, etc. I'll be working on an upgrade path next to maintain current behavior for existing sites.

Meanwhile, I guess I'll postpone this one until that's merged...

dww’s picture

Title: [PP-1] Internal URL handling (language prefixes, base://, ...) » Internal URL handling (language prefixes, base://, ...)
Status: Postponed » Needs review

#3388563: Option to keep language prefix in URLs is merged. This needed to be rebased. Probably still needs work, but the MR is reviewable again. Thanks!

dww’s picture

Issue tags: +Needs tests

The full logic and tests here are now merged the latest changes in 2.0.x, namely #3388563: Option to keep language prefix in URLs.

Not sure if we now want to add yet another setting here to control if the filter should allow URL aliases and language processing to happen at all or not. Basically, do we make the final fallback case where we're now going to try to use internal: check a config setting for if we should continue to use base: like we do now. 😬 🤔

Also, we really need to flesh out the test coverage for the "is_file" logic. Tagging for that, but leaving NR since I'd love more eyes on what we should do here now.

n.ghunaim’s picture

StatusFileSize
new4.47 KB

Leave internal links alone without changing them.

astutonet’s picture

The patch in #59 solved my issues with language prefixes in the URL. But we need a definitive and fast alternative, as I have seen many users abandon Drupal due to problems like this.

mark.labrecque’s picture

StatusFileSize
new4.9 KB

Patch reroll from #59 to account for upstream updates. Still no tests, unfortunately, but it will work with the latest module update.

Status: Needs review » Needs work

The last submitted patch, 74: pathologic_2418369-74.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

mark.labrecque’s picture

Status: Needs work » Needs review
StatusFileSize
new3.98 KB

Let's try that again....

Status: Needs review » Needs work

The last submitted patch, 76: pathologic_2418369-76.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

dww’s picture

Thanks. However, please drop patch #59 or anything based on it. The code is now in the merge request. That’s where the action needs to happen now.

dww’s picture

Also note that language prefixes in the URL should be mostly solved in alpha2 thanks to #3388563: Option to keep language prefix in URLs

twod’s picture

The PR works well for us. Had issues with canonical node paths being marked as unrouted and bypassing the outbound path processing. Having them processed with the internal:// prefix where possible, makes more sense.

ayalon’s picture

StatusFileSize
new11.64 KB

Here is the latest MR as a diff for composer patches

fox mulder’s picture

I'm trying the MR !12 on a multilingual site where the language negotiation based on domain. Pathologic outputs the /node/N link on hungarian aliased, but not on english.
I'm sorry, but I don't have time to come up with an improved amendment proposal right now
It was my fault. There was no English translation of the linked node

tcrawford’s picture

I tested this (#81) and it worked for me in the case of linking from a German node (default language) with an anchor to a node in another language (e.g. https://mysite.ch/fr/node/123 ).

With the patch the German node could link to the French node without issue (i.e. the alias was resolved to the French alias). Without the patch the link is resolved to the German node ( https://mysite.ch/node/123) if the preserve language option is not selected (#3388563) and to the French node (https://mysite.ch/fr/node/123) if it is.

However, for the case of system files (e.g. https://mysite.ch/system/files/media/2025-06/some.pdf ) the patch results in the URL being mangled. I debugged this and found the cause.

System private files are now processed with the internal: scheme, whereas without the patch they are processed with the base: scheme. This (internal:) seems semantically correct, because they are routed. However, this has a nasty side-effect.

The _pathologic_replace callback passes options to fromUri(), with options that contain an empty query param.
$url = Url::fromUri($path, $url_params['options'])->toString();

fromUri() then calls from static::fromInternalUri(). Here the options with an empty query param array are passed to the URL.

This overrides the file query param that is added by the PathProcessorFiles (inbound path processor), resulting in the files query param being lost and the URL being rendered as https://mysite.ch/system/files

We likely need to treat system files as base: (does not seem semantically correct, but works) or unset the the query key from the options before calling Url:fromUri. I tested both approaches and they work. I will push this change to the MR and try to add a test for this case.

tcrawford’s picture

StatusFileSize
new29.08 KB

Added a test for the fix for system files.
Here is the patch to the current latest commit on MR 12: pathologic-language-prefix-2418369-84.diff

tcrawford’s picture

It turns out that it is better to process system files with the base scheme instead of the internal scheme. Otherwise URLs will be ouput as /system/files?file=...... I will fix this and push to the MR shortly and update the patch.

tcrawford’s picture

StatusFileSize
new31.21 KB

I have updated the patch file for the alternative approach to handling system files.

jay.dansand made their first commit to this issue’s fork.

jay.dansand’s picture

Version: 2.0.0-alpha1 » 2.0.x-dev
Status: Needs work » Needs review

Re-rolled against 2.0.x and passes tests now. Is it ready for review?

mark_fullmer’s picture

Status: Needs review » Needs work

Re-rolled against 2.0.x and passes tests now. Is it ready for review?

Thanks for the re-roll. Setting this to Needs work, as I feel the following tasks still need to be performed before further review by the maintainers can happen:

1. The current code changes fail Drupal coding standards, per the PHPCS test failure.
2. The maintainer has left multiple comments on the merge request. These need to be responded to in some fashion, as there are open questions about backwards compatibility for older versions of PHP and test coverage.
3. The issue description does not yet sufficiently state a problem or an enhancement, nor does it state clearly what the intended resolution is. This makes evaluating whether or not the code change is sufficient, or in scope for the proposed goals, a guessing game.

Thanks for people who can contribute to resolving any of the above!

pfrenssen’s picture

Also the additional test coverage and fix from #84 and #86 should be integrated in the MR.