Problem/Motivation

Drupal 7 accepts protocol-relative URLs, that start with double slash and doesn't have an explicit protocol part. Example: //example.com.
Drupal 8 does not support such format yet: #2744729: Link field should accept protocol-relative URLs.

Currently, FieldLink process plugin treats such URLs as internal, so the resulted links are not correct.
To be more exact, they do not pass conditions for internal and externals URLs:
1) if (!preg_match($internal_pattern . $end, $uri)) {

2) if (preg_match($external_pattern . $end, $uri)) {

But then fallback to internal at the end of the canonicalizeUri method.

Steps to reproduce

1. Migrate Drupal 7 link field with //example.com/whatever URL value;
2. Observe internal:/example.com/whatever uri value after the migration.

Proposed resolution

In Drupal 7 Link module, it's being handled by appending a protocol: https://git.drupalcode.org/project/link/blob/7.x-1.5-beta2/link.module#L...

There is also a condition in UrlHelper::isExternal() responsible for such cases:
https://git.drupalcode.org/project/drupal/-/blob/9.3.x/core/lib/Drupal/C...

In both cases, it's handled early before making any further processing/validation. The proposed change is the same here - let's replace a double slash with a configured protocol as early as possible.

Note: if/when #2744729: Link field should accept protocol-relative URLs lands, it would be possible to leave such links as-is. But for now, we should append a protocol and the start.

Remaining tasks

Review, commit.

Issue fork drupal-3232681

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Matroskeen created an issue. See original summary.

Matroskeen’s picture

Status: Active » Needs review
FileSize
793 bytes
1.66 KB
Matroskeen’s picture

Title: FieldLink treats protocol-relative external URLs as internal ones » FieldLink process plugin treats protocol-relative external URLs as internal ones
Priority: Normal » Major
Issue summary: View changes

I'm not sure, but I think it might be even Major. Currently, such links are not correct after migration, so it's some sort of data loss.

The last submitted patch, 2: 3232681-2-test-only.patch, failed testing. View results

danflanagan8’s picture

Status: Needs review » Needs work

This looks great. I went in to the drupal7 fixture and changed

'field_link_url' => 'http://google.com',

to

'field_link_url' => '//google.com',

for node 1. Before the patch, the migrated node linked to [my-drupal-8-site]/google.com. After applying the patch, I re-ran the upgrade on a fresh install. This time, node 1 correctly linked to http://google.com.

I like the test too.

The only suggestion is that we might want to add a @todo comment that references #2744729: Link field should accept protocol-relative URLs. Something like:

// @todo: Remove when https://www.drupal.org/node/2744729 lands.

Matroskeen’s picture

Status: Needs work » Needs review

Thanks for the suggestion! It was done and ready for review again.
I'm also hiding initial patches and keeping a single MR.

danflanagan8’s picture

Status: Needs review » Reviewed & tested by the community

MR looks good!

kpaxman’s picture

Status: Reviewed & tested by the community » Needs review

I just had a look at the merge request and I note that it doesn't check that the "//" is at the front when doing the replace - so in theory, if someone's link was "//google.com//" for some reason (typo? would there ever be a proper URL with multiple slashes?) this would create "http://google.comhttp://"

It may be an edge case, but it feels like for safety's sake it should only replace the first one. Since we've already verified that the string starts with "//" this may be as simple as setting the count parameter on str_replace to 1.

danflanagan8’s picture

Good catch, @kpaxman. Might as well be extra careful.

quietone’s picture

Status: Needs review » Needs work

Setting to NW for #9

Matroskeen’s picture

Status: Needs work » Needs review

Yeah, good catch! If I read correctly https://www.php.net/manual/en/function.str-replace.php, the $count variable is supposed to keep the number of replacements. So if we pass "//google.com//" string, it'll receive the value of 2.

The fix is a bit different and ready for review.

danflanagan8’s picture

Status: Needs review » Reviewed & tested by the community

@Matroskeen, looks like you're correct about the $count parameter in str_replace. Thanks for the link. I like your fix. Setting back to RTCB.

alexpott’s picture

Version: 9.3.x-dev » 9.2.x-dev
Status: Reviewed & tested by the community » Fixed

Seems a pragmatic fix for now.

Committed and pushed e57ef27b2d to 9.3.x and 63b8c894ac to 9.2.x. Thanks!

  • alexpott committed e57ef27 on 9.3.x
    Issue #3232681 by Matroskeen, danflanagan8, kpaxman, quietone: FieldLink...

  • alexpott committed 63b8c89 on 9.2.x
    Issue #3232681 by Matroskeen, danflanagan8, kpaxman, quietone: FieldLink...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.