Current Output:

<meta property="og:image" content="/example.cdndomain.com/cdn/farfuture/mZRxNljyhBA6tIH8ozCr-uQC1ybvCaBRrUW94h7O0gg/1482159252/sites/example.com/files/styles/social_share/public/product/image.png" />

Desired Output:

<meta property="og:image" content="//example.cdndomain.com/cdn/farfuture/mZRxNljyhBA6tIH8ozCr-uQC1ybvCaBRrUW94h7O0gg/1482159252/sites/example.com/files/styles/social_share/public/product/image.png" />

The issue is that the double // is being stripped from the beginning of the URL and is replaced with a single/. I discovered this issue when I enabled the CDN module, however this would be an issue anytime there were protocol agnostic URLs.

However, even with the correct URL, these protocol relative URLs do not work when constructing social share image paths as the resulting image URL in the social share query string won't work since the request has no protocol to be relative to. So perhaps this issue is invalid and we don't ever want to support the output of protocol relative paths in meta tag token values, but even then, it was very confusing to have the URL mysteriously have one of the / removed.

If we do want to address this, I worked up a solution below, which I can turn into a patch if there is support for the approach.

I'm not a regex expert, but it seems that we need to alter the exclusion list to not just ignore double // with a : before them, but also ignore // when it is at the beginning of a string. The change below seems to be working great for me.

Previous code:

// Ensure that there are no double-slash sequences due to empty token
// values.
$replaced = preg_replace('/(?<!:)\/+\//', '/', $replaced);

Suggested code:

// Ensure that there are no double-slash sequences due to empty token
// values.
$replaced = preg_replace('/(?<!:)(?<!^)\/+\//', '/', $replaced);

FWIW I'm filing an issue in the CDN issue queue as well, as this would never be an issue if we could exclude meta tag token values from CDN rewriting. I'll include a link here once that issue is created.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

adamzimmermann created an issue. See original summary.

adamzimmermann’s picture

Issue summary: View changes
adamzimmermann’s picture

Also, this is very much related to OG Tags don't work - double slashes get replaced with one slash, but I felt it was unique enough to open a new issue.

Wim Leers’s picture

Note the D7 version of the Metatag module had pretty much exactly the same bug: #2791963: Protocol Relative URLs not handled properly. It was only discovered and fixed in August 2016.

Wim Leers’s picture

Issue tags: +Needs tests
DamienMcKenna’s picture

Version: 8.x-1.0-beta11 » 8.x-1.x-dev
Wim Leers’s picture

Title: Protocol relative paths are broken » Protocol-relative URLs are broken
ElegguaDP’s picture

Regexp solution from adamzimmermann works well and fixes my problem with CDN.

DamienMcKenna’s picture

Status: Active » Needs review

@ElgguaDP: Thanks for the patch, I appreciate it! Lets have the testbot take a look.

ocastle’s picture

I've tried testing this today for the https://www.drupal.org/node/2867742 Issue which is directly related.

Applying the patch doesn't work for tokens which return img tags as the new regex is only testing for initial first character:

E.g

<img src="//cdn.fullbundle.com/sites/fullbundle/files/..." typeof="foaf:Image" />

This can be worked around by using this patch and changing the token to point directly to the path, we should be adding support for the base token.

[node:field_image:entity:url] rather than [node:field_image]

DamienMcKenna’s picture

What tokens are you using?

ocastle’s picture

Using [node:field_image] (Image Field) causes the wrong url.

https://www.fullbundle.com/cdn.fullbundle.com/sites/...

However this can be fixed for the time being by using [node:field_image:entity:url]

DamienMcKenna’s picture

Title: Protocol-relative URLs are broken » Protocol-relative URLs are broken in image handling

This is in the custom logic for image field handling.

DamienMcKenna’s picture

The relevant code is in MetaNameBase::parseImageURL().

ocastle’s picture

Title: Protocol-relative URLs are broken in image handling » Protocol-relative URLs are broken

Well the Issue isn't limited to just image field handling its relevant to all Protocol Relative URLS. - I'll change the title back.

The patch #9 currently attempts to fix Simple URLs (and i believe this works) but doesn't address image field logic.

DamienMcKenna’s picture

This sounds like it's a problem elsewhere then, maybe in CDN itself.

ocastle’s picture

I've added a new patch which solves the issue, but it changes the new regex added in patch #9 to not limit '//' to just the beginning.

I'm not sure how to test the functionality of "Ensure that there are no double-slash sequences due to empty token values." to see if this has been affected by these adjustments.

The new additions added to the attached patch follow a similar fix to that of the patch committed in D7.

DamienMcKenna’s picture

There are tests in #2628472: Write tests for image handling that we can build upon, so those need to be fixed first.

keopx’s picture

@ocastle This solution is not valid for Twitter. Twitter required a full url, including http(s).

<meta property="og:image" content="http://example.com/ogp.jpg"/>

Twitter documentation example:

<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:site" content="@nytimes">
<meta name="twitter:creator" content="@SarahMaslinNir">
<meta name="twitter:title" content="Parade of Fans for Houston’s Funeral">
<meta name="twitter:description" content="NEWARK - The guest list and parade of limousines with celebrities emerging from them seemed more suited to a red carpet event in Hollywood or New York than than a gritty stretch of Sussex Avenue near the former site of the James M. Baxter Terrace public housing project here.">
<meta name="twitter:image" content="http://graphics8.nytimes.com/images/2012/02/19/us/19whitney-span/19whitney-span-articleLarge.jpg">

Here more: #2867742: Interaction with CDN module

ocastle’s picture

@keopx I think the patch addresses the issue when the url is protocol relative.

The business logic of twitter requiring the full non-relative protocol urls should be handled within the metatag_twitter_cards sub module not here.

The existing code is only handling the token value its given not altering urls.

ocastle’s picture

@keopx I'm assuming Open Graph is also going to required full urls with their scheme.

Wim Leers’s picture

#20: those URLs/docs don't say "full URLs" are required. But sadly, other developers report this same bug in Twitter's API:

So, created #2925819: Ability to force using a scheme instead of scheme-relative URLs as a potential solution. Looking forward to your feedback. Although I agree very much with @ocastle:

The business logic of twitter requiring the full non-relative protocol urls should be handled within the metatag_twitter_cards sub module not here.

In fact, he posted that after I created #2925819: Ability to force using a scheme instead of scheme-relative URLs. Tempted to close #2925819 already. Anyway, still looking forward to your feedback.

ocastle’s picture

We do need a separate issue for the twitter/OG urls.

The question is where does the responsibility lie?

IMO as the CDN module is outputting the protocol relative urls only because there doesn't seem like a full proof way of determining a 'https/http' scenario its should be handled there.

The other option would be that it should be considered an extension inside the metatag module in favour of adding compatibility to the CDN module 'as its just not possible to determine the scheme' there.

Thoughts anyone?

keopx’s picture

Thanks @winleers for added links (I read the same).

I think the same as @ocastle, CDN module is "removing" protocol.

Wim Leers’s picture

IMO as the CDN module is outputting the protocol relative urls only because there doesn't seem like a full proof way of determining a 'https/http' scenario its should be handled there.

That misses the point, I'm afraid. The point is that the CDN module is completely confirming to the HTTP spec, it works in all browsers. It follows the robustness principle: https://en.wikipedia.org/wiki/Robustness_principle.

The actual root cause is that the Twitter Card API is broken (or rather, Twitter's server-side code that processes the metadata that websites provide for it).

Since the Metatag module aims to provide integration with Twitter Card (among others), it should be up to the Metatag module to make the output meet the expectations of the APIs processing it, including working around bugs in those APIs. That's why I agreed with

The business logic of twitter requiring the full non-relative protocol urls should be handled within the metatag_twitter_cards sub module not here.

I proposed #2925819: Ability to force using a scheme instead of scheme-relative URLs not to solve this issue in particular alone, but to make it possible for sites to be explicitly HTTPS-only. If your site were explicitly HTTPS-only, and you'd be using Metatag and you'd be generating Twitter Card markup, then #2925819 would solve the problem reported here.

ocastle’s picture

After a second thought I have to agree with @wim-leers. The CDN module does what it needs to and works, it doesn't have the requirement of having to have the url scheme.

As its Twitter / OG that requires http/https on the URL it should be these modules that validate the inputted value and adjust accordingly by prepending the relevant scheme. I've created an additional task as its not related to this one. - https://www.drupal.org/project/metatag/issues/2925974

It would be good to add support for this issue/bug here.

keopx’s picture

Thanks @ocastle and @winleers, excuse the mess

@ocastle this patch #2840751-18: Protocol-relative URLs are broken works fine!

I continue on #2925974: Twitter Cards require absolute URLs

Wim Leers’s picture

keopx’s picture

Status: Needs review » Reviewed & tested by the community

@ocastle patch works fine for me #18

Wim Leers’s picture

Wait, doesn't this conflict with #2925974: Twitter Cards require absolute URLs?

keopx’s picture

@WinLeers I applied both patches without any conflict :)

Both patches are necessary to works fine (with twitter).

But this patch does its job well

Wim Leers’s picture

But don't both fix protocol-relative URLs?

ocastle’s picture

They don't do the same no.
The patch here on comment #18 allows for protocol relative URLs
The other patch on https://www.drupal.org/project/metatag/issues/2925974 Forces the urls to NOT be protocol relative.

mangy.fox’s picture

Patch #18 applied and fixed the issue for me.

DamienMcKenna’s picture

Status: Reviewed & tested by the community » Needs work

I really appreciate the effort resolving this problem. Would someone mind adding a test for this? Thanks.

DamienMcKenna’s picture

Status: Needs work » Reviewed & tested by the community

Lets move the test coverage to #2956996: Add test coverage for relative URL handling and deal with it later.

DamienMcKenna’s picture

Issue tags: -Needs tests

  • DamienMcKenna committed 81c507b on 8.x-1.x authored by ocastle
    Issue #2840751 by ocastle, ElegguaDP, Wim Leers, keopx, adamzimmermann:...
DamienMcKenna’s picture

Status: Reviewed & tested by the community » Fixed
Parent issue: » #2932455: Plan for Metatag 8.x-1.5

Committed, thank you all!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.