Current Output:
<meta property="og:image" content="/example.cdndomain.com/cdn/farfuture/mZRxNljyhBA6tIH8ozCr-uQC1ybvCaBRrUW94h7O0gg/1482159252/sites/example.com/files/styles/social_share/public/product/image.png" />
Desired Output:
<meta property="og:image" content="//example.cdndomain.com/cdn/farfuture/mZRxNljyhBA6tIH8ozCr-uQC1ybvCaBRrUW94h7O0gg/1482159252/sites/example.com/files/styles/social_share/public/product/image.png" />
The issue is that the double //
is being stripped from the beginning of the URL and is replaced with a single/
. I discovered this issue when I enabled the CDN module, however this would be an issue anytime there were protocol agnostic URLs.
However, even with the correct URL, these protocol relative URLs do not work when constructing social share image paths as the resulting image URL in the social share query string won't work since the request has no protocol to be relative to. So perhaps this issue is invalid and we don't ever want to support the output of protocol relative paths in meta tag token values, but even then, it was very confusing to have the URL mysteriously have one of the /
removed.
If we do want to address this, I worked up a solution below, which I can turn into a patch if there is support for the approach.
I'm not a regex expert, but it seems that we need to alter the exclusion list to not just ignore double //
with a :
before them, but also ignore //
when it is at the beginning of a string. The change below seems to be working great for me.
Previous code:
// Ensure that there are no double-slash sequences due to empty token
// values.
$replaced = preg_replace('/(?<!:)\/+\//', '/', $replaced);
Suggested code:
// Ensure that there are no double-slash sequences due to empty token
// values.
$replaced = preg_replace('/(?<!:)(?<!^)\/+\//', '/', $replaced);
FWIW I'm filing an issue in the CDN issue queue as well, as this would never be an issue if we could exclude meta tag token values from CDN rewriting. I'll include a link here once that issue is created.
Comment | File | Size | Author |
---|---|---|---|
#18 | protocol-relative-urls-2840751-17.patch | 1.22 KB | ocastle |
| |||
#9 | protocol-relative-urls-2840751-9.patch | 488 bytes | ElegguaDP |
|
Comments
Comment #2
adamzimmermann CreditAttribution: adamzimmermann at Chromatic for Meredith Corporation commentedComment #3
adamzimmermann CreditAttribution: adamzimmermann at Chromatic for Meredith Corporation commentedAlso, this is very much related to OG Tags don't work - double slashes get replaced with one slash, but I felt it was unique enough to open a new issue.
Comment #4
Wim Leers.
Comment #5
Wim LeersNote the D7 version of the Metatag module had pretty much exactly the same bug: #2791963: Protocol Relative URLs not handled properly. It was only discovered and fixed in August 2016.
Comment #6
Wim LeersComment #7
DamienMcKennaComment #8
Wim LeersComment #9
ElegguaDP CreditAttribution: ElegguaDP commentedRegexp solution from adamzimmermann works well and fixes my problem with CDN.
Comment #10
DamienMcKenna@ElgguaDP: Thanks for the patch, I appreciate it! Lets have the testbot take a look.
Comment #11
ocastle CreditAttribution: ocastle at Full Bundle commentedI've tried testing this today for the https://www.drupal.org/node/2867742 Issue which is directly related.
Applying the patch doesn't work for tokens which return img tags as the new regex is only testing for initial first character:
E.g
This can be worked around by using this patch and changing the token to point directly to the path, we should be adding support for the base token.
[node:field_image:entity:url] rather than [node:field_image]
Comment #12
DamienMcKennaWhat tokens are you using?
Comment #13
ocastle CreditAttribution: ocastle at Full Bundle commentedUsing
[node:field_image]
(Image Field) causes the wrong url.https://www.fullbundle.com/cdn.fullbundle.com/sites/...
However this can be fixed for the time being by using
[node:field_image:entity:url]
Comment #14
DamienMcKennaThis is in the custom logic for image field handling.
Comment #15
DamienMcKennaThe relevant code is in MetaNameBase::parseImageURL().
Comment #16
ocastle CreditAttribution: ocastle at Full Bundle commentedWell the Issue isn't limited to just image field handling its relevant to all Protocol Relative URLS. - I'll change the title back.
The patch #9 currently attempts to fix Simple URLs (and i believe this works) but doesn't address image field logic.
Comment #17
DamienMcKennaThis sounds like it's a problem elsewhere then, maybe in CDN itself.
Comment #18
ocastle CreditAttribution: ocastle at Full Bundle commentedI've added a new patch which solves the issue, but it changes the new regex added in patch #9 to not limit '//' to just the beginning.
I'm not sure how to test the functionality of "Ensure that there are no double-slash sequences due to empty token values." to see if this has been affected by these adjustments.
The new additions added to the attached patch follow a similar fix to that of the patch committed in D7.
Comment #19
DamienMcKennaThere are tests in #2628472: Write tests for image handling that we can build upon, so those need to be fixed first.
Comment #20
keopx@ocastle This solution is not valid for Twitter. Twitter required a full url, including http(s).
<meta property="og:image" content="http://example.com/ogp.jpg"/>
Twitter documentation example:
Here more: #2867742: Interaction with CDN module
Comment #21
ocastle CreditAttribution: ocastle at Full Bundle commented@keopx I think the patch addresses the issue when the url is protocol relative.
The business logic of twitter requiring the full non-relative protocol urls should be handled within the metatag_twitter_cards sub module not here.
The existing code is only handling the token value its given not altering urls.
Comment #22
ocastle CreditAttribution: ocastle at Full Bundle commented@keopx I'm assuming Open Graph is also going to required full urls with their scheme.
Comment #23
Wim Leers#20: those URLs/docs don't say "full URLs" are required. But sadly, other developers report this same bug in Twitter's API:
So, created #2925819: Ability to force using a scheme instead of scheme-relative URLs as a potential solution. Looking forward to your feedback. Although I agree very much with @ocastle:
In fact, he posted that after I created #2925819: Ability to force using a scheme instead of scheme-relative URLs. Tempted to close #2925819 already. Anyway, still looking forward to your feedback.
Comment #24
ocastle CreditAttribution: ocastle at Full Bundle commentedWe do need a separate issue for the twitter/OG urls.
The question is where does the responsibility lie?
IMO as the CDN module is outputting the protocol relative urls only because there doesn't seem like a full proof way of determining a 'https/http' scenario its should be handled there.
The other option would be that it should be considered an extension inside the metatag module in favour of adding compatibility to the CDN module 'as its just not possible to determine the scheme' there.
Thoughts anyone?
Comment #25
keopxThanks @winleers for added links (I read the same).
I think the same as @ocastle, CDN module is "removing" protocol.
Comment #26
Wim LeersThat misses the point, I'm afraid. The point is that the CDN module is completely confirming to the HTTP spec, it works in all browsers. It follows the robustness principle: https://en.wikipedia.org/wiki/Robustness_principle.
The actual root cause is that the Twitter Card API is broken (or rather, Twitter's server-side code that processes the metadata that websites provide for it).
Since the Metatag module aims to provide integration with Twitter Card (among others), it should be up to the Metatag module to make the output meet the expectations of the APIs processing it, including working around bugs in those APIs. That's why I agreed with
I proposed #2925819: Ability to force using a scheme instead of scheme-relative URLs not to solve this issue in particular alone, but to make it possible for sites to be explicitly HTTPS-only. If your site were explicitly HTTPS-only, and you'd be using Metatag and you'd be generating Twitter Card markup, then #2925819 would solve the problem reported here.
Comment #27
ocastle CreditAttribution: ocastle at Full Bundle commentedAfter a second thought I have to agree with @wim-leers. The CDN module does what it needs to and works, it doesn't have the requirement of having to have the url scheme.
As its Twitter / OG that requires http/https on the URL it should be these modules that validate the inputted value and adjust accordingly by prepending the relevant scheme. I've created an additional task as its not related to this one. - https://www.drupal.org/project/metatag/issues/2925974
It would be good to add support for this issue/bug here.
Comment #28
keopxThanks @ocastle and @winleers, excuse the mess
@ocastle this patch #2840751-18: Protocol-relative URLs are broken works fine!
I continue on #2925974: Twitter Cards require absolute URLs
Comment #29
Wim LeersComment #30
keopx@ocastle patch works fine for me #18
Comment #31
Wim LeersWait, doesn't this conflict with #2925974: Twitter Cards require absolute URLs?
Comment #32
keopx@WinLeers I applied both patches without any conflict :)
Both patches are necessary to works fine (with twitter).
But this patch does its job well
Comment #33
Wim LeersBut don't both fix protocol-relative URLs?
Comment #34
ocastle CreditAttribution: ocastle at Full Bundle commentedThey don't do the same no.
The patch here on comment #18 allows for protocol relative URLs
The other patch on https://www.drupal.org/project/metatag/issues/2925974 Forces the urls to NOT be protocol relative.
Comment #35
mangy.fox CreditAttribution: mangy.fox at Investis Digital commentedPatch #18 applied and fixed the issue for me.
Comment #36
DamienMcKennaI really appreciate the effort resolving this problem. Would someone mind adding a test for this? Thanks.
Comment #37
DamienMcKennaLets move the test coverage to #2956996: Add test coverage for relative URL handling and deal with it later.
Comment #38
DamienMcKennaComment #40
DamienMcKennaCommitted, thank you all!