Problem/Motivation

According to https://ampbench.appspot.com/, which the google engineers used to validate our site, the canonical and amphtml links must be absolute.

canonical is provided by core, and I will open an issue there as well, not sure if you want to work around that in the meantime.

I did apply a quickfix to get ready for the launch in the amp module, I can share that if you're interested.

If not, then we can also close this as a duplicate of the core issue that I will open later.

Side note 1: Our google contact also told us that the ? must be encoded somehow, but that might have been a misunderstanding. Currently waiting for more feedback on that.

Side note 2: We have a site that has some custom requirements for canonical and will use the canonical sometimes to point to external originating sources. I'm not sure yet how the amp module should behave properly. Maybe the checks should use \Drupal\Component\Utility\UrlHelper::isExternal() and externalIsLocal() and only expose amphtml if canonical is the current site. Or if it's not, then build the amphtml url ourself by pointing to the current node.

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Comments

Berdir created an issue. See original summary.

berdir’s picture

Issue summary: View changes
berdir’s picture

Version: 7.x-1.x-dev » 8.x-1.x-dev
mtift’s picture

Issue tags: +1.0 blocker

Yes, if you have a fix then I think we should use that until core has the same functionality. Thanks!

berdir’s picture

Opened https://www.drupal.org/node/2738373

Still not 100% sure about what exactly is required. I'm seeing sites that are valid that don't have it absolute on the normal page, only on the AMP page itself.

So one could argue that it is only a requirement for AMP and not for core/normal pages.

berdir’s picture

Status: Active » Needs review
StatusFileSize
new2.87 KB

Ok, first patch.

I included a check for the side note above about external canonical URLs, I think the module shouldn't touch those. Kind of unrelated, feel free to ignore that part if you prefer a separate issue.

berdir’s picture

I suggest we get your test in first, then we will see the change here nicely and can adopt the test for the url change.

rainbowarray’s picture

Just a note that in the latest patch on #2717641: Add support for 'Top Stories with AMP' I'm also changing the canonical URL to be absolute, as I believe that's needed for the JSON schema output.

Slight differences between the two, glad to make modifications as necessary.

mtift’s picture

Issue summary: View changes
Status: Needs review » Patch (to be ported)
StatusFileSize
new14.89 KB

This looks great. Code makes sense. Tests fail, as expected. And now the URLs look like this:

Committed to 8.x-1.x. I opened a follow-up for the test: #2741217: Test for absolute URLs

mtift’s picture

Version: 8.x-1.x-dev » 7.x-1.x-dev
Status: Patch (to be ported) » Needs review
StatusFileSize
new1.38 KB

I'm not sure if there's an easier way to alter canonical links, but I think we can do the Drupal 7 version like this.

mtift’s picture

Assigned: Unassigned » mtift
rainbowarray’s picture

Status: Needs review » Needs work
+++ b/amp.module
@@ -1192,3 +1193,24 @@ function amp_page_delivery_callback_alter(&$deliver_callback) {
+          $relative_url = $tag['#attributes']['href'];
+          // Double check that we are converting relative URLs to absolute URLs.
+          if ($relative_url == drupal_get_normal_path($relative_url)) {
+            $absolute_url = url($relative_url, array('absolute' => TRUE));
+          }
+          $tag['#attributes']['href'] = $absolute_url;

In theory if this URL somehow is already absolute, then $absolute_url might not exist and could cause an error.

Could be worth either using just one variable ($canonical_url or something like that) or moving to:

if ($relative_url == drupal_get_normal_path($relative_url)) {
  $absolute_url = url($relative_url, array('absolute' => TRUE));
  $tag['#attributes']['href'] = $absolute_url;
}
else {
  $tag['#attributes']['href'] = $relative_url;

}

I think using $canonical_url might be the better option, as otherwise it looks like you're setting the tag href to a relative URL, while with the test you would have determined the URL was actually an absolute URL.

Tested this, and it does successfully turn the amphtml link and canonical links on amp pages into absolute URLs.

mtift’s picture

Status: Needs work » Needs review
StatusFileSize
new1.02 KB
new1.35 KB

drupal_get_normal_path() returns an internal path, so we're already checking for relative URLs. But I see how the variable names could be confusing. This should be more clear.

rainbowarray’s picture

Status: Needs review » Reviewed & tested by the community

Tested, this still works. Good changes. RTBC.

  • mtift committed 558f13b on 7.x-1.x
    Issue #2737729 by mtift: Canonical and AMPHTML must be absolute
    
mtift’s picture

Status: Reviewed & tested by the community » Fixed

Committed to 7.x-1.x

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mtift’s picture