Problem/Motivation
content_translation_page_attachments() adds alternate links to the html head. However these links also end up in the response headers. The large header size crashes the webserver.
The HTTP Link entity-header field provides a means for serializing one or more links in HTTP headers. It is semantically equivalent to the HTML
element.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link
The first bytes in a response are important from a performance perspective. Link response headers are semantically equivalent to the HTML <link> element, so these links should not be available in both the response headers and HTML <link> elements
Proposed resolution
Remove rel="alternate" links from the response headers.
User interface changes
None.
API changes
None.
Data model changes
None.
Release notes snippet
rel="alternate" links provided by the Content Translation module were added as both link HTTP headers and link HTML tags, but are now added only as link HTML tags.
Original report by swentel
content_translation_page_attachments() adds alternate links to the html head, which is fine. However these links also end up in the response headers (I honestly don't really understand why we do that). Now, when there are too many languages (in our case, we're working with 47), varnish will crash with a 503.
- BogoHeader Too many headers: Vary: Accept-Encodin
- HttpGarbage "HTTP/1.1%00200%00OK%00%0aDate: Fri, 28 Apr 2017 12:54:20 GMT%00%0aServer: Apache/2.4.12 (Ubuntu)%00%0aX-Powered-By: PHP/5.6.20%00%0aCache-Control: must-revalidate, no-cache, private%00%0aLink: <http://127.0.0.1:6081/en/handbook>; rel="alternate"; hreflang="en"%00%0aLink: <http://12%00"
- BerespStatus 503
- BerespReason Service Unavailable
- FetchError http format error
- BackendClose 17 default(127.0.0.1,,80)
- Timestamp Beresp: 1493384064.185137 3.703384 3.703163
- Timestamp Error: 1493384064.185146 3.703393 0.000010
- BerespProtocol HTTP/1.1
- BerespStatus 503
I've altered out these alternate links for now with hook_module_implements_alter(). But we shouldn't send those to varnish IMO.
/**
* Implements hook_module_implements_alter().
*/
function module_module_implements_alter(&$implementations, $hook) {
if ($hook == 'page_attachments') {
unset($implementations['content_translation']);
}
}
| Comment | File | Size | Author |
|---|---|---|---|
| #32 | 2873648-32.patch | 514 bytes | idebr |
Comments
Comment #3
swentel commentedActually this not related to content translation alone - I think we should really stop sending any rel link along with the headers.
Moving to another component for now to get more discussion.
Comment #4
ekes commentedNoticed Acquia suffering this one https://library.acquia.com/article/varnish-header-limits as will everyone who can't change their Varnish runtime parameters https://www.varnish-cache.org/docs/5.0/reference/varnishd.html#run-time-...
Comment #5
andypostthat affects nginx as well, requiring to increase buffers
Comment #8
raphaeltbm commentedComment #10
aangel commentedIt's blowing our Varnish, too, with 37 languages. This article says that it's valid to put in the HTTP header but it seems the
<head>or site map are the better places for it to avoid this problem.Comment #13
leolandotan commentedWe are also having this issue and we have 90 plus translations enabled. Is that safe for SEO or for whatever functionality to unset is as suggested (`unset($implementations['content_translation']);`)?
Comment #17
idebr commentedhttps://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link
The first bytes in a response are important from a performance perspective. If Link response headers are semantically equivalent to the HTML
<link>element, why should these links be available in both the response headers and HTML<link>elements?Attached patch from the links from the response headers.
Comment #19
mfbMaybe just keep the canonical link header for now - although it should be fine to remove (as far as I can tell), it's not what this issue is about and is not causing nearly as many problems as Content Translation's link headers.
Comment #20
idebr commentedAttached patch updates test coverage for the response headers.
Comment #21
mfbCan you explain the rationale for removing the canonical link header while keeping the shortlink header? Is shortlink somehow more useful while canonical is not?
Comment #22
idebr commented#21 The
rel="shortlink"did not show up in my project's response headers since it was already being removed by the Metatag moduleAttached patch also removes the duplicate response header for
rel="shortlink"Comment #23
mfbLooks like unused use statement is triggering a failure
Comment #24
mfbCreated a draft change record: https://www.drupal.org/node/3266120
Comment #25
andregp commentedRemoved unused use statement.
Comment #26
mfbGreat!
(It didn't even make sense to Html::escape() the URL, as HTTP headers are not HTML - issue for that is #3245895: Link HTTP header should not be HTML-encoded.)
I'm not aware of any use for these link HTTP headers so marking RTBC.
If anyone does have a use case for the canonical and shortlink HTTP headers, we can always fall back on just getting rid of Content Translation's HTTP headers.
Comment #27
alexpottI think we need to be a little bit careful here. The varnish crash is bad and we need to fix that but we've been emitting these link headers for a long time so surely someone somewhere is depending on them. We added this in #552478: Restrict "self-closing" tags to only void elements in drupal_pre_render_html_tag. I think falling back to remove the translations is probably for the best because the unlimited nature of that is what is causing the problem.
Comment #28
catchLet's open a follow-up to remove the other links - but that could happen as a 10.x-only behaviour change maybe just in case someone actually is relying on them somewhere.
Comment #29
alexpottTrying to work out who uses this. Apparently Google added support for this to there search engine indexing in 2011. I think the primary use-case was for
Source: https://moz.com/blog/how-to-advanced-relcanonical-http-headers
I think we should open a follow to discuss remove the canonical and shortlink because as someone said on the internet:
Source: https://searchengineland.com/the-ultimate-guide-to-http-status-codes-and...
Comment #30
idebr commentedFollow-up: #3266589: Remove redundant Link canonical/shortlink response headers
Comment #31
idebr commentedComment #32
idebr commentedAttached patch removes
rel="alternate"links from the response headers.Comment #33
wim leers— @idebr in #17
This 100% convinces me, no matter the hypermedia impact.
A change record already exists.
Comment #34
alexpottCommitted and pushed 7af29a9e52 to 10.0.x and 3c6394a483 to 9.4.x. Thanks!
I think it is telling that the addition of the translation links to the HTTP response is completely untested.
Going to ping @catch because I think we should consider backporting this to 9.3.x
Comment #37
catchGiven this fixes a 5** error, it's a one-liner, and I don't personally see any reason anyone would rely on these in any way that would break by removing them, I think this is a good candidate for backport. Although maybe better after this Wednesday's release just to give us time to change our minds.
Comment #38
wim leersGiven that
9.4.0-beta1will be released today, marking this as and moving back to9.4.x.