This issue shall be the hub for those issues that keep popping up since a couple of days. The feeds are perfectly valid and Aggregator seems to not to choke but merely throw a warning (although this might be just the module lying to me). Currently "infected"
Marzee Labs
Gizra
Darren Mothersele
Lin Clark
Stéphane Corlosquet
Those feeds remain unsuspended until they do make Aggregator in fact choke for now but I really would like to know the cause here.
Update: Scor's feed mysteriously self-repaired so only the original four remain on the list. Looking through the error logs it appears that no other feeds have shown this behavior in the last weeks. Sadly the problem seems to be more than just annoying as Gizra's latest posts aren't showing up on the Planet. The "infected" feeds validate fine btw. #3 contains a debugging idea.
Comments
Comment #1
amitaibuAt least you have Lin Clark in the "infected" list -- probably one of the few persons that really understands what the RSS schema is ;)
Comment #2
dddave CreditAttribution: dddave commentedBetter title and here comes the error message:
The feed from XYXYXY.com seems to be broken, due to "-1002 missing schema".
Comment #3
linclark CreditAttribution: linclark commentedI'm pretty sure that this error isn't actually an RSS schema issue, but instead is a poorly written error related to the HTTP scheme.
From
drupal_http_request
in common.incCould you add a line there to log the URIs which result in this error?
Comment #4
dddave CreditAttribution: dddave commentedComment #5
dddave CreditAttribution: dddave commentedI'll suggest that debugging advise to tvn. Might take a while.
Added a new feed btw. I wonder what triggered this as I cannot recall seeing this issue before.
Comment #6
linclark CreditAttribution: linclark commentedIt looks like Marzee, Darren's, and mine all had XML elements with href attributes. I checked Wunderkraut's and Wim Leers, and neither of them had XML elements with href attributes as far as I could see.
I might be that Aggregator doesn't know how to handle XML elements with href attributes and thus passes the empty value into
drupal_http_request
?I've changed my feed so that it no longer uses an href on an element. If there is no error for my blog on the next pass, then that's likely the issue.
Comment #7
linclark CreditAttribution: linclark commentedI see that you added scor's blog. I'm pretty sure that he uses Drupal's (or Views') default RSS output, so I don't know what could be causing it.
Comment #8
dddave CreditAttribution: dddave commentedre #6: The error message persists. ;(
Comment #9
dddave CreditAttribution: dddave commentedIt seems this does create some real trouble as Gizra's latest content isn't showing up. Going to talk to tvn today.
Comment #10
dddave CreditAttribution: dddave commentedComment #11
dddave CreditAttribution: dddave commentedComment #12
dddave CreditAttribution: dddave commentedupdated and claryfied the summary
Comment #13
darrenmothersele CreditAttribution: darrenmothersele commentedPerhaps this is something to do with the HTTP headers? Perhaps the content type?
For example, these are the HTTP headers from the Get Pantheon blog (picked randomly from the working feeds on Drupal Planet)...
But these are the HTTP headers from my feed:
And these are the HTTP headers from Gizra's feed:
As you can see the failing feeds (Gizra and mine) are both hosted on GitHub pages, and it doesn't give the correct content type of 'application/rss+xml'.
Comment #14
darrenmothersele CreditAttribution: darrenmothersele commentedI just checked Marzee Labs and Lin Clark's feed and they're both hosted on GitHub pages too.
Comment #15
linclark CreditAttribution: linclark commentedIt doesn't seem to get in the way of processing. I posted something yesterday and it is showing up on the Planet without any problems.
Comment #16
dddave CreditAttribution: dddave commentedmmh, Gizra's new Planet content is not showing up though. But at least we have narrowed it down a bit.
Comment #17
amitaibu@linclark ,
Are you using Jekyll as-well? If so can you share the format of your RSS file. Mine is this
Comment #18
linclark CreditAttribution: linclark commentedI'm using Middleman with the Builder gem. Here's a version of the code I'm using. I technically don't need to make the link absolute in the item because I have the base namespace set, but I do it anyway.
Comment #19
amitaibuI was able to reproduce the error locally.
I have changed in the aggregator the URL from
http://www.gizra.com/taxonomy/term/1/all/feed/
=>http://www.gizra.com/taxonomy/term/1/all/feed/index.html
(i.e. added/index.html
) and the error was gone.@dddave can you please try to do this change on d.o as-well?
Comment #20
amitaibubump. I have a blog post in the pipe -- would love it to reach Drupal planet ;)
Comment #21
dddave CreditAttribution: dddave commentedFirst off: sorry I missed this in the first place. But I am sorry to report that this change did not solve the issue.
Comment #22
amitaibu> First off: sorry I missed this in the first place
No problem, you are probably swamped with issued :)
> But I am sorry to report that this change did not solve the issue
Hmm, I hoped it would "just" work. So it seems its not just the Aggregator module in the way. Is there a dev server I can get admin access to so I can try to debug it there?
Comment #23
dddave CreditAttribution: dddave commentedBest catch tvn on #drupalorg for such requests.
Comment #24
tvn CreditAttribution: tvn commentedHi Amitai, you can use the following dev site:
http://links-drupal.redesign.devdrupal.org/
(I created it for this issue https://drupal.org/node/2125757, but no one used it yet and the 2 issues should not interfere with each other anyway).
Here are some instructions on how to work on our dev server: https://drupal.org/node/1018084
I added both of your SSH keys already.
Comment #25
amitaibu@tvn, thank you for the dev site.
Some insights:
I've changed the rss link to an atom link now served from
http://www.gizra.com/atom-drupal.xml
. Clicking on update items, gives me an error, and re-clicking works fine. So it seems that the URL is valid, but sometimes, for some reason it chokes.I've done the same test on Lin's link, and got the same behavior -- sometimes it errors, some times it works.
Comment #26
amitaibu@dddave
Can you please try the change to http://www.gizra.com/atom-drupal.xml (with the known issues as mentioned in #25)
Comment #27
dddave CreditAttribution: dddave commentedI am sad to report that I still get this error consistently even after trying multiple times.
Comment #28
amitaibu@dddave,
When you try to re-import Lin's feeds -- does it work ok?
Comment #29
amitaibuDebugging more locally, I think I spot the problem:
Occasionally we are getting a 302 response, and the new location is extracted from
$location = $result->headers['location'];
However the location returned by Github is
/drupal-atom.xml
.So on the next call to
drupal_http_request()
the URI isn't correct -- it doesn't have the schema or the path.Comment #30
amitaibuHere's the related core issue #164365: drupal_http_request() does handle (invalid) non-absolute redirects (RFC 7231)
Comment #31
amitaibuAnd here's a blog post about the 302 response from Github
Comment #32
dddave CreditAttribution: dddave commentedLin's feed was still throwing the error yet had content from January fetched. I emptied and refetched which threw the error and didn't catch the content. #meh
I am on vacation until next week so I won't be able to help out here for a while.
Comment #33
amitaibuI have followed https://help.github.com/articles/setting-up-a-custom-domain-with-pages so now my Github page is using CDN.
This means the issue should be solved, as Drupal shouldn't be getting a 302. I have tried it on the dev site, and got no error.
@tvn can you please re-add Gizra to the Drupal planet?
Comment #34
dddave CreditAttribution: dddave commentedYou are on, were the whole time. The issue is that you didn't get aggregated correctly. Which is the feed url we should use btw. Currently it uses the one provided in #26. It also seems your images are broken upon aggregation: https://drupal.org/aggregator/sources/552
edit: The error is indeed gone. Just hard refreshed your feed.
Comment #35
amitaibuHi @dddave , how was the vacation? :)
> Which is the feed url we should use btw.
Can you change it back to http://www.gizra.com/taxonomy/term/1/all/feed/ please
Comment #36
dddave CreditAttribution: dddave commentedI had a very good time.
The feed is working fine now, only the images are broken but #2030877: Review Planet posts for relative URLs could be the cause. Let any other issues discuss in the issue regarding your feed directly. I'll link to this issue on the Planet docs btw.
Comment #37
amitaibuThanks! I've pushed a fix for the images.
Comment #38
darrenmothersele CreditAttribution: darrenmothersele commentedI updated my feed a) so it doesn't redirect, and b) to add the .xml extension so it serves with the correct content type from GitHub pages.
Yesterday it was serving without a redirect, but today I noticed this:
curl -I http://darrenmothersele.com/drupal-planet.xml
Then I tried, for comparison:
curl -I http://www.gizra.com/taxonomy/term/1/all/feed/
Then I tried again,
curl -I http://darrenmothersele.com/drupal-planet.xml
Second time running it I always get a 200 direct reply. But sometimes I get 302?
Does fixing the DNS to use ALIAS always reply with a 200? If so then I'll have to move do a different DNS provider as it's not supported where I am now.
Comment #39
amitaibu>Does fixing the DNS to use ALIAS always reply with a 200?
On Github we actually removed the ALIAS and use CNAME instead - and it fixed the problem.
Comment #40
darrenmothersele CreditAttribution: darrenmothersele commentedThis could be an issue?
http://joshstrange.com/why-its-a-bad-idea-to-put-a-cname-record-on-your-...
Comment #41
darrenmothersele CreditAttribution: darrenmothersele commentedI decided to move off GitHub Pages. As a side effect, hopefully that fixes this issue.
Comment #43
kostajh CreditAttribution: kostajh at Savas Labs commentedQuick note for anyone else who has a blog on GitHub pages that they want on Drupal Planet and is running into the "missing schema" error. We resolved the issue by proxying the RSS feed on GitHub pages through FeedBurner. Not ideal but it works. See https://www.drupal.org/node/2553551#comment-10252437 for more info.