When trying to add http://www.youtube.com/rss/global/top_viewed_today.rss feed through leech form, no result is loaded.
When trying the same from a browser the normal content is loaded. I looked after leech_connection() result in the case and found out that was NULL.
(no blacklist entries)

Comments

and-1’s picture

Title: leech_connection() returns NULL. But browser returns normal feed. » mime=text/plain

Just found out that result's mime was "text/plain" that wasn't in the check list in leech_connection().
Is there any reason why "text/plain" is ommited?

and-1’s picture

Title: mime=text/plain » mime=text/plain||application/octet-stream

The same for application/octet-stream feeds...
A question raised: is it valuable to check file mime and why?

alex_b’s picture

and,

You are talking about this check in leech_connection(), right?

if (!in_array($mime, array('text/xml', 'application/xml', 'text/html', 'application/rss+xml', 'application/atom+xml', 'application/rdf+xml', 'application/opml+xml'))) {
    return;
  }

I assume that you just added text/plain and application/octet-stream to the list - what are your experiences from that? Did you do tests?

We are looking at the problem of complying with standards vs. accept as many kinds of feeds as possible.

My guts tell me that it will be fine to add text/plain but I am kind of reluctant to allow application/octet-stream - do you have an example of a feed with this mime type declared?

Also, I would like to see what Aron thinks about this issue.

Aron Novak’s picture

As maybe you noticed the -dev versions of leech contains text/plain mime type already.
My opinion in this case:
http://inamidst.com/rss1.1/
At this site i read the following:

RSS 1.1 documents SHOULD be served with a media type of "application/rss+xml", or MAY be served with a media type of "application/rdf+xml". Other values MUST NOT be used.

For eg. RSS feeds only have these two mime types. Leech is very unresisting. But leech won't accept feeds claiming that it contains __binary__ data! (http://www.freesoft.org/CIE/RFC/1521/32.htm).

The "application" Content-Type is to be used for data which do not fit in any of the other categories, and particularly for data to be processed by mail-based uses of application programs.

So the RFCs are very clear in the usages of various mime types.
If I'm honest i should drop text/plain mime type allowing.

alex_b’s picture

Status: Active » Fixed

The current solution seems to be adequate:

* add text/plain for a more resilient behaviour of leech
* don't add binary formats - unless somebody can come up here with a good list of binary feeds - then we could think about adding an option on the settings page

Issue is fixed in 4.7.x and 5.x versions.

alex_b’s picture

Status: Fixed » Closed (fixed)

in 5.x-1.7