Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Dries’s picture

1. This patch does not conform Drupal's coding style at all.

2. The error handling of aggregator_import() is bare bones (not helpful).

3. Use drupal_xml_parser_create() instead of xml_parser_create().

4. Drop the "sorry"'s.

5. form_set_error() is used incorrectly.

Please refine and resubmit.

Dries’s picture

Matt_T_hat’s picture

Category: feature » bug
Priority: Normal » Critical

No doubt I've just irritated some one but after spending the day finding out how to run diff/patch on win32 and getting that done - guess what!

That's right the form for OPML is tantalisingly there and yet it is beyond me to get it to anything (like submit). The browse button gets the page to hang for a few moments but that is all.

robertDouglass’s picture

Would be nice if someone could revive this code.

Matt_T_hat’s picture

A-Ha! I got it to work.

The OPML list must have been downloaded and can then be uploaded (via browse) and then it will work. Very well.

arnabdotorg’s picture

Assigned: arnabdotorg » Unassigned

Too lazy|busy to handle this, it would be nice to see this come through.

Robin Monks’s picture

Assigned: Unassigned » Robin Monks
FileSize
3.25 KB

I'm cleaning this up now, here is what I've got so far:
- Now Uses drupal_xml_phraser_create
- Applies Aganist HEAD
- Some Sorry's removed.

To be done:
- Conform to coding style
- Don't give error on existing entry
- Fix form error code

Robin

Robin Monks’s picture

Category: bug » feature
Priority: Critical » Normal

I haven't given up, it's just taking a while.

Robin

Uwe Hermann’s picture

Status: Active » Needs review

I have rerolled the patch, it applies more cleanly to HEAD now. Also, I fixed a bug (I think), $feed['TITLE'] should have been $feed['TEXT'], right? The patch as is works for me. Please lets try to get this into 4.7.

Uwe Hermann’s picture

FileSize
3.35 KB

Forgot the patch.

Robin Monks’s picture

Assigned: Robin Monks » Unassigned

Please make sure this patch checks that the site is not already listed in the .

Also, please reassign this patch to yourself.

Robin

Uwe Hermann’s picture

Assigned: Unassigned » Uwe Hermann

"not listed in the ......"? What is missing?

Prometheus6’s picture

I think he's talking about this:

+        //Adding '@' to bypass the problem of duplicate INSERTS
+        @aggregator_save_feed($edit);
drumm’s picture

Status: Needs review » Needs work

No longer applies.

dayre’s picture

Not a lot of activity on this lately. I'm looking for OPML import functionality for a project i'm working on. The state was changed to "patch (code needs work)" with the last statement by drumm being "No longer applies.".

What no longer applies ? What work still needs to be done with this ? If Uwe doesn't have time, i might as well pick this up, a small TODO list would be helpful too.

Boris Mann’s picture

David -- this means the patch no longer applies to HEAD. Generally, we prepare patches for that, and then "backport" to the current release. This is a feature, so it may no longer get into 5.0, but shouldn't be too many variants.

The "To Do's" seem to be:
* Conform to coding style
* Don't give error on existing entry
* Fix form error code

Egon Bianchet’s picture

Subscribing ...

Egon Bianchet’s picture

Version: x.y.z » 6.x-dev

Maybe in Drupal 6? ;-)

budda’s picture

Thanks to the original patch XML processing code, I've just worked OPML import/export functionality in to FeedParser for 4.7.

Pancho’s picture

Move this to the D7 queue.

cburschka’s picture

Ugh. This is 4.6.x code - form_file(), form_submit(), etc. It will be fun to get it up to date.

cburschka’s picture

Version: 6.x-dev » 7.x-dev
Assigned: Uwe Hermann » cburschka
Status: Needs work » Needs review
FileSize
5.27 KB

... and by fun, I apparently mean easy. Seriously, that was all?

I'm pretty sure that the XML parsing is not up to Drupal standards, however.

The attached prototype patch allows both a local file upload and a remote URL (to be downloaded only once). It does not properly validate, however, and it also does not protect against duplicates.

It works like a charm, though - I imported ~400 feeds from a single OPML file to test. It uses drupal_execute to programmatically submit the feed addition form for each feed in the file.

TODO:
- Protect against those duplicate key inserts by checking for duplicate items.
- Validate the file, make sure that at least one of the two (upload or remote URL) are entered
- Make sure that the XML parsing does not go against ordinary Drupal style

cburschka’s picture

The new patch checks for duplicates before adding.

Robin Monks’s picture

Aside from the dupe checking, this worked great!

Robin

Robin Monks’s picture

Please note, in the previous post I tested Patch #22.

Robin

cburschka’s picture

Mh... could you also test #23, then? :)

cburschka’s picture

Here is a new patch.

cburschka’s picture

Forgot to add: The patch renames some variables, validates that the user either enters a URL or uploads a file, and improves error handling. An OPML outline can be valid without containing anx XML feeds, which will be reflected in the messages.

I have been able to add 208 feeds in one go with this patch in HEAD.

Alex UA’s picture

I tried this patch on my local machine (XP runnign XAMP) and I received the following error:

* notice: Undefined index: aggregator_form_opml in C:\xampp\htdocs\drupal7\includes\form.inc on line 345.
* warning: call_user_func_array() [function.call-user-func-array]: First argument is expected to be a valid callback, 'aggregator_form_opml' was given in C:\xampp\htdocs\drupal7\includes\form.inc on line 360.
cburschka’s picture

In Drupal 7, you need to rebuild the code registry if a new form is added. Disable and re-enable the aggregator module.

Alex UA’s picture

That worked, thanks! This is a really great feature- I hope it makes it into this release...

Dries’s picture

Status: Needs review » Needs work

I think this looks good. I've tested the patch and managed to import ~200 feeds in one swoop. Very nice.

A couple of requests though:

- There are a couple of coding style violations but nothing I can't fix when committing this patch. I'd recommend that we run it through the coder module.

- It would be useful if you could add a one paragraph context-sensitive help text on ?q=admin/content/aggregator/add/opml. I bet you that most people are not familiar with OPML so I think it is important to clue them in. The help text can be short and to the point.

- I'd really like to see us ship some tests with this. Do you think you can write a couple of SimpleTests? It would make this patch fly straight in. :-)

Thanks!

cburschka’s picture

Status: Needs work » Needs review
FileSize
7.99 KB

I think I've managed to get all the code-style issues. The D6 coder is unfortunately a bit hit-and-miss there - it has the old concatenation policy.

Also, there's some help text:

OPML is an XML format used to to exchange multiple newsfeeds between aggregators. A single OPML document may contain a collection of many feeds. Drupal can parse such a file and import all feeds at once, saving you the effort of adding them manually. You may either upload a local file from your computer or enter a URL where Drupal can download it.

There are some other changes:
- renamed _aggregator_parse_opml to _aggregator_parse_opml_feeds to clarify that the function specifically extracts newsfeeds (OPML can contain much more data than that)
- doxygen on _aggregator_parse_opml_feeds
- added an "or" between the upload and URL fields to mark them as alternative
- if no categories are created yet, note that all imported feeds could be added to the same category (it's annoying to find this out later, because afterward they have to be classified manually)
- got rid of the fallback method of prepending <?xml version="1.0" ?>. Seems like the parser works without this line now (perhaps PHP5 only? But D7 is PHP5+ anyway).

The only thing remaining are the simpletests. I've never worked with unit tests before, so I'll need to learn a bit.

I assume these tests will be a single OPML test case. I'll open the form, submit it empty, with an uploaded file and with a URL, check that all feeds get added, and make sure that duplicates are removed. Would that be all?

Here's the patch with everything but the tests.

keith.smith’s picture

Status: Needs review » Needs work
+  $form['upload'] = array(
+    '#type' => 'file',
+    '#title' => t('OPML File'),
+    '#description' => t('Upload an OPML file containing a list of newsfeeds to be imported.'),
+  );

Everywhere else (AFAIK) we just refer to feeds (rather than newsfeeds).

+  $form['remote'] = array(
+    '#type' => 'textfield',
+    '#title' => t('OPML Remote URL'),
+    '#description' => t('Enter the URL of an OPML file on the web. This file will be downloaded only once.'),
+  );

If its a URL it kinda' implies that it exists on the web. Is the second part true or is is downloaded every time this form is submitted?

+
+  $form['refresh'] = array(
+    '#type' => 'select',
+    '#title' => t('Update interval'),
+    '#default_value' => 3600,
+    '#options' => $period,
+    '#description' => t('The length of time between feed updates. (Requires a correctly configured <a href="@cron">cron maintenance task</a>.)', array('@cron' => url('admin/reports/status'))),
+  );

Thanks for following the D6 precedent on the cron maintenance task stuff!

+    $form['category'] = array(
+      '#type' => 'checkboxes',
+      '#title' => t('Categorize news items'),
+      '#options' => $options,
+      '#description' => t('New feed items are automatically filed in the checked categories.'),
+    );

This is the other example of "news items" in one place and "feed items" in another.

+    $form['category'] = array(
+      '#type' => 'item',
+      '#value' => t('If you had created any <a href="@category">categories</a>, you could choose to assign all imported feeds to one of them.', array('@category' => url('admin/content/aggregator/add/category'))),
+    );

I'm uncertain on the tense of this one but it seems like it would be "If you have created any..." or just "Imported feeds can be assigned to categories."

+ * Parses an OPML file

Needs a period at the end.

+ * @return
+ *   An array of feeds, each an associative array with a title and 
+ *   a url element, or NULL if the OPML document failed to be parsed. 

url -> URL

+      return '<p>' . t('<acronym title="Outline Processor Markup Language">OPML</acronym> is an XML format used to to exchange multiple newsfeeds between aggregators. A single OPML document may contain a collection of many feeds. Drupal can parse such a file and import all feeds at once, saving you the effort of adding them manually. You may either upload a local file from your computer or enter a URL where Drupal can download it.') . '</p>';

I'm not sure we use acronym anywhere else but I like it. There is a "to to" in here. Again, it talks about news feeds rather than just feeds.

Regardless of these minor style issues, I like OPML a lot!

cburschka’s picture

Status: Needs work » Needs review
FileSize
8.19 KB

I've fixed "newsfeeds" in 1) and the redundancy/ambiguity in 2).

4) merely repeats text from the normal feed addition form, so I suggest either fixing that now, or both after this is added, but in a separate patch either way. That would be a pretty large-scale issue, incidentally:

$ grep 'feed items' -R *|wc -l
23
$ grep 'news items' -R *|wc -l 
7

Clarified the description for 5) (it's specifically advising the user to create categories before importing the feeds, because this is the one time that >200 feeds can be mass-categorized) and also added some more detail to the doxygen description besides a period in 6).

7) I'm not sure about 'URL' vs 'url' in the doxygen. Note that we're talking about a literal key in an array, ie $item['url'], not $item['URL']. I've emphasized these literals to make that more clear.

8) "to to" and "newsfeeds" fixed.

Dries’s picture

The only thing remaining are the simpletests. I've never worked with unit tests before, so I'll need to learn a bit. I assume these tests will be a single OPML test case. I'll open the form, submit it empty, with an uploaded file and with a URL, check that all feeds get added, and make sure that duplicates are removed. Would that be all?

That's correct. The goal of the test cases is to make sure that all of the functionality works, and that all of the functionality keeps working going forward. You've written enough tests for it, when they give you that level of confidence. Writing tests is fun -- I really encourage you to give it a try.

keith.smith’s picture

Status: Needs review » Needs work
$ grep 'feed items' -R *|wc -l
23
$ grep 'news items' -R *|wc -l
7

Good note, which led me to do the same grep (the last one) without the wc -l

'news items' occurs

  1. once in the CHANGELOG.txt (old notes about changes to historical versions)
  2. in a title in aggregator.admin.inc
  3. in a code comment in aggregator.module
  4. in a title in aggregator.module
  5. in a drupal_set_message in aggregator.module
  6. in a message in aggregator.test
  7. in a message in what was an outdated file on my system (.#aggregator.test.1.3)

I'm happy to roll a patch for 2, 3, 4, 5, and 6, but there really aren't that many instances of this in user-facing text now, once the instances are fixed in this patch. There is also the permission which is now arguably misnamed but that certainly is another issue.

+      '#value' => t('No <a href="@category">feed categories</a> exit. You may want to create one now, because this is the only time all new feeds can be categorized at once.', array('@category' => url('admin/content/aggregator/add/category'))),

Regarding the current patch, exit should be exist. Other than that, everything else looks good on a quick read-through.

cburschka’s picture

Status: Needs review » Needs work

I'm not quite up to the kind of empirical coding required to make a new test case, I'm afraid. I've made the tests and managed to get some of them to work. But for some reason, the tests started consistently crashing, and all HTTP requests started returning 0-length responses after the first few succeeded. Here's my test class; perhaps a few errors can be pointed out.

[test class edited out as it is now in the patch below]

cburschka’s picture

Status: Needs work » Needs review
FileSize
15.94 KB

After hours of work, I'm beginning to get convinced that the fatal error is a fault of simpletest. Here's the patch, and I'm done debugging that test case. :(

catch’s picture

Status: Needs work » Needs review

I don't get the fatal error when runing simpletest.

Looks to me like the example OPML files should be created by the patch in /modules/aggregator somewhere instead of in the patch itself (save those /tmp errors).

cburschka’s picture

The content of the sample OPML files are generated dynamically, including random names. Besides, the only reason the file_save_data fails is that I forgot the TRUE in the file_check_directory above, which means the folder is not created.

I still can't get the tests to run, but the file saving errors are gone.

catch’s picture

This takes me down to 8 fails and 1 exception - tests run fine for me still (and no file_save_data errors this time).

First error is

POST to admin/content/aggregator/add/opml, response is 0 bytes.

Likely that's the cause of the others. No immediate ideas why it's failing though.

cburschka’s picture

Interestingly, that is the same part that is failing for me - only it's more than one request. I can only speculate whether the curl timeout is set too low.

Dries’s picture

For me, the tests come to halt when running them. Is that the behavior you are seeing?

catch’s picture

Tests run all the way though for me (with above failures) - this on ubuntu.

cburschka’s picture

For me, the tests come to halt when running them. Is that the behavior you are seeing?

Yes, the batch operation fails after about five minutes with a HTTP 500 error. The test does not run through.

If this is a timeout issue, perhaps I can fix it by dividing the test case up into two classes. After all, every class runs through in one batch step.

Dries’s picture

Arancayter, I don't think a single test should run for 5 minutes. I means something is broken. Time-permitting, I'll try to install xdebug (on MAMP) to see where the time is spent or where we are being blocked. If someone beats me to it, great. I'd really like to see this patch land.

Thanks for your persistence and your willingness to learn SimpleTest.

cburschka’s picture

Well, I started splitting he test case up anyway to narrow down the problem, and found the request that causes the test to crash:

    $badfields = t('You must <em>either</em> upload a file or enter a URL.');
    $badurl    = t('This URL is not valid.');

    $form = array(
      'files[upload]' => '',
      'remote'        => '',
    );

    $this->drupalPost('admin/content/aggregator/add/opml', $form, t('Import'));
    $this->assertRaw($badfields, t('Error if no fields are filled.'));
    $form['files[upload]'] = $this->path . '/goodopml.xml';

Commenting out the POST will let this function finish normally. There are a lot of failures, but the test runs.

I'm wondering if an empty post array may be the culprit. I could submit another non-required field to make sure something is being sent...

Edit: Even after removing this, I'm still getting 0-length curl responses. However, I tend to get those on other tests in core, so it's possible that that's an issue on my server only.

I would really love it if drupal_http_request could take over the request handling in SimpleTest. Perhaps it's because I use PHP as CGI, but curl is pretty unusable on my site.

catch’s picture

I'm sure there's plenty of tests with empty post arrays already, so it shouldn't be that. I get weird issues with drupalPost sometimes, but they're hard to pin down, and I don't seem to get this specific one.

mustafau’s picture

Updated the patch.

Simplified some of the code in aggregator.admin.inc.
Refactored tests. There are two test cases and both are successful now.

mustafau’s picture

Small improvements over the last patch.

Test cases merged.
XML parser now calls xml_parser_free().

Dries’s picture

Status: Needs review » Needs work

With the latest patch, the tests no longer come to halt. However, there is a notice: "Undefined variable: cat_title".

Note that we don't abbreviate words in Drupal code -- 'cat' should probably be 'category' unless you mean a 'cat'? ;-) Same with '$res'.

Some messages are too cryptic (i.e. ''Category field if categories exist.'). It would be great if we could make these a bit more human-friendly.

Is 'NATURAL JOIN' ANSI SQL? Will it work on PostgreSQL?

url='%s' should be url = '%s' (spaces).

We don't capitalize each word in a sentence (i.e. ""Test Category" should be "Test category").

mustafau’s picture

Status: Needs work » Needs review
FileSize
15.19 KB

Updated.

"Category field if categories exist." is now "Looking for category field."
I am using PostgreSQL and NATURAL JOIN did work. Anyway I have changed it to "LEFT JOIN".

Dries’s picture

Status: Needs review » Fixed

Looks good and the tests pass. Committed to CVS HEAD. Thanks.

mustafau’s picture

Status: Fixed » Needs review
FileSize
1.56 KB

I have added a redirect destination to the OPML import form and made small improvements to the code.

mustafau’s picture

FileSize
1.83 KB

We should validate URLs too.

Dries’s picture

Status: Needs review » Fixed

Committed to CVS HEAD.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.