twistor [twistor] [#835276]

CVS edit link for twistor

I have been developing with Drupal for a couple of years now, but I have never contributed anything back. Today, I wrote a plugin for the Feeds module that allows someone to run XPath queries against an HTML, or XML document. My primary motivation for this is to scrape webpages in a sane manner, and create nodes from the data. The feeds module provides such an elegant solution to this, that I just have to share it. Also planned for this module is regex support, if anyone feels like being masochistic.
In the near future, I have two other modules in the works. One, is an api module that wraps up different message queue apis such as STOMP, zeromq, activemq, etc, into a generic api. I'm currently looking at both the Queue module, and the Messaging module to see if it would make more sense to implement on top of one of those, but the application is significantly different enough that I doubt it. I'm watching Pipe Dream with baited breath.
The second is, what I would call, a Comet api. I am working on one that wraps up different Comet implementations such as Orbitedand APE. I would also like to use this to implement websockts, and SSE should servers for those become stable. The goal is to make it as invisible as possible by tying into the existing AJAX and AHAH apis, and simply circumventing the normal request process. These different realtime technologies are happening and it seems to me that Drupal is missing out.

Comment	File	Size	Author
#9	feeds_xpath_parser-0.3.tgz	2.07 KB	twistor
#7	feeds_xpath_parser-0.2.tar_.gz	1.87 KB	twistor
#1	feeds_xpath_parser.tar_.gz	1.49 KB	twistor

Comments

Comment #1

twistor commented 23 June 2010 at 02:54

Status	File	Size
new	feeds_xpath_parser.tar_.gz	1.49 KB

Here's the module.

Comment #2

twistor commented 23 June 2010 at 02:59

Note: I am aware of the Scraper module, but that seems to be a bit of a dud.

Comment #3

twistor commented 23 June 2010 at 03:11

Status:

Postponed (maintainer needs more info)

» Needs review

Comment #4

avpaderno

he/him

Italian

Brescia, 🇮🇹 🇪🇺

commented 23 June 2010 at 11:49

Status:	Needs review	» Needs work
Issue tags:		+Module review

Hello, and thanks for applying for a CVS account.

As reported from the CVS application requirements, the proposed module needs to not duplicate the work done for an existing project. May you describe the differences between the proposed module, and the existing project (the Drupal version for which the module is created is not a difference we are interested in)?

Comment #5

twistor commented 23 June 2010 at 15:12

The scraper module allows you to specify a URL and a beginning and end point of a page. It then takes that section of a page and puts it into a block.
First off, my module allows for querying any XML/HTML documents, not just webpages. They could be local files as well, or anything that the Feeds module can provide. Second, you can specify an arbitrary number of parts to pull from a page and put those into fields. Finally, this module is a plugin, it's built on top of existing modules for maximum code reuse.

Comment #6

avpaderno

he/him

Italian

Brescia, 🇮🇹 🇪🇺

commented 23 June 2010 at 15:41

Status:

Needs work

» Needs review

Comment #7

twistor commented 23 June 2010 at 23:37

Status:

Needs review

» Needs work

Status	File	Size
new	feeds_xpath_parser-0.2.tar_.gz	1.87 KB

Updated module. Not quite as easy to break things. Added support to choose between XML and HTML.

Comment #8

twistor commented 23 June 2010 at 23:37

Status:

Needs work

» Needs review

Comment #9

twistor commented 24 June 2010 at 03:01

Status	File	Size
new	feeds_xpath_parser-0.3.tgz	2.07 KB

Now you can choose which fields output raw XML/HTML. More error checking. Support for leaving a field blank.

Comment #10

meatbag commented 27 June 2010 at 09:05

http://drupal.org/project/feeds_xmlparser

Here is an existng module which seems to do the same thing.
I suggest that you contribute there.

Comment #11

avpaderno

he/him

Italian

Brescia, 🇮🇹 🇪🇺

commented 27 June 2010 at 11:39

http://drupal.org/project/feeds_xmlparser is not hosted on Drupal.org; it would be more difficult for twistor to contribute to that project.

Comment #12

avpaderno

he/him

Italian

Brescia, 🇮🇹 🇪🇺

commented 27 June 2010 at 12:19

Status:

Needs review

» Fixed

  $info['FeedsXPathParser'] = array(
    'name'        => 'XPath parser',
    'description' => 'Queries an XML or HTML document using XPath.',

Those strings are not translated, as all the strings appearing on the user interface should be. The description should be Queries a XML or HTML document using XPath.

Thank you for your contribution! I am going to update your account.
These are some recommended readings to help with excellent maintainership:

You can find more contributors chatting on the IRC #drupal-contribute channel. So, come hang out and stay involved.
Thank you, also, for your patience with the review process.
Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.

I thank all the dedicated reviewers as well.

Comment #13

tobbe_s commented 4 July 2010 at 13:32

The description should be "Queries a XML or HTML document using XPath".

No, the description was correct to begin with. It should be "Queries an XML or HTML document using XPath.". "a" is only used before consonant sounds, whereas "an" is used before vowel sounds. Compare "an hour", "a European" and "an X-ray".

Comment #14

18 July 2010 at 13:40

Status:	Fixed	» Closed (fixed)
Issue tags:	-Module review

Automatically closed -- issue fixed for 2 weeks with no activity.

Comment #15

18 July 2010 at 13:40

Issue tags:

+Module review

Restoring issue tags, see #2125755: System messages removed all issue tags during D7 upgrade.

Comment #16

avpaderno

he/him

Italian

Brescia, 🇮🇹 🇪🇺

commented 3 November 2018 at 19:22

Component:	Miscellaneous	» new project application
Assigned:	Unassigned	» avpaderno
Issue summary:	View changes

twistor [twistor]

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

News items

Our community

Documentation

Drupal code base

Governance of community