CVS edit link for twistor
I have been developing with Drupal for a couple of years now, but I have never contributed anything back. Today, I wrote a plugin for the Feeds module that allows someone to run XPath queries against an HTML, or XML document. My primary motivation for this is to scrape webpages in a sane manner, and create nodes from the data. The feeds module provides such an elegant solution to this, that I just have to share it. Also planned for this module is regex support, if anyone feels like being masochistic.
In the near future, I have two other modules in the works. One, is an api module that wraps up different message queue apis such as
STOMP,
zeromq,
activemq, etc, into a generic api. I'm currently looking at both the Queue module, and the Messaging module to see if it would make more sense to implement on top of one of those, but the application is significantly different enough that I doubt it. I'm watching
Pipe Dream with baited breath.
The second is, what I would call, a Comet api. I am working on one that wraps up different Comet implementations such as
Orbitedand
APE. I would also like to use this to implement websockts, and SSE should servers for those become stable. The goal is to make it as invisible as possible by tying into the existing AJAX and AHAH apis, and simply circumventing the normal request process. These different realtime technologies are happening and it seems to me that Drupal is missing out.
Comments
Comment #1
twistor commentedHere's the module.
Comment #2
twistor commentedNote: I am aware of the Scraper module, but that seems to be a bit of a dud.
Comment #3
twistor commentedComment #4
avpadernoHello, and thanks for applying for a CVS account.
As reported from the CVS application requirements, the proposed module needs to not duplicate the work done for an existing project. May you describe the differences between the proposed module, and the existing project (the Drupal version for which the module is created is not a difference we are interested in)?
Comment #5
twistor commentedThe scraper module allows you to specify a URL and a beginning and end point of a page. It then takes that section of a page and puts it into a block.
First off, my module allows for querying any XML/HTML documents, not just webpages. They could be local files as well, or anything that the Feeds module can provide. Second, you can specify an arbitrary number of parts to pull from a page and put those into fields. Finally, this module is a plugin, it's built on top of existing modules for maximum code reuse.
Comment #6
avpadernoComment #7
twistor commentedUpdated module. Not quite as easy to break things. Added support to choose between XML and HTML.
Comment #8
twistor commentedComment #9
twistor commentedNow you can choose which fields output raw XML/HTML. More error checking. Support for leaving a field blank.
Comment #10
meatbag commentedhttp://drupal.org/project/feeds_xmlparser
Here is an existng module which seems to do the same thing.
I suggest that you contribute there.
Comment #11
avpadernohttp://drupal.org/project/feeds_xmlparser is not hosted on Drupal.org; it would be more difficult for twistor to contribute to that project.
Comment #12
avpadernoThose strings are not translated, as all the strings appearing on the user interface should be. The description should be .
Thank you for your contribution! I am going to update your account.
These are some recommended readings to help with excellent maintainership:
You can find more contributors chatting on the IRC #drupal-contribute channel. So, come hang out and stay involved.
Thank you, also, for your patience with the review process.
Anyone is welcome to participate in the review process. Please consider reviewing other projects that are pending review. I encourage you to learn more about that process and join the group of reviewers.
I thank all the dedicated reviewers as well.
Comment #13
tobbe_s commentedNo, the description was correct to begin with. It should be "Queries an XML or HTML document using XPath.". "a" is only used before consonant sounds, whereas "an" is used before vowel sounds. Compare "an hour", "a European" and "an X-ray".
Comment #16
avpaderno