Is it possible to import the content (html and table data) of an external html page on a completely different server (a cross-domain URL) with sheetnode?

a example of a page to import would be:
http://finance.yahoo.com/currency-investing
so the idea is to import the table in this page into sheetnode, or even the entire page would be fine also.

possible? If so, how?
could you give some tips as how to go about doing such...?

Any help greatly appreciated,
regards

Comments

infojunkie’s picture

The module "Sheetnode HTML" can import a table from an external HTML page. Is this what you're looking for? You need to formulate a QueryPath query (similar to jQuery) to find the right table to import.

ah0’s picture

Thank you lots for the quick support,
I went through the querypath site and tutorials but still unsure how I can formulate a querypath query for a table such as one in the example page above: http://finance.yahoo.com/currency-investing

Would you please bring an example of such path?
Thanks,

infojunkie’s picture

Status: Active » Fixed

I tried your example and found that the path was not obvious. I fixed the code to make the path simpler: with the latest dev (get it 12 hours from now or directly from git) you can now use table#flat-rates-table.

ah0’s picture

StatusFileSize
new7.66 KB
new10.42 KB

Thanks again for the help and the effort,
much much appreciated.

Please see attached screen captures. (.gif)

to test it out, this is what I did:
1)
Site Configuration > Sheettnode > General
and set the "View mode" to: HTML Table
and checked the: "Show toolbar in view mode"
2)
Content Management > Create Content > Sheetnode Import From HTML Page
and set:
URL: http://finance.yahoo.com/currency-investing
QueryPath: table#flat-rates-table

then the sheetnode content type gets created but all it shows is the title for that page which is:
"Currency, Currencies & Forex Currency Trading - Yahoo! Finance"

no table is imported.
Am I missing a step or doing something wrong?

Cheers,

infojunkie’s picture

That's weird :-) What version of the QueryPath library are you using? I've been using 2.1.0.

ah0’s picture

StatusFileSize
new20.54 KB

I actually was using the QueryPath version 2.0 but then I just replaced the files with the new version 2.1.0.
But still the same thing happens; only the title of the page is shown.

I cant think of anything... also the structure of QueryPath folder/files (screen capture attached)
any ideas?

thanks alot

ah0’s picture

- Installed the QueryPath module ( http://drupal.org/project/querypath )
- downloaded the MINI version 2.1.0 library from: ( https://github.com/downloads/technosophos/querypath/QueryPath-2.1-minima... ) and located the files under the domain.com/modules/querypath/QueryPath (THE SCREENSHOT AT POST #6)

is this the correct file locations?
any idea would be really helpful,
thank you,

infojunkie’s picture

Yes, your installation is correct, since the title is actually retrieved from the page (using QueryPath).

Can you make sure that:
a) there is an entry in the {sheetnode} table that corresponds to your newly-created node, and what are its contents?

b) there is no Javascript error that prevents the sheet from showing on screen when you view the node?

c) the import process fails on other URLs containing tables?

ah0’s picture

Thanks very much for the help,
It was all my fault , the path to the PHPExcel wasnt properly set.
All the best,
thank you,

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.