Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
One of the mappers that never quite made it to FeedAPI was a Imagefield/Filefield mapper.
The clostest patch that almost made it is: #319538: Mapper for FileField / ImageField
Especially with the ability to use Feeds as an import engine, a mapper like this would be a lifesaver for someone trying to import many images or files into a drupal site.
Comment | File | Size | Author |
---|---|---|---|
#36 | 624088-36-feeds_file.patch | 23.95 KB | alex_b |
#35 | 624088-35-feeds_file.patch | 6.66 KB | alex_b |
#34 | 624088-34-feeds_file.patch | 23.52 KB | David Goode |
#28 | 624088-27_filefield.patch | 23.56 KB | alex_b |
#19 | 624088-19_filefield.patch | 18.93 KB | alex_b |
Comments
Comment #1
Gerard McGarry CreditAttribution: Gerard McGarry commentedCan I just add my voice to this request? I've been experimenting with FeedAPI to create a news aggregation site using Drupal. In order to create an even remotely interesting layout, importing images from the feed and refactoring them for Imagefield would be a critical part of the functionality.
How difficult would it be to port the ImageGrabber module across to work with Feeds? I'd be happy to help testing this against one of my test servers if that's of any use.
Comment #2
pbuyle CreditAttribution: pbuyle commentedHere is a patch with a filefield mapper and its test case.
The mappers assumes a source that provides values as array of array of array of enclosures as done by FeedsSimplePieParser. It is more flexible when it comes to the what en enclosure is as it will accept objects, arrays and strings. For an object, it assume the link properties is the URL of the enclosure. For an array, it does the same with the link key. It assume a string to be the direct URL. Relative URLs are assumed to be local files and are not fetched using HTTP (or other where ini_get('allow_url_fopen') is true).
The test case needs more work since it doesn't cover all those possibilities.
Comment #3
pbuyle CreditAttribution: pbuyle commentedHere is a more complete and fixed patch.
- the test case check for filename (without extension) in created node edit pages.
- dpm() and tabbed indents have been removed
- Support both SimplePie_Enclosure and FeedResultEnclosure (see #641522: Consolidate import stage results and provide generalized enclosure class).
Comment #4
mylos CreditAttribution: mylos commentedI downloaded this patch and I tested it on my site. I have CSV file with few columns and one for images (image_url) so I have URL of image. So tell me please is possible to get those picture from that URL, download it and upload to node's CCK image_field ?
regards
Milos
PS. I got this error:
warning: Invalid argument supplied for foreach() in /home/pinefactory/pine-chest-of-drawers.com/sites/all/modules/feeds/mappers/filefield.inc on line 36.
Comment #5
pbuyle CreditAttribution: pbuyle commentedIt is not currently possible. The mapper expect the source's value to be an array of arrays of arrays. I don't think the CSV Parser can produce such an array. AFAIK, the only compatible parser is the SimplePie one.
It shouldn't be hard to fix it for your case. I would like to work on it, but I don't have enough time at the moment. An existing test case (using SimpleTest) would help.
Comment #6
jerdavisI needed to get this working with CSV data, so I took the example and made a couple of quick tweaks to allow it to work with a non-array as well. This probably needs a bit of cleanup, but works. It'll also likely need the tests updated.
Comment #7
pbuyle CreditAttribution: pbuyle commentedHere is a version that should accept yet more structures for $value in filefield_feeds_set_target. It first try to resolve it as an enclosure. If it fails but if $value is an array, it the same process is applied on all its items. A enclosure can be an object, an array or a string.
Comment #8
branislav.mikic CreditAttribution: branislav.mikic commentednice work mongolito404,
thank you a thousand times!
Comment #9
mylos CreditAttribution: mylos commentedThanks man,
You are my hero. That import it's working like charm with image cache and CCK.
Look at example: http://www.pine-chest-of-drawers.com
regards and many thanks
Milos
Comment #10
Jordi Bufí CreditAttribution: Jordi Bufí commentedHi everybody,
I've downloaded the patch and applied it to the necessary files, but there isn't a "filefield" option on the select list for sources (i'm on the "mapping for node processor" page.
I have also created an image filefield on the "feed" node type but it isn't shown on the target select list.
How is it supposed to work?
Comment #11
BWPanda CreditAttribution: BWPanda commentedWorking for me too.
Comment #12
rjbrown99 CreditAttribution: rjbrown99 commentedThis is working for me. I'm doing CSV imports of URLs to the field, and it is successfully going out and grabbing the remote image and saving them into the field. I'm able to manipulate via Imagecache. Works well. Thanks.
Comment #13
jarea CreditAttribution: jarea commentedForgive me if I am being a little thick, but assuming 624088_7_mapper_filefield.patch is the one to run with, to what do I apply this patch? And should it reside in the feeds/mappers directory?
Thanks for the clarification.
Comment #14
rjbrown99 CreditAttribution: rjbrown99 commentedDrop it into the feeds root directory, then patch -p0 < mypatchfile
Comment #15
Jordi Bufí CreditAttribution: Jordi Bufí commentedHi again,
I partly fixed my problem: I had created the image filefield on the wrong content type ("feed", not "feed item").
Now I can select the target "Image" but I don't know how to configure it for getting an image from the html of an RSS element (i.e. the first one). It's possible?
Comment #16
pbuyle CreditAttribution: pbuyle commentedThe mapper handles assigning files from enclosures found in the item to a FileField field. File linked from HTML in the item 'description' are not supported.
PS: IMHO, the best way to handle this is the create a subclass of the SimplePie or Common syndication parser that will extract the URL first linked file from the each item 'description' and add it to its 'enclosures'. It would be even better to fix the imported RSS to use enclosures instead of html link in description.
Comment #17
Jordi Bufí CreditAttribution: Jordi Bufí commentedMany thanks mongolito404!
Comment #18
rjbrown99 CreditAttribution: rjbrown99 commentedFWIW, I'm doing my imports via CSV and it works well. My imported CSV field is just a URL with the absolute path to an image.
http://my.server.com/path/to/IMAGE.JPG
Comment #19
alex_b CreditAttribution: alex_b commented- Replaced tabs indents with spaces
- Some code style fixes
Looking at this patch better, we should be able to get away with none of the enclosure object encapsulation and merely accept URLs and arrays of URLs as a mapping source. We could patch FeedsSimplePieParser to expose enclosures as arrays of URLs. This would make the patch much simpler by maintaining the same level of functionality. More powerful enclosure handling can be always added in when needed.
Comment #20
rjbrown99 CreditAttribution: rjbrown99 commentedInteresting potential issue.
I'm doing an import of CSV fields, and one field is an imagefield with a URL like:
http://somesite.com/images/IMAGE_CORRECT_NAME.jpg
Well, one of the imported image fields has the wrong URL. Instead of IMAGE_CORRECT_NAME.jpg they are just feeding me IMAGE_NAME.jpg, which results in a 404 when you visit that URL.
The result is that the feed import never completes. I have to manually stop and kill the import process. If I fix the name of the JPG to the correct name, the feed completes normally.
This is using the patch from #7.
Comment #21
pbuyle CreditAttribution: pbuyle commentedSome formats provide more than an URL for enclosures. RSS enclosures have a required length and mime-type. Other formats may provides a description of the enclousre as a text. Reducing enclosure to URL means that this information is lost on parsing and cannot be used by mapper. My current code for the mapper in #624088 doesn't include handling for this information but I see two immediate uses:
* The mime-type of an enclosure mapped to an ImageField (which is a FileField) can be used to drop any non-image enclosure
* The description of an enclosure can assigned to the $info['data']['description']
I was waiting for #641522 to be resolved (and some freetime) to work on this. The idea is to rewrite _enclosure_resolve to always return an instance of the enclosure interface defined there. Either because the passed in $enclosure is such an object or because it is a string that can be used to create an new, temporary, enclosure object for an url of a local file.
Comment #22
rjbrown99 CreditAttribution: rjbrown99 commentedOK, I have more information on the 404 issue from my previous post. In a nutshell, Feeds stops importing when I encounter an imagefield URL that results in a 404.
1) I added a few statements to filefield.inc to see which of the 'if' paths I was following down the Resolve URL trail in the _enclosure_resolve function. In my case, I fall into the "allow_url_fopen" camp. The current code in this module looks like this:
Based on reading the PHP website for the copy function, perhaps it should be enclosed in an if statement to handle the condition where it fails to copy? This is their demonstration code:
2) I noticed that if I 'cancel' the import, and then re-import, it picks up at the item AFTER the 404 item that stops it. So at least for me, if I encounter a stuck import I can cancel, re-start, and it will keep going.
Comment #23
pbuyle CreditAttribution: pbuyle commentedis (roughly) equivalent to
So copy failures should be handled. You should have an error message in the watchdog. But the import shouldn't freeze. So obviously something is wrong.
Comment #24
rjbrown99 CreditAttribution: rjbrown99 commentedHm good point. I realized that after I read about what ? does :)
Here's my log entry:
Line 100 is return copy($url, $filepath) ? $filepath : 0;
That looks about right as a watchdog error. I'm not sure why the import stops. I left it for hours and it was completely frozen. I'm going to work on a few more imports over the next few days and will report back.
Comment #25
q0rban CreditAttribution: q0rban commentedSubscribe
Comment #26
alex_b CreditAttribution: alex_b commented#24: I can't confirm that a 404 would result in a stuck import process. I'm also "falling in the allow_furl_open" camp.
I did see though that some of the files are HUGE. I've will change them to a smaller size.
This is the error I get when hitting a 404 resource:
Comment #27
rjbrown99 CreditAttribution: rjbrown99 commentedI can reproduce this if I delete the nodes that correspond to the imported items that have the broken link and then re-process the import file. Can I add any additional debug details or statements that could help to shed more light on it? I can also grab packet captures off of the wire if it would be useful.
My CSV import files aren't huge - on the order of ~1000-2000 rows/items.
Comment #28
alex_b CreditAttribution: alex_b commentedI'm trying to refocus the patch after #641522: Consolidate import stage results and provide generalized enclosure class got larger in scope (my fault - sorry mongolito404). However, I think we can get strong enclosure support (mongolito404 explained in #21 why we need it) and filefield mapping with one patch - see attachment.
This patch builds on #7/#19, all tests are passing. The most important modifications are:
class FeedsElement
that encapsulates any single mapping source element. A source element can now be either a simple data type (string, number) or aFeedsElement
or an array of either of the former two.class FeedsEnclosure extends FeedsElement
that supports additional methods like getContent() that are particular to enclosures.FeedsSimplePieEnclosure
is a simple adapter derived fromFeedsEnclosure
that encapsulatesSimplePie_Enclosure
object.These changes simplify the code in filefield.inc substantially while creating a straightforward foundation that future special source elements can extend.
Open tasks/questions:
$enclosure->link
which I didn't take over in this patch - which parser produces this format? Am I missing something on this thread?FeedsElement
we should update existing mappers to check for it.field_file_save_file()
which forces us to download a resource to a file and then copy it again because it asks for a local file?file_save_data()
fallback which kicks in ifallow_furl_open
is FALSE inFeedsEnclosure::getFile()
is broken. I think it was also broken in the previous patches. It essentially downloads empty files.Comment #29
alex_b CreditAttribution: alex_b commentedNot working on this atm.
Comment #30
alex_b CreditAttribution: alex_b commentedMore comprehensive title as #28 formalizes feed source elements better.
Comment #31
JayCally CreditAttribution: JayCally commentedHow can you get this to work with imagecache? It works great but doesn't create the imagecache files. So I get a broken image link for the thumbnail and small in story image. Anyone know how to implement imagecache?
Comment #32
BenK CreditAttribution: BenK commentedSubscribing....
Comment #33
rjbrown99 CreditAttribution: rjbrown99 commentedUpdate: I don't think my 'hung' feed import is due to a 404'd item. I think it's due to a different issue I just created: #681160: Scalability - PHP Memory exhausted stops the Feed import.
Comment #34
David Goode CreditAttribution: David Goode commentedFixed problem number 5 in alex's #28 post and rerolled. The issue was just a minor misuse of the drupal api calls.
David
Comment #35
alex_b CreditAttribution: alex_b commented#34: David - thanks for the review and the fix. I have done some minor cleanup of the patch and added the __toString() method to FeedsElement which should allow us to treat FeedsElement objects as strings.
Will commit now in
#28 1. deferred - not critical
#28 2. This is fixed by using __toString() method in FeedsElement.
#28 3. deferred #682508: FeedsSyndicationParser: support enclosures
#28 4. deferred - not critical
#28 5. fixed in #34
#28 6. deferred - not critical
Comment #36
alex_b CreditAttribution: alex_b commentedAdding missing files. Disregard patch in #35.
Comment #37
alex_b CreditAttribution: alex_b commentedCommitted, thank you everybody for your work on this patch.
http://drupal.org/cvs?commit=313192
Comment #38
arski CreditAttribution: arski commentedHi,
I just patched my feeds module and everything seems to be in place. However, when I try to run an import from a .csv file to create some nodes I get the following error:
Fatal error: Class 'FeedsEnclosure' not found in .../httpdocs/sites/all/modules/feeds/mappers/filefield.inc on line 56
I can find this class where it was supposed to be added by the patch but something seems to be failing.. do I need to do something special to make Drupal catch up or something?
Thanks a lot,
Martin
Comment #39
rjbrown99 CreditAttribution: rjbrown99 commentedTry using HEAD. I think Alex moved that class to a location that can be found by the other supporting modules.
See this issue: #686462: Move FeedsEnclosure class to FeedsParser.inc
Comment #40
arski CreditAttribution: arski commentedHi,
Just tried with HEAD.. same result.. maybe its not loaded somehow by the file mapper? :(
Thanks,
Martin
Comment #41
arski CreditAttribution: arski commentedHi again,
I guess the latest patch wasn't in HEAD yet, I just applied the patch in that other thread and it works.
Thanks a lot.
Comment #43
netentropy CreditAttribution: netentropy commentedwill this make it into a -dev verison in the future?
Comment #44
publicmind CreditAttribution: publicmind commentedJFYI, FeedAPI ImageGrabber has been ported for Feeds module, (http://drupal.org/project/feeds_imagegrabber). have a look, and if you can, please provide feedback.
regards,
publicmind
Comment #45
netentropy CreditAttribution: netentropy commentedi will check it out but i remember from the older version that it doesn't work if someone loads an image from their pc without spaces, then imagegrabber won't read the cck html code and grab the image
Comment #46
publicmind CreditAttribution: publicmind commentedits not known to me, please describe it in the issue queue of FIG, so that proper action can be taken to resolve it.
regards,