Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hello,
I'm working in a use case where exist file entities that live by them self. I mean, you can upload files (ie. videos, docs, images, etc.) directly using media module without attaching them to any node. The problem is that this files are not indexed. Apparently its a viable but never requested feature (ref).
Someone know if there is a hack to index this files?
Comments
Comment #1
David_Rothstein CreditAttribution: David_Rothstein commentedThe Apache Solr File module does this, but it isn't compatible with the Apache Solr Attachments module.
I think I have a case where I'll need both features on the same site (i.e., some files indexed as attachments and linked in the search results to the entity they're attached to, and others - which aren't necessarily attached to anything - indexed as their own entities), so I'm interested in being able to do this with Apache Solr Attachments alone.
Given that the word "attachments" appears in the name of the module, I'm not sure the feature really fits here. But I'd still like it (and might have time to work on it), and it's encouraging to read in that blog post that it might be accepted as a patch :) Anyone know if it would have a chance of making it in?
Comment #2
David_Rothstein CreditAttribution: David_Rothstein commentedOK, I needed this, so I wound up writing a patch.
I tried to keep it as minimal as possible so it has a shot at the stable 7.x-1.x branch (in reality, I think a larger rearchitecting might make sense for this feature). So I use the same storage methods that the module currently uses for file attachments, and basically the same user interface also. It's still a large patch, but a lot of it is moving code around.
Features:
In order to work correctly, this patch requires that #2014067: File types provided by the File Entity module are not always recognized by Apache Solr be applied to the Apache Solr module.
Comment #3
David_Rothstein CreditAttribution: David_Rothstein commentedNote that if you're using the Apache Solr Access module, these unattached files might not show up in search results for non-administrative users (similar bug as already exists in this module for attached files: #1782936: Index file entities be with access grants from parent node if the apachesolr_access module is enabled).
Since these unattached files have no parent entities, the correct fix here probably involves something similar to the patches in #1665350: Only users with the "Bypass content access control" permission are able to search for users when Apache Solr Access is enabled for the Apachesolr User module.
Comment #3.0
David_Rothstein CreditAttribution: David_Rothstein commentedminor modifications
Comment #4
Pedja Grujić CreditAttribution: Pedja Grujić commentedNot able to apply patch, fails:
patching file apachesolr_attachments.admin.inc
patching file apachesolr_attachments.index.inc
Hunk #2 succeeded at 211 (offset 5 lines).
Hunk #3 succeeded at 246 (offset 5 lines).
Hunk #4 FAILED at 249.
1 out of 4 hunks FAILED -- saving rejects to file apachesolr_attachments.index.inc.rej
patching file apachesolr_attachments.module
Hunk #4 succeeded at 252 (offset 4 lines).
Hunk #5 succeeded at 354 (offset 4 lines).
Hunk #6 succeeded at 371 (offset 4 lines).
Hunk #7 succeeded at 407 (offset 4 lines).
Hunk #8 succeeded at 447 (offset 4 lines).
Hunk #9 succeeded at 456 (offset 4 lines).
Hunk #10 FAILED at 744.
Hunk #11 succeeded at 783 (offset 24 lines).
Hunk #12 FAILED at 847.
2 out of 12 hunks FAILED -- saving rejects to file apachesolr_attachments.module.rej
Comment #5
undertext CreditAttribution: undertext commentedI rerolled the patch.
It applies for dev branch and for current stable version 7.3.
Thanks to @David_Rothstein for saving my work hours)
Comment #6
David_Rothstein CreditAttribution: David_Rothstein commentedThe reroll looks good, but was missing one or two things (most notably the code that exposes facets for the fields attached to file entities). So I fixed that in the attached patch.
However, in the interim there have been many commits to the 7.x-1.x-dev branch, so neither this patch nor the one above apply to that branch anymore (both still work against the 7.x-1.3 release). So, marking "needs work" for that.
Comment #7
ryantollefson CreditAttribution: ryantollefson commentedThanks for this; I ran into a small bug... Cron was crashing (white screen) on me.
Checked Server logs and found:
PHP Fatal error: Call to a member function getExternalUrl() on a non-object in [path]\apachesolr_attachments.module on line 242, referer: [URL]
I don't really know PHP, so not sure how to fix, but here is line 242 from my file:
Comment #8
undertext CreditAttribution: undertext commentedI need 2 patches on my site : this one and the one from https://www.drupal.org/node/1854088.
They are in conflict so i combined patches to use in my make file. (https://www.drupal.org/files/issues/apachesolr-attachments-bypass-deadlo... + https://www.drupal.org/files/issues/apachesolr-attachments-index-unattac...)
Comment #9
nmillin CreditAttribution: nmillin commentedPatch #8 didn't apply cleanly to the latest dev. Tweaked patch to work with current dev.
Comment #10
vitalie CreditAttribution: vitalie commentedThanks all. Patch #9 works ok if I add:
after line 357 after applying the patch (the line is
oreach ($parent_entities as $parent_entity_info) {
Comment #11
vitalie CreditAttribution: vitalie commentedIf anyone needs it, here is the patch from #9 which includes the code from #10 and is applied agains the 7.x-1.4 version of the module.
Comment #12
jenlamptonRerolled from the correct location.
Comment #13
jenlamptonThe previous patch removed the check for file size, which was causing me some headaches. Rerolled here (it needed a reroll anyway) and added back the filesize check. Note that the file size check has moved to
apachesolr_attachments_get_attachment_text()
so that the rest of the file entity will be still be properly indexed.Comment #14
pwolanin CreditAttribution: pwolanin as a volunteer and at SciShield commentedHmm, it's not clear to me why function apachesolr_attachments_apachesolr_file_excluded() is removed and some of the other changes
You are also changing the cleanup function in a way that might leave some orphan files?
Also - looks like I have a test fixture but no tests.
Comment #15
mausolos CreditAttribution: mausolos commentedAny particular reason you're setting the filesize limit before AND after the logic check? Was this unintended?
Comment #16
nixar CreditAttribution: nixar commentedI cannot talk about #14 or #15 but I've been using the patch in #13 on a site with several hundreds of files and it's been working well.