Problem/Motivation

The search index with the file entities can be set up in the search_api module.

File entities don't have published/unpublished state so they all get indexed regardless whether they are private or public files.

For example, a client has a webform. The search is not indexing Webform submissions. It is indexing File entities which are created when a file is uploaded to a webform submission (private path).

Proposed resolution

Wrote a data alteration plugin regarding this issue

Pls help for your kindly review and test so I can improve it, thanks

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pandaski created an issue. See original summary.

pandaski’s picture

Issue summary: View changes
pandaski’s picture

Status: Active » Needs review
drunken monkey’s picture

Thanks a lot for posting, looks great!
Just a few formatting changes, plus this:

+++ b/includes/callback_file_entity_public.inc
@@ -0,0 +1,56 @@
+      if (empty($file->uri) || strpos($file->uri, 'private://') !== FALSE) {

This should only match at the beginning of the URI, right? Then I think substr($file->uri, 0, 10) === 'private://'will be both faster and more accurate. (Or, otherwise strpos($file->uri, 'private://') === 0.)

Otherwise, I think this is RTBC. Please test/review the attached patch to make sure it still works correctly!

pandaski’s picture

Status: Needs review » Reviewed & tested by the community
+++ b/includes/callback_file_entity_public.inc
@@ -0,0 +1,42 @@
+      if (empty($file->uri) || substr($file->uri, 0, 10) === 'private://') {

Love this way :)

@drunken monkey thanks for your kind review. I think it is currently ready and we have tested it for our distribution.

drunken monkey’s picture

Version: 7.x-1.x-dev » 8.x-1.x-dev
Status: Reviewed & tested by the community » Needs work
Issue tags: -needs port to 8.x-1.x +Needs tests
FileSize
1.98 KB

Great, thanks a lot for the feedback!
Committed. Thanks again!

For D8, the attached would probably be the most sensible way to implement this. (Or would this be too disruptive for people with search indexes containing files? Should we maybe start introducing config for that processor, to easily exclude it for a specific type, now that it covers a wider and wider range of types?)
Anyways, I don’t have private files or a file index set up, so would be great if someone with real use for this could give it a try.
Also, it will need automatic test coverage, too, in any case.