Hi,

I found out that the module does not currently support search indexes which hold multiple entities. Is this correct?

This is what I found out:

My setup

Versions:

  • Drupal 7.39
  • Apache SOLR 5.2.1
  • search_api 7x-1.16
  • search_api_solr 7x-1.9
  • search_api_attachments 7.x-1.6
  • Results are being rendered by search_api_views

I have 1 custom search index with the item type selected as "multiple types" (thus using multiple entity types). Next, under Datasource options -> Entity Types I selected "node" and 1 custom ECK entity.

For search_api_attachments I have followed the readme. I am using the SOLR extraction method, and SOLR is properly configured.
In the "Filters" tab of the index, I selected the File attachments with no special settings.

The problem

When I index, the contents of attached files are not being indexed. When I run a query in SOLR, the file contents aren't picked up either.

After some searching and debugging I found this:

In callback_attachments_settings.inc, when the entityType of the index != 'file' (my case), you start traversing the $items array at line 44. When a normal index on a single entity type is use, every item in the $items array contains the entity object (node etc.). But when using multiple entity types in you index, every item in the $items array is again an array containing information about the entity, and the first key is the entity object itself.

Because the entity is one level deeper in the $items array, the code stops in the second foreach where is says if (isset($item->$name)) { (line 46).

The solution?

I applied a quick fix to see if I would get any results. Right before the first foreach, where you start traversing the $items array in the big else{}, I added:

if ($this->index->item_type == 'multiple') {
        foreach ($items as $key => $item) {
          $items[$key] = reset($item);
        }
      }

I just move the entity object to the current $item, so that when you start looping through $items, the entity object is directly available.

I doesn't seem like the right solution, but it works for now as I'm now able to index file contents and can succesfully search within files.

Diff:

--- a/htdocs/sites/all/modules/contrib/search_api_attachments/includes/callback_attachments_settings.inc
+++ b/htdocs/sites/all/modules/contrib/search_api_attachments/includes/callback_attachments_settings.inc
@@ -41,6 +41,13 @@ class SearchApiAttachmentsAlterSettings extends SearchApiAbstractAlterCallback {
     }
     else {
       $fields = $this->getFileFields();
+
+      if ($this->index->item_type == 'multiple') {
+        foreach ($items as $key => $item) {
+          $items[$key] = reset($item);
+        }
+      }
+
       foreach ($items as $id => &$item) {
         foreach ($fields as $name => $field) {
           if (isset($item->$name)) {
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

screon created an issue. See original summary.

screon’s picture

I forgot about another problem:

  protected function getFileFields() {
    $ret = array();
    foreach (field_info_fields() as $name => $field) {
      if ($field['type'] == 'file' && array_key_exists($this->index->getEntityType(), $field['bundles'])) {
        $ret[$name] = $field;
      }
    }
    return $ret;
  

Since an index with item_type = "multiple" doens't return anything when getEntityType() gets called, no FileFields are returned and thus the whole functionality doesn't work. When I comment out the array_key_exists part, it does work.

I'll think about a possible solution.

izus’s picture

Category: Bug report » Support request
Status: Active » Postponed (maintainer needs more info)

hi,
actually this module supports the entity types : node, file, field collections, references and entity reference (and there is an opened issue to support comments)

the 'multiple' item_type doesn't seem default for me (do you use another module that adds this ?)

apart from that, as each entity handles the information about files differently, we have provides support to each entity type is a submodule (look the examples for entityreference or field_collections...)
if you have another entity type : you can add support to is an contribute it (if the entity type is provided in a contrib module) or just create a custom module for it.

if your concern is to search throw different indexex that have attahcments extractions data, this can be done thank's to https://www.drupal.org/project/search_api_multi.

screon’s picture

The 'multiple' option is default in search_api, but I don't know from which version. I set up a sandbox on simplytest with search_api 1.16 as a test: https://dfyi9.ply.st/node#overlay=admin/config/search/search_api/add_index.

So I guess the best solution would be to create a submodule to support the 'multiple' item type like you suggest? I'll look into it with my colleagues, and will post my progress here.

screon’s picture

Here is a first attempt. I'm not very experienced at this, so bare with me.

Basically I create a new submodule which does almost the same as the base attachments module, but I added 1 more foreach loop in alterItems() so we can pick up de file contents of the entity in the $items array. Then I write the attachments content to the $item object (for some reason, when I write it to the subitem, the search doesn't pick this up). And I changed the logic for the getFileFields() function.

screon’s picture

Status: Postponed (maintainer needs more info) » Needs review
Grimreaper’s picture

Hello,

Thanks for the patch.

To test it would you please tell us which module you use to index multiple entity types into one index because as Izus, I can't see the "multiple" option.

screon’s picture

FileSize
33.45 KB

Hi,

As I said before, this seems to be default in search_api (at least in version 7x-1.16), you can see this in this simplytest sandbox: https://dmrr9.ply.st/

izus’s picture

code in #5 seems ok but i'll wait for some feedback from users that use it.

kbrinner’s picture

For those who don't see the multiple option, it's when you are creating a new index at admin/config/search/search_api/add_index, and selecting from the select list 'item type' - 'multiple types' is the last option. See the screenshot in comment #8.

rovo’s picture

Patch in #5 worked for me.

Prior to the patch, when creating a new index: I could select multiple types, pick fields, and choose File attachments from the Filters tab; but I was not able to extract the content from files that were attached to my custom entities.

After the patch, when creating a new index: I could select multiple types, pick fields, choose File attachments from the Filters tab; then I was able to select fields 'Attachment content: FIELD_NAME' as Fulltext types.

Thank you Screon!

frob’s picture

Status: Needs review » Needs work

This patch is made from the sites directory, it needs to be remade in the search_api_attachment's module root directory.

frob’s picture

Category: Support request » Bug report
FileSize
5.69 KB

I have rerolled the patch to just throw this submodule into the existing contrib. I haven't done a full review of the module but I can verify that it works when using multiple entity types, in my case I am using nodes and users.

Really I don't think a sub-module should be necessary to make it do this. This should just be a part of normal functionality and such I am also changing this to a bug report and not a support request.

frob’s picture

I have noticed that on multivalue fields only the last file is actually indexed, the rest are not.

frob’s picture

Status: Needs work » Needs review
FileSize
5.68 KB

I fixed the issue where only the last file's content are getting indexed.

rovo’s picture

frob, good catch on the multivalue. I've applied it and it's resolved that aspect for me.

Tested patch in #15

frob’s picture

@rovo, is the issue rtbc then?

izus’s picture

Status: Needs review » Reviewed & tested by the community

as of #16

  • izus committed 1150531 on 7.x-1.x authored by frob
    Issue #2596283 by frob, screon, izus, rovo, Grimreaper, kbrinner: No...
  • izus committed 1afd465 on 7.x-1.x
    Issue #2596283 by izus: List search_api_attachments_multiple_entities in...
izus’s picture

Status: Reviewed & tested by the community » Fixed

Thanks all
this is now merged

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.