I see the module has the option to limit the number of bytes which will be indexed, Number of first N bytes to index in the extracted string.
https://www.drupal.org/project/search_api_attachments/issues/2888827 . However this is limited to 99999 bytes.

I'd like to submit a patch increasing this limit. In our case this will allow us to index as much data as possible without passing a certain size (~1MB) which is causing 413 errors when sending to solr containers.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

weemondo created an issue. See original summary.

izus’s picture

I'm ok with increasing the limit
please submit the patch
Thanks

izus’s picture

increased it to 99999 , that's almost 1 MB

  • izus committed 8efa8cc on 8.x-1.x
    Issue #3015359 by weemondo, izus: Increase maximum size limit for...
izus’s picture

Status: Active » Fixed
weemondo’s picture

Status: Fixed » Active

Apologies for reopening, and apologies for not submitting my patch sooner. But I think we should make the limit more flexible, in my case my limit was 1MB, but others may have a different limit. The patch here makes the extracted limit behave more like the upload limit to allow the extracted limit to be specified in text eg 100, 10 KB 10 MB.

weemondo’s picture

Attaching patch this time

izus’s picture

Hi,
i didn't test it or dit a full code review but here is a quick mention that i noticed

+++ b/src/Plugin/search_api/processor/FilesExtractor.php
@@ -460,13 +460,13 @@ class FilesExtractor extends ProcessorPluginBase implements PluginFormInterface
+      '#title' => $this->t('Limit size of the extracted string before indexing in solr.'),

This is not true : it's available for all extraction methods not only solr.

weemondo’s picture

Aha, good point! Updated to remove reference to solr.

  • izus committed 682abdf on 8.x-1.x
    Issue #3015359 by izus: Add a commun validator method for...
  • izus committed 7b73b78 on 8.x-1.x authored by weemondo
    Issue #3015359 by weemondo, izus: Increase maximum size limit for...
izus’s picture

Status: Active » Fixed

hi,
it is now merged
i also refactored the code to use the same validator helper method for number_first_bytes and max_filesize settings

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.