Problem/Motivation

Currently, there is no limit on the extracted text in the key_value table in the database. This can lead to extremely large table rows and MySQL errors for users that fail to tune the processor correctly (most users)

See https://www.pixelite.co.nz/article/search-api-attachments-and-storing-re... for a more comprehensive write up on the issue.

Proposed resolution

Instead of defaulting to 'no limit', how about setting the limit to something reasonable, e.g. 1MB. 1 million characters should be enough to get the sense of the document without indexing the entire thing.

Remaining tasks

  • Ensure this is a good idea
  • Write patch
  • Update docs

User interface changes

The processor will have a default value for users that have not tuned it previously.

API changes

None

Data model changes

None

Release notes snippet

CommentFileSizeAuthor
#9 3095538-9.patch1.92 KBwiifm
#7 3095538-7.patch1.92 KBwiifm
#4 3095538-4.patch864 byteswiifm
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

wiifm created an issue. See original summary.

wiifm’s picture

Issue summary: View changes
ressa’s picture

That sounds to me like a sensible approach, to avoid an error which might take a long time to debug and figure out, so thanks for blogging about it as well as creating an issue @wiifm.

wiifm’s picture

Status: Active » Needs review
FileSize
864 bytes

Attached is a patch to show my thoughts here.

focus13’s picture

+1 for this patch.

If you uninstall module you need to optimize table key_value to free physical espace. See my patch https://www.drupal.org/node/2981539
and thanks for blog.

With this option we don't need to optimize table.

izus’s picture

Status: Needs review » Needs work

hi,
thanks for spotting this.
+1 too :)
i think that we should change the default value of number_first_bytes in the form too :
that's the ('0') here : https://git.drupalcode.org/project/search_api_attachments/blob/8.x-1.x/s...

wiifm’s picture

Status: Needs work » Needs review
FileSize
1.92 KB

New patch attached with changes needed.

klonos’s picture

"Default the configuration to a sensible about of text to extract"

Should that be "amount" instead of "about" ^^ ?

wiifm’s picture

FileSize
1.92 KB

Nice catch.

izus’s picture

  • wiifm authored 84ca55f on 8.x-1.x
    Issue #3095538 by wiifm, izus, ressa, focus13, klonos: Set a sensible...
izus’s picture

Status: Needs review » Fixed

This is now merged
Thanks all

wiifm’s picture

🎉 thanks @izus! I was going to ask for a new beta, and you have even beat me to the punch there as well. Appreciate it.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.