Problem/Motivation
Currently, there is no limit on the extracted text in the key_value
table in the database. This can lead to extremely large table rows and MySQL errors for users that fail to tune the processor correctly (most users)
See https://www.pixelite.co.nz/article/search-api-attachments-and-storing-re... for a more comprehensive write up on the issue.
Proposed resolution
Instead of defaulting to 'no limit', how about setting the limit to something reasonable, e.g. 1MB. 1 million characters should be enough to get the sense of the document without indexing the entire thing.
Remaining tasks
- Ensure this is a good idea
- Write patch
- Update docs
User interface changes
The processor will have a default value for users that have not tuned it previously.
API changes
None
Data model changes
None
Release notes snippet
Comment | File | Size | Author |
---|---|---|---|
#9 | 3095538-9.patch | 1.92 KB | wiifm |
Comments
Comment #2
wiifmComment #3
ressa CreditAttribution: ressa at Ardea commentedThat sounds to me like a sensible approach, to avoid an error which might take a long time to debug and figure out, so thanks for blogging about it as well as creating an issue @wiifm.
Comment #4
wiifmAttached is a patch to show my thoughts here.
Comment #5
focus13 CreditAttribution: focus13 commented+1 for this patch.
If you uninstall module you need to optimize table key_value to free physical espace. See my patch https://www.drupal.org/node/2981539
and thanks for blog.
With this option we don't need to optimize table.
Comment #6
izus CreditAttribution: izus commentedhi,
thanks for spotting this.
+1 too :)
i think that we should change the default value of number_first_bytes in the form too :
that's the ('0') here : https://git.drupalcode.org/project/search_api_attachments/blob/8.x-1.x/s...
Comment #7
wiifmNew patch attached with changes needed.
Comment #8
klonos"Default the configuration to a sensible about of text to extract"
Should that be "amount" instead of "about" ^^ ?
Comment #9
wiifmNice catch.
Comment #10
izus CreditAttribution: izus commentedComment #12
izus CreditAttribution: izus commentedThis is now merged
Thanks all
Comment #13
wiifm🎉 thanks @izus! I was going to ask for a new beta, and you have even beat me to the punch there as well. Appreciate it.