The following error occurred when implementing the "Database search" module in combination with the content index.

Drupal\search_api\SearchApiException : SQLSTATE[42000]: Syntax error or access violation: 1118 Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs: ALTER TABLE {search_api_db_content_index} ADD `alt_13` VARCHAR(255) DEFAULT NULL COMMENT 'The field\'s value for this item'; Array ( ) dans Drupal\search_api_db\Plugin\search_api\backend\Database->fieldsUpdated()

This seems to be caused by the allowed size inside the varchar field of "search_api_db_content_index".
I created a patch to store the value inside a text type field instead of a varchar field. This is a temporary solution and does not seem like a permanent fix when we think of performance.

Can someone provide some more information about this issue and a possible solution. I'm willing to create a new and better patch from the feedback.

For everyone that experiences the issue and wants to use this current patch (not recommended for performance on large websites):

  • Remove the current fields inside your content index and re-add them to apply the new database type
  • As an alternative, a solution without having to delete the fields, you can delete the table "search_api_db_content_index" and then press save on the fields tab of the content index. Then clear the current search index and re-create it.

[EDIT]: With more information available I removed the previous subscription and changed the title since this provided an inaccurate description of the issue at hand.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hoebekewim created an issue. See original summary.

hoebekewim’s picture

Issue summary: View changes
hoebekewim’s picture

Title: Character length of 30 breaks the Database Search on large input » Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535
Issue summary: View changes
FileSize
680 bytes
drunken monkey’s picture

Title: Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535 » Database backend hits "maximum row size" when adding a field
Status: Active » Needs review
Issue tags: +Needs tests
FileSize
2.16 KB

Oh, damn, that's really a problem! Thanks a lot for reporting this.
A lot of the problem stems from us wanting to handle UTF-8 completely correctly – and thus using the 4-byte variant for text fields. That means you "only" need 65 such fields to reach the limit – and the complaints we already received about problems with the "maximum of 64 indexes" limit show that there are at least some people with such large indexes.

One thing I immediately see here, though, is that the column for "Fulltext" fields is much too large – in the denormalized table, we just store a prefix for the field, for sorting purposes – with a maximum of 30 characters. So, having 255 characters of space in the column is unnecessary. I thought this didn't have any influence (though I now realize the code also doesn't correctly restrict the index on that column to the 30 characters, so there's also a performance problem there), but for reaching the max row size it is of course relevant.
The attached patch would fix this, and even contains an update function to fix this for existing indexes. Would be great if you could test this, and maybe also review.

However, this just alleviates the problem somewhat, especially if you have many fulltext fields. If you have lots of string fields, on the other hand (which is probably more common), it's still possible to reach the limit. Fixing this, however, would be more complicated. Since we can't place an upper bound on the number of fields a search index can have, the only actual solution for this would be to detect this case somehow and, once it happens, create a second (or third) denormalized table for storing additional fields. Thinking of this, it might not even be that hard to implement, though it would surely require some effort, and we'd have to change the internal structure for the indexes' DB information. But all in all, very doable.
Once we do it, we should just add good test coverage for this, to ensure it a) works correctly and b) actually solves this problem.
I don't really see any other reliable solution.

For a custom site running into this, just manually reducing the length for some fields, as appropriate, is a simple solution. Normally, the DB server shouldn't recreate those columns, so the fix should be pretty permanent. For example, you know that the node type field will never need 255 characters.
(Come to think of it, detecting the max length automatically and adapting the DB structure accordingly should actually also be possible. However, since it also just alleviates the problem and doesn't resolve it, we might as well go with the "real" solution above right away. (Or maybe do both, for performance reasons.))

The "Needs tests" tag is just for the planned "proper solution" – if you agree, I would be fine with committing the attached patch, as a quick workaround, as-is.

borisson_’s picture

Status: Needs review » Reviewed & tested by the community

I think we can commit this patch as-is and open a followup for the multi-table solution described in #4 as a more permanent solution.

drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: -Needs tests

OK, thanks for reviewing and your feedback!
Created #2876237: Enable spreading of the denormalized index table over multiple database tables to circumvent problems with the row size limit as the follow-up and committed here.
Thanks again, everyone!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.