When Preprocessors "Transliteration" or "Ignore characters" to replace/filter out emojis, the Type of the Field doesn't matter.
Steps to reproduce
- Install Drupal with search_api and search_api_db (I've used Core 8.7.6 and sear_api 8-x.1.14)
- Add a Search Server (all default options, name Server)
- Add a Search Index (name Index, tick Datasources - Content, select Server Server)
- Go to [site_url]/admin/config/search/search-api/index/Index/fields and add field.
- Click 'Add' button on Content - Title, click 'Done'
- Click 'Save changes', leaving the Type as the default String.
- Create a Basic page with Emojis in the title.
- Run Cron to trigger indexing of the new page.
- Go to watchdog log and see this SQL-Error:
SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xF0\x9F\x92\x96 ...' for column 'value' at row 1: INSERT INTO @search_api_db_b_title (item_id, value) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array ( [:db_insert_placeholder_0] => entity:node/1:en [:db_insert_placeholder_1] => 💖 💎, Daisy Meadows 👑💎 )
Comment | File | Size | Author |
---|---|---|---|
#11 | non_text_typed_field_break_with_emojis_2022_09.patch | 1.18 KB | VasyOK |
#7 | non_text_typed_field_break_with_emojis_3060442-7.patch | 1.07 KB | Spokje |
Screen Shot 2019-06-08 at 6.13.53 PM.png | 151.96 KB | destinationsound | |
Screen Shot 2019-06-08 at 6.13.41 PM.png | 404.15 KB | destinationsound |
Comments
Comment #2
drunken monkeyThis should already be supported since a few years ago. And we even have tests for it, I’m pretty sure.
Which database system are you using?
Comment #3
borisson_Yes, make sure your database supports and is created as utf8_mb4 and then it should just work :)
Comment #4
SpokjeI re-activated this one since I'm also experiencing problems with Emojis.
It all works fine if you use Type "Text" for your Fields. If using any other type: Nastiness.
Steps to reproduce
- Install Drupal with search_api and search_api_db (I've used Core 8.7.6 and sear_api 8-x.1.14)
- Add a Search Server (all default options, name
Server
)- Add a Search Index (name
Index
, tick Datasources - Content, select ServerServer
)- Go to
[site_url]/admin/config/search/search-api/index/Index/fields
and add field.- Click 'Add' button on Content - Title, click 'Done'
- Click 'Save changes', leaving the Type as the default String.
- Create a Basic page with Emojis in the title.
- Run Cron to trigger indexing of the new page.
- Go to watchdog log and see this SQL-Error:
SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xF0\x9F\x92\x96 ...' for column 'value' at row 1: INSERT INTO @search_api_db_b_title (item_id, value) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array ( [:db_insert_placeholder_0] => entity:node/1:en [:db_insert_placeholder_1] => 💖 💎, Daisy Meadows 👑💎 )
Note: Your mileage on the exact error will vary with your chosen title for the Basic Page.
Comment #5
SpokjeUpdated Title and Issue Summary
Comment #6
SpokjeChanged Category from Feature request to Bug report
Comment #7
SpokjeIt looks like we're shooting ourselves in the foot in Class
Drupal\search_api_db\DatabaseCompatibility\MySql
where we settables to the Emoji-unfriendly
utf8/utf8_general_ci
whenever we _not_ use Type text:https://git.drupalcode.org/project/search_api/blob/8.x-1.x/modules/searc...
According to the comment above this code, this is used to be able to use the full 255 characters as a primary key.
My patch will probably completely ignore that, but I'm curious to see what current tests will break.
I think at the very least we would need something filtering out non-utf8 characters when not using utf8mb4 to prevent the mention MySQL error. And also, of course a Test that proves my current reproduction steps are valid.
Comment #8
SpokjeComment #9
SpokjeRight, that didn't go well...
On my setup (D8.7.6, search_api 1.14 with locally
Ver 15.1 Distrib 10.3.9-MariaDB, for Win64 (AMD64)
and on serverVer 15.1 Distrib 10.1.41-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
I only see primary keys with the maximum length of 150, so for my case, I think I could get away with my patch (anddrush pmu search_api_db
,drush en search_api_db
,drush sapi-c
anddrush sapi-i
)Maybe a forced processor 'Ignore characters'
so
when using Field Type String?Anyway, we need a failing test to reproduce first. I'll try to create one tomorrow, please feel free to create one before I can.
Comment #10
SpokjeComment #11
VasyOK CreditAttribution: VasyOK commentedThanx Spokje! 🤝
Your solution works.
While patching with Composer, it write:
Could not apply patch! Skipping.
In current version of serach_api this fragment something changed:
Actual patch attached.
Yes, after uninstall search_api_db some settings are gone.
So is needed:
It's comfortable with drush: