I can't set an score as 0.5 with PostgreSQL
The errror.
Drupal\search_api\SearchApiException: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for integer: "0.5" [error]
LINE 1: ...ATE search_api_db_default_index_text SET score=score * '0.5'
^: UPDATE search_api_db_default_index_text SET score=score * :mult
WHERE (field_name = :db_condition_placeholder_0) ; Array
(
)
in Drupal\search_api_db\Plugin\search_api\backend\Database->fieldsUpdated() (line 1034 of
/mnt/persist/www/visitlinkoping_prod/web/modules/contrib/search_api/modules/search_api_db/src/Plugin/search_api/backend/Database.php).
Way to replicate it.
Install a Drupal Standard profile with PostgreSQL the stable one from Ubuntu 16.04. Install Search API Database.
Create the defaults server, and index, add a field like body, put it with Score 0.5, try to index, you'll get the error.
Comments
Comment #2
killua99 commentedComment #3
drunken monkeyThanks for reporting this issue! Once again, MySQL is just way laxer than Postgres regarding these things.
Well, nothing for it. Let's see if the attached SQL syntax works for all three DMBSs we support.
Comment #4
drunken monkeyOops.
Comment #7
drunken monkeyNo idea what's happening there in Postgres. This should help debugging.
Comment #9
drunken monkeyAnd this has a chance of fixing, or at least helping, with the MySQL fails.
Comment #11
drunken monkeyComment #13
drunken monkeyThis is getting exhausting …
Comment #14
drunken monkeyComment #15
killua99 commentedLet me help you.
The test is failing with this patch or is a test fail from the current branch ?
Comment #16
killua99 commentedAlso All the test are against Drupal 8.4.x, that .... is way into the future, any reason for this? what about 8.2.x ? or stable one? 8.2.5? I guess using Drupal 8.4.x does not affect the test failure.
Also I mange to control the issue with the unserialized I guess, I'm not 100% sure, but PostgreSQL do not use UNSIGNED, in that case is SERIAL, but I do not believe we're gonna have negative numbers.
The next issue is when I set a field with score 3. If I leave AS INTEGER, the postgresql has an error about "Error numeric out of range" ... Still if I set to BIGINT still there are numbers, or Scores that are way to big, I guess this is because the operation is at the end, (score * 15), that cause a really big number. I was trying to delete de index and reindex everything but I can, should I TRUNCATE the table manually?
Comment #17
killua99 commentedThis patch might work for all databases. I guess
Comment #18
killua99 commentedMeh, it fails to apply. Reroll.
Comment #21
killua99 commentedWell, when you try to edit too fast on notepad.
Comment #23
killua99 commentedI find out the root of the unserialized.
The Task.php Content Entity create the table.
On PostgreSQL that is equal to = text. That mean really short character. This need to be
It require an update.
Comment #24
drunken monkeyThanks for trying to help!
However, as you can see from the test results, #14 is already pretty good. There's just some pretty inexplicable fail on Postgres, with an illegal serialized value being stored, which I don't really know how to debug. Can you reproduce that problem/test fail locally? Debugging that would really help – I can neither reproduce nor explain it.
Using type "BIGINT" shouldn't be necessary as the database column itself uses normal "INT" as well. And I really don't know what the purpose of all these additional function calls would be.
Comment #25
drunken monkeyOh, sorry, wrote that comment before seeing your last post. Awesome work, thanks for finding that out!
Not sure how this is caused/triggered by this patch, though. But if your conclusion is true, it's a big problem in any case, I guess.
Do you know where that SQL is created, and how we can fix this on our side?
However if Postgres really creates a field with a max length of (apparently) 47 characters for a type called
string_long, this sounds to me like a Core bug.Comment #26
killua99 commentedSince patch #4 only test you could see the serialize is broken.
With PostgreSQL the way the task is safe cause in broken data. That is the process we should look first, then, the integer issue.
Comment #27
killua99 commentedIs not a core problem tho, I'm concluding that type field might work too, the thing is, the task when is saved is broken.
Comment #28
drunken monkeySorry, I don't understand this. Why do you say that this is not a Core bug?
Comment #29
killua99 commentedThe object that try to serialize is half, or less. Others objects serialized are complete. Just the one trying to change the score fails. The field string_long might work too.
Comment #30
killua99 commentedComment #31
borisson_Not sure if we should do this, but I did see a couple of things that can be improved in the patch.
We already cast to an integer, so why do it again here?
Coding standards: There's too many blank lines here, it should be like this, I think.
Does this work? It doesn't look like it should. Don't we need to, at least, do:
throw new \Exception($e->getMessage());?The task manager uses constructor DI to inject needed services, let's do that for this service as well.
Comment #32
killua99 commentedActually I see now that, my entire comment was strip away from this patch. No idea why.
I thing is, I'm agree with these changes, the really issue is, the update process of score have to be rewritten. The update score multiples for what do you have on the database, and I was trying to catch the bug, I see at the indexItem the score was rounded and checked to don't be so big. If you have a score of 200, and the multipler will be 10 the next score will be 2000, if we change the score again trying to figure why the score does not work, it will do another multiplier against the current score instead look at the token and then do the math.
Mainly this issue has to mayor for my point of view, the Class serialization, that's currently handle by the php function serialize that for Class and mostly for Entity get a broken serialize result, that's the main error on the Test. Later the score, when you update any field.
My thought is, the way the Task Entity store data should be changed to \Serialize. That will be more solid the way we rebuild an Entity from the serialize string.
This patch might not be taken for commit, this require work. and maybe a META issue to attend all this issues.
Comment #33
drunken monkeyI'm very sorry, but I really don't understand what you're trying to say, and your patch mostly looks like temporary hacks and nothing that could be committed to the module. And, again, most of the changes look like they're completely unnecessary seeing as #14 already almost passes.
Please try to explain again what the problem is that causes the test fails for #14. If it's a truncated field value – how would using a different serializer help?
If you have identified the problem and know how to fix it (and it's really not a problem in Core), then please use my patch in #14 as a base and only add the changes that are necessary to make the tests pass on Postgres.
Comment #34
killua99 commentedThe patch #14 does not work neither.
The error is clear, Drupal\Tests\search_api_db\Kernel\BackendTest::testBackend
Exception: could not unserialize: O:30:"Drupal\search_api\Entity\Index":27:{s:5:"
the taskmanager.php when serialize the Class Index get broken and with weird characters and save only the half of it. on MySQL it saved with those weird characters, but it should use the Serialize class instead the function.
On PHP documentation they mention about this with Class, that might get broken. The patch #14 to able to work with PostgreSQL that is this issue, the Serialize process has to be change.
I did an interfiff against you patch #14 to see the differences. The patch require more work, but is the only way to guarantee a good Serialize way. Also the UPDATE process of the Index is completely wrong. It do the score against the Score getting bigger and bigger the INTEGER.
Comment #35
drunken monkeySorry, but it's really almost impossible for me to tell what you're trying to say. Do you maybe have a colleague with better English who can translate? I think this is working for neither of us.
Anyways, I think you're now saying that Postgres can't handle the special characters introduced by serializing an entity? In that case, we can just serialize it's values, can't we? Using the
serializerservice would a) introduce a dependency on the "Serialization" module and b) apparently require us to know beforehand what class the serialized data should have (which we don't).Patch attached (interdiff compared to #3, which is apparently the only serious patch in this issue, without any debugging code).
If I interpret this correctly, I don't think you're reading the code right there. Or how do you think it should look?
Comment #36
borisson_So, if we want to improve the postgres handling, we should make sure that this passes on postgres?
Comment #37
drunken monkeyYes, of course! (All patches should pass on Postgres.)
So, seems like it's time for another debugging patch!
Comment #39
drunken monkeyOh. Right.
Comment #41
drunken monkeyMore debugging!
Tried sqlfiddle.com as an alternative, but that doesn't really seem to work.
Maybe I should try again to set up Postgres locally? Can't really be more tedious than this (he said, unconvinced).
Comment #42
drunken monkeyForgot the Postgres test above, but might not matter, actually: I managed to install Postgres this time and found a cast syntax that seems to work for all three DBMSs! (Although I couldn't actually reproduce the initial bug reported here with my Postgres installation, so not completely sure.)
Let's give it a try! (Interdiff compared to #39, of course.)
Comment #44
drunken monkeyWhat?
Comment #45
drunken monkeyHm, OK, seems the cast doesn't really get us anything – Postgres refuses to multiply an integer with a float literal in any case, it doesn't even get to the surrounding cast.
This workaround should work in any case, though – no idea why I didn't think of that before.
Comment #47
drunken monkeyLooks pretty good in my local tests, but to avoid problems with very large numbers (as we already saw in a different issue) I guess we should only use that if necessary, and with as small a scaling factor as possible (maximum 1000, though – who really needs more than three digits of precision there?).
Comment #48
borisson_Tests are green, I like the workaroud. It's not perfect but does sound like a good idea. Should we protect against too large numbers by using
max(PHP_MAX_INT, pow(10, min(3, $after_point_digits)));?If you don't think that's needed, we can commit this I think.
Comment #49
drunken monkeyNeeds a re-roll. Some lunatic committed a patch changing all arrays to short syntax …
No, such a check is definitely not necessary, especially for the line you quote – that's at most 1000. And the total multiplier can't be much larger. The only integer overflow that might happen is in the database table, and I don't think we can really guard against that. The problem of the DBMS, I'd say. (At least until enough people complain.)
So, if this passes, I'll commit it for RC 2.
Comment #50
borisson_Thanks for explaining! Looks good!
Comment #52
drunken monkeyOK, good to hear.
Committed.
Thanks again for everyone's work here!