Issue originally opened at #2297817: Do not attempt field storage write when field content did not change and kept hostage for a year because of Drupal 8. Now that the Drupal 8 patch does not seems as obvious as it was and discussion is going further, I open a Drupal-7 only patch as per the backport policy states (actually it should be the other way around but the original issue is now too deep in Drupal 8 to change back).
This patch introduces a check in field SQL storage when updating or saving entities which prevents it from doing useless DELETE/INSERT when saving unchanged content. See details in the other issue, but this is a huge performance win when dealing with a high-frequency writes sites which has many fields (many start to less than 10 to see a difference).
Comments
Comment #1
pounardComment #2
Fabianx CreditAttribution: Fabianx commentedRTBC per RTBC of the simple patch in #2297817-60: Do not attempt field storage write when field content did not change.
Comment #3
Fabianx CreditAttribution: Fabianx commentedAnd this is at least major for performance reasons.
Comment #4
Fabianx CreditAttribution: Fabianx commented.
Comment #5
marcingy CreditAttribution: marcingy commentedThis can't go in until fixed in d8.
Comment #6
marcingy CreditAttribution: marcingy commentedComment #7
amateescu CreditAttribution: amateescu for Drupal Association commentedIt's fixed in D8 now :)
Comment #10
joelpittetOk so this broke a commerce site:(
Try this patch and try to save a commerce_price field with a
0
int value for the amount.Throws this as it seems to mutate the the
0
value toNULL
:Also a nitpik:
Capitalize the boolean vars for coding standards.
Comment #11
joelpittetI think this is somehow mutating the value but I don't understand why and can't create a MVC to reproduce:(
Comment #12
joelpittetAll I know is that is where the value gets dropped, (to be clear it's dropped after exiting
_field_sql_storage_write_compare()
on the$field_name = $field['field_name'];
line after the comparison.The
amount
is removed.Comment #13
joelpittetSo you don't think I'm crazy(this may not stop you from thinking that):
Here's the script I used:
https://gist.github.com/joelpittet/233b866eddbd1c806784
I have a taxable_price field that and was updating an existing field, I'm sure that could be modified into a test somehow.
Comment #14
pounardThis is because of the array_filter() call, a callback must be provided as second argument that only exclude
=== ''
and=== null
values, then your site will work again! Will fix that, thanks for finding this bug.Comment #15
joelpittet@pounard you likely have a better grasp of what's going on but why is the $entity->field's values changing as they are passed by value and not by reference to the comparison function, no?
That's what's freaking me out a bit and I can't come up with a MVC code in 3v4l to reproduce what I'm seeing as I step through the code.
Actually haha, you can see I just realized in my screenshot above that I tried cloning the entity before passing it in to subvert this bug. (Same thing happened with or without that hack)
Comment #16
Fabianx CreditAttribution: Fabianx commented#14: Is there anything that changed in D8 that we need to port back? I remember that the filtering was also moved somewhere else IIRC in D8 ...
Comment #17
pounard#16 I'm not sure, it's just a matter of '0' values being filtered out, maybe can you have a look to what's been commit to D8 and look for potential array_filter() calls ?
Comment #18
Fabianx CreditAttribution: Fabianx commented#2297817-71: Do not attempt field storage write when field content did not change had the same observation on the filtering in equals, it was fixed in #72.
I think we may need to do another straight port of the D8 committed patch to get the remaining fixes after the fork ...
Comment #19
pounardStraight backport seems a very difficult task, since the API is very different. I did some minor modifications to my original patch due to problems with Commerce, here is a SVN patch version (sorry can't use git from here). Posting here so that I can find it back when I'll have a real internet connection :)
Comment #20
Jordan Samouh CreditAttribution: Jordan Samouh commentedHello Pounard,
We have this kind of problem
Thanks for this patch.
Did you see some others problems by appliyng this patch?
I see the status Needs work
Thanks.
Comment #21
pounardWe actually use this patch on 2 production sites for now quite some time, one commerce and another one more standard, still did not experience any bugs. The commerce one was prone to the "0" value error, but now seem to be working fine with the latest patch.
Comment #22
discipolo CreditAttribution: discipolo commentedjust wanted to verify whether the expected result of this patch are less sql queries when saving nodes without changing fields? (as long as the node is not a new revision)
Comment #23
pounardThat's pretty much it, indeed.
Comment #24
boyan.borisov CreditAttribution: boyan.borisov at FFW for LUSH Digital commentedPatch #19 works great for me but in some corner cases if I have broken/incomplete entity I got an error like this:
array_intersect_key(): Argument #1 is not an array [error]
File modules/field/modules/field_sql_storage/field_sql_storage.module, line 461, in class="placeholder">sites/all/modules/contrib/migrate/includes/base.inc:707
I've created a patch where I do an additional check just before to use array_intersect_key()
Comment #25
mikeytown2 CreditAttribution: mikeytown2 commentedLet's see what the testbot does
Comment #31
pounard#19 patch does not apply because it's not a GIT patch, but an SVN-like patch, it needs to be applied manually using the
patch -p0 < PATCHFILE
command, then re-done usinggit diff > NEWPATCHFILE
then the testbot will be able to apply it.Comment #32
pounardProper patch that should work. I added #24 minor modification.
Comment #33
pounardYay ! Reviews or RTBC please !
Comment #34
DamienMcKennaDoes this need additional tests, or is the fact that all tests pass sufficient to say it's ok?
Comment #35
pounardHum good question, I have to admit I would be more comfortable myself with a few additional tests, but if core did it right, it supposed to be okay. Nonetheless this patch still did encountered a few bugs without tests failing, such as not updating
0
values which were considered as empty by array_filter() hence the_field_sql_storage_write_compare_filter_callback()
function, this means that core tests are not complete.Comment #36
DamienMcKennaIt also needed a few minor tweaks for coding standards compliance ('null' vs 'NULL', 'false' vs 'FALSE').
Comment #37
DamienMcKennaPutting this back to "needs work" as it needs to be more precise in its comparisons, and needs tests to check edge cases.
Comment #38
pounardConsidering
I think it's ok as it is, but I do agree with tests.
Comment #39
ndobromirov CreditAttribution: ndobromirov commentedHi,
Is there any progress on this? This is quite the huge performance optimization, as sparing the write to a field table will spare also the query cache entries for it, this greatly improving MySQL's performance. Note that query cache is enabled by default in MySQL and depending on the traffic patterns it might cause more harm than help if there are more writes than not. This is still environment / project dependent, but still relevant note that affects scalability.
BR, Nick.
Comment #40
pounardReal status is no one seems to care since Drupal 8 branch was created, but I'm patient :) I don't loose hope about this patch going in.
Comment #41
das-peter CreditAttribution: das-peter at Cando commentedWe're using this patch for quite a while and I just came across the first issue.
If a "broken" entity is handled you might get a notice like: array_intersect_key(): Argument #1 is not an array.
File /var/www/html/modules/field/modules/field_sql_storage/field_sql_storage.module, line 461.
_field_sql_storage_write_compare_filter()
, fairly, assumes that$item
is an array - which not necessarily is the case if the field / entity / something is broken.In my current case
$item
isNULL
- now the question is how to resolve such a situation.I suggest we simply skip the cleanup if
$item
isn't an array.Comment #42
dreamer777 CreditAttribution: dreamer777 commentedDoes this solution work with field collection fields and items?
Comment #43
hosef CreditAttribution: hosef at Hook 42 commentedWhat kind of things need to have tests written for this to be ready? I can write the tests, but I don't know what tests need to be written.
Comment #44
hosef CreditAttribution: hosef at Hook 42 commentedIt appears that the problem reported in #41 was fixed already in #24
Comment #45
Fabianx CreditAttribution: Fabianx commented#44:
The most important thing is that D8 changed some things about the approach and those need to be backported to D7 first.
Then we should backport the additional tests that D8 provided as well.
And then it can go in - possibly as opt-in experimental feature first.
Comment #46
hosef CreditAttribution: hosef at Hook 42 commentedI ran a quick test, and found that Drupal 8 also runs a delete/insert for every field/property even if only 1 thing was changed, so this needs to be fixed in Drupal 8 as well.I, also, found that the field module in Drupal 8 does not handle the actual saving of the data at all. That is handled in Drupal\Core\Entity\Sql\SqlContentEntityStorage, and the way it works is so different from Drupal 7 that backporting the Drupal 8 change does not look like a good idea.
For the Drupal 7 patch: The test coverage in the field_sql_storage module looks pretty complete already, so I would just need to write some tests for the comparison functions to make sure it compares complex fields correctly.
EDIT: I took a look at the code, and it does appear to check the field data before saving. The large number of queries I saw was because "create new revision" defaulted to true and I forgot to turn it off.
Comment #47
pounardAgree, I don't think that D8 backport would be doable in any way.
Comment #48
hosef CreditAttribution: hosef at Hook 42 commentedchanging back to Drupal 7 because I was mistaken when examining if it was an issue in Drupal 8. :)
Comment #49
hosef CreditAttribution: hosef at Hook 42 commentedHere is a patch that fixes a few more coding standards things and also adds a new test in field_sql_storage.test.
Most of the scenarios that I check for in the test should probably not happen unless a module is really mangling the data, but we should handle them gracefully anyway.
Comment #51
hosef CreditAttribution: hosef at Hook 42 commentedComment #53
hosef CreditAttribution: hosef at Hook 42 commentedI'm not sure what is going on, but there are lots of test failures that are unrelated to fields and when I re-queue it different tests fail.
Comment #54
hosef CreditAttribution: hosef at Hook 42 commentedWell, I re-queued it again and they all passed, so something weird must have been going on with one of the testbots.
Comment #55
hosef CreditAttribution: hosef at Hook 42 commentedComment #56
btully CreditAttribution: btully commentedIs this the latest working patch for D7? It does not apply cleanly against Drupal 7.54. Any advice on how to apply it?
Comment #57
oadaeh CreditAttribution: oadaeh at Hook 42 commented@btully, I just applied the patch cleanly to a current dev branch of D7 checked out with Git, so it still does apply cleanly for the situation in which it was created.
If you are applying the patch with the patch utility and/or to a stable release packaged file downloaded from this site, maybe try executing this command:
patch -p1 --merge < drupal-stop-saving-unmodified-field-data-2470619-49.patch
I found it here: https://stackoverflow.com/a/30860581
Comment #58
oadaeh CreditAttribution: oadaeh at Hook 42 commentedI think the variables being checked should be before the value they are checked against.
I think this setting could just be set before the conditional check. Then the variable will be set regardless of what happens and the else block will be unnecessary.
However, those are just nit picks that do not affect the operation, and the other changes look fine to me.
The code changes have been in production for us since mid-December without any problems.
Comment #59
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedComment #60
MustangGB CreditAttribution: MustangGB commentedThis is so nice, reduced my loading times on a badly performing page (ajax updating multiple hundreds of nodes) by something like 80%.
Comment #61
oadaeh CreditAttribution: oadaeh at Hook 42 commentedThe patch still applies cleanly to the code changes for 7.55 & 7.56, and I just re-queued the tests to verify they all still work.
Comment #62
mxr576Okay, so how we can give a boost to this patch to be committed to the core? What is remaining?
Comment #63
joseph.olstadFor those tired of waiting, I am going to make an install profile for D7 including this gem and all the other rtbc patches for 7.70 and possibly some extra performance patches, check my sandboxes in early to mid February. **UPDATE** the media_dev distribution has this patch on the latest 7.x-dev core branch and many others. ** END UPDATE **
Comment #64
hosef CreditAttribution: hosef at Hook 42 commentedI just tested it on my local machine against the latest D7 code base. It still applies, and all the tests still pass.
Comment #65
joseph.olstadBumping to 7.70. This didn't make it into 7.60
Comment #66
Fabianx CreditAttribution: Fabianx at Tag1 Consulting for The Linux Foundation commentedThis should be: != and not !== according to https://www.drupal.org/project/drupal/issues/2475237#comment-9879243
Comment #67
heddnPorting #2475237: Method FieldItemList::equals() uses type safe comparison limiting usefulness into this patch.
Comment #68
heddnComment #69
quietone CreditAttribution: quietone at Acro Commerce commentedOnly two items on wording.
Unfortunately, not confident with Drupal 7 to set this RTBC (and haven't read the full history because of that) :-(
s/sometime/sometimes/, I think.
'may stale'?
Comment #70
joshmillerWe recently deployed a Drupal 6 to Drupal 7 migration where we used this patch and #2994212: field_sql_storage_field_storage_load does use an unnecessary sort in the DB leading to a filesort to speed up a 72+ hour migration to under an hour on the same hardware. For certain data intensive work, this is a must-have and works quite well at speeding up Drupal Commerce on the day-to-day work of a site as well.
Comment #71
MustangGB CreditAttribution: MustangGB commentedSeems like the whole major/minor release idea went out the window a long time ago.
Comment #73
MustangGB CreditAttribution: MustangGB commentedComment #74
joshmillerA re-run of the test shows this is working. Back to RTBC.
Comment #76
izmeez CreditAttribution: izmeez commented@joshmiller Are you referring to the patch in comment #49 despite the comments in #58 or are you referring to the patch in #67 ? That one appears to be failing tests on php 7.2Looks like tests on patch in #67 are passing.Comment #77
Ronino CreditAttribution: Ronino as a volunteer commentedNow all tests for #67 succeeded.
I don't get the improvements like in #70, but patch #67 speeds up some of my migrations by 50 %. Great!
Comment #78
MustangGB CreditAttribution: MustangGB commentedComment #79
potop CreditAttribution: potop commented#67 works fine in our production environment more then 6 months now
Comment #80
mcdruidChanges to comments; including fixing a couple of things that didn't quite make sense (e.g. #69) and re-flowing for the line length limit in places.
I'm curious why one of these values is cast to a string in the tests:
...although I don't think it's particularly significant.
I'm also not quite sure what this assertion achieves:
...but again I don't think it's causing a problem as such.
I've looked at #2297817: Do not attempt field storage write when field content did not change to see if anything else should be backported and not spotted anything obvious.
Leaving at RTBC as I've not changed anything other than comments.
Comment #81
mcdruidIn #45 @Fabianx suggested:
This introduces a new variable whereby sites could opt out of this optimization if they need to.
NR for that change.
Comment #82
joseph.olstadJudging from the interdiff, totally makes sense and looks like similar opt out strategy that was recently put into D7 core by McDruid, since the previous patch was RTBC, I will move this back to RTBC
Comment #83
Fabianx CreditAttribution: Fabianx at Tag1 Consulting for The Linux Foundation commentedRTBC + 1 - looks great, but should be:
- opt in for old sites
- automatic on for new sites and can be disabled
This minimizes risk of data loss and optimizes potential usage.
Comment #84
joseph.olstad-1 for the opt in
I think we've already mitigated the risks enough, automated test coverage on all major versions of mysql and even automated tests on sqlite
I'd prefer to set this into the wild and we just deal with whatever comes up like we do with any other major/minor change in core/contrib.
Comment #85
izmeez CreditAttribution: izmeez commentedThis should be an opt-out. The performance difference has been reported as significant and especially important for migrations. It makes no sense to perpetuate reduced performance as the default.
Comment #86
mcdruidAs the consequence of this going wrong is data loss, I agree with making it opt-in for existing sites.
We need a Change Record (/ Notice), in which we can extol the performance virtues of this and encourage existing sites to test and opt-in to the optimisation.
Comment #87
joseph.olstadRecently the date module release caused data corruption, despite automated tests and years of work on the release, someone reported the issues immediately and it's getting worked on. yes bad stuff can happen but in the name of progress and performance, we should continue to take risks because the payoff in this case is great and we've done a best effort of mitigating the risks.
Comment #88
izmeez CreditAttribution: izmeez commentedComments #70 and #77 may be useful for a Change Record.
Comment #89
joseph.olstadQuote from comment #70
IMHO, the performance gains are worth the risk, I mean, up to 7200% performance gains, this is more than significant, it is a revolutionary improvement! Just this alone probably merits a fork of Drupal 7 to a Super 7 fork.
just keeps getting better all the time! Great work everyone
Comment #90
pounardGlad that after 6 years, it finally gets in, I wish I could go back in time :) I sadly don't maintain any Drupal 7 site anymore, but I sure hope it will help people ! Thanks everybody.
Comment #91
izmeez CreditAttribution: izmeez commentedAlmost in, keep your fingers crossed.
Comment #92
Fabianx CreditAttribution: Fabianx at Tag1 Consulting for The Linux Foundation commentedRTBC + 1, let's get this in. Thanks for making this opt-in.
For new sites it will be enabled automatically so any new migrations will be able to profit (presuming they are on new sites).
Still needs a Change Record.
Comment #93
joseph.olstadOn fresh installs this will be enabled by default, for existing sites, it's an opt-in. Oh yes the change record! Hoping someone can do that for us. If someone doesn't do it in March I will do it in April. A little busy atm.
Comment #95
mcdruidCommitted (yay!) but still needs the CR, so not marking as Fixed just yet.
Comment #96
joseph.olstadI added a draft change record
https://www.drupal.org/node/3205293
Comment #97
izmeez CreditAttribution: izmeez commented@joseph.olstad Thanks for drafting a CR. I have modified it slightly to clarify one line and added a bit on the performance with migrations.
Comment #98
joseph.olstadYour CR additions look great, thanks izmeez.
Comment #99
izmeez CreditAttribution: izmeez commentedIt's good to see this committed. There are a lot of people who made this happen including @mcdruid and @fabianx.
It may be frowned upon but I can't resist putting in a plug for #2994212: field_sql_storage_field_storage_load does use an unnecessary sort in the DB leading to a filesort with a simple patch from @fabianx. I don't have enough knowledge to comment on the SQL but it still applies to the current core and looks like a companion to this issue when it comes to improving performance while not as much as this one.
Comment #100
mcdruidThanks for the work on the CR, and thank you to everyone that contributed to this!
Comment #101
mforbes CreditAttribution: mforbes commentedI've only skimmed through the comments, so I apologize if this has already been answered. Is there any guidance on how to decide whether an existing site can opt in safely, or is the whole opt-in idea just mitigating the unknown such that safety comes only from waiting for a potential regression issue/release?
Comment #102
MustangGB CreditAttribution: MustangGB commentedBasically this, I don't believe there are any known issues, that said personally I'd implement it first on a staging/test site copy for a quick check everything is still working as expected.
Comment #103
gbirch CreditAttribution: gbirch commentedI'm coming to this very late, but a quick review of _field_sql_storage_write_compare($field, $entity1, $entity2)
makes it appear that the following code
will predictably return the wrong answer when $entity1->$field_name === NULL and $entity2->$field_name == 0 or '0' (or vice versa).
In many cases NULL is really not the same as a zero value, especially in things like questionnaires, where you need to distinguish between a boolean NO (0 by default), and a question that has simply not been answered.
In short, you can't "safely assume" that because two things are empty() they are the same.
Unless of course there is some kind of implicit assumption that field values will always be stored in an array under the field name.
Comment #104
MustangGB CreditAttribution: MustangGB commentedIf it's not then something else has gone badly wrong, as I understand it this is not the value of the field being checked, just whether the field has has data (array) or has no data (empty).
Presumably
empty()
is faster thanis_array()
also.Comment #105
gbirch CreditAttribution: gbirch commented@MustangGB Yes, you're almost certainly right about that. Sorry for the false alarm.
Comment #106
Phizes CreditAttribution: Phizes at Wunderman Thompson Technology commentedI'm sorry to have missed the progress of this ticket into core. A major warning for anyone looking to enable this functionality. As it stands, if you change the bundle of an entity, this change is not detected by this code. This will only raise it's head when the entities' old bundle type is deleted, as Drupal cleans up old field values, you will have data loss.
Please opt in with care.
We do use this in production as based on the patch from 67 with an extra check for the entity type. I did not get back to posting a patch as I was having trouble with writing tests for this situation. I can attach a patch without tests later on with the check we have in place for this as we do have a use case where we change the entity type.
While it is an edge case, it was somewhat tricky to pick up on what exactly was going on as there is a delay from the action to the point that the data loss occurs.
In short, you will have data loss if you opt in to the above behaviour in the following scenario:
Comment #107
MustangGB CreditAttribution: MustangGB commentedYes please do! Might make sense to use a new issue though as this is already pretty long.
This does seem like an edge case scenario indeed, I've never heard of anyone changing the bundle of an entity before, but even if it wasn't officially supported I don't see why it should be explicitly broken, and it seems like a pretty reasonable extra check to add.
Comment #108
izmeez CreditAttribution: izmeez commentedIf you add patch in a different queue please be sure to add a link to the new issue in this issue so others can follow. Thanks.
Comment #109
Phizes CreditAttribution: Phizes at Wunderman Thompson Technology commentedI've opened a new issue for this, with a new patch, and no tests: https://www.drupal.org/project/drupal/issues/3210388