Summary / Motivation
Steps to reproduce:
Given I have an entity. In early revisions of the entity, I had data in a field. In the live version, that field is empty.
When I go to field UI and edit that field, even to make a harmless change, Field API decides that field has no data, and erases the history.
Proposed resolution
Let field_has_data use the revisioning data instead of the "current" data.
This works, because there can be data in field_revission tables AND no data in the field_data tables, but not the other way round.
If we have data in field_data, we always have the current data duplicated in the field_revision table.
Therefore following patch fixes the issue:
diff --git a/modules/field/field.module b/modules/field/field.module
--- a/modules/field/field.module
+++ b/modules/field/field.module
@@ -948,6 +948,7 @@ function field_get_items($entity_type, $entity, $field_name, $langcode = NULL) {
function field_has_data($field) {
$query = new EntityFieldQuery();
return (bool) $query
+ ->age(FIELD_LOAD_REVISION)
->fieldCondition($field)
->range(0, 1)
->count()
Comment | File | Size | Author |
---|---|---|---|
#39 | interdiff-10-39.txt | 901 bytes | nlisgo |
#39 | field_has_data_looks_at-2278583-39.patch | 3.24 KB | nlisgo |
#30 | field_has_data_looks_at-test-only-2278583-30.patch | 3.9 KB | Berdir |
#22 | field_has_data_looks_at-test-only-2278583-22.patch | 2.82 KB | nlisgo |
#10 | 2278583-field-has-data-test-only-10.patch | 2.38 KB | nlisgo |
Comments
Comment #1
Fabianx CreditAttribution: Fabianx commentedTagging Novice to create a proper patch
Comment #2
Fabianx CreditAttribution: Fabianx commentedThe novice part includes creating tests - though that is more intermediate novice:
What needs to happen as part of the test:
* Create an entity with a text field that is populated
* Create revisions of this entity with different contents of this text field
* Create a new revision of this entity where the text field is empty
* Run field_update_field() on the field base.
* Ensure the database still has the revision data.
Comment #3
joshi.rohit100Please review the submitted patch.
Comment #4
dcam CreditAttribution: dcam commentedBumping to critical due to data loss.
This issue doesn't seem to affect D8. I used the same test procedure outlined in the "Steps to reproduce" in D8 and D7. I was only able to confirm it for D7.
Since this situation probably doesn't have any test coverage it may have to be added for D8 as well as D7.
Comment #5
dcam CreditAttribution: dcam commentedComment #6
nlisgo CreditAttribution: nlisgo commentedI am going to attempt to write some tests for this. I'll give a progress update in the next couple of days.
Comment #7
nlisgo CreditAttribution: nlisgo commentedComment #8
nlisgo CreditAttribution: nlisgo commentedI have a patch that I will upload shortly. I will upload a test-only patch and then a patch with the fix.
I know the default behaviour of Drupal is to store a duplicate entry of the current field data in the revisions table. But when writing tests I would expect field_has_data to return TRUE if there is a current value for the field but no entries in the revision table.
I will supply a 3rd patch with a fix that addresses this issue.
This would mean that the sites that seek to by-pass the revisions table can still expect the field_has_data function to return TRUE as is the behaviour when they enable and configure https://www.drupal.org/project/field_sql_norevisions.
Comment #9
nlisgo CreditAttribution: nlisgo commentedThis solution covers the fix provided in comment 3. But I have written a test that is expected to return FALSE if data is only found in the field table but not in the revision table and that simply doesn't feel right.
I will upload the next set of patches in another comment.
Comment #10
nlisgo CreditAttribution: nlisgo commentedThis patch extends the fix to check for values in the revision table first and if none are found then it will look in the field table. The same number of tests are run but the final test is now expected to be TRUE instead of FALSE.
Please review.
Comment #11
Fabianx CreditAttribution: Fabianx commentedOh, and that is a very good catch.
Because IIRC D8 now even supports that case to not even create the revision tables - maybe could also check the table definition? (but needs to be different from D7 to D8)
Not yet reviewed the patch, though.
Comment #14
Fabianx CreditAttribution: Fabianx commentedRTBC, looks wonderful to me.
@David Rothstein might want us to port the test coverage to D8 first though.
Comment #15
nlisgo CreditAttribution: nlisgo commentedWe are going to need a D8 test for FieldStorageConfig::hasData() that covers this behaviour.
This test is not necessarily expected to fail as it was for D7 before the introduction of the patch but we need to protect against regression. The bug has not been reported in D8.
In D8 there are tests for the FieldStorageConfig::hasData() in Drupal\field\Tests\FieldDataCountTest::testEntityCountAndHasData() but they do not cover the behaviour described in this issue. This seems the natural place to put the new tests.
I am setting to 'Unassigned' and 'Needs work' because I lack experience with D8 but would be happy to review a fix.
Comment #16
Fabianx CreditAttribution: Fabianx commentedComment #17
nlisgo CreditAttribution: nlisgo commentedThis is a test only patch for D8. I am creating a revision of an entity with a field value then creating a new revision of the entity with no value for that field and hasData() returns FALSE when they expected behaviour may need to be TRUE. However, if this is not an issue in D8 t may be because the data is not deleted when a field is updated (field_update_field in D7) based upon the return value of hasData().
Comment #19
BerdirCan we set it to an empty array instead? unsetting fields is doing weird stuff atm internally.
Looks like the data isn't lost in 8.x, but it probably still makes sense to keep the behavior of hasData() in sync between 8.x and 7.x and check the revision table here as well? Possibly dynamically based on whether the entity type is revisionable, there's still the hope that we can avoid creating field revision tables for non-revisionable entity types in 8.x
Comment #20
nlisgo CreditAttribution: nlisgo commentedImplemented the amend suggested in #19.
Comment #21
nlisgo CreditAttribution: nlisgo commentedComment #22
nlisgo CreditAttribution: nlisgo commentedIgnore patch in #20, this is the same but I have renamed the file to indicate that it is test-only.
Comment #25
nlisgo CreditAttribution: nlisgo commentedThe reason that we can not recreate this is D8 is because an entry is created in the field table even if a value was not supplied. If you follow the steps to recreate in D8 and then delete the entries in the field table and ensure there are still entries in the field revision table then updating the field will case the revision data to be deleted and the hasData method to return FALSE.
I'm sure that is a bug. I'm just trying to determine whether it is a known issue already or I should open a new issue.
Comment #26
swentel CreditAttribution: swentel commentedre: #25, maybe related with #2279405: ContentEntityDatabaseStorage inserts bogus NULL values for new entities and/or #2232477: Fatal when adding new fields with NOT NULL constraints in a base table that contains existing entities ?
Comment #27
nlisgo CreditAttribution: nlisgo commented@swentel, I think you're spot on. The good news about this task is I believe it can progress without awaiting resolutions on those issues.
We need to provide a patch for D8 to return TRUE for FieldStorageConfig::hasData() if there are entries in database for revisions or current field data.
Comment #28
Fabianx CreditAttribution: Fabianx commented@nlisgo: That means we could still reproduce it by deleting values from the DB manually?
Nice that we have that test :) - It would make the other issues fail.
Lets continue with same patch as in D7 here and then move it finally back to there :).
Thanks for all the progress!
Comment #29
nlisgo CreditAttribution: nlisgo commented@Fabianx: that is correct. I confirm that you can reproduce this issue in D8 if you delete the entries in the field table manually. Even though there are entries in the field revision table FieldStorageConfig::hasData() will return FALSE. If you update the field storage settings when there are no entries in the field table then all of the data in the field revision table will be deleted.
When the issues referenced in #26 are resolved this bug will be very reproducible.
Comment #30
BerdirI am a bit confused..
The test is failing right now in 8.x,which kind of proofs that the same problem exists? Without manually deleting anything? I did run the test manually and debugged in and the field data table was empty as expected, so that is why the query returns FALSE, otherwise it would not?
The attached change fixes the test and seems to be the right thing to do here?
Comment #31
nlisgo CreditAttribution: nlisgo commented@Berdir, would you be able to post up an interdiff?
Or maybe I am misunderstanding you. Did you have to change the test at all?
Comment #32
BerdirNot worth it, the only difference is the fix, the test is identical to what you posted. That's why I didn't bother with an interdiff or test-only patch.
Comment #33
nlisgo CreditAttribution: nlisgo commentedThanks, I just misread your message. I will review this tomorrow. Looks good on first glance. Thanks a ton for your help on this.
Comment #34
Fabianx CreditAttribution: Fabianx commentedThe fix looks great, RTBC + 1. Final RTBC can com from nlisgo tomorrow.
Comment #35
swentel CreditAttribution: swentel commented+1 indeed
Comment #36
nlisgo CreditAttribution: nlisgo commentedRTBC + 1
I am going to move to RTBC for the D8 patch.
I think this is ready. I've read and reviewed the code change by the patch in #30. FieldStorageConfig::hasData() now returns the value we expect. In future this code may be revisited for optimisation would make that improvement non-critical and future development would be helped by these tests in place.
What has become apparent is that not only was FieldStorageConfig::hasData() not returning the expected value but neither was SqlContentEntityStorage::countFieldData().
The patch to the countFieldData() does not break any other existing tests on that method. I have no idea what direction we will be going in with allowing revisions to be disabled for content. Is this a setting that will only be permitted on a content type level or will we be able to decide that a specific entity will not store revisions. If we ever move in that direction and we have entities that use a single field, some with only entries in the data tables and some with only entries in the revision table,s then this code would need to be refactored. But that's ok :)
I would like to resupply the D7 patch before it's committed because of a typo in an annotation. Shall I wait for the D8 patch to be committed first?
Comment #37
alexpottCommitted 03662aa and pushed to 8.0.x. Thanks!
Comment #39
nlisgo CreditAttribution: nlisgo commentedJust changes to the annotation of the patch supplied in #10.
Comment #40
BerdirNote that my testing in 8.x showed that we already create the revision tables only conditionally, and already implements it dynamically based on the revisionable flag of the host entity type. Would not even work otherwise. So it doesn't need any follow-up or further improvements.
Comment #41
Fabianx CreditAttribution: Fabianx commentedRTBC for D7 as well :) - Nice!
I think our only choice for D7 is to check both tables as done here.
Looks great to me!
Comment #42
David_Rothstein CreditAttribution: David_Rothstein commentedCommitted to 7.x - thanks!
The Drupal 7 tests might be improved a bit by having them actually use the API (or user interface) to get into a realistic situation in which the revision table has data but the field table doesn't, rather than inserting into the field tables directly? But it works fine as is, and that could certainly be a followup if desired.