If you create a field, delete it, and try to create another one with the same name, it will fail, because the original field is still in the table with the 'deleted' column set.

We can *probably* resolve this. Create another column for "original name" in field_config and remove the original name from the real column. Also, we'll have to rename the field data table so when a new field gets create the field data table name is available.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

yched’s picture

Hm - but what if that new field gets deleted as well before the GC is done ?

bjaspan’s picture

Right. I thought of that but forgot to mention it. At delete, we need to save original_name, and rename the table to a new *unique* name whose name we also store.

KarenS’s picture

And keep in mind that we don't want to create a new name for the deleted item that they might want to use for another field, it should be a nonsense name that would never actually be needed, maybe rename it to the timestamp value or something like that.

bjaspan’s picture

Priority: Normal » Critical
bjaspan’s picture

Assigned: Unassigned » bjaspan

I'm working on this. Unfortunately, like so many other things, it is not simple.

This may be the issue that forces the Field API to stop using field name as the implicit "primary key" of fields.

bjaspan’s picture

Status: Active » Needs review
FileSize
18.9 KB

This probably isn't done but is far enough along for a review.

The basic ideas:

* Field data tables are named based on their id, not just their name (good thing we have pkey ids, eh?). This eliminates all problems with data table names.
* When a field is deleted, fc.field_name is set to NULL but fc.original_name is preserved. I suspect this is a hack and in fact fc.field_name should just become non-unique, with the API-enforced rule that at most one row per field name w/ deleted=0 can exist. But we'll have to check/fix some other code for that to work.
* fci.field_name is no longer part of a unique index; fci.field_id is.
* field_read_fields() returns deleted fields keyed by field_name . field_id. This is also a hack, but I think there is no code that cares.

The tests prove it works. :-)

As a reminder, this issue is a total release blocker. D7 can't ship if field names can't be reused; we'll be roasted. Actually writing the bulk-delete cleanup code is much more optional.

bjaspan’s picture

It turned out not to be as bad as I thought in #5. When we pass a $field_name to the Field API, we can just interpret it to mean "the non-deleted field/instance with this field name, if one exists." Then only the functions that deal with deleted fields/instances need to be more complex.

The hack in field_read_fields() could be fixed by having its return array keyed by field id instead of field name, or just having it be numerically indexed like field_read_instances().

bjaspan’s picture

New patch. I changed fc.field_name not to have a unique index on it, so the fc.original_name column is no longer needed. We enforce the uniqueness of non-deleted field names in field_create_field() so we do not need the database to do it. I think the only function that needs to be fixed for this change is field_read_fields(), and I changed it so that if it is possibly returning deleted fields it keys the return array by field id, otherwise it keys the return array by field name as it previously did.

yched’s picture

Looks right to me.

- About field ids in the name of field data table names : do you think it would work if we did that only for deleted fields ? Usual name remains 'field_data_[field_name]', without any numeric id that jumps at you in PHPMyAdmin or when debugging a Views query, and the name remains constant across site instances (dev, staging, prod...), no matter what field create/delete ops actually happened on each instance. On field delete, we rename the table to 'field_data_deleted_[field_name]_[field_id]' or something (maybe we don't even actually need the field name in there...). Numeric ids in the names of those tables are harmless.

- Proposed solution for the keys of the field_read_fields() result does the trick. As you point above, having the function just return a numerically indexed array of $field structs, like field_read_instances() does, is acceptable as well IMO.

Other remarks, from 'detail' to plain 'painful nitpicking':
- Double spacing - die-hard double-click habit on that space bar, hm ? :-)
- typo : "to control which one will you get"
- do we need the explanation in field_read_field / field_read_instance PHPdoc, actually ? Can't we just say "This function will not return deleted fields, use field_read_fields() instead for this purpose.". I feel the explanation actually makes it unclear whether it will return unpredictable results.
- $skip_deleted: the function param is 'positive' (include deleted), so internally using a 'negative' boolean reverses the logic.
- + // used by field_delete_field() among others. We should fix the comments just below, with function names without trailing ().
- there seems to be a +[spaces] line in hunk @@ -239,10 +240,10 ?

(+ that ? modules/field/modules/combo line at the top of your patches will get you into troubles ;-) )

yched’s picture

Another minor point :
- should we change _field_sql_storage_schema() so that the table 'description' for a deleted field is something else than ''Data storage for field [field_name]' ?

bjaspan’s picture

I started off with a plan to rename field_data_fieldname to field_data_deleted_fieldname_id. I changed to the current approach for these reasons:

* I don't know *for sure* that renaming a table is a nearly zero-cost operation on all database systems and I do not want to add an unnecessary slow step to deleting a field.
* I seemed unnecessarily complex to have the schema change by having a table renamed. Granted we should be sure that the schema is updated and its cache cleared exactly when it should be, but why worry about it? It just seems straightforward to have the table name be consistent for its whole lifetime.

I do get that this means table names will not be consistent across staged sites... but then people should not be making assumptions about SQL table names, anyway. These are not absolute reasons, though, and we could switch to table renaming.

I made all the other changes you suggested.

Status: Needs review » Needs work

The last submitted patch failed testing.

bjaspan’s picture

Status: Needs work » Needs review
FileSize
21.41 KB

Rerolled.

yched’s picture

According to DamZ on IRC, table rename should be a snap on the db backends we currently support. I have no factual evidence, though. And also there's no guarantee that it will be the case in db backends we might support in the future.

bjaspan’s picture

I'm not sure I care. What is the advantage of field_data_fieldname vs. field_data_fieldname_id (except I guess a shorter table name)? No one should be hard-coding table names in any case and it makes it very explicit what field the table is for.

yched’s picture

Status: Needs review » Reviewed & tested by the community

It's rather that we're introducing a so far unknown pattern of 'table name containing a serial number'. Well, why not, there's a first time for everything.

Worst actual use case I can think of is : you cannot manually take a (Views or other)-generated query from dev to staging, for instance to debug or profile it. No real biggie I guess. OTOH, deployment with drupal is already a complex area, so if there's a way we can not add another gotcha...

Bumping to RTBC to move it forward.

bjaspan’s picture

Bump. This issue is getting stale. FYI, whichever of this or #415044: Indexes for field storage. gets committed first, the other is going to need a re-roll.

Angie, grab me on IRC if you'd like a walk-through of this.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

I committed #415044: Indexes for field storage. so this will need a re-roll.

bjaspan’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
21.86 KB

Rerolled.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

bjaspan’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
21.86 KB

Reroll.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Looks good. Committed to CVS HEAD. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.