Make schema methods' visibility public in SqlContentEntityStorage [#2533282]

Problem/Motivation

The dynamically-generated schemas for entity and field storage are currently locked behind protected methods, with no easy way to get at them. One reason folks may need to access those schemas is illustrated over in #2529554: Upgrade path for 1923406 (in the Head 2 Head module). The current workaround is to hack core to make them public. If these remain protected, anybody needing to do similar updates in the future will need to do similar acrobatics to work around this.

Proposed resolution

Make the following public:

SqlContentEntityStorage::getEntitySchema()
SqlContentEntityStorage::getDedicatedTableSchema()

Remaining tasks

User interface changes

API changes

Yes, but there are no interface changes.

Data model changes

Comment	File	Size	Author
#3	entity-schema-visibility-2533282-03.patch	5.65 KB	jhedstrom
#3	8.0.x: PHP 5.5 & MySQL 5.5 12,891 pass
#3	interdiff.txt	801 bytes	jhedstrom
#1	entity-schema-visibility-2533282-01.patch	4.87 KB	jhedstrom
#1	8.0.x: PHP 5.5 & MySQL 5.5 12,482 pass, 67 fail

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Comment #1

jhedstrom

English

Portland, OR

CreditAttribution: jhedstrom at Phase2 commented 14 July 2015 at 23:15

Issue summary:	View changes
Status:	Active	» Needs review

File	Size
entity-schema-visibility-2533282-01.patch	4.87 KB
8.0.x: PHP 5.5 & MySQL 5.5 12,482 pass, 67 fail

Comment #2

14 July 2015 at 23:38

Status:

Needs review

» Needs work

The last submitted patch, 1: entity-schema-visibility-2533282-01.patch, failed testing.

Comment #3

jhedstrom

English

Portland, OR

CreditAttribution: jhedstrom at Phase2 commented 15 July 2015 at 02:42

Status:

Needs work

» Needs review

File	Size
interdiff.txt	801 bytes
entity-schema-visibility-2533282-03.patch	5.65 KB
8.0.x: PHP 5.5 & MySQL 5.5 12,891 pass

1 file was hidden/shown/deleted

File	Size
entity-schema-visibility-2533282-01.patch	4.87 KB
8.0.x: PHP 5.5 & MySQL 5.5 12,482 pass, 67 fail

This should address the failing tests.

Comment #4

Berdir

German

Switzerland

CreditAttribution: Berdir at MD Systems GmbH commented 15 July 2015 at 06:22

Assigned:

Unassigned

» plach

Making them public for update functions makes sense I think.

Assigning to @plach for his opinion.

Do we need some docblock updates to clarify when they should be used and when not?

update functions are DB specific anyway, there's nothing we can do about that. However, we should probably make sure that we don't fail hard if the used storage class isn't the one we expect. Then e.g. MongoDB can provide it's own update functions.

Comment #5

plach

he/him

Italian

Venezia

CreditAttribution: plach as a volunteer commented 16 July 2015 at 00:45

Assigned:

plach

» Unassigned

Not really a fan of this: the idea is that the storage handler is the only one manipulating the schema. Making the handler return it would allow anyone to do thing we do not want to happen, like manually manipulating the schema.

Can you foresee any legal usage aside from HEAD 2 HEAD?

Comment #6

Berdir

German

Switzerland

CreditAttribution: Berdir at MD Systems GmbH commented 16 July 2015 at 06:14

Well, it's not just head2head but also possibly every other update function that is going to be written.

That said, thinking about it, this might actually be a very bad idea.

It's the classical flaw in update functions.. you must not rely on the current state of Drupal since you might be an update function from beta 7 to 8 but the code/schema you are changing might have changed again in beta 12 to 12 and the site is doing a direct update to beta 12.

Given that, we should possibly do the same as we've always done in update functions.. hardcode the target schema changes.

And yes, that is a big conceptual problem with entity updates.. those don't know versions or dependencies or anything. They just update from old-state to new-state.

Comment #7

plach

he/him

Italian

Venezia

CreditAttribution: plach as a volunteer commented 16 July 2015 at 09:09

Well, it's not just head2head but also possibly every other update function that is going to be written.

Honestly I was hoping the Entity update system + Migrate would be enough, but I guess that's because I'm not living in the real world yet :)

Given that, we should possibly do the same as we've always done in update functions.. hardcode the target schema changes.

Yep, I'd say this is the correct approach exactly for the reason you stated.

And yes, that is a big conceptual problem with entity updates.. those don't know versions or dependencies or anything. They just update from old-state to new-state.

Initially we were talking about introducing versioning to definitions added to state, but that was deemed unnecessarily complex. Maybe we should think about that in the future...

Comment #8

Berdir

German

Switzerland

CreditAttribution: Berdir at MD Systems GmbH commented 16 July 2015 at 09:38

Yeah.. You might be interested #2532356: Provide a wrapper to core entity.definition_update_manager to temporarily disable updates as well, which is the way head2head is trying to work around the problem that nothing can run before automated entity updates.

As I wrote there, I'm wondering if we could improve it a lot simply by doing the automated entity updates after manually defined update functions. Then they can do whatever they want to fix the things that automated updates can't do (yet) like renaming fields, changing schema and so on.

Comment #9

plach

he/him

Italian

Venezia

CreditAttribution: plach as a volunteer commented 17 July 2015 at 00:22

As I wrote there, I'm wondering if we could improve it a lot simply by doing the automated entity updates after manually defined update functions. Then they can do whatever they want to fix the things that automated updates can't do (yet) like renaming fields, changing schema and so on.

That's tricky, also the current order has legit use cases: for instance, you want to pre-fill a base field column after creating it. I'm wondering whether there's any way to integrate with the update dependency system so that updates can choose whether running before or after entity updates.

Comment #10

plach

he/him

Italian

Venezia

CreditAttribution: plach as a volunteer commented 17 July 2015 at 11:14

Status:	Needs review	» Postponed
Related issues:		+#2535082: Allow hook_update_N() implementations to run before the automated entity updates

Let's talk about this again after #2535082: Allow hook_update_N() implementations to run before the automated entity updates is fixed.

Comment #11

jhedstrom

English

Portland, OR

CreditAttribution: jhedstrom at Phase2 commented 22 July 2015 at 16:37

Status:

Postponed

» Needs review

#2535082: Allow hook_update_N() implementations to run before the automated entity updates is in now, so un-postponing for further discussion.

Being able to access schema definitions would have drastically simplified both #2507201: Upgrade path for MySQL switch to multibyte UTF8 (which is still a work in progress) and #2529554: Upgrade path for 1923406 (which is committed, but since it is using hand-picked columns to focus on, I don't have confidence that the fix is complete).

Comment #12

Berdir

German

Switzerland

CreditAttribution: Berdir at MD Systems GmbH commented 22 July 2015 at 16:47

It would have, but the problem is that those functions will return the schema for the final state.

Let's assume that Node adds another base field, or renames one or a change like that (Node won't do that, but contrib definitely will).

getSchema() will then already contain that field but the table you're attempting to update might not yet. This is the same problem as with hook_schema(), an update function must never rely on that.

I think my idea in #2535082: Allow hook_update_N() implementations to run before the automated entity updates about having callbacks that are invoked if an automated schema change is not possible would allow for a more reliable implementation. You'd then simply get the old field schema/definition and the new one and can implement the code to go from old to new. head2head shows that maybe this needs to be a global event and not just a callback on entities/field types, but still I think that could work.

Comment #13

jhedstrom

English

Portland, OR

CreditAttribution: jhedstrom at Phase2 commented 28 July 2015 at 23:17

@berdir in the callback idea, does would this account for schema changes to tables/fields that have data? Or would the callbacks be generic, then fire an event that modules could listen for?

I've changed approach for #2535082: Allow hook_update_N() implementations to run before the automated entity updates over in #2540990: Need to update stored schema data since without that, core thinks the old schemas are still installed, and repeatedly tries to run failing schema updates (they fail due to existing data). The approach there could be easier :) Given that approach, I'm not sure how useful this change would be.