[meta] Make the entity storage system handle changes in the entity and field schema definitions [#1498720]

Comment	File	Size	Author
#169	et-entity_schema_handling-1498720-169-review-do-not-test.patch	232.57 KB	effulgentsia
#164	et-entity_schema_handling-1498720-164.patch	319.9 KB	plach
#164	et-entity_schema_handling-1498720-164.interdiff.txt	9.76 KB	plach
#149	et-entity_schema_handling-1498720-149.patch	256.46 KB	plach
#149	et-entity_schema_handling-1498720-149.interdiff.txt	3.63 KB	plach
#149	et-entity_schema_handling-1498720-149.interdiff-142_145.txt	16.9 KB	plach
#145	Drupal_module_updates___Drupal_8_x_-_DEV.png	188.43 KB	plach
#145	Drupal_entity_schema_updates___Drupal_8_x_-_DEV.png	224.87 KB	plach
#145	Status_report___Drupal_8_x_-_DEV.png	90.94 KB	plach
#145	et-entity_schema_handling-1498720-145-part2.txt	46.29 KB	plach
#145	et-entity_schema_handling-1498720-145-part1.txt	255.37 KB	plach
#142	d8_storage.patch	248.01 KB	fago
#142	d8_storage.interdiff.txt	10.23 KB	fago
#113	et-entity_schema_handling-1498720-113.patch	248.26 KB	plach
#107	1498720.107.patch	141.33 KB	alexpott
#107	103-107-interdiff.txt	4.08 KB	alexpott
#103	1498720.103.patch	139.84 KB	alexpott
#103	101-103-interdiff.txt	13.82 KB	alexpott
#101	et-entity_schema_handling-1498720-101.patch	136.81 KB	plach
#100	et-entity_schema_handling-1498720-100.patch	197.83 KB	plach
#100	et-entity_schema_handling-1498720-100.interdiff.txt	884 bytes	plach
#98	et-entity_schema_handling-1498720-98.patch	197.6 KB	plach
#98	et-entity_schema_handling-1498720-98.interdiff.txt	1.48 KB	plach
#96	et-entity_schema_handling-1498720-96.interdiff.txt	19.84 KB	plach
#96	et-entity_schema_handling-1498720-96.patch	196.93 KB	plach
#96	et-entity_schema_handling-1498720-96-review.do_not_test.patch	134.13 KB	plach
#94	et-entity_schema_handling-1498720-94.patch	186.04 KB	plach
#94	et-entity_schema_handling-1498720-94.interdiff.txt	3.1 KB	plach
#92	et-entity_schema_handling-1498720-92.patch	186.33 KB	plach
#92	et-entity_schema_handling-1498720-92.interdiff.txt	16.21 KB	plach
#90	et-entity_schema_handling-1498720-90.interdiff.txt	3.91 KB	plach
#90	et-entity_schema_handling-1498720-90.patch	179.24 KB	plach
#88	et-entity_schema_handling-1498720-88.patch	178.37 KB	plach
#6	entity-schema-1498720-6-do-not-test.patch	18.49 KB	das-peter
#4	entity-schema-1498720-4-do-not-test.patch	13.41 KB	das-peter
#151	et-entity_schema_handling-1498720-151-part2.txt	46.55 KB	plach
#151	et-entity_schema_handling-1498720-151.patch	299.37 KB	plach

Comment #1

plach

he/him

Italian

Venezia

commented 24 March 2012 at 21:18

Status:

Active

» Postponed

We need at least one translation schema created to go on with this.

Log in or register to post comments

Comment #2

plach

he/him

Italian

Venezia

commented 5 November 2012 at 03:34

Title:

Introduce automatic translation schema creation for the default SQL controller

» Make the default SQL storage controller automatically generate tables for every defined entity type

Log in or register to post comments

Comment #3

catch

he/him

English

commented 11 January 2013 at 12:03

Status:

Postponed

» Active

Un-postponing.

Log in or register to post comments

Comment #3.0

plach

he/him

Italian

Venezia

commented 11 January 2013 at 14:19

Issue summary:

View changes

Updated issue summary.

Log in or register to post comments

Comment #3.1

plach

he/him

Italian

Venezia

commented 11 January 2013 at 14:20

Issue summary:

View changes

Updated issue summary.

Log in or register to post comments

Comment #3.2

plach

he/him

Italian

Venezia

commented 11 January 2013 at 14:20

Issue summary:

View changes

Updated issue summary.

Log in or register to post comments

Comment #4

das-peter commented 10 March 2013 at 12:24

Status	File	Size
new	entity-schema-1498720-4-do-not-test.patch	13.41 KB

Here's a first POC of how I could imagine this could be done.

Log in or register to post comments

Comment #5

das-peter commented 10 March 2013 at 14:34

Summary of discussion in IRC with fago:

Use the generated schema in hook_schema().
Replace the schema_settings by information from constrains and new data types where necessary.

Log in or register to post comments

Comment #6

das-peter commented 24 March 2013 at 14:46

Status	File	Size
new	entity-schema-1498720-6-do-not-test.patch	18.49 KB

Here's a next attempt.

Use the generated schema in hook_schema().

This is quite hard because at the point when hook_schema() is invoked the entity information aren't yet updated. Thus we can't access the storage handler.
I don't know if we can or should change the installation order.

Replace the schema_settings by information from constrains and new data types where necessary.

Looks doable, however there are some pain-points:

Where come default values from? There's no such constraint - and it makes no sense to have on.
How can the default constraints of a field type be changed / extended? Currently I use the same approach as before for schema_settings. Downside is that this has no effect on the validation stuff.

Log in or register to post comments

Comment #7

damien tournoud commented 24 March 2013 at 15:45

Indexes are going to be a whole world of pain.

This is quite hard because at the point when hook_schema() is invoked the entity information aren't yet updated. Thus we can't access the storage handler.

Isn't this basically just #1416558: hook_entity_info(), hook_schema(), and the field system are strongly bound to each other? One way of solving this as part of this patch is to remove the whole 'schema_fields_sql' thingie. We don't need that anymore if we enforce that the schema is generated directly from the field definitions.

Log in or register to post comments

Comment #8

yched commented 24 March 2013 at 16:24

Just FYI :

Yeah, exposing dynamic schemas for field_sql_storage tables in it's hook_schema(), based on field definitions stored in ConfigEntities, has been a absolute can of worms in #1735118: Convert Field API to CMI.
Current patch does it by reading directly from the underyling config() files, skipping entity_load() - entity_load() within hook_schema() is a road to hell currently.

But : there has been discussions as to whether field_sql_storage really needs to expose its field data tables to hook_schema() to begin with. That's a *lot* of information, that isn't used anywhere in practice.
The people I asked so far (@chx, @fago, @effulgentsia) were all of the opinion that we probably shouldn't bother.

The pros & cons might be different when it comes to entity base tables, dunno for sure, but the two (base table, field data tables) should probably be treated the same ?

Log in or register to post comments

Comment #9

damien tournoud commented 24 March 2013 at 19:20

This information has actually been very important in the past (in Drupal 7, for example, ctools is able to create relationship context automatically from the foreign keys). If tomorrow the SQL schema doesn't bring more information then the TypedData schema, of course we don't necessarily need to bother.

Log in or register to post comments

Comment #10

berdir

German

Switzerland

commented 25 March 2013 at 11:39

Issue tags:

+Entity Field API

In regards to default values, see #1777956: Provide a way to define default values for entity fields, but I'm not sure if we actually want to define that on the schema because the defaults could be dynamic I guess and you're not supposed to create database records yourself anyway.

Log in or register to post comments

Comment #11

damien tournoud commented 25 March 2013 at 18:55

Yes, I think it's fine not to define defaults on the SQL layer.

Log in or register to post comments

Comment #12

yched commented 29 March 2013 at 18:23

This information has actually been very important in the past (in Drupal 7, for example, ctools is able to create relationship context automatically from the foreign keys)

Right - though double checking, ctools does use drupal_schema() to get the foreign keys on entity types base tables, but reads foreign keys on field data tables through hook_field_schema(), not drupal_schema() / hook_schema().

Asked for feedback on "do we want to expose schemas of dynamic field data tables in hook_schema() ?" in #1735118-181: Convert Field API to CMI.

Log in or register to post comments

Comment #12.0

yched commented 29 March 2013 at 18:23

Issue summary:

View changes

Updated issue summary.

Log in or register to post comments

Comment #13

plach

he/him

Italian

Venezia

commented 1 September 2013 at 11:08

Title:

Make the default SQL storage controller automatically generate tables for every defined entity type

» Make DatabaseStorageController automatically generate tables for every defined entity type

Log in or register to post comments

Comment #13.0

plach

he/him

Italian

Venezia

commented 1 September 2013 at 11:08

Issue summary:

View changes

Update issue summary

Log in or register to post comments

Comment #14

yesct commented 8 October 2013 at 14:19

Issue tags:

+d8dx

while discussing #2057401: Make the node entity database schema sensible in irc, was pointed here while I was figuring out how the baseFieldDefinitions() in Node related to node_schema in node.install and the DatabaseStorageController and the EntityManager. (tagging d8dx since this might help with that confusion, maybe another tag would be better)

updated the issue summary.

Log in or register to post comments

Comment #14.0

yesct commented 8 October 2013 at 14:19

Issue summary:

View changes

summarized, and used template.

Log in or register to post comments

Comment #14.1

yesct commented 8 October 2013 at 14:20

Issue summary:

View changes

oops. space.

Log in or register to post comments

Comment #15

amateescu commented 26 November 2013 at 14:01

Issue summary:	View changes
Status:	Active	» Postponed

#2144327: Make all field types provide a schema() is a prerequisite for this.

Log in or register to post comments

Comment #16

effulgentsia commented 4 December 2013 at 01:31

What I like about this issue is that if you, say, swap in a MongoDb controller, then you don't need pointless tables in the MySQL db. However:

Moreover any entity can natively have multilingual properties and fields.

If I'm reading that right, then that means a different MySQL controller (say, one that wants to make menu link titles multilingual) would generate a different table. So, then, what happens when a table is made with one controller, and then a different controller is swapped in? Is the new controller responsible for altering the tables from any arbitrary schema into the one that it needs?

Log in or register to post comments

Comment #17

amateescu commented 4 December 2013 at 01:45

My initial reaction would be to say that we can't (or shouldn't?) support swapping controllers if the current tables are not empty..

Log in or register to post comments

Comment #18

plach

he/him

Italian

Venezia

commented 4 December 2013 at 02:06

Yep, a contrib module switching-in a different storage controller might try to write a migration to move data around, but I think that in the core context we should assume an empty storage. I was planning to write all the code here in the assumption there is no entity data yet. We can try to write a migration to switch between the table layouts supported by core but that would totally be a follow-up issue (and might also be contrib material).

Log in or register to post comments

Comment #19

plach

he/him

Italian

Venezia

commented 4 December 2013 at 18:03

Priority:	Normal	» Major
Related issues:		+#2144327: Make all field types provide a schema()

This is at least major as (together with #2144263: Decouple entity field storage from configurable fields) it will allow us to implement the unified storage discussed in Prague.

Log in or register to post comments

Comment #20

gábor hojtsy

he/him

Hungarian

Hungary

commented 5 December 2013 at 12:43

Priority:

Major

» Critical

Critical as per #2047633-116: Move definition of menu links to hook_menu_link_defaults(), decouple key name from path, and make 'parent' explicit and following comments.

Log in or register to post comments

Comment #21

effulgentsia commented 28 December 2013 at 08:29

Issue tags:

+beta blocker

Given that catch approved this issue being critical in the issue linked in #20, and that this is clearly data model/schema related, I'm tagging it "beta blocker" despite not having explicit approval to do so. Someone can correct me if they disagree.

Log in or register to post comments

Comment #22

plach

he/him

Italian

Venezia

commented 3 January 2014 at 22:43

Log in or register to post comments

Comment #23

chx commented 8 January 2014 at 10:06

> since all the data retrieval is handled by the controller itself. This may allow us to kill hook_entity_load() and hook_ENTITY_TYPE_load() altogether.

Erm. Nope? What if two modules want to add data? Both can't replace the controller.

Log in or register to post comments

Comment #24

plach

he/him

Italian

Venezia

commented 8 January 2014 at 23:25

Well, the original idea was that everything would be a field, so any addition would be loaded by the storage controller, which would allow for fields to specify a custom storage engine if needed (see the "Unify Field Storage" section of the related Prague Notes). Computed data would just be computed fields. That said, I am no longer sure this would extinguish the need for hook_entity_load() (or an equivalent event) so removing it in this stage of D8 development would probably be a bad idea.

Log in or register to post comments

Comment #25

plach

he/him

Italian

Venezia

commented 9 January 2014 at 11:50

To clarify: we were not proposing to support per-field storage again, just the ability to specify a different table layout than the default one.

Log in or register to post comments

Comment #26

sylvain lecoy commented 24 January 2014 at 23:35

Doctrine #1817778: Consider using Doctrine ORM for Entities has this feature already, plus it supports schema version (e.g. adding or removing fields, changing the nature of the relationship, etc.).

Log in or register to post comments

Comment #27

plach

he/him

Italian

Venezia

commented 25 January 2014 at 01:47

That's a D9 issue, is it?

Log in or register to post comments

Comment #28

sylvain lecoy commented 25 January 2014 at 19:23

Yes it is, for the record it was originally against drupal8, then postponed.

As I see this issue postponed too now, it might be interesting to no re-invent the wheel if Doctrine (like Symfony2 folks use) is chosen.

I am pretty sure drupal will use an ORM in the end and provide tools to integrates these ORM just like it has become the Java standard with Spring/Hibernate.

Just look at how symfony guys did, it will become the common standard in php in a few years.

Log in or register to post comments

Comment #29

sylvain lecoy commented 25 January 2014 at 18:10

Then every body will be amazed by ORM and DataModel visualisation and modeling through graphical IDE and @nnotations but this is a feature we had since 10 years in Java.

We'll also be able to remove the need for 'EntityInterface' and every developers will thank us for this, ORM taking the responsibility of creating tables, but also maintaining them, by deleting, updating and creating relationships (through foreign keys or not) as the conceptual data model evolves in the code by annotations.

We are not ready for this when I pledged in the issue this will be our future, but one day we'll have no choice. We don't want to rewrite code that has been tested for years, which is performant, and widely accepted as common pattern in other technologies such as persistent web servers like J2EE.

Log in or register to post comments

Comment #30

plach

he/him

Italian

Venezia

commented 25 January 2014 at 23:39

I agree this option should be seriously took into consideration in the future, but I fail to see how we could do such a big change in this phase of D8 life-cycle, honestly.

Log in or register to post comments

Comment #31

sylvain lecoy commented 26 January 2014 at 01:05

Sorry for my intrusion in the conversation, just wanted to tell you I am really excited about this field, and wanted to point you to some work in progress for drupal 9. I have also a working copy (generates entity on drupal 7 from Schema API, not yet Field API) here: https://drupal.org/project/doctrine.

Please continue to your original idea, for now its almost impossible to introduce such a big change, that's what I've been told by people in 2012 for D8 so i'm not surprised to hear the same thing now in 2014 :D You got a point.

Log in or register to post comments

Comment #32

plach

he/him

Italian

Venezia

commented 26 January 2014 at 01:39

Well, it's great that you are already exploring this idea: by the time it can be seriously considered for core inclusion, we will have some real-life data to evaluate :)

Log in or register to post comments

Comment #33

andypost

he/him

Russian

commented 26 January 2014 at 20:24

Status:

Postponed

» Active

There's no issues blocking this

Log in or register to post comments

Comment #34

plach

he/him

Italian

Venezia

commented 26 January 2014 at 21:50

Actually #2068325: [META] Convert entity SQL queries to the Entity Query API is more or less a prerequisite. However I am planning to start working on this in a few days.

Log in or register to post comments

Comment #35

plach

he/him

Italian

Venezia

commented 26 January 2014 at 23:19

2 files were hidden/shown/deleted

Status	File	Size
hidden	entity-schema-1498720-4-do-not-test.patch	13.41 KB
hidden	entity-schema-1498720-6-do-not-test.patch	18.49 KB

Log in or register to post comments

Comment #36

plach

he/him

Italian

Venezia

commented 26 January 2014 at 23:20

Log in or register to post comments

Comment #37

plach

he/him

Italian

Venezia

commented 26 January 2014 at 23:24

Log in or register to post comments

Comment #38

tstoeckler

he/him

German

Essen, Germany

commented 28 January 2014 at 04:18

Re #34: Hey there, I recently started some initial work which I just posted in #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities. Maybe we can coordinate efforts.

Log in or register to post comments

Comment #39

plach

he/him

Italian

Venezia

commented 28 January 2014 at 10:15

It looks like a duplicate to me :(

Log in or register to post comments

Comment #40

tstoeckler

he/him

German

Essen, Germany

commented 28 January 2014 at 11:43

We can certainly continue here but the patch over there is not really an a reviewable state. I posted in only because I wanted to *avoid* any duplication if you wanted to get started on this soon per #34. The only patch in this issue was very much outdated so I don't think I duplicated much work?!

Log in or register to post comments

Comment #41

plach

he/him

Italian

Venezia

commented 28 January 2014 at 11:51

No problem for the patch, but I think keeping these two issues open would be misleading. Also, it would be good to discuss the approach before diving into coding: there are some non-trivial aspects to figure out.

Btw, did you have a look to the Prague notes ("Unify Field storage" section)?

Log in or register to post comments

Comment #42

andypost

he/him

Russian

commented 6 February 2014 at 19:20

This needs to unify field and data type plugins in #2150511: [meta] Deduplicate the set of available field types

For example node.title defines itself as 'text' but have no 'format' associated.

Log in or register to post comments

Comment #43

sun

German

Karlsruhe

commented 12 February 2014 at 19:33

Parent issue:

» #2194785: [meta] Stop relying on database schema info at runtime

Log in or register to post comments

Comment #44

tstoeckler

he/him

German

Essen, Germany

commented 4 March 2014 at 13:06

Posted a new, now working patch in #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities. I kept this as a separate issue for now, as in theory we could handle the actual schema generation there, but still install the schema manually. This has turned out to be a pretty massive issue on its own. Then once that is in place, at least partly, we can figure out the dynamic creation/deletion here. Do you think that makes sense @plach?

Log in or register to post comments

Comment #45

tstoeckler

he/him

German

Essen, Germany

commented 4 March 2014 at 14:22

Title:

Make DatabaseStorageController automatically generate tables for every defined entity type

» Make FieldableDatabaseStorageController automatically generate tables for every defined entity type

also, it's now called FieldableDatabaseStorageController

DatabaseStorageController is the legacy one that's onely being used by menu link now.

Log in or register to post comments

Comment #46

andypost

he/him

Russian

commented 4 March 2014 at 14:38

Log in or register to post comments

Comment #47

plach

he/him

Italian

Venezia

commented 4 March 2014 at 21:13

I've been sketching an implementation plan in the last couple of days. I should be able to post it soon for review...

Log in or register to post comments

Comment #48

plach

he/him

Italian

Venezia

commented 7 March 2014 at 20:49

Aside from the aspects already laid out in Prague , below there is a description of how I'd implement this.

Goals

Supporting changes to the following entity type properties: revisionable, translatable
Supporting schema changes to store additional custom fields
Supporting schema changes to drop storage for removed custom fields
Avoiding the need of loading the full schema at runtime
Providing a standardized lightweight field -> table.column mapping for modules needing to perform raw SQL queries (-> Views)
Supporting schema versions and tracking changes
Supporting schema regeneration without a UI (ideally just a cache clear)
Supporting UI-driven schema regeneration (see next)
Supporting data migration between the previous schema version and the current one

Implementation overview

In the current proposal there is a new entity controller/handler (an instance of EntitySchemaHandlerInterface, actual names TBD), that is responsible for performing all the schema-related operations. The schema handler:

(re)builds the entity schema definition based on entity type/field definitions
stores the schema definition
tracks changes and schema versions
(re)installs the schema
detetcs and reports conflicts (typical example: a change in the definition that cannot be performed when data is already stored)
handles field purging

Additionally a service (implementing EntitySchemaBuilderInterface) is used to trigger the actual actions performed by the schema handler. The core implementation is owned by the Entity module, which implements hook_rebuild() to react the proper way when entity and field info are regenerated. This is the main (only?) core way to rebuild/reinstall the entity schema definition. Contrib modules could swap in a more advanced schema builder, that could provide an actual UI to trigger schema changes, show schema version diffs and start data migrations. The core schema builder refuses to install a new schema version if data is available and the required action cannot be performed. Instead it reports the inconsistency in Drupal's status report as an error.

Entity schema is initially generated and installed during the Drupal installation process. We need to check whether switching storage controller is possible during installation, so we do not need to always install the default SQL schema before swapping it out. This is definitely something we should try to avoid, as users are created during the installation process, so any alternative storage controller would need to provide a migration and do some magic to move users off their SQL table.

The entity schema definition is stored through the State service. The stored data includes:

the schema version, an integer incremented at each change;
the hash of the current schema definition, used to quickly determine whether there are schema changes
the handler class, which might be useful if the storage controller is switched on an existing installation that already has a schema change history
the schema definition data structure (the schema API array for core storage)

Default SQL implementation details

The core SqlSchemaHandler (as well as the related storage controller) supports 4 table layouts, depending on whether the entity type supports revisions, translations, none or both. When building the schema definition it loops through the field definitions and assigns each field to a "base" table ({entity}, {entity_revision}, {entity_field_data}, {entity_field_revision}) or a field table. The actual column schema definitions are picked from the field schema. By default indexes are generated based on entity keys and field schema definitions, but every entity type can provide its own specialized version of the schema handler to optimize index definitions (or any other part of it). I think we should add a new revisionable property on field definitions: this would allow the schema handler to avoid storing unnecessary/unwanted data in the revision tables.

The storage controller applies the same logic to tell in which table to store each field, so it does not need to rely on the fully generated schema. This is not necessarily a big gain as field schema definitions might be needed to perform the actual queries. However one of the goals is trying to get rid of this requirement.

Entity data handling

As I said above, the core schema handler supports performing only "data-safe" changes. This means that when data is available only the following schema changes are allowed:

Adding/dropping a field table
Adding/dropping a custom field column to/from a base/data/revision table
Switching from revisionable to non-revisionable
Switching from translatable to non-translatable

We probably need some kind of confirmation step before performing an operation that would imply data loss (dropping columns or tables). For instance this status might trigger a warning in the status report with a link to a confirmation form.

Any other change, mainly switching from a simpler table layout to a more complex one (e.g. from non-revisionable to revisionable) requires a data migration and as such is not supported. Any attempt to trigger such an action causes an exception to be thrown.

Views support and (legal) raw SQL access

Views is the most obvious example of a module legally needing to perform SQL queries directly. In D8 this would be discouraged but Views offers a vast set of SQL-specific features that would be impossible to implement otherwise. The point is that Views' flexible architecture allows for pluggable query engines: there is nothing wrong in having a SQL-specific engine, as long as it is possible to provide other ones for alternative storages. However the fact that we support swappable entity storages implies that Views cannot rely on a fixed table layout (not even assuming the core entity storage), hence we need a way for it to generate entity (views) data automatically based on entity/field definitions (see #1740492: Implement a default entity views data handler). The ways this issue could help with that are:

Providing a lightweight field.property -> table.column mapping that Views could use to generate its data. This would be implemented by a method on the storage controller. This is an example of the hypothetical return values of SqlStorageInterface::getTableMapping($field_name) (see also #2079019: Make Views use SqlEntityStorageInterface):

'foo' -> array(
  'table' => 'my_entity',
  'revision_table' => 'my_entity_revision',
  'columns' => array('foo'),
)

'bar' -> array(
  'table' => 'my_entity_field_data',
  'revision_table' => 'my_entity_field_revision',
  'columns' => array('bar__column1', 'bar__column2'),
)

'field_baz' -> array(
  'table' => 'my_entity__field_baz',
  'revision_table' => 'my_entity_revision__field_baz',
  'columns' => array('baz_column1', 'baz_column2'),
)

Allowing the schema handler to return the full schema definition.

The data above + the entity field definitions should allow to implement a base entity views data controller.

Comments welcome :)

Log in or register to post comments

Comment #49

andypost

he/him

Russian

commented 7 March 2014 at 23:19

Also this code could be re-used to generate "table-per-entity type" for comment, comment statistics and history tables that all have entity_id and int but fieldable entities could have ID as string, the approach suggested by @catch in #2081585-79: [META] Modernize the History module

Log in or register to post comments

Comment #50

plach

he/him

Italian

Venezia

commented 8 March 2014 at 02:56

If I understand correctly, those could be defined as custom fields. In that case, yes, they would get dedicated per-entity-type tables.

Log in or register to post comments

Comment #51

berdir

German

Switzerland

commented 10 March 2014 at 09:30

The problem with that is also what @catch commented in #2205215: {comment} and {comment_entity_statistics} only support integer entity ids. We have no API to interact with single fields, only entities. So making them field storage, all changes would need to save the whole entity.

The only thing we could do is to optimize internally that to only update field tables that have changed, but that's it.

Log in or register to post comments

Comment #52

fago

German

Vienna

commented 22 March 2014 at 14:35

In the current proposal there is a new entity controller/handler (an instance of EntitySchemaHandlerInterface, actual names TBD), that is responsible for performing all the schema-related operations.

I'm not sure it makes sense to have a separate controller/handler for taking care of schema changes. How or whether you have to deal with schema changes is highly storage dependent, so I'd see this as the job of the entity storage. If it's about pluggability, I'd agree that it makes sense to separate that out - either in a service injected into the storage controller, or similarly using a composite pattern as I've already suggested while moving field storage to entity storage.

That said, shouldn't be FieldableEntityStorageControllerInterface enough of an interface for handling schema changes? It needs to be generalized to work with entity field interfaces (for which we need to create a FieldStorageInterface, as part of #2116363: Unified repository of field definitions (cache + API) or as separate issue) - see #2144263: Decouple entity field storage from configurable fields. As we can generate our schema based on the field definitions now, having the changed field definitions should be enough to allow any storage to perform any necessary change actions.

That said, imo the API should not be about schema changes, but about field definition changes.

For that, I think a good first step would be to get going with adding necessary information to our metadata, i.e. #2143069: Add getProvider() and hasCustomStorage to FieldDefinitionInterface and the revisionable flag, this should be the easy/quick wins. Then make sure we've FieldStorageInterface and generalize FieldableEntityStorageControllerInterface. Then, make sure stuff depending on the tables has a way to lookup tablenames (= add an extended interface for SQL storage engines: #2079019: Make Views use SqlEntityStorageInterface). Then finally we need to test and verify that stuff does not break.

One issue we need to solve is how we'd generate field definition changes for module provided fields. Either we store previous definition versions and do it automatically, or we'd have to require developers to write suiting update functions (should be ok I think and is less auto-magic).

Finally, a one of the bigger tasks would be to make the schema generation for base tables the job of the storage controllers. A quick first step could be moving code and responsibility, while basing the base table schema on the field schema is probably more tricky - issue: #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities.

That's how I'd see, but let's discuss asap, nail down steps and get the ball rolling. Maybe let's discuss asap in szeged.

Log in or register to post comments

Comment #53

fago

German

Vienna

commented 22 March 2014 at 15:07

Log in or register to post comments

Comment #54

plach

he/him

Italian

Venezia

commented 23 March 2014 at 00:38

Given #52 I guess it's probably better to discuss this once again in Szeged before starting the actual work.

Just a quick answer: the main reason why I'd go for a separate class to handle entity schema is that it may involve quite an amount of code that we don't need to load on most requests. Obviously it would make little sense to swap the storage controller without swapping the schema handler too, so yes I see them strictly coupled in terms of actual usefulness. They would have two quite distinct responsibilities though, so having two classes would be correct IMHO.

Log in or register to post comments

Comment #55

Crell commented 23 March 2014 at 19:45

Just because schema changes live in a separate object doesn't mean they have to be part of a global subsystem that all storage drivers have to implement. This is another reason all of these objects should just be in the container, not coupled together via the annotation by class name.

It's completely legit for the entity storage handler for SQL to depend on 4 other objects, including the database service, in order to do what it needs to do. Meanwhile, the MongoDB handler can only depend on 2, neither of which map conceptually to any of the 4 sub-objects that the SQL handler uses. That's not only OK, that's a good architecture if that detail is hidden behind the storage handler's interface in ways that the calling code doesn't give a damn about.

Architect globally, implement locally. :-)

Log in or register to post comments

Comment #56

tstoeckler

he/him

German

Essen, Germany

commented 24 March 2014 at 08:02

I agree with everything that's been said so far. I also agree it makes sense to put this logic into a separate class and inject that as a service, I will refactor #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities accordingly.

For now I will keep that issue in its limited scope, i.e. I will not account for any schema changes. I will also not create the schema only on demand but will hardcode the schema creation into ModuleHandlerInterface::install(). We will need to work out how we store the information which tables have been created with which state in a separate issue. I think that is a non-trivial problem on its own.

Something like a field-definition-update API that #52 talks about sounds pretty sweet. I was not aware of #2116363: Unified repository of field definitions (cache + API), I will read up on that.

Log in or register to post comments

Comment #57

plach

he/him

Italian

Venezia

commented 24 March 2014 at 15:12

Discussed this again with @fago, @tstockler and @berdir. We decided to split this in smaller tasks. We are currently working on:

#2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities
#2143069: Add getProvider() and hasCustomStorage to FieldDefinitionInterface

@Crell:

We agreed to introduce the schema builder as a service: my only concern was the inability to have a different class per entity-type, if needed. @fago and @berdir pointed out that we can simply inject a different service, so that works for me :)

Log in or register to post comments

Comment #58

plach

he/him

Italian

Venezia

commented 24 March 2014 at 15:12

Status:

Active

» Postponed

Log in or register to post comments

Comment #59

fago

German

Vienna

commented 24 March 2014 at 16:53

Another dependency: #2144631: Add a revisionable key to field definitions

Log in or register to post comments

Comment #60

Crell commented 25 March 2014 at 01:24

Make it a service, or make it a required service that every entity type or storage handler must define?

The former is what I was saying we should do. :-) The latter I'm saying is unnecessary, or if it's necessary then it's a sign of a deeper design flaw.

Log in or register to post comments

Comment #61

plach

he/him

Italian

Venezia

commented 25 March 2014 at 01:34

I'd say the former: every storage controller can define its own list of dependencies, and I don't think we are hardcoding the need for the schema builder anywhere in the public API...

Log in or register to post comments

Comment #62

dave reid

he/him

English

Nebraska USA

commented 25 March 2014 at 02:07

/me has no idea how this would affect or work for File Entity in contrib which "enhances" the file entity type with bundle information, which core doesn't even have the schema to hold.

Log in or register to post comments

Comment #63

plach

he/him

Italian

Venezia

commented 25 March 2014 at 10:22

@Dave Reid:

I think this should let you obtain the same result by just defining a new bundle field, which would be then automatically added to the schema. It may also let you add the other stuff that is currently defined in file_entity_schema(), if you define them as additional custom fields. Not sure whether that would make sense actually, I didn't study code closely.

Log in or register to post comments

Comment #64

jessebeach commented 25 March 2014 at 12:29

Has this issue become a META for the following or is it really postponed on them?

#2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities
#2143069: Add getProvider() and hasCustomStorage to FieldDefinitionInterface
#2144631: Add a revisionable key to field definitions

Log in or register to post comments

Comment #65

plach

he/him

Italian

Venezia

commented 25 March 2014 at 15:38

For now we are planning to do some of the work described in #48 directly here.

Log in or register to post comments

Comment #66

jessebeach commented 26 March 2014 at 08:51

Issue summary:

View changes

Log in or register to post comments

Comment #67

fago

German

Vienna

commented 26 March 2014 at 09:24

Opened an issue for the field storage interface dependency: #2226197: Introduce FieldStorageDefinitionInterface in the Entity Field API.

Log in or register to post comments

Comment #68

plach

he/him

Italian

Venezia

commented 26 March 2014 at 09:26

Issue summary:

View changes

Log in or register to post comments

Comment #69

plach

he/him

Italian

Venezia

commented 26 March 2014 at 13:21

Issue summary:

View changes

Log in or register to post comments

Comment #70

plach

he/him

Italian

Venezia

commented 26 March 2014 at 13:23

Issue summary:

View changes

Log in or register to post comments

Comment #71

plach

he/him

Italian

Venezia

commented 27 March 2014 at 10:48

Log in or register to post comments

Comment #72

jessebeach commented 4 April 2014 at 14:23

Title:

Make FieldableDatabaseStorageController automatically generate tables for every defined entity type

» [PP-2] Make FieldableDatabaseStorageController automatically generate tables for every defined entity type

Log in or register to post comments

Comment #73

Anonymous (not verified) commented 5 April 2014 at 21:52

I just found this issue. OMG I will be so happy when this will be in core!!!!

Log in or register to post comments

Comment #74

plach

he/him

Italian

Venezia

commented 6 April 2014 at 15:36

Title:

[PP-2] Make FieldableDatabaseStorageController automatically generate tables for every defined entity type

» [PP-2] Make ContentEntityDatabaseStorage handle changes in the entity schema definition

The actual schema generation is addressed in #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities. We will provide a unified API to track schema changes in #2144263: Decouple entity field storage from configurable fields. Here we will have to react to any change in the entity schema definition and handle it properly.

Log in or register to post comments

Comment #75

sylvain lecoy commented 9 April 2014 at 11:59

You might be interested in looking at the reverse procedure: e.g. building an entity object from Schema API: http://drupalcode.org/project/doctrine.git/blob/48b52566638b3e31e3e0ebff...

If you have any questions i'll be glad to help :)

Log in or register to post comments

Comment #76

dave reid

he/him

English

Nebraska USA

commented 1 June 2014 at 21:15

Issue tags:		+Contributed project blocker
Related issues:		+#2258347: Consider adding hook_entity_schema_alter()

Now that #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities landed, I think this is now a major blocker for porting File entity to D8.

Log in or register to post comments

Comment #77

tstoeckler

he/him

German

Essen, Germany

commented 1 June 2014 at 22:19

Status:

Postponed

» Active

Log in or register to post comments

Comment #78

xjm

she/her

English

commented 3 June 2014 at 10:55

Thanks @davereid; that's important to know.

@tstoeckler: I think this issue was also still postponed on #2144263: Decouple entity field storage from configurable fields? Or is it possible to work on them at the same time?

Log in or register to post comments

Comment #79

tstoeckler

he/him

German

Essen, Germany

commented 3 June 2014 at 13:43

Title:	[PP-2] Make ContentEntityDatabaseStorage handle changes in the entity schema definition	» [PP-1] Make ContentEntityDatabaseStorage handle changes in the entity schema definition
Status:	Active	» Postponed

Hmm... I would have thought "Yes", but let's let @plach decide, as he has concrete plans for this issue, I think. Marking back to postponed for now.

Log in or register to post comments

Comment #80

effulgentsia commented 3 June 2014 at 18:53

Just chatted with @fago and @plach, and they say that while this issue won't be finishable/committable until after #2144263: Decouple entity field storage from configurable fields is done, it can still be started in parallel. Leaving postponed for now, until someone begins that work.

Log in or register to post comments

Comment #81

plach

he/him

Italian

Venezia

commented 4 June 2014 at 02:08

Assigned:

Unassigned

» plach

I will do tomorrow

Log in or register to post comments

Comment #82

fago

German

Vienna

commented 4 June 2014 at 19:16

Issue summary:

View changes

Log in or register to post comments

Comment #83

fago

German

Vienna

commented 4 June 2014 at 19:17

updated the issue summary based on a call with plach and timplunkett

Log in or register to post comments

Comment #84

plach

he/him

Italian

Venezia

commented 4 June 2014 at 21:57

Issue summary:

View changes

Log in or register to post comments

Comment #85

plach

he/him

Italian

Venezia

commented 4 June 2014 at 21:58

Issue summary:

View changes

Log in or register to post comments

Comment #86

fago

German

Vienna

commented 7 June 2014 at 22:26

As discussed with yched, alexpott, mtift, plach and xjm (notes here) we'll have to move the field purging mechanism to the entity field API and keep deleted field definitions in state for being able to properly solve this. However, we can postpone that to a follow-up and just forbid uninstalling modules providing fields as long as the fields have data for now.
-> Thus, we'll have to forbid uninstalling modules providing fields as long as the fields have data as part of this issue.

Log in or register to post comments

Comment #87

fago

German

Vienna

commented 7 June 2014 at 20:33

opened #2282119: Make the Entity Field API handle field purging

Log in or register to post comments

Comment #88

plach

he/him

Italian

Venezia

commented 28 June 2014 at 22:57

Status:

Postponed

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-88.patch	178.37 KB

A first patch, not really ready for review yet, to see how many failures we have. This includes #2144263: Decouple entity field storage from configurable fields as it depends on it.

Log in or register to post comments

Comment #89

29 June 2014 at 00:00

Status:

Needs review

» Needs work

The last submitted patch, 88: et-entity_schema_handling-1498720-88.patch, failed testing.

Log in or register to post comments

Comment #90

plach

he/him

Italian

Venezia

commented 29 June 2014 at 00:03

Status:

Needs work

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-90.patch	179.24 KB
new	et-entity_schema_handling-1498720-90.interdiff.txt	3.91 KB

Some fixes

Log in or register to post comments

Comment #91

29 June 2014 at 00:20

Status:

Needs review

» Needs work

The last submitted patch, 90: et-entity_schema_handling-1498720-90.patch, failed testing.

Log in or register to post comments

Comment #92

plach

he/him

Italian

Venezia

commented 29 June 2014 at 14:27

Status:

Needs work

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-92.interdiff.txt	16.21 KB
new	et-entity_schema_handling-1498720-92.patch	186.33 KB

2 files were hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-88.patch	178.37 KB
hidden	et-entity_schema_handling-1498720-90.patch	179.24 KB

This should fix quite a few test failures.

Log in or register to post comments

Comment #93

29 June 2014 at 15:28

Status:

Needs review

» Needs work

The last submitted patch, 92: et-entity_schema_handling-1498720-92.patch, failed testing.

Log in or register to post comments

Comment #94

plach

he/him

Italian

Venezia

commented 29 June 2014 at 16:02

Status:

Needs work

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-94.interdiff.txt	3.1 KB
new	et-entity_schema_handling-1498720-94.patch	186.04 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-92.patch	186.33 KB

More fixes

Log in or register to post comments

Comment #95

plach

he/him

Italian

Venezia

commented 29 June 2014 at 18:34

Status:

Needs review

» Needs work

Green, cool.

As I was saying this is not ready for review yet, unless you just want to have a look to how the new API to handle field schema changes looks like. Still lot to do, but we have a good foundation now.

Log in or register to post comments

Comment #96

plach

he/him

Italian

Venezia

commented 29 June 2014 at 23:13

Status:

Needs work

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-96-review.do_not_test.patch	134.13 KB
new	et-entity_schema_handling-1498720-96.patch	196.93 KB
new	et-entity_schema_handling-1498720-96.interdiff.txt	19.84 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-94.patch	186.04 KB

Ok, this should be more serious: it implements schema handling for shared table fields.

There is still room for a lot of clean-up and field data purging is not handled (it will in #2282119: Make the Entity Field API handle field purging), but aside from additional test coverage this first step should not be too far from completion.

Next step is automatically detecting and applying changes. I will start working in a separate branch based on the current one, so we can decide whether split that part off in a separate non beta-blocking issue or whether it makes more sense to have them together.

The full patch is for the bot and includes #2144263: Decouple entity field storage from configurable fields, the one to review is, well, the .review one :)

Log in or register to post comments

Comment #97

29 June 2014 at 23:13

Status:

Needs review

» Needs work

The last submitted patch, 96: et-entity_schema_handling-1498720-96.patch, failed testing.

Log in or register to post comments

Comment #98

plach

he/him

Italian

Venezia

commented 30 June 2014 at 00:34

Status:

Needs work

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-98.interdiff.txt	1.48 KB
new	et-entity_schema_handling-1498720-98.patch	197.6 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-96.patch	196.93 KB

Oops

Log in or register to post comments

Comment #99

30 June 2014 at 01:16

Status:

Needs review

» Needs work

The last submitted patch, 98: et-entity_schema_handling-1498720-98.patch, failed testing.

Log in or register to post comments

Comment #100

plach

he/him

Italian

Venezia

commented 1 July 2014 at 00:52

Status:

Needs work

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-100.interdiff.txt	884 bytes
new	et-entity_schema_handling-1498720-100.patch	197.83 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-98.patch	197.6 KB

The last changes exposed this nice bug

Log in or register to post comments

Comment #101

plach

he/him

Italian

Venezia

commented 1 July 2014 at 22:53

Status	File	Size
new	et-entity_schema_handling-1498720-101.patch	136.81 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-100.patch	197.83 KB

Rerolled after #2144263: Decouple entity field storage from configurable fields (yay!)

Log in or register to post comments

Comment #102

plach

he/him

Italian

Venezia

commented 1 July 2014 at 22:53

Title:

[PP-1] Make ContentEntityDatabaseStorage handle changes in the entity schema definition

» Make ContentEntityDatabaseStorage handle changes in the entity schema definition

And no longer postponed!

Log in or register to post comments

Comment #103

alexpott

he/they

English

🇪🇺🌍

commented 9 July 2014 at 15:34

Status	File	Size
new	101-103-interdiff.txt	13.82 KB
new	1498720.103.patch	139.84 KB

Read through the patch - just get on terms with what is going - the patch attached has some minor clean up and shows that we have a lack of test coverage around ContentEntityDatabaseStorage::doDeleteFieldItems(), ContentEntitySchemaHandler::getSharedTableFieldSchema() and ContentEntitySchemaHandler::getDedicatedTableSchema(). The last two are using FieldException - the patch in 101 didn't include a use statement for this class. And ContentEntityDatabaseStorage::doDeleteFieldItems() was not touching the revision table which looks necessary.

Log in or register to post comments

Comment #104

9 July 2014 at 16:21

Status:

Needs review

» Needs work

The last submitted patch, 103: 1498720.103.patch, failed testing.

Log in or register to post comments

Comment #105

Anonymous (not verified) commented 9 July 2014 at 16:48

This may allow us to kill hook_entity_load() and hook_ENTITY_TYPE_load() altogether.

Nope nope nope nope nope nope nope . Why would you do that? That has nothing to do with schema. It's an event trigger and what modules do with it is not Drupal's concern.

Log in or register to post comments

Comment #106

plach

he/him

Italian

Venezia

commented 10 July 2014 at 13:10

@alexpott:

Sorry, I was going on with my work in the sandbox (see also #2298525: Test issue for Make ContentEntityDatabaseStorage handle changes in the entity schema definition), I hope I will be able to include your interdiff without merging issues :)

@ivanjaros:

That's no longer on the table (see #24), I will update the issue summary when I have a more stable patch.

Log in or register to post comments

Comment #107

alexpott

he/they

English

🇪🇺🌍

commented 10 July 2014 at 14:55

Status:

Needs work

» Needs review

Status	File	Size
new	103-107-interdiff.txt	4.08 KB
new	1498720.107.patch	141.33 KB

Fix tests. I got getEntityTypeId and getTargetEntityTypeId mixed up!

Log in or register to post comments

Comment #108

dawehner

German

commented 15 July 2014 at 14:32

Someone will have to update the issue summary, it is kinda outdated.

One general question: People did not wanted to expose the table name / column name as API, but I guess it is unavoidable here.

Log in or register to post comments

Comment #109

plach

he/him

Italian

Venezia

commented 22 July 2014 at 23:47

Someone will have to update the issue summary, it is kinda outdated.

I will soon, I'm almost done with the first part (at least we should be close to a reviewable patch), see #2298525: Test issue for Make ContentEntityDatabaseStorage handle changes in the entity schema definition.

One general question: People did not wanted to expose the table name / column name as API, but I guess it is unavoidable here.

Yep, I am afraid it is. The route I took here was "hiding" it in the table mapping, so it's available only if you have a storage class that implements SqlEntityStorageInterface.

Log in or register to post comments

Comment #110

fago

German

Vienna

commented 23 July 2014 at 08:26

I shortly reviewed the patch, without going into all the details for now. Overall I like how code has been refactored and moved around!

Here some remarks:

```
+++ b/core/lib/Drupal/Core/Entity/Schema/EntitySchemaHandlerInterface.php
@@ -7,8 +7,48 @@
+  public function markFieldSchemaAsDeleted(FieldStorageDefinitionInterface $storage_definition);
```
This naming and description confused me a bit - what does it mean to mark a schema as deleted? Schema does not support something like that ;)

Maybe, it should be just prepareFieldSchemaDeletion()?

+++ b/core/lib/Drupal/Core/Entity/Sql/DefaultTableMapping.php
@@ -177,4 +197,105 @@ public function setExtraColumns($table_name, array $column_names) {
+  function allowsSharedTableStorage(FieldStorageDefinitionInterface $storage_definition) {
...
+  function requiresDedicatedTableStorage(FieldStorageDefinitionInterface $storage_definition) {

Why does the one allow it and the other one require it. Cannot it just be "hasXTableStorage()" or usesXTableStorage()?

Also, the logic seems a bit duplicated, maybe one could check the other, i.e. !customSTorage + !shared table -> dedicated table?

```
+++ b/core/modules/file/file.views.inc
@@ -520,9 +519,12 @@ function file_field_views_data(FieldConfigInterface $field) {
+  $table_mapping = $entity_manager->getStorage($entity_type_id)->getTableMapping();
```
Can it rely on having SQL storage here? Also on various other places - it seems to be a pre-existing assumption though. I guess we should ensure it does at least not FATAL on non-sql storages, besides providing a neat way to provide views integration still. Stuff for another issue.

Log in or register to post comments

Comment #111

plach

he/him

Italian

Venezia

commented 23 July 2014 at 09:59

Thanks @fago, but that patch is very outdated, I am working in #2298525: Test issue for Make ContentEntityDatabaseStorage handle changes in the entity schema definition. I will see if I can incorporate your feedback there.

Log in or register to post comments

Comment #112

yesct commented 26 July 2014 at 09:03

Issue tags:

+DX (Developer Experience)

adding the more widely used tag.

Log in or register to post comments

Comment #113

plach

he/him

Italian

Venezia

commented 28 July 2014 at 22:55

Status	File	Size
new	et-entity_schema_handling-1498720-113.patch	248.26 KB

5 files were hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-101.patch	136.81 KB
hidden	101-103-interdiff.txt	13.82 KB
hidden	1498720.103.patch	139.84 KB
hidden	103-107-interdiff.txt	4.08 KB
hidden	1498720.107.patch	141.33 KB

This implements the first part of the work, see the (upcoming) updated issue summary. I am now going to start phase 2 ;)

Log in or register to post comments

Comment #114

plach

he/him

Italian

Venezia

commented 29 July 2014 at 00:11

Issue summary:

View changes

Log in or register to post comments

Comment #115

plach

he/him

Italian

Venezia

commented 29 July 2014 at 00:17

+#1497374: Switch from Field-based storage to Entity-based storage

Log in or register to post comments

Comment #116

plach

he/him

Italian

Venezia

commented 29 July 2014 at 00:31

@fago:

The latest patch addresses most of your review. Some replies:

Done, but I thought the previous name made sense as field tables are renamed (marked) as deleted :)
I discussed those names extensively with @eff in Austin, and those were the ones that seemed to be more meaningful to him. No problem in renaming them, but I guess we should involve Alex in the bikeshed ;)
Stuff for #1740492: Implement a default entity views data handler

Log in or register to post comments

Comment #117

plach

he/him

Italian

Venezia

commented 29 July 2014 at 00:35

Title:

Make ContentEntityDatabaseStorage handle changes in the entity schema definition

» Make ContentEntityDatabaseStorage handle changes in the entity and field schema definitions

Better title

Log in or register to post comments

Comment #118

plach

he/him

Italian

Venezia

commented 29 July 2014 at 00:37

Title:

Make ContentEntityDatabaseStorage handle changes in the entity and field schema definitions

» Make the entity storage system handle changes in the entity and field schema definitions

Even better

Log in or register to post comments

Comment #119

plach

he/him

Italian

Venezia

commented 29 July 2014 at 00:39

Issue summary:

View changes

Log in or register to post comments

Comment #120

sun

German

Karlsruhe

commented 29 July 2014 at 01:02

Ugh. I wasn't aware that the result of the Entity Schema effort was going to retry a reinvention of Data API ad-hoc and 5 minutes to midnight, even though there's plenty of past evidence to prove that the approach simply doesn't work out for more complex data types in SQL storages.

Likewise, for NoSQL storages, we're talking about either (1) excessive mass-updates of all stored items, or (2) application code that has backwards-compatibility layers baked in, so as to be able to understand data/properties/values in different formats.

What exactly did we sign up for by committing the entity/field schema abstraction?

Log in or register to post comments

Comment #121

xjm

she/her

English

commented 29 July 2014 at 17:18

FYI, "Five minutes to midnight" is an unreasonable characterization as these issues have been beta-blocking since December (and #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities was committed on June 1).

Log in or register to post comments

Comment #122

berdir

German

Switzerland

commented 29 July 2014 at 19:13

Thanks @xjm :)

I experimented a bit with this with file_entity and managed to use it to add the type/bundle field: https://github.com/md-systems/file_entity/commit/8bd8e27f2cd5e488edba4bf...

I had one problem and that is that the bundle field is NOT NULL and I need an 'initial' value. I experimented with supporting that automatically, it's very simple but a bit random, see 8.x-et-support-initial-1498720-berdir. Not saying you should include it, just putting it out as an idea. Would certainly need tests and documentation. As an alternative, I could override the schema handler through a custom storage class and add the initial there.

Log in or register to post comments

Comment #123

chx commented 29 July 2014 at 19:16

I thought massive changes are going to be handled by the migrate system ; just add D8 source classes to taste?

Log in or register to post comments

Comment #124

plach

he/him

Italian

Venezia

commented 29 July 2014 at 23:32

Version:

8.0.x-dev

» 8.x-dev

@sun:

TBH I am a bit surprised by your comment: as usual, I am completely open to discuss technical stuff, but your comment seems to imply I am trying to sneak in this huge change last-minute without prior discussions. Actually it's the exact opposite:

I have been talking about this to stuff to everyone looking even barely interested to it at each Drupal event I attended since Denver
I posted my implementation plan 5 months ago (see #48), after discussing the big picture in Prague. I announced that post on the g.d.o. core group and on Twitter.
I discussed it again in Szeged and Austin

To sum up, I think I did everything I could to advertise my goals in general and this issue's goals in particular. If that was not enough, I am sorry, but it wasn't certainly due to bad will. The only reason I started working on this so late is that there was an almost infinite list of prerequisite issues to address before it.

That said, thursday 6pm CEST we are going to have the usual #drupal-entity meeting with most key people in this area: you'd be welcome to join us so we can discuss your concerns.

@Berdir:

Your code looks certainly promising, if there are no major objections I'd see it as a valid solution.

@chx:

Not sure whether you are referring to @Berdir's comment or this issue in general, I will assume the latter: the plan for core is limiting schema changes to those not-implying data migrations, e.g. the usual field table addition. The service responsible for managing changes could be swapped in contrib with an implementation relying on the Migrate API to deal also with changes that involve data migrations.

Log in or register to post comments

Comment #125

plach

he/him

Italian

Venezia

commented 29 July 2014 at 23:27

Restoring new version

Log in or register to post comments

Comment #126

yched commented 30 July 2014 at 10:13

The case of "initial" is a bit tricky.
- it only applies to the case where you're adding a new field stored in the base table, where there is already one row per existing entity. When used on a field stored in a dedicated table, supporting an initial value would mean inserting potentially 1000s of new rows.
- the "initial value" is not inherent to the field type, but very much case-by-case (it's the code / person that adds the new field that knows the correct initial value for that new field specifically).

To me, the above says "not part of the field type schema, not handled by the field storage layer, belongs to migrations".

Log in or register to post comments

Comment #127

fago

German

Vienna

commented 31 July 2014 at 16:02

I discussed those names extensively with @eff in Austin, and those were the ones that seemed to be more meaningful to him. No problem in renaming them, but I guess we should involve Alex in the bikeshed ;)

I'm not into bikesheeding, but I'd like to understand why one allows and the other requires the storage "approach". Is requiresDedicatedStorage() stronger? (Its docs seem to be wrong as they are from shared storage.).

Also, the logic seems a bit duplicated, maybe one could check the other, i.e. !customSTorage + !shared table -> dedicated table?

Looks like this has been addressed, thanks :)

Log in or register to post comments

Comment #128

xjm

she/her

English

commented 1 August 2014 at 14:38

We discussed this issue in the IRC meeting yesterday (including @fago, @berdir, @effulgentsia, @plach, @yched, @swentel, and @alexpott). Conclusions:

We will file a followup issue for the "initial value" problem space.
plach will ping joachim to see if there's any concerns he would raise following experience with the Data API, though we agreed this issue is much narrower in scope than that. (I left him an IRC tell.)
fago and effulgentsia will provide some code reviews of Part 1.
plach will also begin work on part 2 and explore what options/libraries might be available for the schema diff.

Log in or register to post comments

Comment #129

joachim commented 5 August 2014 at 07:16

It's a long time since I've worked on Data module...

What I do remember is that has a hideous circularity problem that I never properly managed to fix. I don't recall any of the details, but the comments I put in the code at the time can do that for me ;)

/**
 * Implements hook_schema_alter().
 *
 * This is a central piece of data module:
 * Here we tack schema information that has been defined through the API in data_tables
 * or by hook_data_default onto the $schema array.
 *
 * We do not use hook_schema() for exposing schema information as this would cause a race
 * condition: ctools/exports looks for data module's data_tables at the same time when
 * we are actually rebuilding it - follow path through
 * data_get_all_tables() ... _data_load_table() ... ctools_export_load_object().
 *
 * TODO: This is still rather hairy, and needs more work.
 * In the meantime, it's probably best to enable CTools first, and then Data
 * rather than both together.
 */
function data_schema_alter(&$schema) {
  // Sidestep this during installation, as otherwise this is circular:
  // data_get_all_tables() calls ctools stuff, which calls the schema, and
  // gets us right back here.
  if (drupal_get_installed_schema_version('data') != SCHEMA_UNINSTALLED) {

So it looks like the big problem was that Data table definitions are exportable with CTools. Also, Data needs to use hook_schema_*alter*() because it's drawing from data that's in a table. With the D8 config system that won't be a problem at least.

Log in or register to post comments

Comment #130

berdir

German

Switzerland

commented 5 August 2014 at 07:21

entity tables aren't exposed to drupal_get_schema() at all (nor are field tables), so anything related to that can't be an issue :)

Log in or register to post comments

Comment #131

chx commented 5 August 2014 at 07:57

I still don't get it. Now I read the IS. Why we are not telling people to just add configurable fields?? Why put all this effort to store things in the base table? Between that and migrate, I really don't get why is this necessary.

Log in or register to post comments

Comment #132

joachim commented 5 August 2014 at 08:19

Would be a good idea to also ping the maintainer of ECK, which deals with an almost identical situation to Data.

Log in or register to post comments

Comment #133

Anonymous (not verified) commented 5 August 2014 at 09:56

@chx IMHO a) to save unnecessary joins to additional field tables, b) mostly base fields of an entity are single valued so it's easy to just open the entity field data table and immediately see all values instead of "hunting" the values in different field tables(DX).

The b) point for me is very important but I guess moving entirely to Fields would save us quite a bit of coding(base field - definiton + display settings + form settings + all the code in core that handles base fields), no field would be outside of configuration, etc... so IMHO I'd agree with such approach.

Log in or register to post comments

Comment #134

plach

he/him

Italian

Venezia

commented 5 August 2014 at 23:25

@joachim:

Thanks for your feedback! As @Berdir was point out, we are no longer exposing the schema array through hook_schema() so that issue should not affect us. Moreover atm I am trying to spot differences that might trigger schema changes by just inspecting entity type/field schema definitions, so that schema array should not be a concern at least in this basic implementation. I am envisioning a contrib module that could expand on the core behavior and provide schema diffs, change previews and similar cool stuff, but that would totally rely on existing code/libraries. Anyway, I pinged @fmizzell as you suggested :)

@chx:

Well, actually denormalization in particular and performance in general are some of the biggest reasons I have in mind. Certainly we are still supporting the ability to add configurable fields via code, actually it's even possible to define bundle fields via code and they would get a dedicated field table, exactly as configurable fields. Adding stuff to the base table(s) is just another option that I think may be quite useful in certain situations. I must say that I'd very much prefer to use base fields defined in code (instead of bundle ones), though. I've always been disturbed by the fact that a configurable field, even one added in code and locked, is somehow "unreliable" and basing coded business logic on it feels just wrong to me. A possible option to make it easier to add base fields in scenarios where data is already available, could be to expose a way to specify that a certain field should be stored in a dedicated table even if it's a single-value base field.

Log in or register to post comments

Comment #135

fmizzell commented 6 August 2014 at 06:41

I haven't looked at D8's entity stuff since the time when typed data was being discussed, so I am not sure that looking at the patch and trying to give feedback would be that helpful at all, but it does seem that @platch envisions a contrib module that could offer a UI to trigger creation and deletion of base fields, and that is exactly what ECK is doing in D7. So as long as that is possible ECK will be able to work with the core system.

Log in or register to post comments

Comment #136

plach

he/him

Italian

Venezia

commented 6 August 2014 at 16:18

@fmizzell:

Thanks! Is there any experience you'd like to share (tricky stuff, good solutions) that could be useful to design the D8 system?

Log in or register to post comments

Comment #137

fmizzell commented 6 August 2014 at 20:07

In ECK I am lucky to have almost full control of how base field's metadata gets manipulated (A user could modify the serialized array in the db, but why would you when there is an API?). Adding a new property/base field looks something like this:

$entity_type->addProperty(...);
$entity_type->save();

So I can record manipulations that need to be performed to existing db tables and perform them during save. I am guessing that in D8 a lot of this metadata is in code or in the config system, so assuming that people is going through an API might be out of the question.

Knowing that users will go through an API eliminates the need for schema versions and other historical data (which makes everything much simpler), but on the other hand, this issue is more ambitious than just adding/deleting base fields, and saving historical information like diffs will be necessary anyways to help out with auto-migration features, etc.

Log in or register to post comments

Comment #138

plach

he/him

Italian

Venezia

commented 6 August 2014 at 21:47

@fmizzell:

#137 makes a lot of sense to me. As you correctly point out, the main difference in D8 is that Entity Field CRUD operations are not observable, at least at the moment, so what we need to do to detect changes is storing a copy of entity type and field storage definitions and comparing them with the ones currently available in the system. This has also one advantage though: such operation can be performed whenever we want and allows for multiple changes to be applied in a single run.

Anyway, as discussed with @yched elsewhere, we should probably ensure the Entity Manager handles all this stuff and the storage is just an event subscriber or something like that. However we can probably implement that approach as an API addition, so no need to tackle that here.

Log in or register to post comments

Comment #139

chx commented 6 August 2014 at 22:23

You need to realize that I have no bones in this -- mongodb will store whatever is the current field schema. As there's no database schema, for all it cares every document in a collection can be completely different. So from that perspective, I shouldn't care about this issue.

My concern is timing and resources and maintainability of the entity storage code. Please think carefully: is this really a release blocking task? Isn't it merely a feature? Shipping a field config with a module should be quite reliable AFAIK; it's already built and working.

Log in or register to post comments

Comment #140

fago

German

Vienna

commented 7 August 2014 at 16:34

@chx:
Well, actually denormalization in particular and performance in general are some of the biggest reasons I have in mind.

While performance benefits are great to have, I think the main reason for us having to do this is something else: So far entity types have been able to do updates to their entity base tables by writing the schema update operations theirselves. With the schema being generated, this is not possible anymore. Instead we need this API to continue to have the possibility for entity type providing modules to make changes to their base fields / base table schema.

Log in or register to post comments

Comment #141

gábor hojtsy

he/him

Hungarian

Hungary

commented 8 August 2014 at 08:12

Issue tags:

+Drupalaton 2014

Log in or register to post comments

Comment #142

fago

German

Vienna

commented 8 August 2014 at 16:08

Status	File	Size
new	d8_storage.interdiff.txt	10.23 KB
new	d8_storage.patch	248.01 KB

ok, I looked through the latest version of the patch (branch 8.x-et-entity_schema_handling-1498720-plach) in detail now. In general, this looks great. Adding an updated version with some small documentation stuff and glitches fixed.

Here some remarks/questions:

      // There is just one field for each dedicated storage table, thus
      // $field_name can only refer to it.
      if (isset($field_name) && $this->requiresDedicatedTableStorage($this->fieldStorageDefinitions[$field_name])) {
        $this->allColumns[$table_name] = array_merge($this->getExtraColumns($table_name), $this->allColumns[$table_name]);
      }
      else {
        $this->allColumns[$table_name] = array_merge($this->allColumns[$table_name], $this->getExtraColumns($table_name));
      }

I do not understand why the array_merge() order differs here. So this could use a comment / some explanation.

/**
 * Defines a schema handler that supports revisionable, translatable entities.
 */
class ContentEntitySchemaHandler implements ContentEntitySchemaHandlerInterface {

Given #2275659: Separate FieldableEntityInterface out of ContentEntityInterface it's actually not bound to content entities, but fieldable entities. I'm not sure whether it should be FieldableEntitySchemaHandlerInterface or better just "EntitySchemaHandlerInterface" + type hint on fieldable entity interface later on and ContentEntityInterface for now.

    // If we are adding a field stored in a shared table we need to recompute
    // the table mapping.
    if ($this->getTableMapping()->allowsSharedTableStorage($storage_definition)) {
      $this->tableMapping = NULL;
    }

Yep, but we do not generally know which changes influence the table mappings and which don't. So shouldn't we just re-compute the mapping on every change?

    if ($table_mapping->requiresDedicatedTableStorage($storage_definition)) {
      // Mark all data associated with the field for deletion.
      $table = $table_mapping->getDedicatedDataTableName($storage_definition);
      $revision_table = $table_mapping->getDedicatedRevisionTableName($storage_definition);
      $this->database->update($table)
        ->fields(array('deleted' => 1))
        ->execute();
      $this->database->update($revision_table)
        ->fields(array('deleted' => 1))
        ->execute();
    }

Shouldn't those changes be in the SchemaHandler as well? The name and description of the schema handler implies that is used for dedicated fields as well. If we want to keep implementations separated, I'd suggest moving it to a trait and use the trait in the schema handler?

CommentSchemaHandler
This one seems to be unused?

    $storage = $entity_manager->getStorage($field_storage->getTargetEntityTypeId());
    $result = $storage instanceof ContentEntityDatabaseStorage ? $storage : FALSE;

Is there a reason it checks the implementation and not on the sql storage interface?

      ->setSetting('unsigned', TRUE)
      ->addConstraint('TermParent', array())
      ->setCustomStorage(TRUE);

oh, yes :)

Log in or register to post comments

Comment #143

fago

German

Vienna

commented 9 August 2014 at 07:02

        $this->dropEntitySchema($original);
        $this->storage->setEntityType($entity_type);
        unset($this->schema[$entity_type->id()]);
        $this->createEntitySchema($entity_type);

Does it check whether the entity has data somewhere? I'm missing it. ~~Also, it would make sense to handle simple cases e.g. column additions and removals without migration / data loss I think? -> We do :)~~

Log in or register to post comments

Comment #144

8 August 2014 at 16:55

Status:

Needs review

» Needs work

The last submitted patch, 142: d8_storage.patch, failed testing.

Log in or register to post comments

Comment #145

plach

he/him

Italian

Venezia

commented 8 August 2014 at 18:02

Issue summary:	View changes
Status:	Needs work	» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-145-part1.txt	255.37 KB
new	et-entity_schema_handling-1498720-145-part2.txt	46.29 KB
new	Status_report___Drupal_8_x_-_DEV.png	90.94 KB
new	Drupal_entity_schema_updates___Drupal_8_x_-_DEV.png	224.87 KB
new	Drupal_module_updates___Drupal_8_x_-_DEV.png	188.43 KB

Here's my current work: part2 still needs work, and I realized also part1 needs a bit of additional work to handle index/keys changes, but this is definitely looking more like it. Not sending to the bot yet, but manual testing here looks promising (see https://twitter.com/plach__/status/497797333015601153). Adding some screenshots of the UI tweaks.

(this does not include @fago's changes, just saw them)

Log in or register to post comments

Comment #146

plach

he/him

Italian

Venezia

commented 8 August 2014 at 17:58

Status:

Needs review

» Needs work

Log in or register to post comments

Comment #147

plach

he/him

Italian

Venezia

commented 8 August 2014 at 17:59

Issue summary:

View changes

Log in or register to post comments

Comment #148

fago

German

Vienna

commented 9 August 2014 at 08:46

Thanks!

+++ b/core/lib/Drupal/Core/Entity/Schema/ContentEntitySchemaHandler.php
@@ -95,6 +95,60 @@ public function __construct(EntityManagerInterface $entity_manager, ContentEntit
+    // A change in the storage class may or may not imply a data migration. We

This misses the storage class check although commented - it should probably just call the requiresChanges method for now as it's the same in case of entity data? Or do we handle going from revisionable to not revisionable without migration?

```
+++ b/core/lib/Drupal/Core/Extension/ModuleHandler.php
@@ -868,11 +869,12 @@ public function install(array $module_list, $enable_dependencies = TRUE) {
-            $entity_manager->getStorage($entity_type->id())->onEntityTypeDefinitionCreate();
...
+            $entity_schema_manager->onEntityTypeDefinitionCreate($entity_type);
```
So this makes the base system supported schema changes instead of letting the entity storage handle changes. As update.php list schema changes separately, it makes sense - but still hard-codes entity storage changes to be schema changes only.
Not sure whether that's a use-case, but should we continue routing change notification calls through the storage, such that a non-SQL storage could implement its own of storage adaptions as needed?
Then, the SQL storage could forward calls to the schema manager service internally. Thoughts?

Other storage engines might not have to do schema changes, but still e.g. data cleanup, changing indexes etc. might be interesting. Then who knows what sort of storage implementation people will come up with?

If we do this, the question is how we deal with the summary in update.php. I could see us to just continue to hard-code the summary to schema changes. Even if other storages cannot provide a summary, avoiding to locking them out of storage changes seems reasonable.

Log in or register to post comments

Comment #149

plach

he/him

Italian

Venezia

commented 10 August 2014 at 15:12

Status:

Needs work

» Needs review

Status	File	Size
new	et-entity_schema_handling-1498720-149.patch	256.46 KB
new	et-entity_schema_handling-1498720-149.interdiff.txt	3.63 KB
new	et-entity_schema_handling-1498720-149.interdiff-142_145.txt	16.9 KB

5 files were hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-113.patch	248.26 KB
hidden	d8_storage.interdiff.txt	10.23 KB
hidden	d8_storage.patch	248.01 KB
hidden	et-entity_schema_handling-1498720-145-part1.txt	255.37 KB
hidden	et-entity_schema_handling-1498720-145-part2.txt	46.29 KB

This should address @fago's reviews (thanks :). Attached you can find also the interdiff between #142 and #145, which probably @fago missed.

Given #2275659: Separate FieldableEntityInterface out of ContentEntityInterface it's actually not bound to content entities, but fieldable entities. I'm not sure whether it should be FieldableEntitySchemaHandlerInterface or better just "EntitySchemaHandlerInterface" + type hint on fieldable entity interface later on and ContentEntityInterface for now.

I've been asking myself more or less the same questions (here and in other places): I think for now it safer to target the most specific scenario (just content entity types). We can discuss the details and "widen" the target safely, the opposite strategy would imply API breaks.

Yep, but we do not generally know which changes influence the table mappings and which don't. So shouldn't we just re-compute the mapping on every change?

Fair point. But we can bring it even further and say that the storage should make no assumption on what is needed to reflect the change in the schema, so ideally it should be responsibility of the schema handler to instantiate a fresh table mapping. For now I just added a todo about removing those lines when we are done with #2274017: Make SqlContentEntityStorage table mapping agnostic .

Shouldn't those changes be in the SchemaHandler as well?

I don't think that code belongs to the schema handler: the storage handles field data, while the schema handler deals with schema. IMHO this way the overall (CR)AP strategy is correctly split into its two respective areas of responsibility.

CommentSchemaHandler
This one seems to be unused?

What do you mean? :)
It's overriding the getEntitySchema() method as usual.

Is there a reason it checks the implementation and not on the sql storage interface?

Well, the current implementation hardcodes assumptions on the table layout that are specific to the core implementation provided by ContentEntityDatabaseStorage. I think we can get rid of this entirely in #1740492: Implement a default entity views data handler.

Does it check whether the entity has data somewhere? I'm missing it.

Validation is being implemented in part2 atm. My current approach is leaving the possibility to override data handling policies in the schema manager. The storage layer just exposes new methods to describe how/if the change affects data. This part could use some IRC discussion :)

it should probably just call the requiresChanges method for now as it's the same in case of entity data? Or do we handle going from revisionable to not revisionable without migration?

Yep, we just need to drop revision tables. Added a comment to clarify that.

[...] Not sure whether that's a use-case, but should we continue routing change notification calls through the storage, such that a non-SQL storage could implement its own of storage adaptions as needed? [...]

This is an area I spent quite some time thinking around: IMO ideally all these notifications should be routed through the entity manager, which in turn would be responsible for notifying interested entity handlers (probably only storage) and for proxying the notifications to other interested business objects, such as the schema manager. Among the rest this would allow the EM to clear its own caches as discussed with @yched in #2144263-87: Decouple entity field storage from configurable fields.

I agree the current approach is not the cleanest possible, but it is still an improvement wrt the HEAD code (although the interdiff looks worse) and it ensures that any change to definitions is persisted in state after being reflected in the storage. In fact every call is initially proxied to the storage, so we are losing nothing in terms of the use cases we are able to support.

I think we should clean this up in a dedicated issue: we can definitely implement the plan above without introducing relevant API changes.

Log in or register to post comments

Comment #150

plach

he/him

Italian

Venezia

commented 10 August 2014 at 14:57

4 files were hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-149.interdiff-142_145.txt	16.9 KB
hidden	Drupal_module_updates___Drupal_8_x_-_DEV.png	188.43 KB
hidden	Drupal_entity_schema_updates___Drupal_8_x_-_DEV.png	224.87 KB
hidden	Status_report___Drupal_8_x_-_DEV.png	90.94 KB

Log in or register to post comments

Comment #151

plach

he/him

Italian

Venezia

commented 10 August 2014 at 15:01

Status	File	Size
new	et-entity_schema_handling-1498720-151.patch	299.37 KB
new	et-entity_schema_handling-1498720-151-part2.txt	46.55 KB

1 file was hidden/shown/deleted

Status	File	Size
hidden	et-entity_schema_handling-1498720-149.patch	256.46 KB

Sorry, patch in #149 contains only part1 while it was meant to contain both. Here is the complete one. Interdiffs above are correct. Attached the difference between #149 and #151.

Log in or register to post comments

Comment #152

xjm

she/her

English

commented 13 August 2014 at 15:38

Issue tags:

+Needs issue summary update

Looks like the summary could use an update. :) Is that the full implementation of part 2?

Log in or register to post comments

Comment #153

plach

he/him

Italian

Venezia

commented 13 August 2014 at 16:44

I am not sure what to update, I think I followed strictly what is proposed in the summary.

Log in or register to post comments

Comment #154

gábor hojtsy

he/him

Hungarian

Hungary

commented 13 August 2014 at 16:47

I think the biggest question is are phase1 and phase2 both part of this issue intended to commit? What are missing pieces?

Log in or register to post comments

Comment #155

plach

he/him

Italian

Venezia

commented 13 August 2014 at 23:05

Part 1 and part 2 are split just for reviewer convenience. We agreed they should be brought on together. Then maybe we can commit them separately if that's the easier way to get this in, but they should be set to RTBC together.

I am still coding a few bits, but we should be close. Namely:

Detect changes in entity schema indexes/keys.
Block module uninstallation if data is available, since we have the purge issue still open: #2282119: Make the Entity Field API handle field purging.
Block entity schema changes. We need this temporarily as the various entity-type-specific storage classes do not support dynamic table layouts yet, so switching the entity schema causes queries to break. Moreover we need Views integration fixes (#1740492: Implement a default entity views data handler and friends) before allowing for those, for the very same reason. However this can be done in a follow-up, as just unblocking these changes (simply removing a throw) is a totally non API breaking change.
Improve test coverage for the schema manager.

Additionally, I am wondering whether we should add a method to reconcile the definitions stored in state with the ones available in code. This would allow people to apply changes manually and just notify the schema manager that everything is fine again.

Log in or register to post comments

Comment #156

fago

German

Vienna

commented 14 August 2014 at 18:04

Great work, we've come along way already! Here a another review, I've not gone into details really, tried to stay more on the bigger picture for now.

```
+++ b/core/lib/Drupal/Core/Entity/Schema/ContentEntitySchemaHandler.php
@@ -95,6 +95,65 @@ public function __construct(EntityManagerInterface $entity_manager, ContentEntit
+    // A change in the storage class may or may not imply a data migration. We
+    // assume it does. This method should be overridden otherwise. Basically the
```
hm, reading this again I'm wondering what an example of a change in the storage class would be, which would lead to a different schema being generated?
I guess it boilds down to table mapping changes, so maybe we can add a todo to compare the generated table mappings? I guess changing the class would be something you'd like to be able to do without being force into some migrations ;-)

+++ b/core/lib/Drupal/Core/Entity/Schema/ContentEntitySchemaHandler.php
@@ -95,6 +95,65 @@ public function __construct(EntityManagerInterface $entity_manager, ContentEntit
+    // only schema change that does not imply a data migration is from
+    // revisionable to non revisionable, as in that case we just need to drop
+    // revision tables.

Yes, but updateEntitySchema() seems to drop the schema and re-add it on the entity level - so there is still entity data migration required? It just handles dedicated revision tables better?

```
+++ b/core/lib/Drupal/Core/Extension/ModuleHandler.php
@@ -868,11 +869,12 @@ public function install(array $module_list, $enable_dependencies = TRUE) {
+          if ($entity_type instanceof ContentEntityTypeInterface && $entity_type->getProvider() == $module) {
+            $entity_schema_manager->onEntityTypeDefinitionCreate($entity_type);
```
hm, yet another object with those event listeners. Should we start proxying all of them via the manager already as part of this issue?
It would make more sense to me to have the storage notified from the manager and not from the schema managaer as now, as there might be storages completely unrelated to schema. E.g. a site running mongodb might want something like a null schema handler (or even remove the service if possible later on).

Related, I'm not sure the SchemaManager should be the one who stores the definition changes into state. That does not directly seem to be related to schema managed, as it will be required to figuring out definition changes in general to which storages should be able to react - howsoever they need to do that. Maybe, mongo just needs to drop/clean some data?
So what about moving this to the EntityManager?
I've been asking myself more or less the same questions (here and in other places): I think for now it safer to target the most specific scenario (just content entity types). We can discuss the details and "widen" the target safely, the opposite strategy would imply API breaks.

I see, good argument - but isn't it actually the other way round? E.g. changing an argument from FieldableInterface to ContentEntityInterface would not break things, but the other way round it would be broken as you narrow down the interface?
Fair point. But we can bring it even further and say that the storage should make no assumption on what is needed to reflect the change in the schema, so ideally it should be responsibility of the schema handler to instantiate a fresh table mapping. For now I just added a todo about removing those lines when we are done with #2274017: Make SqlContentEntityStorage table mapping agnostic .

True, ok.
I don't think that code belongs to the schema handler: the storage handles field data, while the schema handler deals with schema. IMHO this way the overall (CR)AP strategy is correctly split into its two respective areas of responsibility.

I'd not say this is traditional data that you saved as it's a consequence of the field definition change; i.e. it's a metadata/schema change not a data change. But as the storage class needs to have knowledge about the 'deleted' column anyway, it doesn't seem to matter much where it lives.
CommentSchemaHandler
This one seems to be unused?

What do you mean? :)
It's overriding the getEntitySchema() method as usual.

Nope? Not in the previous version I reviewed, nor in the latest one. Other storages override schemaHandler() to register it, but CommentStorage doesn't ?
Well, the current implementation hardcodes assumptions on the table layout that are specific to the core implementation provided by ContentEntityDatabaseStorage. I think we can get rid of this entirely in #1740492: Implement an entity views data controller.

I see, yeah let's deal with that over there.
This is an area I spent quite some time thinking around: IMO ideally all these notifications should be routed through the entity manager, which in turn would be responsible for notifying interested entity handlers (probably only storage) and for proxying the notifications to other interested business objects, such as the schema manager.

Yep, I've been thinking about that as well. Given all the various reactions in different classes I've been wondering whether it makes sense to start making use of symfony events for that also. It should be possible to introduce without an API break though if we continue to call pre-existing listener methods. Proxying them through the storage for now is fine imo.
* - entity_type: a scalar having only the ENTITY_TYPE_UPDATED value.

If there is only one possible value, having the constant seems superfluous ? Why not just set it to TRUE? Or better, just have general constants for CREATED, UPDATED and DELETED which we can re-use independent of what has been created|updated|deleted ?
* Only changes that do not imply a data migration are applied when data is
* available for a certain entity type. If any change fails to comply with
* this policy the operation is aborted.

How would I notice an aborted operation? Does it throw an exception? If so, that misses docs.

Log in or register to post comments

Comment #157

effulgentsia commented 19 August 2014 at 01:54

Sorry to be joining the review party so late (17 days since #128.3 was written). I started trying to absorb the patch, and the first thing that I got stuck on is:

+++ b/core/modules/aggregator/src/FeedStorage.php
@@ -21,24 +21,11 @@ class FeedStorage extends ContentEntityDatabaseStorage implements FeedStorageInt
-  public function getSchema() {
-    $schema = parent::getSchema();
-
-    // Marking the respective fields as NOT NULL makes the indexes more
-    // performant.
-    $schema['aggregator_feed']['fields']['url']['not null'] = TRUE;
-    $schema['aggregator_feed']['fields']['queued']['not null'] = TRUE;
-    $schema['aggregator_feed']['fields']['title']['not null'] = TRUE;
-
-    $schema['aggregator_feed']['indexes'] += array(
-      'aggregator_feed__url'  => array(array('url', 255)),
-      'aggregator_feed__queued' => array('queued'),
-    );
-    $schema['aggregator_feed']['unique keys'] += array(
-      'aggregator_feed__title' => array('title'),
-    );
-
-    return $schema;
+  protected function schemaHandler() {
+    if (!isset($this->schemaHandler)) {
+      $this->schemaHandler = new FeedSchemaHandler($this->entityManager, $this->entityType, $this, $this->database);
+    }
+    return $this->schemaHandler;

Looks like this patch splits every content entity storage handler into a custom storage handler + a custom schema handler, but are we sure we want to require that split in every case? I support the split at the base class level (ContentEntityDatabaseStorage + ContentEntitySchemaHandler) and making it possible to subclass each one separately, but as a default case, would it be better DX for ContentEntityDatabaseStorage::getEntitySchema() to call back into $this->storage->adjustSchema() (pending better name), so that most content entity types can get away with only implementing the one storage class unless they need a custom schema handler for more exotic reasons?

Log in or register to post comments

Comment #158

effulgentsia commented 19 August 2014 at 22:51

+++ b/core/lib/Drupal/Core/Entity/Sql/DefaultTableMappingInterface.php
@@ -0,0 +1,118 @@
+interface DefaultTableMappingInterface extends TableMappingInterface {

Why does this need to be its own interface? Why not add the new methods to TableMappingInterface? I don't see where there's code that benefits from a narrower TableMappingInterface that lacks the new methods.

call back into $this->storage->adjustSchema() (pending better name)

At least based on the examples in core, would optimizeSchema() be a good name for this? If I'm reading it right, then the base class (ContentEntitySchemaHandler) generates a functionally correct schema, but what's left as a per-entity-type responsibility is adding indexes and setting not-null constraints (primarily to benefit indexes), so I think "optimize" would be a decent name for that, but open to other suggestions.

Finally, neither #151 nor #2298525-48: Test issue for Make ContentEntityDatabaseStorage handle changes in the entity schema definition apply cleanly to HEAD anymore. Would it be possible to get an updated patch posted once HEAD changes are merged back into the sandbox?

Log in or register to post comments

Comment #159

plach

he/him

Italian

Venezia

commented 19 August 2014 at 23:30

@fago:

1: One example is the contact_message entity: it currently has null storage, but we have tests where a regular storage class is swapped in. We need to generate the schema in that case.
2: Yep, still working on that, as part of the entity keys/indexes stuff.
3: Unless every (content) entity type is stored in Mongo, I don't think a null schema manager would make sense. Anyway, I think the switch to the EntityManager as the class responsible for notifying (actually proxying notifications) about entity/field definition changes could deserve it's own issue, as I think there a few aspects to figure out that are not completely trivial. Personally I'd avoid rushing it in here, but if there's consensus we should be addressing that now, fine by me. What we could end-up with is a scenario where the EntityManager actually stores data in state and tracks changes in definitions and the schema manager receives notifications about those and interacts with the status report and the storage class to actually apply them. The only reason why I am currently routing calls through the schema manager is to ensure definitions are properly stored in state.
4: If we restrict the API to ContentEntityInterface and then we decide to "relax" it to support EntityInterface we break no existing code. I am not sure where a FieldableInterface would fit in a hypothetical hierarchy, but I'd guess ContentEntityInterface would extend it, so the same reasoning would apply.
6: I agree it's metadata. Currently what determines the decision of putting some code in the storage or in the schema handler is: if code touches table records it goes in the storage, otherwise in the schema handler.
7: Sorry, I missed what you meant.
9: Yep, this looks very event-ish to me too :)
10: Good point.
11: It is supposed to throw an exception, if not I will add it and document it :)

@effulgentsia:

This has a double reason:

The schema handler needs to retrieve the full entity schema (entity-type specific bits included) in various parts of its code, in the current form it can do that without needing to rely on the storage class.
The current way allows all the schema handling to be internal, there is no public method returning the schema array anymore. This allowed me to write all the consuming code in a cleaner way, without introducing assumptions that make sense only for our core storage.

Looking at core implementations, with the only exception of block content, every entity type that needs a specific schema handler needs also a specific storage, so I don't think this is a big DX burden. If it turns out I am wrong, we can try to refactor things so that the schema handler is injected in the storage class, but that would prevent lazy loading, which would be bad since the schema handler is a big class that is used very rarely.

Log in or register to post comments

Comment #160

plach

he/him

Italian

Venezia

commented 19 August 2014 at 23:29

@effulgentsia:

Why does this need to be its own interface?

Because this is the specific implementation for core storage: other storage classes might want to provide completely different ways to deal with SQL tables and implement completely different table layouts (see also #2274017: Make SqlContentEntityStorage table mapping agnostic ). In that case the concepts of dedicated/shared tables would be meaningless.

I am completing a couple of changes and then I will merge head. Hopefully it won't be too painful :)

Log in or register to post comments

Comment #161

effulgentsia commented 20 August 2014 at 00:49

The schema handler needs to retrieve the full entity schema (entity-type specific bits included) in various parts of its code, in the current form it can do that without needing to rely on the storage class.

How? $storage is a constructor dependency of ContentEntitySchemaHandler, and is used already within ContentEntitySchemaHandler::getEntitySchema(). All I'm saying is that at the end of that implementation, call $this->storage->optimizeSchema($schema). That doesn't add any additional object dependency that isn't already there.

The current way allows all the schema handling to be internal, there is no public method returning the schema array anymore.

Is that really a benefit? If a particular entity type's storage class extends ContentEntityDatabaseStorage, then it's already bound to the concept that some kind of schema exists, so outside code knowing that an optimizeSchema() method can be called on it isn't very harmful, is it? I actually think that would be a legitimate method to define in SqlEntityStorageInterface. Note, outside code couldn't get a schema from it, it would need to pass one in and let the method modify it (i.e., add indexes).

This allowed me to write all the consuming code in a cleaner way, without introducing assumptions that make sense only for our core storage.

I don't think this would change that. The consuming code you're talking about would still purely act on ContentEntitySchemaHandlerInterface exactly the same as now. It's only that the ContentEntitySchemaHandler implementation of the protected getEntitySchema() method would interact with the storage object it already has access to.

other storage classes might want to provide completely different ways to deal with SQL tables and implement completely different table layouts...In that case the concepts of dedicated/shared tables would be meaningless.

Ah, thanks for that explanation. In that case, should getReservedColumns() and getFieldColumnName() move to TableMappingInterface? Those seem generic to me rather than bound to any particular layout (dedicated/shared) strategy.

Log in or register to post comments

Comment #162

plach

he/him

Italian

Venezia

commented 20 August 2014 at 22:07

How? $storage is a constructor dependency of ContentEntitySchemaHandler

Well, the plan for #2274017: Make SqlContentEntityStorage table mapping agnostic is making the schema handler depend only on the table mapping class, by making the latter encapsulate the logic that is currently located in ContentEntityDatabaseStorage::getTableMapping(). This would allow to instantiate a table mapping class wherever needed. However I just realized we are currently using the hasData() and countFieldData() methods from the storage, so I guess we won't be able to make the schema handler completely independent from the storage.

Anyway, personally I find handling schema only in the schema handler cleaner, but if there's consensus that the current DX is too bad (I definitely don't think so) your proposal is probably the best way forward.

Log in or register to post comments

Comment #163

plach

he/him

Italian

Venezia

commented 21 August 2014 at 00:20

Issue summary:

View changes

Actually any table layout change implies a data migration... updated summary.

Log in or register to post comments

Comment #164

plach

he/him

Italian

Venezia

commented 22 August 2014 at 00:47

Status	File	Size
new	et-entity_schema_handling-1498720-164.interdiff.txt	9.76 KB
new	et-entity_schema_handling-1498720-164.patch	319.9 KB

This addresses bullets 2, 7, 10, 11 of #156. The interdiff does not include 2 as I did quite a few merges meanwhile. I am waiting feedback on the other bullets. If we decide to address #156.3 in a separate issue, then I think we are ready to split this in sub-issuess as @eff proposed. I still have a couple issues on my todo list and there's still #161 to be discussed, but I think none of them would imply cross-issue changes, so we should be fine.

@effulgentsia:

If you want to proceed with the split, please use my sandbox and create a separate branch for each sub-issue, branching the dependent ones off the independent ones, so I can perform cross-branch changes if needed.

Log in or register to post comments

Comment #165

plach

he/him

Italian

Venezia

commented 22 August 2014 at 00:48

Issue summary:

View changes

Log in or register to post comments

Comment #166

plach

he/him

Italian

Venezia

commented 22 August 2014 at 09:31

Sorry, I missed this one:

Ah, thanks for that explanation. In that case, should getReservedColumns() and getFieldColumnName() move to TableMappingInterface? Those seem generic to me rather than bound to any particular layout (dedicated/shared) strategy.

Well, not sure whether getReservedColumns() is still needed, but it's tied to an implementation detail of field tables. However I could see the value of moving those in the base interface. I will do it on the next reroll.

Log in or register to post comments

Comment #167

plach

he/him

Italian

Venezia

commented 23 August 2014 at 14:24

Issue tags:

+sprint

Tagging for the rocketship.

Log in or register to post comments

Comment #168

effulgentsia commented 23 August 2014 at 22:40

@effulgentsia: If you want to proceed with the split, please use my sandbox and create a separate branch for each sub-issue, branching the dependent ones off the independent ones, so I can perform cross-branch changes if needed.

Ok, will do when I get a chance (probably Monday). For now, just testing out the first part in #2326719: Move pseudo-private table mapping functions from ContentEntityDatabaseStorage to public API of DefaultTableMapping.

Log in or register to post comments

Comment #169

effulgentsia commented 24 August 2014 at 23:18

Status	File	Size
new	et-entity_schema_handling-1498720-169-review-do-not-test.patch	232.57 KB

The issue in #168 is green, if anyone wants to review/rtbc that :)

Additionally, I opened #2326949: Move entity-type-specific schema information from the storage class to a schema handler, which is now also green, so let's make the decision there regarding #157/#162. I'm ok with that getting RTBC'd and committed if no one else objects to the DX of needing an extra class.

And, I opened #2326981: Move \Drupal\field\FieldException to \Drupal\Core\Field\FieldException, which is super simple, so can hopefully land very fast.

All 3 of those do not overlap at all, so can land in any order. So, I'm attaching a patch here that is the same as #164, but rebased on top of those 3, in case it helps any other reviewers focus on what's not covered by those issues.

Per #168, I'll commit all these as branches to the sandbox tomorrow, if no one beats me to it.