Problem/Motivation

Spin off from #3410480: [META] Use events in Database Schema operations.

Today,

A Drupal schema definition is an array structure representing one or more tables and their related keys and indexes. A schema is defined by hook_schema(), which usually lives in a modulename.install file.

https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Database%...

Proposed resolution

1) Replace the current way of defining a table's structure via a nested array, with a structure of value objects defined in the Drupal\Core\Database\SchemaDefinition namespace.

This structure is immutable and database abstract (for example, directionally in the future the db-specific field type array key 'mysql_type', or the 'mysql_engine' will no longer be defined here).

A table currently represented like

    $schema = [
      'description' => 'The base table for configuration data.',
      'fields' => [
        'collection' => [
          'description' => 'Primary Key: Config object collection.',
          'type' => 'varchar_ascii',
          'length' => 255,
          'not null' => TRUE,
          'default' => '',
        ],
        'name' => [
          'description' => 'Primary Key: Config object name.',
          'type' => 'varchar_ascii',
          'length' => 255,
          'not null' => TRUE,
          'default' => '',
        ],
        'data' => [
          'description' => 'A serialized configuration object data.',
          'type' => 'blob',
          'not null' => FALSE,
          'size' => 'big',
        ],
      ],
      'primary key' => ['collection', 'name'],
    ];

Becomes, note the usage of named parameters (that in the future could allow some more flexibility when it comes to adding/changing/deprecating/removing function arguments):

    new Table(
      name: 'config',
      description: 'The base table for configuration data.',
      columns: [
        new Column(
          name: 'collection',
          description: 'Primary Key: Config object collection.',
          type: ColumnType::VarcharAscii,
          length: 255,
          notNull: TRUE,
          default: '',
        ),
        new Column(
          name: 'name',
          description: 'Primary Key: Config object name.',
          type: ColumnType::VarcharAscii,
          length: 255,
          notNull: TRUE,
          default: '',
        ),
        new Column(
          name: 'data',
          description: 'A serialized configuration object data.',
          type: ColumnType::Blob,
          notNull: FALSE,
          size: ColumnSize::Big,
        ),
      ],
      primaryKey: new PrimaryKey(['collection', 'name']),
    );

2) Add self-validation of the properties of the objects:

  • $binary is only compatible with 'char', 'varchar' or 'text' columns
  • $length is only valid for type 'char', 'varchar' or 'text' fields
  • $unsigned is only valid for type 'int', 'float' and 'numeric' fields
  • can only add $scale and $precision properties to a 'numeric' column
  • $serialize is only compatible with text/blob columns
  • validate value types of array-based properties
  • equivalent of #3082239: Forbid limited length primary and unique keys, allow only in indexes
  • cannot set a default value for a serial column
  • serial column: adding notNull and unsigned is redundant
  • cannot define a Null default if notNull is true
  • cannot add a string default to a numeric column
  • default values should be restricted depending on the column type

Remaining tasks

User interface changes

API changes

Draft Change Record: https://www.drupal.org/node/3557405

Data model changes

Release notes snippet

Issue fork drupal-3411490

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

mondrake created an issue. See original summary.

mondrake’s picture

Assigned: Unassigned » mondrake

mondrake’s picture

Status: Active » Needs review
Issue tags: +Needs subsystem maintainer review, +Needs frontend framework manager review

Here's a MR that tries to do the minimum possible:

  • add all the relevant value objects to the Drupal\Core\Database\SchemaDefinition namespace
  • adds a converter from the new API to the old API + a basic test, and makes adjustments so that we can use the new API in hook_schema() or other functions without impacting the database driver code (that will be huge, but follow ups). In practice the new API is converted to the old one on the fly.
  • added a couple of example conversions to demonstrate the new API, for system_schema() and Drupal\Core\Batch\BatchStorage::schemaDefinition(). I would leave conversions to followups, too, because it's a lot of work, especially for cases where the old array structure is built dynamically.

Looking forward to reviews!

mondrake’s picture

Status: Needs review » Needs work

Let’s change the trait to an abstract class.

mondrake’s picture

Assigned: mondrake » Unassigned
Issue summary: View changes
Status: Needs work » Needs review

For review. Invocation of hook_schema now return by default the array version of the schema definition, for BC. Code needs to explicitly opt-in to get Table objects instead or array shapes. In the future this will change but this allows to manage transition to the new approach.

mondrake’s picture

Status: Needs review » Needs work

Adding an enum for column size.

mondrake’s picture

Issue summary: View changes
mondrake’s picture

Status: Needs work » Needs review
mondrake’s picture

Issue summary: View changes
needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

bradjones1’s picture

Worth noting we're doing something similar but not quite the same over at #3397622: Adding GIN and GIST indexes to PostgreSQL databases.

Per some feedback in Drupal Slack today, it sounds like the ideal state would be to use events a la #3410480: [META] Use events in Database Schema operations but we're postponed enough as it is on related issues. At the very least, this demonstrates how we can use objects in some circumstances inside of "ArrayPI" without breaking BC and enabling innovation.

mondrake’s picture

Status: Needs work » Needs review
Issue tags: -Needs frontend framework manager review
daffie’s picture

Status: Needs review » Needs work

I like that we add and later change to the use of an object based schema for table definitions. With that said, I do have some questions/objections and hopefully they can be addressed. The main one is that all the new objects are based on classes that are final classes. That makes that they are not extendable. Which is great for MySQL/MariaDB as no new functionality will be added to those databases. However for PostgreSQL we would like to add GIST and GIN indexes (#3397622: Adding GIN and GIST indexes to PostgreSQL databases) and maybe others in the future. I do not see how the PostgreSQL module can do that. It would be great if we could allow database driver modules to extend/override the base classes.

mondrake’s picture

@daffie now this starts getting a bit old and I cannot recall the details of two years ago. However, please read #3410480-4: [META] Use events in Database Schema operations: the part that this issue is covering relates to the 'immutable' schema, the driver/db specifics should be dealt with somewhere else.

So for me the question here should be: what is #3397622: Adding GIN and GIST indexes to PostgreSQL databases trying to achieve 'in abstract' i.e. independently from the db? Creating an index over a JSON column? If yes, then maybe we need another JsonIndex 'immutable' object to describe the schema. Then, each db would implement it on its own logic, Postgres with GIN/GIST, others they would just fallback to the normal index.

This is in principle, I haven't looked at the details. But my purpose here is to separate the abstract schema definition from the db implementation.

That said, I have no issue to remove final from the classes. As long as this is not seen as a door to make the 'abstract' schema, db-dependent again.

geek-merlin’s picture

Title: Replace array-based Schema API with a value object structure » Replace array-based DB Schema API with a value object structure
mondrake’s picture

Status: Needs work » Needs review
andypost’s picture

Read through code and it looks nice!
Approach with Enums looks like should give a room for extension, so new kinds of indexes looks doable

mradcliffe’s picture

Issue summary: View changes

I started a draft change record.

mondrake’s picture

daffie’s picture

I would like to suggest a couple of changes to support database specific stuff like GIST and GIN indexes on PostgreSQL.

1. Add on the class Drupal\Core\Database\SchemaDefinition\Index two new variables. The first is $dbSpecificType and is the same as the one in the class Column. The second is $dbSpecificExtra is the same as $dbSpecificType only extra string with config data for creating the index.
An index that should be a GIST index for PostgreSQL could then be defined as:

   new Index(
     name: 'ages',
     columns: ['age'],
     dbSpecificType: [
       'pgsql' => 'gist',
     ],
     dbSpecificExtra: [
       'pgsql' => 'gist_trgm_ops',
     ],
   );
 

Other database drivers will just ignore the PostgreSQL specific stuff.

2. Lets add an hook call to let the current database driver module to override ConvertDefinition::schemaToArray($schema). When we have a column with something like:

   new Column(
     name: 'db_timestamp',
     description: "The database timestamp",
     dbSpecificType: [
       'mysql' => 'timestamp',
       'pgsql' => 'timestamp',
       'sqlite' => 'datetime',
     ],
     notNull: FALSE,
     default: NULL,
   );
 

We get after calling ConvertDefinition::schemaToArray() for MySQL something like:

   'db_timestamp' => [
     'description' => "The database timestamp",
     'type' => 'timestamp',
     'not null' => FALSE,
     'default' => NULL,
   ];
  

For SQLite the result would be:

   'db_timestamp' => [
     'description' => "The database timestamp",
     'type' => 'datetime',
     'not null' => FALSE,
     'default' => NULL,
   ];
  

@mondrake: What is your opinion?

Edit: Maybe we do not need the hook for point #2.

mondrake’s picture

Hi @daffie, my opinion:

#21.1 - I think the opposite, i.e. that we should remove $dbSpecificType from Column.
$dbSpecificType kills abstraction, and is impractical as it forces core to add definition for db drivers that it does not own. And in fact it does not work: do you see in core specific types for Oracle or MSSql or anything else which is not a core-supported driver?
However, until 21.2 is done, I'd be fine to add what you would suggest as a stop-gap, with the understanding that it will only probably serve core db drivers anyway. Asking for framework manager review in that case.

#21.2 - that is EXACTLY the endgame that #3410480-4: [META] Use events in Database Schema operations describes, please look at it and at the PoC MR in it. The idea is

(1) SchemaDefinition (abstract) --> (2) dispatchSchemaDefinitionEvent (abstract) --> (3) subscribeSchemaDefinitionEvent (DBMS-specific) --> Schema (DBMS-specific)

(1) SchemaDefinition is a DMBS-abstract, immutable value object
(2) when we need to create a table on my TimeWarpBase database, we dispatch an event with the SchemaDefinition object as input that requests subscribers to collect the db-specific specs in the Schema object
(3) my TimeWarpBase database driver implements the subscriber and reacts on (2), providing the TimeWarpBase-specific specs
(4) at the end of event processing, the Schema objects has the dbms-specific specs necessary for creating the table

And: in this issue we only start addressing (1) which is a huge task on its own anyway.

mondrake’s picture

@daffie last commit introduces something along the lines of #21.1.

Only a single additional property to the Index object, that is converted into an additional spec key 'indexes extra'. Example:

      indexes: [
        new Index(
          name: 'foo',
          columns: ['job'],
          dbSpecificExtra: [
            'pgsql' => [
              'type' => 'gist',
              'operator' => 'gist_trgm_ops',
            ],
          ],
        ),
      ],

gets converted to

      'indexes' => [
        'foo' => ['job'],
      ],
      'indexes extra' => [
        'foo' => [
          'pgsql' => [
            'type' => 'gist',
            'operator' => 'gist_trgm_ops',
          ],
        ],
      ],
daffie’s picture

Status: Needs review » Needs work
Issue tags: -Needs subsystem maintainer review

@mondrake: For #23, yes a very good idea!

For my not reading the meta issue. I should have read the issue and I did not. My bad, and yes I agree with your proposal in the Meta issue. I have removed the tag "Needs subsystem maintainer review".

I have my a couple of remarks on the MR. Let me know what you think of them.

Again good work and a good proposal! Feel free to remove the tag for the framework managers review.

mondrake’s picture

Status: Needs work » Needs review

Thanks @daffie! Addressed your points.

nicxvan’s picture

Again good work and a good proposal! Feel free to remove the tag for the framework managers review.

I think this still needs framework manager review, I don't see one.

mondrake’s picture

daffie’s picture

From comment #22:

The idea is (1) SchemaDefinition (abstract) --> (2) dispatchSchemaDefinitionEvent (abstract) --> (3) subscribeSchemaDefinitionEvent (DBMS-specific) --> Schema (DBMS-specific)

I agree with the idea. The SchemaDefinition should be abstract. For the new object definition I think that is the right solution. However for the array definition we run into problems. With the Columns, Indexes and Tables we have the property $dbSpecificExtra and it is added to the array definition in different ways. The different ways of adding the "extra" data is to me all a bit hacky and we are only doing that to make the array definition also abstract. As we are going to deprecate and drop the array definition anyway, lets make the array definition database driver specific. We will end up with a prettier solution for the object definition and we do not need to do hacky solutions in the array definitions.

For the Column class:

new Column(
  name: 'db_timestamp',
  description: "The database timestamp",
  dbSpecificExtra: [
    'mysql' => [
      'mysql_type' => 'timestamp',
    ],
    'pgsql' => [
      'pgsql_type' => 'timestamp',
    ],
    'sqlite' => [
      'sqlite_type' => 'datetime',
    ],
  ],
  notNull: FALSE,
  default: NULL,
);

With the database specific type we are doubling down with database type. Not very pretty. We can do better with:

new Column(
  name: 'db_timestamp',
  description: "The database timestamp",
  dbSpecificExtra: [
    'mysql' => [
      'type' => 'timestamp',
    ],
    'pgsql' => [
      'type' => 'timestamp',
    ],
    'sqlite' => [
      'type' => 'datetime',
    ],
  ],
  notNull: FALSE,
  default: NULL,
);

The resulting array definition for SQLite:

'db_timestamp' => [
  'description' => "The database timestamp",
  'type' => 'datetime',
  'not null' => FALSE,
  'default' => NULL,
];

For the Table class:

new Table(
  dbSpecificExtra: [
    'mysql' => [
      'mysql_engine' => 'InnoDB',
      'mysql_character_set' => 'utf8mb4',
    ],
  ],
);

The database specific keys are directly added to the array definition. You can only hope that a key setting for one database does not break something in another database.

For the Index class:

new Index(
  name: 'foo',
  columns: ['job'],
  dbSpecificExtra: [
    'pgsql' => [
      'type' => 'gist',
      'operator' => 'gist_trgm_ops',
    ],
    'time_warp' => [
      'jump' => 'to_the_left',
      'step' => 'to_the_right',
    ],
  ],
);
'indexes' => [
  'foo' => ['job'],
],
'indexes extra' => [
  'foo' => [
    'pgsql' => [
      'type' => 'gist',
      'operator' => 'gist_trgm_ops',
    ],
    'time_warp' => [
      'jump' => 'to_the_left',
      'step' => 'to_the_right',
    ],
  ],
],

The database specific stuff is added in a second array with the key "indexes extra". Also not a very pretty solution. With a database specific array definition we can do much better.

The resulting array definition for PostgreSQL:

'indexes' => [
  'foo' => [
    'columns' => ['job'],
    'type' => 'gist',
    'operator' => 'gist_trgm_ops',
  ],    
];

@mondrake: What is your opinion?

Edit: If a framework manager agrees with your solution, we will do your solution.

mondrake’s picture

My opinion is that we should not touch the array definition here for maximum BC, and we are doing that everywhere except for indexes definitions, where we want to have an extra information about the dbms-specific index type that simply is not defined today. And in order to keep BC, we add an extra 'indexes extra' key to the spec because in the current 'indexes' key there's no space for it.

But @daffie let's not try to solve future problems here - strictly speaking, adding 'indexes extra' is scope for #3397622: Adding GIN and GIST indexes to PostgreSQL databases. Done here just to demonstrate the extensibility of the new approach.

mondrake’s picture

BTW, #3397622: Adding GIN and GIST indexes to PostgreSQL databases has now an MR built on top of the one here, that passes MySQL and PgSQL tests.

alexpott’s picture

Why are we introducing the ::Undefined everywhere? Isn't that the same as allowing NULLs?

mondrake’s picture

Why are we introducing the ::Undefined everywhere? Isn't that the same as allowing NULLs?

Not if we allow NULL as a valid value - we need to be able to distinguish NULL as a valid value, and not passing the argument (which does not necessarily mean that we can safely fall back to using NULL as a default).

alexpott’s picture

Re #32 - yeah but that is only for default value. Imo we should have a DefaultValue class rather than Property::Undefined everywhere.

alexpott’s picture

Status: Needs review » Needs work

I agree with @mondrake re #28. I think we should not be touching the array definitions here at all. In fact I do not think we should be adding indexes extra support here at all. That belongs in its own issue and deserves its own discussions.

alexpott’s picture

I'm not sure telling people to do \Drupal\Core\Database\Schema\ConvertDefinition::schemaToArray($schema) - in hook_schema. Ie. from the CR:

/**
 * Implements hook_schema().
 */
function system_schema(bool $asArray = TRUE): array {
   $schema['my_table'] = new Table(
   );

   return $asArray ? \Drupal\Core\Database\Schema\ConvertDefinition::schemaToArray($schema) : $schema;
}

Imo this is wrong. The caller should process the definitions and if the schema is not a Table it call the converter for you. In a follow-up we can add a new issue to trigger a deprecation if doing this.

geek-merlin’s picture

EDIT: Race condition with @alexpott #34. But still relevant for the followup.

@mondrake: +100 to make all this thoroughly typed!

But kittens (and me) suffering a lot from the 'extra' keys bringing back the array-PI that was about to be eradicated in the first place.

class-string pattern to the rescue. I guess you saw it before:

In this issue it is implemented like this:

new Table(
  extra: [
    MySqlTable::class => new MySqlTable(
      engine: MySqlEngine::InnoDB,
      characterSet: CharSet::utf8mb4,
    ),
  ],
);

Keying by class-string allows PhpStan, PhpStorm, and friends to infer types:

  /**
   * @template T
   * @param array<class-string<T>, T> $extra
   */
// Class can now be final as $extra allows infinite extensibility.
final class Table {
  public function __construct(
    private readonly array $extra,
  ) {}
}

Consuming code can now do:

$mySqlTableExtra = $table->extra[MySqlTable::class];
if ($mySqlTableExtra) {
  // This is NOT necessary, because of class-string.
  assert($mySqlTableExtra instanceof MySqlTable);
  ...
}
alexpott’s picture

The thing that bothers me about #35 is that module authors are going to have to make 2 changes to their hook schema in order to navigate through the deprecations. One convert the thing and support the $as_array thing and then another to remove all of that (and potentially a middle step when the default changes). I think we need a deprecation plan posted on this issue and we need to work out how to have the minimum number of steps for a module author to take. In an ideal world... hook schema all arrays and then they convert to use the objects and then do nothing else.

mondrake’s picture

Status: Needs work » Needs review

If we can 'special case' the invocation of hook_schema implementation in the module handler, we could make the conversion as part of the processing, after the invocation and before returning to the caller. So we could have hook_schema implementations either return an array of Tables or an array-schema and that be managed transparently. Easier for deprecation too because we would only throw deprecations in one place.

Not as easy for storage handlers though, as each one has its own method, and extenders may rely on array-schema only.

mondrake’s picture

#36 yes... but that's for later 😀

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Needs review » Needs work
mondrake’s picture

Status: Needs work » Needs review

back to NR

mondrake’s picture

Assigned: mondrake » Unassigned
mondrake’s picture

Status: Needs review » Needs work

Umm no, we need to have the conversion done where createTable is called, because in later issues we will pass Table objects to it, and if we convert it upstream we will lose that.

mondrake’s picture

Status: Needs work » Needs review

Done #43

mondrake’s picture

Actually added more type specific value classes to make things stricter.

mondrake’s picture

Filed follow up #3558426: Add property validation for SchemaDefinition value objects. The value type classes will definitely help there.

alexpott’s picture

If we've got a difference between a NULL and an empty string value for description - then we shouldn't. That's feels like something completely unnecessary - for the reader of the description there is no difference both are the same.

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Needs review » Needs work

#47 ok, will change that

mondrake’s picture

Assigned: mondrake » Unassigned
Status: Needs work » Needs review

#47 done

daffie’s picture

Status: Needs review » Needs work
mondrake’s picture

Status: Needs work » Needs review
daffie’s picture

Status: Needs review » Reviewed & tested by the community

It all looks good to me.
For me it is RTBC.

joachim’s picture

Status: Reviewed & tested by the community » Needs work

Left a comment on the MR.

Also, we have a problem because the BC policy says:

# Function arguments
Named arguments should not be considered part of the public API.

and these value objects surely can NOT change their argument names.

mondrake’s picture

Status: Needs work » Reviewed & tested by the community

#53 seems overzealous.

In coding standards, I see #3337834: Standards for PHP named arguments is an active (stalled) issue. PHP attributes were requested to be exempted, #3439004: Coding standard for attributes that turned into #3439004: Coding standard for attributes (stalled). In fact, core moved on and named arguments in attributes are common.

So this is at least debatable and that's why this is on framework manager's review. In fact SSM and a FM reviewed this already and they did not raise the point.

joachim’s picture

Status: Reviewed & tested by the community » Needs work

The link in the docs is incorrect.

mondrake’s picture

Status: Needs work » Needs review
joachim’s picture

Status: Needs review » Needs work

The link to the docs section needs to be made with @link - see https://project.pages.drupalcode.org/coding_standards/php/documentation/

mondrake’s picture

Status: Needs work » Needs review

@joachim can you please suggest the changes you believe are needed and I will include them.

alexpott’s picture

I think the CR needs an update for the default value changes...

alexpott’s picture

I think we should also file an issue and bake support for this in https://www.drupal.org/project/schema ASAP - we are obviously helped out by the fact there is no 11.x release yet :)

mondrake’s picture

Updated CR.

mondrake’s picture

daffie’s picture

Status: Needs review » Reviewed & tested by the community

All the changes look good to me.
The followup has been created.
Back to RTBC.

liam morland’s picture

@daffie GitLab is saying to me "Change requests must be approved by the requesting user" followed by your picture. I think you need to resolve the suggested changes.

mradcliffe’s picture

Status: Reviewed & tested by the community » Needs work

There's a rebase/merge from 11.x needed again.

joachim’s picture

The 'See the Schema API documentation for details on schema definition' text should use a @link tag to make a link to another docs section.

mondrake’s picture

Status: Needs work » Reviewed & tested by the community

merged with 11.x, no conflicts.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

Posted a couple of small comments on the MR - once addressed this can go back to RTBC so we can land early in the 11.4.x cycle, Thanks!

joachim’s picture

> See the Schema API documentation for details on schema definition
* structures.

I keep saying this needs to be made into a proper link to the 'schemaapi' section with a @link tag.

mondrake’s picture

Status: Needs work » Reviewed & tested by the community

@alexpott replied inline

@joachim see #58

joachim’s picture

Status: Reviewed & tested by the community » Needs work

> @joachim can you please suggest the changes you believe are needed and I will include them.

I've already explained what needs to be done -- use the @link tag. I also gave a link to the documentation that explains how to do it.

That needs to be fixed for this to pass the documentation gate.

mondrake’s picture

#71 I will be waiting for someone with expertise in this area to suggest in detail the changes required. Direct commit to the MR even better.

joachim’s picture

> #71 I will be waiting for someone with expertise in this area to suggest in detail the changes required

No expertise required. There is an example on the page I linked to. It's this:

* See also @link topic_foo Foo topic, @endlink and the
mondrake’s picture

Assigned: Unassigned » mondrake

[...] we could still change \Drupal\Core\Database\Schema::createTable() to allow table objects and use ConvertDefinition::tableToArray() in there instead [...]

OK, with ConvertDefinition I wanted to avoid adding, to new code, methods that we already know will be deprecated later. But fine, let's do that.

mondrake’s picture

Status: Needs work » Needs review
mondrake’s picture

Assigned: mondrake » Unassigned

For review again.

mradcliffe’s picture

I think most of @alexpott's review comments are addressed except “This also makes me think that maybe a schema object which implements ArrayAccess might actually be the best BC way to go here.”.

I reviewed database.api.php and function hook_schema does not need to be linked to schemaapi group because it is already in the schemaapi group, @joachim. When navigating on api.drupal.org, I can follow the “Database abstraction layer” link, then follow the “Schema API” link, and hook_schema is listed as a function. When I follow that link, “Schema API” is listed as a related topic.

mondrake’s picture

Not sure about implementing ArrayAccess at this point... $schema->tables will return the array of Table objects when needed, and $schema->toArray() will return the legacy arrayPI specification.

mradcliffe’s picture

Status: Needs review » Needs work

Seems okay to me, then.

But ConvertDefinitionTest has a failure and static analysis failure for named parameters.

mondrake’s picture

Status: Needs work » Needs review

Yes, I forgot to adjust the newly added unit test. Fixed.

mondrake’s picture

Replied inline.

Let me try adding a new method wrapping createTable and see how that could work.

mondrake’s picture

Added a new Schema::createTableFromDefinition() that I'm not particularly proud of. It swallows almost anything and creates a table out of it.

mradcliffe’s picture

I updated the change record.

There seems to be a random failure in CKEditor FunctionalJavascript test so I reran the pipeline.

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Needs review » Needs work

Let's add one more hook_schema() conversion as system_schema() is kinda special one.

mondrake’s picture

Assigned: mondrake » Unassigned
Status: Needs work » Needs review

Converted ban_schema().

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Needs review » Needs work

#85 exposed a bug.

mondrake’s picture

Assigned: mondrake » Unassigned
Status: Needs work » Needs review

Green again.

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Needs review » Needs work

#3558426: Add property validation for SchemaDefinition value objects is almost complete and IMO should be incorporated here as it’s changing some methods that are introduced here. NW to get that done.

mondrake’s picture

Assigned: mondrake » Unassigned
Issue summary: View changes
Status: Needs work » Needs review

Done #88 and updated IS.

mondrake’s picture

fork update. still green.

mondrake’s picture

fwiw, #3557481: Convert hook_schema() implementations to SchemaDefinition is complete. Built on top of the MR here, converts all the hook_schema() implementations in core.

mradcliffe’s picture

Status: Needs review » Reviewed & tested by the community

I think this should be RTBC since that one is green. Thank you, @mondrake.

mondrake’s picture

#3566704: Convert core/tests to use SchemaDefinition objects is also practically ready for review once this will be merged.

catch’s picture

I haven't reviewed this in depth yet.

One question I have is about the top level Schema object - I'm not really sure what that gets us that an array of table objects would not.

Looking at comment_schema() in the conversion MR, it results in a fair amount of nesting compared to if we did $tables[] = new Table();

mondrake’s picture

#94

Apart from addressing @alexpott's comment here https://git.drupalcode.org/project/drupal/-/merge_requests/5987#note_638314, the Schema object now also comes along as a solid solution to reverse engineering (introspecting) / exporting as abstract a real db schema. See #3566893: [PP-2] Convert DbDumpCommand to use SchemaDefinition objects for example.

EDIT: the DbDumpCommand changes of that MR

catch’s picture

OK I need to think about #95 a bit but I still find the multiple levels of nesting harder to follow reading that MR. We could potentially build the table objects and add them to the schema object rather than all the nested calls?

mondrake’s picture

#96 already doing that in #3557481: Convert hook_schema() implementations to SchemaDefinition where multiple tables are in a single Schema object. See for example core/modules/search/search.install in that MR.

Also more thinking: a 'schema' in general DBMS terms can include other assets like views, stored procedures, etc etc. Not saying we should support that, but in case a need appears in the future, a mere array of tables will not be enough.

catch’s picture

OK that search one looks more like I was thinking, so I think the change would be using the same pattern for comment then.

What does name do in the Schema object? Can't immediately see how it's used.

Also more thinking: a 'schema' in general DBMS terms can include other assets like views, stored procedures, etc etc. Not saying we should support that, but in case a need appears in the future, a mere array of tables will not be enough.

Doesn't hurt to leave open the possibility.

mondrake’s picture

What does name do in the Schema object? Can't immediately see how it's used.

Yeah... I originally put it in to be consistent with all the rest of the objects in that namespace, with the idea to store the module name there when returning from hook_schema() implementations. But correct, for the moment it does nothing.

the change would be using the same pattern for comment then

Didn't do that because there's one table only in that hook_schema() implementation, but no problem to use the pattern used for search there too, when we come to it.

mondrake’s picture

fwiw - I am thinking to explore a way to create a registry of the abstract definitions of the tables created by Drupal. That would help solve the problem of dumping data from different types of databases (db-dump can only work on MySQL at the moment). Also, the schema introspection features are not guaranteed to rebuild a fully abstract specification - think of identifiers length that may cause a table/column abstract name to be shortened; for example, #3200743: Decouple identifier management from database Connection, introduce an IdentifierHandler class that would restrict identifiers to 63 chars in PostgreSQL: once done, there would be no direct way to retrieve a table or column abstract name from the concrete one.

So a registry (a db table with all the objects created and their abstract definition) would allow keeping the abstracted object data, that could be used for dumping x-platform and also facilitate schema operations like changing/renaming/deleting fields, adding/removng keys, etc... that currently require partial specifications to be passed in --- having a registry, those operations could fetch current state vs desired state and produce the specific DDL SQL needed.

All this would require quite some work, but dumping the idea here just to say that in that scenario the Schema name may play a role.

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Reviewed & tested by the community » Needs work

In light of #100, I think we need to do something more with Schema here. On it.

mondrake’s picture

Assigned: mondrake » Unassigned
Status: Needs work » Needs review

The last commit adds a definition of the Schema "type" i.e. "what" is requiring the tables: a module via module_hook, an entity, a storage class, a migration plugin, a test, etc etc.

This also makes the storage classes actually return a Schema object and not a Table. This way we can simplify a lot the new Schema::createTableFromDefinition().

Schema type and name will then be captured in the registry mentioned in #100... a bit in the future.

mondrake’s picture

Assigned: Unassigned » mondrake
Status: Needs review » Needs work

Probably the enum name would better be SchemaDefinitionType, on it

mondrake’s picture

Retargeted to main

mondrake’s picture

Assigned: mondrake » Unassigned
Status: Needs work » Needs review

Done #104.

mondrake’s picture

Version: 11.x-dev » main
mondrake’s picture

Let’s make storage classes schemaDefinition() methods retirn a Schema and not a Table, see #3568040: Convert *Storage() classes' ensureTableExists methods to use SchemaDefinition objects

mondrake’s picture

Status: Needs work » Needs review

Oooops did not realize it’s done already.

mondrake’s picture

Updated CR with latest changes.

steinmb’s picture

Issue still tagged with "Needs framework manager review" - Does that mean we waiting on Daffie here or should the tag be removed?

nicxvan’s picture

@daffie is not a framework manager, he is the subsystem maintainer.

The backend framework managers are:

effulgentsia, catch, alexpott, and larowlan.

I've not read the issue to see if they've reviewed it with their framework manager hat on. You can find the list of roles in maintainers.txt

mondrake’s picture

Merged with main; since the ban module is now removed, swapped the example hook_schema() implementation with the one from the locale module.

mondrake’s picture

Merged with main, locale.install added an 'hash' column to the 'locale_file' table.

needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

mondrake’s picture

Status: Needs work » Needs review
amateescu’s picture

Status: Needs review » Needs work

Reviewed the MR.. I like the direction in principle, but I think it'll need a bit more thought and discussion :)

mondrake’s picture

Thanks for review @amateescu!

Will take some time to process it all...

mondrake’s picture

Status: Needs work » Needs review

Replied inline to @amateescu questions (well tried at least), and made some changes.

mondrake’s picture

wrt to methods that require partial implementation of a schema, IMHO we should ideally stop that in favor of always passing a fully immutable schema - so that the methods could always have full context of the destination schema. Will not be tomorrow but at least that should be the direction.

needs-review-queue-bot’s picture

Status: Needs review » Needs work
StatusFileSize
new90 bytes

The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

mondrake’s picture

Status: Needs work » Needs review

rebased