Problem/Motivation

Postponed on #3492391: Make the event dispatcher available before container full bootstrap

Menu tree storage, config and cache uses the same pattern: execute a query, catch an exception, try to create the table, if succeeded rerun the query. This is already wasteful and there's already demand #2338747: Move {router} out of system.install and create the table lazy for it elsewhere.

Proposed resolution

Create a trait for lazy table creation and use that trait in the existing backend services. Later do the same with services that now do not make use of lazy table creation.

Remaining tasks

User interface changes

API changes

Ther new trait Drupal\Core\Database\LazyTableCreationTrait is created.
The trait has 2 methods: Drupal\Core\Database\LazyTableCreationTrait::catchException() and Drupal\Core\Database\LazyTableCreationTrait::ensureTableExists(). The first method with holds the throwing of the exception when the table does not exists. The second method tries to create the table. There is also an abstract method Drupal\Core\Database\LazyTableCreationTrait::schemaDefinition() that every service that uses the trait must implement. The method should return the table schema for the table.

A number of class constants have been deprecated and will be removed before Drupal 12.0.0:

  1. Drupal\Core\Batch\BatchStorage::TABLE_NAME
  2. Drupal\Core\Flood\DatabaseBackend::TABLE_NAME
  3. Drupal\Core\Lock\DatabaseLockBackend::TABLE_NAME
  4. Drupal\Core\Queue\DatabaseQueue::TABLE_NAME

A number of protected class variables have been deprecated and will be removed before Drupal 12.0.0:

  1. Drupal\Core\Lock\DatabaseLockBackend::database
  2. Drupal\Core\Routing\MatcherDumper::tableName
CommentFileSizeAuthor
#151 interdiff-2371709-120-149.txt71.33 KBbhanu951
#120 on_demand_table-2371709-120.patch55.17 KBalmaudoh
#116 interdiff.txt727 bytesalmaudoh
#116 on_demand_table-2371709-116.patch55.68 KBalmaudoh
#109 interdiff.txt4.53 KBalmaudoh
#109 on_demand_table-2371709-109.patch55.71 KBalmaudoh
#105 on_demand_table-2371709-105.patch55.1 KBalmaudoh
#99 2371709-99-do-not-test.patch51.56 KBkgoel
#95 2371709-95.patch59.63 KBkgoel
#86 interdiff.txt6.89 KBdawehner
#86 2371709-86.patch59.63 KBdawehner
#77 2371709-77.patch60.26 KBdawehner
#77 interdiff.txt1.61 KBdawehner
#75 interdiff.txt17.56 KBdawehner
#73 interdiff.txt17.56 KBdawehner
#73 2371709-71.patch59.88 KBdawehner
#65 interdiff.txt9.44 KBamateescu
#63 interdiff.txt3.76 KBamateescu
#61 2371709_61.patch50.39 KBchx
#61 interdiff.txt3.97 KBchx
#59 interdiff.txt4.18 KBchx
#58 2371709_58.patch50.5 KBchx
#57 interdiff.txt4.21 KBchx
#57 2371709_57.patch49.41 KBchx
#52 2371709_52.patch48.97 KBchx
#52 interdiff.txt2.25 KBchx
#50 2371709_49.patch46.58 KBchx
#50 interdiff.txt18.04 KBchx
#47 interdiff.txt4.76 KBchx
#47 2371709_46.patch46.64 KBchx
#45 2371709_44.patch45.89 KBchx
#41 2371709-pgsql.txt763 bytesamateescu
#19 interdiff.txt2.93 KBchx
#19 2371709_19.patch39.45 KBchx
#16 interdiff.txt6.19 KBchx
#16 2371709_15.patch40.71 KBchx
#12 interdiff.txt1.42 KBchx
#12 2371709_12.patch37.44 KBchx
#11 interdiff.txt804 byteschx
#11 2371709_11.patch37.53 KBchx
#10 2371709_10.patch37.1 KBchx
#9 2371709_8.patch35.15 KBchx
#8 2371709_7.patch35.17 KBchx
#5 2371709_5.patch34.2 KBchx
#3 2371709_3.patch30.66 KBchx
eate.patch29.11 KBchx

Issue fork drupal-2371709

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

Status: Needs review » Needs work

The last submitted patch, eate.patch, failed testing.

chx’s picture

Issue summary: View changes
chx’s picture

Status: Needs work » Needs review
StatusFileSize
new30.66 KB

Bumped ensureTableExists to Connection; now we should install at least. Writing tests.

Status: Needs review » Needs work

The last submitted patch, 3: 2371709_3.patch, failed testing.

chx’s picture

Issue summary: View changes
StatusFileSize
new34.2 KB
chx’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 5: 2371709_5.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new35.17 KB
chx’s picture

Issue summary: View changes
StatusFileSize
new35.15 KB

dawehner asked for a rename.

chx’s picture

Issue summary: View changes
StatusFileSize
new37.1 KB

Bumped ensureTableExists to schema, there's 1 usage of it outside of the database, so it doesn't really matter and it's really a schema-ish operation. Added a test for this one too. The patch is still net negative (54 LoC) despite adding a new interface and tests and it of course enables doing this conversion for a lot other very cheaply.

chx’s picture

Issue summary: View changes
StatusFileSize
new37.53 KB
new804 bytes

Missing interface doxygen pointed out by dawehner. Added the results of some discussions with dawehner into the summary.

chx’s picture

Issue summary: View changes
StatusFileSize
new37.44 KB
new1.42 KB

I realized that running a select on an empty table is pointless. Genius, that's my middle name, isn't it? Took me a bit.

dawehner’s picture

Issue summary: View changes
  1. +++ b/core/lib/Drupal/Core/Cache/CachetagSchema.php
    @@ -0,0 +1,45 @@
    +
    +class CachetagSchema implements SchemaProviderInterface {
    

    It would be nice to explain why this schema is separated out ...

  2. +++ b/core/lib/Drupal/Core/Database/Query/Query.php
    @@ -19,6 +21,13 @@
    +   *
    +   * @var string
    +   */
    

    Given that its more than just a string we should document it, if possible

  3. +++ b/core/lib/Drupal/Core/Database/Query/Query.php
    @@ -110,7 +119,41 @@ public function __clone() {
    +      while ($table instanceof Query) {
    +        $table = $table->getTable();
    +      }
    ...
    +  public function getTable() {
    +    return $this->table;
    +  }
    

    Alternative getTable() could always return the table string.

  4. +++ b/core/lib/Drupal/Core/Database/Schema.php
    @@ -684,6 +683,33 @@ public function createTable($name, $table) {
    +  public function ensureTableExists($table_name, SchemaProviderInterface $schema_provider) {
    +    try {
    +      if (!$this->tableExists($table_name)) {
    

    I'm curious ... did you considered to add the tablename into the SchemaProviderInterface?

  5. +++ b/core/lib/Drupal/Core/Database/SchemaProviderInterface.php
    @@ -0,0 +1,28 @@
    +  /**
    +   * A schema API array.
    +   *
    +   * @return array
    +   */
    +  public function getSchema();
    

    Maybe put a @see hook_schema() on here.

  6. +++ b/core/modules/system/src/Tests/Database/SelectTest.php
    @@ -28,6 +29,29 @@ function testSimpleSelect() {
       /**
    +   * Tests rudimentary SELECT statements.
    +   */
    

    ha, this is not rudimentary anymore

  7. +++ b/core/modules/system/src/Tests/Database/SelectTest.php
    @@ -28,6 +29,29 @@ function testSimpleSelect() {
    +  function testSimpleSelectWithTableAssert() {
    

    Let's rename it to testSimpleSelecgtWithEnsuringTable

dawehner’s picture

Issue summary: View changes

Status: Needs review » Needs work

The last submitted patch, 12: 2371709_12.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new40.71 KB
new6.19 KB

All done except 4) because the query already has the tablename.

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

This change is pretty convenient and even could make in the future makes some of the tests faster as well as avoids
the manual creation of tables in DrupalKernel tests, which would be pretty handy.

Crell’s picture

Status: Reviewed & tested by the community » Needs work

Overall this makes good sense, and I like how it's broken up into pieces within the DB API. That said, there's some improvements we can still make:

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -298,21 +275,11 @@ public function delete($cid) {
    +    // Delete in chunks when a large array is passed.
    +    foreach (array_chunk($cids, 1000) as $cids_chunk) {
    +      $this->connection->delete($this->bin)
    +        ->condition('cid', $cids_chunk, 'IN')
    +        ->executeEnsuringTable($this);
    

    If the table doesn't exist, why do we need to create the table and then delete? Our desired post-condition is already true (there are no records with those $cids in the database), so why bother creating the table? We only need do that on insert/update.

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -346,17 +308,7 @@ public function deleteTags(array $tags) {
    +    $this->connection->truncate($this->bin)->executeEnsuringTable($this);
    

    Same comment here. If a truncate fails, our post-condition is known-true so why bother creating the table?

  3. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -389,25 +336,20 @@ public function invalidateMultiple(array $cids) {
    +    $tag_cache = &drupal_static('Drupal\Core\Cache\CacheBackendInterface::tagCache', array());
    +    $invalidated_tags = &drupal_static('Drupal\Core\Cache\DatabaseBackend::invalidatedTags', array());
    

    Not introduced by this patch so not really a blocker here, but can someone explain to me why there's a drupal_static() call inside a class? Wouldn't an object property accomplish the same thing with less hackery?

  4. +++ b/core/lib/Drupal/Core/Database/Query/Query.php
    @@ -19,6 +20,15 @@
       /**
    +   * The base table.
    +   *
    +   * Can be a Query object for subqueries.
    +   *
    +   * @var string|static
    +   */
    +  protected $table;
    

    Off topic, but a reasonable improvement.

  5. +++ b/core/lib/Drupal/Core/Database/Schema.php
    @@ -684,6 +683,33 @@ public function createTable($name, $table) {
    +   * @return bool
    +   *   TRUE if the table exists, FALSE if it does not.
    

    This could be read as whether or not the table existed before, not whether or not we're now sure the object exists. New wording: "TRUE if the table already existed or now exists, FALSE if there was an error and it does not."

  6. +++ b/core/modules/system/src/Tests/Database/SchemaTest.php
    @@ -440,4 +443,11 @@ protected function assertFieldCharacteristics($table_name, $field_name, $field_s
    +  /**
    +   * {@inheritdoc}
    +   */
    +  public function getSchema() {
    +    return drupal_get_schema_unprocessed('database_test', 'test');
    +  }
    

    Since that's a tests-only table, can we not move it into this class entirely rather than calling out to hook_schema?

  7. +++ b/core/modules/system/src/Tests/Database/SelectTest.php
    @@ -28,6 +29,29 @@ function testSimpleSelect() {
    +      self::addSampleData();
    

    Nitpick: static, not self.

Discussing with dawehner at BADCamp I suggested that maybe we should have a table object, rather than a "schema provider object" (since in all relevant cases we will be ensuring only a single table). However, reading through the patch I can see the benefit of having the schema defined directly on the object that is the swappable service. So I'm OK with this design.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new39.45 KB
new2.93 KB

1,2 reverted (and removed the bug fix, not necessary now)

3. off topic as noted

5. 7. fixed

6. the schema is in the module and is used by the installSchema() call in setUp, why would I copy it?

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

The feedback of @crell got addressed in #20.

6. the schema is in the module and is used by the installSchema() call in setUp, why would I copy it?

Not copying the schema array is a valid point.

catch’s picture

Priority: Normal » Major

I've just committed #1426804: Allow field storages to be persisted when they have no fields. - this added some test code that we should be able to remove again here. Leaving RTBC since that can happen in a separate issue, but also not committing yet because I've not been able to give this a proper look yet (like the idea a lot though).

yched’s picture

This indeed looks like it would let us remove some edgy test code from #1426804: Allow field storages to be persisted when they have no fields., but that would require adding a SchemaProviderInterface class for Entity & Fields tables schemas, which is probably outside the scope of that issue here.

On that aspect though :

1) SchemaProviderInterface::getSchema() [no param] means there has to be one SchemaProvider object per table, which doesn't look applicable for dynamic needs like, well, entities & fields. Could that method receive the $table_name as a param to allow the same object to return schemas for different tables ? Then single-table implementations like the ones in this patch can always ditch/overlook the param.

2) Naming fun...
In the "Entity/field SQL storage" space, SqlContentEntityStorageSchema is in charge of assembling entity & field tables schema and creating the tables (currently does so in the various onEntityTypeCreate() / onFieldStorageCreate() / ... methods from EntityTypeListenerInterface & FieldStorageDefinitionListenerInterface)
So it would most likely be the object to also implement the SchemaProviderInterface added here.

But then : SqlContentEntityStorageSchema implements SchemaProviderInterface ?
I.e. "a Schema is a SchemaProvider" ? That sounds quite recursive :-/

That's a case of two different-but-related systems using the same name for slightly different things. Naming things in the "SQL entity storage schema" issue was quite debated IIRC. In the end, the "Schema" objects in the "SQL entity storage schema" land are more "handlers for some schema logic".

If this issue adds its notion of SchemaProviderInterface (which I think is a perfectly reasonable name), and Entity/field SQL storage has to use it, then maybe it would make sense for SqlContentEntityStorageSchema and its existing interfaces to move to being "SchemaProviders" too, instead of just "Schemas"...

Anyway - item 2) is probably more for discussion with the Entity Storage folks, probably in the followup issue where we move entity/field tables to the on-demand-creation mechanism added here.

chx’s picture

Status: Reviewed & tested by the community » Needs work

> Could that method receive the $table_name as a param to allow the same object to return schemas for different tables ? Then single-table implementations like the ones in this patch can always ditch/overlook the param.

Very nice idea! I didn't want the schema to return an array indexed by table names cos the single use is common but your suggestion makes a lot of sense. So setting for CNW to move {cachetags} in.

plach’s picture

@yched:

In the early Entity Storage patches SqlContentEntityStorageSchema was called ContentEntitySchemaHandler or something like that. I guess SqlContentEntityStorageSchemaProvider would definitely make more sense although "handler" is the right word IMO, as it does more than just returning a schema definition. I think we removed the handler suffix for consistency with the other entity handlers, but I've never been a big fan of that choice.

Anyway, I guess such a rename would need to happen in a BC way and, given the new policies, I'm not even sure that would be enough to let it be approved.

yched’s picture

@chx, @plach:

Thinking a bit more about passing $table_name to SchemaProviderInterface::getSchema(), I don't think that would be enough for SqlContentEntityStorageSchema(). It means getSchema($table_name) needs to reverse-engineer the table name to figure out which table it is (the base table, the revision table, the data table, some single-field table...) - that is, do the opposite of TableMappingInterface::getXxxTableName(). That would be some brittle regexp logic - not even sure that's doable since we hash some parts to avoid long table names.

Would be more helpful if SchemaProviderInterface::getSchema() received the "internal code for the table" : 'base_table', 'revision_table', 'data_table', 'field_table for field foobar'...

Meaning, SchemaProviderInterface::getSchema() should receive the table name (good enough for other, non-entity cases) + an arbitrary $context that was passed to $query->executeEnsuringTable($schemaProvider, $context) ?
(the code that does the query knows which table it is talking about, and is able to pass the right $context to describe that table)

andypost’s picture

@chx++

Having this in core will help history, comment_statistics to have own storage on per bundle basis.
The tracker could get some facelift too

yched’s picture

Wondering how the patch would work for queries JOINing several tables.
- one single $query->executeEnsuringTable[s?]($schemaProvider), and the same $schemaProvider is used for all the tables in the query ?
(not really compatible with the per-table $context param suggested in #25)
- one $query->executeEnsuringTable($schemaProvider, $table_name) per table present in the request ?

yched’s picture

FTR, regarding #25 and #27 : I'd hate to block this patch on requests for the (somewhat complex...) Entity/Field case. Having on-demand creation for Entity/Field tables would definitely make tests faster and simpler to write, but moving *all* runtime Entity/Fields requests (SqlStorageController + EFQ + Views...) to "beware, the tables might not exist yet, use $query->executeEnsuringTable() correctly" is not likely to be completely trivial, and having on-demand table creation for other systems is totally worthwhile in itself.

Just trying to see if we can shape the initial feature in a way that keeps the door open for entities without too much API changes. I know, that's a fine line to scope creep :-/

chx’s picture

More and more I tend to think that while adding a table name serves this issue well anything else should be a followup. Any query doing a JOIN is not fit for this. It's possible the field table case is too complicated to be covered by this simple mechanism. It is useful for so much else, although.

catch’s picture

This might be an awful idea, but what if we just put this logic directly into execute()?

The try/catch has no real cost.

The catch case if there's no schema definition we'd just rethrow the exception - that will have a bit of overhead, but PDO exceptions are rare enough that shouldn't matter.

Also I'm wondering what happens if the lazy table creation is implemented, but the first query to that table is in a query where it's not the base table. That feels like a race condition but also possible - could we check other tables in the query if the base table is OK, and only rethrow the exception after doing that?

chx’s picture

Great idea with an optional schema_provider argument on execute itself. But I'd rather answer your multi table question the same as I did yched's: if there is more than one table then simply re-throw. This solution is for simple I would say.

chx’s picture

By the way this is why I still love working on Drupal core: I thought this ready but since then we got two really great ideas! Please bring them on :)

Crell’s picture

I think it's important to target the 80-90% use case here, not all use cases. To that end:

1) If you're a use case where you're not working with the base table, I guess this doesn't work for you. That's a case you'd still need to handle a variant of this logic manually (just as the current use cases are doing now). If even that isn't sufficient, you're back to hook_schema (ie, what you need to do already.) Fields may be such a use case where following this logic manually is necessary, or at least a task for a second pass.

2) I would prefer to not fold this logic into execute(). That complects executing the query with the (rare) case of lazy schema generation. If you want to mess with lazy schema generation you should know what you're doing. Keep each method single-purpose. (SRP and all that.)

chx’s picture

Crell's argument for 2) sounds compelling but I will wait for catch's answer.

catch’s picture

I'm OK with not handling entities and fields here, I do think we should open a follow-up to explore that more though, and try to leave that open in this patch as much as possible, pretty much what yched said in #28.

@Crell this is another case where 'complection' doesn't really stand up as a magic/dirty word when looking at the overall context of the change.

Having the separate method only works if every read from the table is from exactly the same system that's handling the write. In the case of entities/fields there's at least three systems that interact with the storage (entity CRUD, entity field query and Views). If we don't use this for fields (which clearly is the most advanced and difficult to handle use case), there are other places where lazy table creation would be useful, that also have Views integration.

The obvious example in core is dblog. The Logger/DbLog class is currently in dblog module, and there's a hook_schema(). However it seems quite likely that we'll want to move the actual logger to core/lib at some point and drop the hook_schema(). This would allow the logger to be used independently of the UI at least, and remove the implicit dependency of the class on hook_schema() and module installation which was my original motivation for opening #1167144: Make cache backends responsible for their own storage.

However dblog module would still provide views integration, services.yml and extra UI bits like displaying individual log messages.

With the logic in execute(), dblog only needs to do the hook_schema() -> SchemaProviderInterface change, which is pure implementation detail and no API change (with the partial exception of the schema no longer showing up in drupal_get_schema() but that's never actually used any more and we're close to dropping it at this point).

With the logic in executeEnsuringTable(), Views would need to be able to distinguish between tables that need executeEnsuringTable() and queries that don't. If the former, it would also need to add to the hook_views_data() API to allow modules to communicate whether at table is lazy created or not.

(Side note, but any module can define Views integration for anything, I could see 'cache inspector' and 'queue inspector' modules showing up in contrib).

So while having two separate public methods keeps the concepts separate in the database layer, it will slowly have the effect of bleeding knowledge about lazy table creation - both the general concept and whether particular tables are lazy-created or not, around different subsystems in core, creating considerably more coupling overall.

Given that, this looks like an example of SRP taken too far to me, found this discussion which looks relevant too:

There is a second question, as important as "does every class have only one reason to change?" and that is "does every change only affect one class?"

From http://programmers.stackexchange.com/questions/150760/single-responsibil...

Apart from that. I'd expect by Drupal 9 that we move almost entirely away from hook_schema(), and more or less all tables are created via this mechanism - because a major trend since Drupal 5 or so has been to decouple various subsystems from specific database implementations (cache, lock, queue, entities). Most of what we have left in core hook_schema() definitions is things that simply have not been got to yet, @andypost gave three good examples. So the use cases for querying a table that's not lazy-created have already become increasingly narrow and this will tend towards the default (let alone that writing database queries is becoming increasingly narrow anyway with Views and EFQ).

yched’s picture

Interesting points in #35.

Not sure I get how putting the logic in execute() is that different from having a dedicated executeEnsuringTable() method though. Whatever the method, a SchemaProviderInterface object needs to be passed to the query anyway, right ?

So the code that writes the query has to know 1) that the table is lazy created and 2) what is the right SchemaProvider for the table- execute() doesn't bring much over executeEnsuringTable() regarding "who needs to know what" ?

(except executeEnsuringTable() is easier to add in query_alter() ?)

plach’s picture

Adding to #35: I think doing this for entities/fields would be tricky because atm the events that trigger schema creation/changes are completely separated from actual querying. Hence I'm not sure it would be easy to find a point in the various execution flows where it makes sense to call an explicit executeEnsuringTable() method (and pass a schema provider) and still keep things storage-agnostic. I've no actual concern, just the feeling that this would not play well with the current architecture...

(obviously this is not meant to be a blocker at all :)

catch’s picture

Whatever the method, a SchemaProviderInterface object needs to be passed to the query anyway, right ?

Thinking through it, that's making me think that having to provide the SchemaProviderInterface object at all is potentially brittle. It's not more brittle than what we do now, but if we apply the pattern to places like dblog then it gets trickier.

The only way to handle that would be to add either something declarative or an event so that when the exception is caught, it then figures out whether the table has a matching SchemaProviderInterface class somewhere - but that adds a fair bit of dependencies.

yched’s picture

The only way to handle that would be to add either something declarative or an event so that when the exception is caught, it then figures out whether the table has a matching SchemaProviderInterface class somewhere - but that adds a fair bit of dependencies

Which is a bit like hook_schema() with an indirection : mapping table names to "the name of a class that can provide the schema" instead of the schema array directly...
Yeah, sounds a bit complicated in terms of instantiating the classes - they are likely to be business objects too, rather than just SchemaProviderInterface objects, and thus have dependencies...

What this patch gives us is a mechanism for on-demand creation of tables, where the impact is that all queries on those tables need to account for "the table might not exist yet" and provide the SchemaProvider for the table.
That is good enough for self-enclosed systems where all queries are generated by using a defined API - which is a general trend anyway since we tend to abstract from hardcoded SQL implementations. --> log, cache, history, comment_statistics...

Then, as @catch points, there's Views, that virtually allows exposing any SQL table, bypassing the APIs above.
But then could Views simply try / catch its query and consider that "Exception, table does not exist" means "empty result set" ?

chx’s picture

Issue summary: View changes

Had a massive discussion with catch; summary updated accordingly. I believe this works even for the field case and definitely works for Views. Now, can we get some PostgreSQL folks to implement isTableMissingException quickly?

amateescu’s picture

StatusFileSize
new763 bytes

Here you go :)

plach’s picture

+++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Connection.php
@@ -323,6 +323,16 @@ public function rollbackSavepoint($savepoint_name = 'mimic_implicit_commit') {
+    return ($e->getCode() === '42P01') ? TRUE : FALSE;

Why not just return $e->getCode() === '42P01';?

amateescu’s picture

Because I'm dumb? :P

yched’s picture

@chx @catch : great to hear about the new plan !

What I don't get though is:

call the the SchemaProvider interfaces with the table name and use the first that answers.

Which SchemaProviders do we call exactly ? How do we find them ?

chx’s picture

Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new45.89 KB

There is no interdiff here, the change is too big, treat is as a completely new patch. amateescu, thanks for the pgsql patch.

The execute method debate is decided by patching query in the connection class instead.

Edit: re #44 we find them in the backtrace.

Status: Needs review » Needs work

The last submitted patch, 45: 2371709_44.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new46.64 KB
new4.76 KB

I didn't use the options carefully prepared in config, caught the wrong exception twice. Also two minor leftovers removed. Drupal standard now installs in peace. Whether the tests pass, I do not know :)

Status: Needs review » Needs work

The last submitted patch, 47: 2371709_46.patch, failed testing.

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -218,6 +218,12 @@ public function destroy() {
    +   * - on_table_missing: When a string, it is a table name. If the query
    

    We talked about making that a little bit more explicit: 'create_missing_table'

  2. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -555,6 +562,25 @@ public function query($query, array $args = array(), $options = array()) {
    +      if ($this->isTableMissingException($e) && !empty($options['on_table_missing'])) {
    +        if ($options['return'] == Database::RETURN_STATEMENT) {
    +          return new StatementEmpty();
    +        }
    

    We should explain why return early is okay: this basically optimizes for SELECT queries, right?

  3. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -555,6 +562,25 @@ public function query($query, array $args = array(), $options = array()) {
    +          foreach (debug_backtrace(DEBUG_BACKTRACE_PROVIDE_OBJECT) as $bt) {
    +            if (!empty($bt['object']) && $bt['object'] instanceof SchemaProviderInterface && ($schema = call_user_func([$bt['object'], 'getSchema'], $table))) {
    

    Let's get rid of cuf

  4. +++ b/core/lib/Drupal/Core/Database/Query/Merge.php
    @@ -135,6 +135,9 @@ class Merge extends Query implements ConditionInterface {
    +    if (!isset($options['on_table_missing'])) {
    +      $options['on_table_missing'] = $table;
    +    }
    
    +++ b/core/lib/Drupal/Core/Database/Query/Select.php
    @@ -137,6 +137,9 @@ class Select extends Query implements SelectInterface {
    +    if (!isset($options['on_table_exception'])) {
    +      $options['on_table_exception'] = TRUE;
    +    }
    
    +++ b/core/lib/Drupal/Core/Database/Query/Truncate.php
    @@ -35,6 +35,9 @@ class Truncate extends Query {
    +    if (!isset($options['on_table_missing'])) {
    +      $options['on_table_missing'] = FALSE;
    +    }
    
    +++ b/core/lib/Drupal/Core/Database/Query/Update.php
    @@ -74,6 +74,9 @@ class Update extends Query implements ConditionInterface {
    +    if (!isset($options['on_table_missing'])) {
    +      $options['on_table_missing'] = FALSE;
    +    }
    

    What about using the += pattern here as well? ... Oh and btw. on_table_exception seems to be an older name for on_table_missing?

  5. +++ b/core/lib/Drupal/Core/Database/Schema.php
    @@ -739,4 +766,9 @@ protected function escapeDefaultValue($value) {
    +  protected function query($query, array $args = array(), $options = array()) {
    +    $options['on_table_missing'] = FALSE;
    +    return $this->connection->query($query, $args, $options);
    +  }
    

    Is there a reason to have this method even it is not used at all? (neither storm nor myself could find a use of it)

  6. +++ b/core/lib/Drupal/Core/Database/SchemaProviderInterface.php
    @@ -0,0 +1,31 @@
    + * Drupal\Core\Database\Query::executeEnsuringTable() and
    + * Drupal\Core\Database\SelectInterface::executeEnsuringTable() allows these
    

    ... old documentation

  7. +++ b/core/lib/Drupal/Core/Database/SchemaProviderInterface.php
    @@ -0,0 +1,31 @@
    +  /**
    +   * A schema API array.
    +   *
    +   * @return array
    +   *   A schema API array
    +   *
    +   * @see hook_schema()
    +   */
    +  public function getSchema($table_name = NULL);
    

    Let's document it ... in general it is confusing to be NULL by default, what about an empty string? It should document that most of the time you can pretty much ignore the table name, unless you need to support multiple tables.

  8. +++ b/core/modules/system/src/Tests/Cache/DatabaseBackendTagTest.php
    @@ -48,7 +49,7 @@ public function testTagInvalidations() {
    -    $invalidations_before = intval(db_select('cachetags')->fields('cachetags', array('invalidations'))->condition('tag', 'test_tag:2')->execute()->fetchField());
    +    $this->assertFalse(Database::getConnection()->schema()->tableExists('cachetags'));
    

    This is a bit confusing: what about documenting that just setting cache entries does not save cache tags, ... this just happens if someone invalidates them.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new18.04 KB
new46.58 KB
  1. Renamed to 'create_missing_table', thanks, great idea!
  2. Added a comment -- thanks for bringing this up, I added a killswitch to Select::addJoin , I thought of it earlier then dropped it but it's crystal clear from writing the comment that this only works for single tables as we discussed above.
  3. Gone!
  4. Most of these are just gone. Merge sets unconditionally the return to affected so you'd need to do $options = ['return' => Database::RETURN_AFFECTED] + $options + ['create_missing_table' => $table]. I shied away from doing that. As a followup we could explore setting it in update instead of the constructor but I'd rather not change Merge behavior for now.
  5. Removed, earlier attempt.
  6. Refreshed documentation.
  7. Same.
  8. Added a comment.

Status: Needs review » Needs work

The last submitted patch, 50: 2371709_49.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new2.25 KB
new48.97 KB

Ah yes, MenuTreeStorageTest asserts that reading creates the table. That doesn't happen. So I refactored the test slightly and save instead.

yched’s picture

Hate to be the downer here, but I can't say I'm in love with the "fetch missing info from wherever we can find it higher in the callstack" thing.

- It feels like uncommon/surprising black magic
- It feels brittle too: many things can be in the callstack, including objects that happen to be SchemaProviders for unrelated systems - since it seems a lot of simple systems won't check the table name param, you can end up creating the wrong table ?
- De facto, it moves from (previous approach) "only code that knows the right SchemaProvider can query a lazy table" to (current approach) "only the SchemaProvider can query a lazy table". I don't see the gain, it's just more constraining ? The previous approach (explicitly pass the SchemaProvider for the query) already had that "only the subsystem can query its tables" aspect, but more flexible and without black magic.
- If joins are still not supported, we haven't gained in terms of use cases ?

Won't fight this if everyone else likes the idea, but I'm not too yay myself :-)

amateescu’s picture

Edit: Stepped into a different bug.

dawehner’s picture

amateescu’s picture

@dawehner, yup, that was the problem. Manual installation works fine on MySQL but fails on Postgres:

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[42P01]: Undefined table: 7 ERROR: relation "cache_bootstrap" does not exist LINE 1: ..., checksum_invalidations, checksum_deletions FROM cache_boot... ^: SELECT cid, data, created, expire, serialized, tags, checksum_invalidations, checksum_deletions FROM {cache_bootstrap} WHERE cid IN (:cids_0); Array ( [:cids_0] => module_implements ) in Drupal\Core\Extension\ModuleHandler->getImplementationInfo() (line 517 of /home/andrei/work/d8.dev/core/lib/Drupal/Core/Extension/ModuleHandler.php).

Drupal\Core\Extension\ModuleHandler->getImplementationInfo('stream_wrappers_alter')
Drupal\Core\Extension\ModuleHandler->getImplementations('stream_wrappers_alter')
Drupal\Core\Extension\ModuleHandler->alter('stream_wrappers', Array)
Drupal\Core\StreamWrapper\StreamWrapperManager->register()
Drupal\Core\DrupalKernel->boot()
Drupal\Core\DrupalKernel->prepareLegacyRequest(Object)
drupal_install_system(Array)
install_base_system(Array)
install_run_task(Array, Array)
install_run_tasks(Array)
install_drupal()

Edit: I can try looking into this a bit later today if no one beats me to it.

chx’s picture

StatusFileSize
new49.41 KB
new4.21 KB

@yched we can fix that. We were very close fixing that in fact -- the database passes in the table name already and the caller checks whether there is a schema returned for that table name. cache already returns a schema if it is something it is interested in it. All we need to do is to change the interface to make the table name required and make the implementations check it. In fact I had this coded and this is why I have switched from $bt['class'] to $bt['object'] because this way getSchema can look at $this->table / $this->bin. But I have backed out because I saw MenuTreeStorage calling $this->getSchema() and I thought that we can have our cake and eat it too but no. You are right, this is brittle so MenuTreeStorage will need to pass in the table name. That's really not a big deal.

Because the win is gigantic: query strings work without bothering to pass the SchemaProvider in to query/queryRange/queryTemporary.

chx’s picture

StatusFileSize
new50.5 KB
new4.18 KB

Edit: nevermind.

chx’s picture

StatusFileSize
new4.18 KB

Edit: nevermind. Double post, somehow.

Status: Needs review » Needs work

The last submitted patch, 58: 2371709_58.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
StatusFileSize
new3.97 KB
new50.39 KB

@amateescu, is this enough to fix pgsql? No doubt this is necessary because pgsql overrides the query() function without calling the parent (opsie); I am just not sure whether it's enough.

amateescu’s picture

@chx, nope, doesn't seem to be enough. I also tried to debug it for a few hours but didn't get anywhere.. I'll try again tomorrow.

Additional uncaught exception thrown while handling exception.
Original

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[42P01]: Undefined table: 7 ERROR: relation "cache_bootstrap" does not exist LINE 1: INSERT INTO cache_bootstrap (cid, serialized, created, expir... ^: INSERT INTO cache_bootstrap (cid, serialized, created, expire, tags, checksum_invalidations, checksum_deletions, data) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7); Array ( ) in Drupal\Core\Extension\ModuleHandler->getHookInfo() (line 290 of /home/andrei/work/d8.dev/core/lib/Drupal/Core/Extension/ModuleHandler.php).

Additional

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[42P01]: Undefined table: 7 ERROR: relation "cachetags" does not exist LINE 1: INSERT INTO cachetags (invalidations, tag) VALUES ('1', 'ext... ^: INSERT INTO cachetags (invalidations, tag) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array ( ) in Drupal\Core\Cache\Cache::invalidateTags() (line 132 of /home/andrei/work/d8.dev/core/lib/Drupal/Core/Cache/Cache.php).
amateescu’s picture

StatusFileSize
new3.76 KB

Almost there :)

There are multiple problems with Postgres here:

1) in Drupal\Core\Database\Driver\pgsql\Insert::execute(), we set the return option to Database::RETURN_NULL

    ...
    if (!empty($table_information->sequences)) {
      $options['sequence_name'] = $table_information->sequences[0];
    }
    // If there are no sequences then we can't get a last insert id.
    elseif ($options['return'] == Database::RETURN_INSERT_ID) {
      $options['return'] = Database::RETURN_NULL;
    }
    ...

which makes

      $handled = $this->handleMissingTable($e, $query, $args, $options);
      if (isset($handled)) {
        return $handled;
      }

fail because $handled will be NULL on an insert in a cache table. Added a hacky workaround to return an array instead and just use its first value.

2) since we've removed various pieces of code that ensured tables were created before running INSERT queries, we now have to extend the fixes/workarounds from #2181291: Prevent a query from aborting the entire transaction in pgsql to Drupal\Core\Database\Driver\pgsql\Insert::execute() as well as handling it inside Drupal\Core\Database\Driver\pgsql\Connection::query().

3) I'm now getting

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for type bytea LINE 1: ...'1416262383.549', '-1', 'entity_types', '0', '0', 'a:6:{s:19... ^: INSERT INTO cache_discovery (cid, serialized, created, expire, tags, checksum_invalidations, checksum_deletions, data) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7); Array ( ) in Drupal\Core\Plugin\DefaultPluginManager->setCachedDefinitions() (line 197 of /home/andrei/work/d8.dev/core/lib/Drupal/Core/Plugin/DefaultPluginManager.php).

Drupal\Core\Plugin\DefaultPluginManager->setCachedDefinitions(Array)
Drupal\Core\Plugin\DefaultPluginManager->getDefinitions()
Drupal\Core\Config\ConfigManager->getEntityTypeIdByName('core.extension')
Drupal\Core\Config\ConfigInstaller->createConfiguration('', Array)
Drupal\Core\Config\ConfigInstaller->installDefaultConfig('core', 'core')
drupal_install_system(Array)
install_base_system(Array)
install_run_task(Array, Array)
install_run_tasks(Array)
install_drupal()

which appears to be a general problem with inserting serialized PHP data into a bytea column, so I wonder how the installation even works without this patch.

Edit: to answer myself - it works by magic and I see that cache_discovery gets populated with serialized data just fine :/

To be continued tomorrow...

chx’s picture

Discussed with @amateescu yesterday after he realized postgresql insert relies on schema. Here's the plan:

  1. Move the schema-finder into its own method
  2. Have the pgsql schema-from-information_schema method throw an exception for missing table
  3. Catch the exception and call 1. It'll have the info just necessary. Table can be created at this point as well.
amateescu’s picture

StatusFileSize
new9.44 KB

Catch the exception and call 1. It'll have the info just necessary.

Except that the info from 1. (a $schema array) is not what's necessary for insert :) But that's no biggie, we can just query information_schema again.

Anyway, the plan above allowed me to make some progress. A lot more tables are created now but some are still not, like cache_data, cache_menu, cache_render and menu_tree.

This interdiff is against #61 if anyone wants to play with it during the weekend.

Crell’s picture

I have to say, trying to crawl up the call stack to get to an object with a given interface strikes me as a "holy crap, why would you do such a thing?" action. That's relying on all sorts of implicit behavior and statefulness that feels like it's going to blow up in our faces, or more likely someone else's face.

To whatever extent a subsystem maintainer can veto something, I do to that approach.

The original approach seemed far more contained and reasonable to me. The only caveat is Views, which to me seems like a non-issue. If the table isn't there, there's nothing to show anyway so who cares? Let's solve the 80% case, not over-complicate it with language black magic (which has performance concerns, too; generating the call stack is not a fast operation) to try and cover every possible edge case.

chx’s picture

Supporting query strings is not an edge case and there you have problems with doing this otherwise. Performance concerns are likely invalid since this is a very rare occasion coupled with one of the slowest possible database operations: creating a table.

chx’s picture

But, I have said everything I had to say multiple times now; I have carried this issue so far if people don't like it; that's not really a problem of mine. Without any regrets I am unfollowing this issue. MongoDB now carries a little DB driver anyways to facilitate an easy install and it has a Schema class that blackholes every call to it anyways. If people want simpler and speedier tests they can continue with either the latest or earlier patches. I am out.

dawehner’s picture

As a compromise can we replace the auto-detection of the used class and pass along the schema provider as part of the $options?
Both $connection->query(), $connection->select(), $connection->update() etc. provide it.

catch’s picture

If we did it as part of options that gives us a potential route to supporting more than one table/schema provider (or the table not being the base table) later on as well as static queries. I'm thinking about an eventual situation where Views handles locating the schema provider, if one exists, for each table in a query and adds it to options.

Views isn't the only issue here, EntityQuery is almost as problematic (except that it's at least linked to the API dealing with the schema, whereas Views isn't at all, so EnttiyQuery probably has a closer route to discovering that information).

Using the backtrace isn't a performance concern for the reasons chx mentioned (we're already doing a DDL operation if we get to that point). I agree it's weird but so is an extra execute() method, $options does seem like a decent compromise between the two.

Crell’s picture

Using the options array makes more sense, I agree. It's not quite what it was originally intended for, but such expansion is why it's an array rather than just a primary/replica boolean. Let's see if we can make that work.

tstoeckler’s picture

EntityQuery is almost as problematic (except that it's at least linked to the API dealing with the schema, whereas Views isn't at all, so EnttiyQuery probably has a closer route to discovering that information

Spot on. Entity Query already has the storage injected, so it can get the schema quite easily.

dawehner’s picture

StatusFileSize
new59.88 KB
new17.56 KB

@amateescu
Great work!! Did you managed to do a little bit more progress in the meantime?

Here is a first try based upon the last patch in #61 to implement the idea mentioned in #69.

After fixing a couple of calls to $connection->select() in MenuTreeStorage
this could actually pass.

amateescu’s picture

@dawehner, nope. And now I'll wait until the new approach/patch is RTBC before doing any other Postgres fixes.

dawehner’s picture

StatusFileSize
new17.56 KB

Using the options array makes more sense, I agree. It's not quite what it was originally intended for, but such expansion is why it's an array rather than just a primary/replica boolean. Let's see if we can make that work.

Cool that you agree, having a conflict here would be bad.

Spot on. Entity Query already has the storage injected, so it can get the schema quite easily.

It does? Afaik this should be part of \Drupal\Core\Entity\Query\Sql\Query,
but I'm not really sure whether I see the entity storage at the moment, can you enlighten me?
On the other hand I try to understand the point what is the point in that. We could just specify $options['create_missing_table'] = TRUE;
which would return empty results in both views and entity query and be done. There is no real requirement to create the tables for that.

Status: Needs review » Needs work

The last submitted patch, 73: 2371709-71.patch, failed testing.

dawehner’s picture

Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new1.61 KB
new60.26 KB

Its green again.

Update the issue summary to include the new idea.

dawehner passes the ball to @amateescu

amateescu’s picture

Thanks, but, as I said earlier, I'll resume working on this only when it's RTBC or at least fully reviewed and agreed upon by the db maintainers.

yched’s picture

The IS still contains "climb the backtrace and call the the SchemaProvider interfaces..."

Also, not clear what's the status of selects with JOINs in that latest approach ? (asking for entity/fields in EFQ/Views).

dawehner’s picture

Issue summary: View changes

To be honest I'm personally okay with not supporting every possible usecase.

Updated the IS a bit more.

yched’s picture

To be honest I'm personally okay with not supporting every possible usecase

Sure, as I said earlier, I absolutely do not intend to block this on supporting every use case. Just trying to have a clear view of what's supported and what's not, because that's the difference between "this can be used by self-enclosed systems like cache or history" and "this can be leveraged by entity storage".

dawehner’s picture

So @yched do you agree with that patch as it is? We need a RTBC (which is odd, because that is the wrong state) so that @amateescu picks it up and tries to fix the PGSQL problems.

amateescu’s picture

Well, let me clarify that a bit. I'm waiting for an 'unofficial' RTBC (as in, not the issue status) from people who want this patch to go in a specific direction or have strong opinions about it, like @catch, @Crell or @yched.

I hope no one gets this the wrong way, it's just that there are a lot of other patches to work on instead of learning Postgres (which I installed and used for the first time just a couple of weeks ago to help @chx and test that the error code from #41 works as expected).

Crell’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -298,21 +269,11 @@ public function delete($cid) {
    +    // Delete in chunks when a large array is passed.
    +    foreach (array_chunk($cids, 1000) as $cids_chunk) {
    +      $this->connection->delete($this->bin, array('schema_provider' => $this))
    +        ->condition('cid', $cids_chunk, 'IN')
    +        ->execute();
         }
    

    I think I noted this earlier, but may not have been clear about it... we don't need to create the table lazily on delete, since we don't need it to be there for the post condition to be satisfied. However, we DO need to catch a table-not-found case and continue gracefully without letting the exception bubble up. I'd suggest just catching the exception outside the foreach and swallowing it.

    I think the same applies to a few other cases below.

  2. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -218,6 +218,12 @@ public function destroy() {
    +   * - create_missing_table: When a string, it is a table name. If the query
    +   *   fails with a table missing error then this table is created and if
    +   *   the creation succeeded then the query is re-run. When TRUE and
    +   *   return is set to Database::RETURN_STATEMENT and the query is failing
    +   *   with a table missing error then an empty statement is returned. Defaults
    +   *   to TRUE.
    

    This doesn't say where the table definition comes from. I assume it's from schema API but that's not at all clear. But isn't the idea to allow the schema to be passed in, not pulled from a static hook_schema definition?

    Also, I do not understand the TRUE meaning. If a query fails an exception gets thrown, so the return value is irrelevant. Don't load too much meaning into a single property.

    Also, schema_provider is not documented?

  3. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -555,6 +562,10 @@ public function query($query, array $args = array(), $options = array()) {
    +      $handled = $this->handleMissingTable($e, $query, $args, $options);
    +      if (isset($handled)) {
    +        return $handled;
    +      }
    

    $handled should never be null. From the code here I don't see how it should ever be anything but a boolean, The code below says it can also be a statement, but that's still an object. Just don't let handleMissingTable ever return null, period.

  4. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -576,6 +587,43 @@ public function query($query, array $args = array(), $options = array()) {
    +    if ($this->isTableMissingException($e) && !empty($options['create_missing_table'])) {
    

    $options['create_missing_table'] should never be empty. defaultOptions() should ensure that. If this is a boolean check just do that.

  5. +++ b/core/lib/Drupal/Core/Database/Connection.php
    @@ -576,6 +587,43 @@ public function query($query, array $args = array(), $options = array()) {
    +      if (is_string($options['create_missing_table']) && isset($options['schema_provider'])) {
    +        $table = $options['create_missing_table'];
    +        /** @var \Drupal\Core\Database\SchemaProviderInterface $schema_provider */
    +        if ($schema_provider = $options['schema_provider']) {
    +          if ($this->schema()->ensureTableExists($table, $schema_provider->getSchema($table))) {
    +            // Theoretically the table must exist at this point so this
    +            // should not matter but still, databases. This makes sure
    +            // no infinite recursion happens.
    +            $options['create_missing_table'] = FALSE;
    +            unset($options['schema_provider']);
    +            return $this->query($query, $args, $options);
    +          }
    +        }
    +      }
    

    I really don't get the point of create_missing_table at all. If the schema provider defines one table, we're done. If it defines multiple... wouldn't we need all of them? This feels like is seriously over-complex for what should be a simple task.

  6. +++ b/core/lib/Drupal/Core/Database/Database.php
    @@ -443,4 +443,26 @@ public static function closeConnection($target = NULL, $key = NULL) {
    +  /**
    +   * Extracts the SQLSTATE error from the PDOException.
    +   *
    +   * @param \Exception $e
    +   *   The exception
    +   *
    +   * @return string
    +   *   The five character error code.
    +   */
    +  public static function getSQLState(\Exception $e) {
    

    Why is this not a method on Connection instead, where it could be overridden per driver if needed? Database should just be the connection management and utility facades; this is not just a utility facade; it's internal behavior of a database driver.

  7. +++ b/core/lib/Drupal/Core/Database/Driver/pgsql/Connection.php
    @@ -143,6 +143,10 @@ public function query($query, array $args = array(), $options = array()) {
    +      $handled = $this->handleMissingTable($e, $query, $args, $options);
    +      if (isset($handled)) {
    +        return $handled;
    +      }
    

    As above.

  8. +++ b/core/lib/Drupal/Core/Menu/MenuTreeStorage.php
    @@ -760,13 +732,13 @@ public function getRootPathIds($id) {
    -      $query = $this->connection->select($this->table, $this->options);
    +      $query = $this->connection->select($this->table, NULL, $this->options);
    

    There's a lot of these lines. Does that mean this code was all that buggy before...?

  9. +++ b/core/modules/system/src/Tests/Database/SelectTest.php
    @@ -7,13 +7,14 @@
    @@ -23,11 +24,36 @@ function testSimpleSelect() {
    
    @@ -23,11 +24,36 @@ function testSimpleSelect() {
         $query->addField('test', 'name');
         $query->addField('test', 'age', 'age');
         $num_records = $query->countQuery()->execute()->fetchField();
    +    debug($num_records);
    

    womp womp.

I agree with the overall direction, but the internals feel very over-complicated at present.

I would see this as simply needing a schema_provider option. In case of table-not-found, install everything the schema_provider gives you and try again. If it still doesn't work or one wasn't provided, just exception as normal.

I can only assume that the double and undocumented keys here are for complex cases like Field API, but I believe the setup here puts the complexity in the wrong place. There should be no parameters on ProviderInterface::getSchema(). It should return all tables to be created. In a more complex case like Field API where the tables might be conditionally named, well, then you don't make the calling object itself a provider. Just instantiate one with appropriate constructor parameters. Something like (mostly pseudo-code):

class Foo {
  public function bar() {
    $provider = new FieldProvider($field_name);
    $this->connection->query('...', [], ['schema_provider' => $provider]);
  }
}

FieldProvider can then provide all of the necessary tables for that given field (or whatever) based on $field_name (or whatever other setup happens other than getSchema()). That's a totally legit thing to do, and cleaner than optimizing for the degenerate case of Foo being its own provider. (If it can be, that's cool, but we shouldn't count on that always being the case.)

That lets us separate out the schema definition logic to its own object, keep the Foo class or whatever it represents clean, and keeps the handling code inside the DB API shorter and with fewer overloaded properties. The presence of schema_provider itself triggers the "on failure do this and try again" logic, and its absence means... do exactly what happens now.

Crell’s picture

Status: Needs review » Needs work
dawehner’s picture

Status: Needs work » Needs review
StatusFileSize
new59.63 KB
new6.89 KB

Thank you @crell for your review!

I think I noted this earlier, but may not have been clear about it... we don't need to create the table lazily on delete, since we don't need it to be there for the post condition to be satisfied. However, we DO need to catch a table-not-found case and continue gracefully without letting the exception bubble up. I'd suggest just catching the exception outside the foreach and swallowing it.

Note: given that we have both create_missing_table and schema_provider we don't create tables of delete calls.

This doesn't say where the table definition comes from. I assume it's from schema API but that's not at all clear. But isn't the idea to allow the schema to be passed in, not pulled from a static hook_schema definition?

Also, I do not understand the TRUE meaning. If a query fails an exception gets thrown, so the return value is irrelevant. Don't load too much meaning into a single property.

Also, schema_provider is not documented?

Worked a bit on the documentation here.

$handled should never be null. From the code here I don't see how it should ever be anything but a boolean, The code below says it can also be a statement, but that's still an object. Just don't let handleMissingTable ever return null, period.

At least in my version of the code, in case you have a different exception ... you will return NULL;

There's a lot of these lines. Does that mean this code was all that buggy before...?

Yeah, you are right here.

Why is this not a method on Connection instead, where it could be overridden per driver if needed? Database should just be the connection management and utility facades; this is not just a utility facade; it's internal behavior of a database driver.

Fixed it.

I would see this as simply needing a schema_provider option. In case of table-not-found, install everything the schema_provider gives you and try again. If it still doesn't work or one wasn't provided, just exception as normal.

Well we want to at least support the cache backens ... which have a dynamic table name.

amateescu’s picture

$handled should never be null. From the code here I don't see how it should ever be anything but a boolean, The code below says it can also be a statement, but that's still an object. Just don't let handleMissingTable ever return null, period.

At least in my version of the code, in case you have a different exception ... you will return NULL;

@Crell is right, $handled should never be null. that's a problem that we need to fix anyway for pgsql, see why in #63.

Also:

I would see this as simply needing a schema_provider option. In case of table-not-found, install everything the schema_provider gives you and try again. If it still doesn't work or one wasn't provided, just exception as normal.

Well we want to at least support the cache backens ... which have a dynamic table name.

Wouldn't that be fixed by something like this?

+++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
@@ -228,7 +199,7 @@ public function setMultiple(array $items) {
-        ->insert($this->bin)
+        ->insert($this->bin, array('schema_provider' => array($this->bin => $this)))

Status: Needs review » Needs work

The last submitted patch, 86: 2371709-86.patch, failed testing.

Crell’s picture

Dynamic table name, sure. Then getSchema() can return a table definition with the right name. If it's a method on the backend itself, it has access to $this->bin. If it's a separate object, pass $this->bin to that object's constructor. Problem solved.

Really, this should be a lot easier than the last patch was making it out to be. :-)

yched’s picture

SchemaProvider::getSchema() has no argument, it provides the schemas for all the tables to be created

Not opposed to that, but just trying to clarify what it would mean when applied to entities / fields.

This means the first request on an entity type, no matter what are the actual tables being requested, creates all the base tables and all the per-field tables for the configurable fields that exist at this point in time.

1) In practice it probably means little actual perf gains for tests, since we create all field tables anyway.
If not for test perf, there's IMO little interest to bother moving the entity/field storage to the lazy model ?

2) Say some new fields is created in the UI after that.
- Their tables are not created immediately, since in that scenario we moved to the "lazy" model.
- They are created by the next first request involving one of those fields, which triggers a SchemaProvider call
- The SchemaProvider returns all the previous tables + the new ones for the fields that were created since the last request that had to call the SchemaProvider.
- This means the "lazy create" supporting code needs to check which tables already exist, so that we don't create them again.
(We also have spent a bit more time re-computing schemas we don't need for tables that already exist - no biggie, since that is a rare runtime occurrence, but that still feels like a clustered, non-cached version of hook_schema())

So, as compared to "SchemaProvider::getSchema() receives the missing table name and only returns the schema for that one":
- some complexity moves from one place to the other, but can't be fully avoided ?
- "tables appear by bursts depending on the sequence of fields creation and requests being executed" feels a little more confusing/unpredictable than "a table appears when it is requested".
- it's much less interesting for test perf

Then again, it's still not clear to me whether the system designed here intends to missing tables in JOINs. If it doesn't, then entities/fields are out of scope to begin with, and the above is irrelevant.

catch’s picture

Would an array of schema providers in $options help the fields case, feels like that might be enough to support multiple tables? Of course entity/field needs to then figure what that array is supposed to be, but then on the exception could iterate each schema provider to check. Would be less tables at a time than every single field on an entity then for the case yched brings up.

Crell’s picture

Perhaps I'm not understanding Field API's usage patterns properly. Wouldn't it be querying Field-at-a-time, and so only creating the tables for a given field? (Remember, we only need to bother creating tables on write, not on read, since for a SELECT we'd still end up with empty data anyway in the end so no sense mucking with the schema.)

yched’s picture

@Crell

Wouldn't it be querying Field-at-a-time, and so only creating the tables for a given field? (Remember, we only need to bother creating tables on write, not on read, since for a SELECT we'd still end up with empty data anyway in the end so no sense mucking with the schema.)

Not good enough, unfortunately, because a *LEFT* JOIN on a table with no data does not mean the result set is empty.

Imagine the following case:
- You have a View that lists nodes. Let's say the current set of nodes is such that the View actually returns results.
- You add a new field to a node type, and change the view so that it filters on "nodes WHERE that new field is empty".
- With current HEAD, the new field table is created immediately. It is empty, but the Views query LEFT JOINs to it, so the View still returns the same result set as before. That is correct : there *are* nodes, and the new field is "empty" on them.
- With the current proposed approach, the View is suddenly empty. Until someone actually adds one field value in one arbitrary node (doesn't even have to be one of the nodes selected by the View), which creates the table, and the View is back to returning all its previous nodes. That would feel very broken :-)

An approach where :
- the SchemaProvider is responsible for a set of tables, and receives the name of the table that needs to be created,
- and the query can specify an array of SchemaProviders (e.g for our case one per entity type involved in the query); when a table is deemed missing, they are called in turn until one of them says "yep, that's mine" and actually returns a schema.
would seem to solve that.

That also means the "create on table not exists" logic would need to account for all the tables that are part of the query, and possibly behave differently based on the type of JOIN ?
[edit: hm, not completely sure that "behave differently depending on the join" part would actually be needed]

yched’s picture

Restating this to be extra clear: I'm just trying to outline what (I think) it would take to be able to leverage that lazy-create feature for entity/fields because, in other issues, @catch and @alexpott seemed to expect that it will solve a couple race conditions on entity/field table creation, that currently require weird workarounds in some tests. And also, yes, it's likely that it would make our tests run faster.

I don't intend to block a simpler version from getting in, and I'm perfectly fine with a decision of "yeah, too complex, screw entity / fields, let's just have a nice lazy-creation mechanism that works great for simpler APis".
Or "let's get a simple version in first, and possibly enrich it to account for more complex uses in a later issue" (would then imply BC break concerns, of course).

In other words : not my call :-)

kgoel’s picture

StatusFileSize
new59.63 KB

Need to address #84 and #87. Just a re-roll right now since patch didn't apply.

daffie’s picture

Status: Needs work » Needs review

For the testbot

Status: Needs review » Needs work

The last submitted patch, 95: 2371709-95.patch, failed testing.

kgoel’s picture

My bad for not adding "do not test" suffix as the patch was going to fail anyway. Working on addressing #84 and $87.

kgoel’s picture

StatusFileSize
new51.56 KB

Probably, this is not useful but posting patch anyway.

dawehner’s picture

Status: Needs work » Needs review

Let's always send it to the testbot.

yched’s picture

I don't intend to block a simpler version from getting in, and I'm perfectly fine with a decision of "yeah, too complex, screw entity / fields, let's just have a nice lazy-creation mechanism that works great for simpler APis".

Given that the discussion has stalled since then, I'd vote for doing this :-/

bzrudi71’s picture

Seems that this patch would help us to fix some PosgreSQL related issues while in transaction. Didn't read the whole story, but +1 for this approach here! Adding the PostgreSQL issue as related.

amateescu’s picture

@bzrudi71, you might also be interested in comments #62, #63 and #65 where I was trying to make this patch work for PostgreSQL, before the patch started to drift into a different direction.

dawehner’s picture

Issue tags: +Needs reroll

Tagging, its not trivial btw.

almaudoh’s picture

StatusFileSize
new55.1 KB

Rerolled. Took three steps and an hour. Small changes due to some changes in SQL-queries in HEAD. Patch is smaller due to removals related to caching system changes.

Status: Needs review » Needs work

The last submitted patch, 105: on_demand_table-2371709-105.patch, failed testing.

almaudoh’s picture

+++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
@@ -448,8 +350,8 @@ protected function normalizeCid($cid) {
+    $schema[$this->bin] = array(

Not really read the code yet, but shouldn't this be $schema[$table_name]

almaudoh’s picture

+++ b/core/lib/Drupal/Core/Database/Connection.php
@@ -215,6 +215,12 @@ public function destroy() {
+   * - create_missing_table: Boolean value, whether to create a missing table.
+   * - missing_table_name: In case create_missing_table is TRUE, this will
+   *   be the table name used for the automatic table creation.
+   * - schema_provider: An object implementing
+   *   \Drupal\Core\Schema\SchemaProviderInterface . This will be called when

@@ -229,6 +235,7 @@ protected function defaultOptions() {
+      'create_missing_table' => TRUE,

@@ -616,6 +627,44 @@ protected function handleQueryException(\PDOException $e, $query, array $args =
+    if ($this->isTableMissingException($e) && $options['create_missing_table']) {
...
+      if (!empty($options['missing_table_name']) && isset($options['schema_provider'])) {
+        $table = $options['missing_table_name'];

+++ b/core/lib/Drupal/Core/Database/Query/Insert.php
@@ -72,9 +72,10 @@ class Insert extends Query {
+      'create_missing_table' => $table,

+++ b/core/lib/Drupal/Core/Database/Query/Merge.php
@@ -135,6 +135,9 @@ class Merge extends Query implements ConditionInterface {
+      $options['create_missing_table'] = $table;

The main cause of the failures. Inconsistency between implementation of 'create_missing_table' and 'missing_table_name'.

almaudoh’s picture

Status: Needs work » Needs review
Issue tags: -Needs reroll
StatusFileSize
new55.71 KB
new4.53 KB

Fixed #107 and #108.

Status: Needs review » Needs work

The last submitted patch, 109: on_demand_table-2371709-109.patch, failed testing.

The last submitted patch, 109: on_demand_table-2371709-109.patch, failed testing.

basic’s picture

This patch is causing testbot issues and not succeeding on the bots. Please review / re-roll the patch before submitting again.

The last submitted patch, 109: on_demand_table-2371709-109.patch, failed testing.

almaudoh’s picture

I wonder why testbot keeps re-queuing it :(

almaudoh’s picture

Status: Needs work » Needs review
StatusFileSize
new55.68 KB
new727 bytes

Re-rolled and fixed a stray debug...

Status: Needs review » Needs work

The last submitted patch, 116: on_demand_table-2371709-116.patch, failed testing.

The last submitted patch, 109: on_demand_table-2371709-109.patch, failed testing.

jthorson’s picture

almaudoh’s picture

Status: Needs work » Needs review
StatusFileSize
new55.17 KB

Reroll

almaudoh’s picture

Status: Needs review » Needs work

The last submitted patch, 120: on_demand_table-2371709-120.patch, failed testing.

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 120: on_demand_table-2371709-120.patch, failed testing.

dawehner’s picture

The more I think about this particular issue I think it is wrong to do that in this subsystem. The query parts are separated from the schema at the moment and this patch would break that.

Crell’s picture

Version: 8.0.x-dev » 8.1.x-dev

I'm pretty sure this won't happen until 8.1 anyway...

dawehner: Can you clarify? You access the schema through the connection object already anyway.

dawehner’s picture

dawehner: Can you clarify? You access the schema through the connection object already anyway.

Fair, but i still think that its connects two subsystems which ideally would not know about each other.

Crell’s picture

You mean the DB connection and the schema management? How would those be fully decoupled? The schema needs a connection in order to do anything...

fgm’s picture

I /think/ the idea is that the schema needs a storage (hence a connection) to implement its operations, but the schema mechanism itself should not need it.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

andypost’s picture

Looks the issue brings some confusion, In #2664830-72: Add search capability to help topics
it is not clear which schema management to use - hook_schema vs definition from service's method

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

catch’s picture

Title: Move the on-demand-table creation into DBTNG » Move the on-demand-table creation into the database API
catch’s picture

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

andypost’s picture

Issue tags: +Needs reroll

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Bhanu951 made their first commit to this issue’s fork.

bhanu951’s picture

Issue tags: -Needs reroll
StatusFileSize
new71.33 KB

Tried to Re-roll patch from #120 to11. Branch.

bhanu951’s picture

Status: Needs work » Needs review
smustgrave’s picture

Status: Needs review » Needs work

Reroll seemed to have test failures.

daffie’s picture

Issue summary: View changes

I have updated the IS and created a CR.

daffie changed the visibility of the branch 2371709-move-the-on-demand-table to hidden.

daffie’s picture

Status: Needs work » Needs review

The Mr is ready for a review.

nicxvan’s picture

Oh I've had my eye on this it will make hook schema deprecation much nicer.

andypost’s picture

added suggestion

daffie’s picture

@andypost: Thank you for the review.

Changing the class variable to a readonly variable and setting its value is not allowed in PHP. From the PHP documentation:

Specifying an explicit default value on readonly properties is not allowed, because a readonly property with a default value is essentially the same as a constant, and thus not particularly useful.

See: https://www.php.net/manual/en/language.oop5.properties.php

I have changed the variables to be of the string type.

catch’s picture

Out of interest why could we not use constants for this case? The classes should be able to override those.

daffie’s picture

@catch: I some cases it can be a constant in other cases it cannot. Like with Drupal\Core\Menu\MenuTreeStorage, Drupal\Core\Routing\MatcherDumper and Drupal\Core\KeyValueStore\DatabaseStorage. In those cases the table name is set in the class constructor.

mondrake’s picture

A bit late to the party... but why not to use events instead of a trait? #3410480: [META] Use events in Database Schema operations

For example (conceptual, not tested):

from

  /**
   * Returns a new batch id.
   *
   * @return int
   *   A batch id.
   */
  public function getId(): int {
    $try_again = FALSE;
    try {
      // The batch table might not yet exist.
      return $this->doInsertBatchRecord();
    }
    catch (\Exception $e) {
      // If there was an exception, try to create the table.
      if (!$try_again = $this->ensureTableExists()) {
        // If the exception happened for other reason than the missing table,
        // propagate the exception.
        throw $e;
      }
    }
    // Now that the table has been created, try again if necessary.
    if ($try_again) {
      return $this->doInsertBatchRecord();
    }
  }

to

  /**
   * Returns a new batch id.
   *
   * @return int
   *   A batch id.
   */
  public function getId(): int {
    $event = new EnsureTableExistEvent(
      callable: [$this, 'doInsertBatchRecord'],
      table: static::TABLE_NAME,
      schema: $this->schemaDefinition(),
    );
    $this->connection->dispatchEvent($event);
    return $event->result();
  }

OR

from

  public function load($id) {
    // Ensure that a session is started before using the CSRF token generator.
    $this->session->start();
    try {
      $batch = $this->connection->select('batch', 'b')
        ->fields('b', ['batch'])
        ->condition('bid', $id)
        ->condition('token', $this->csrfToken->get($id))
        ->execute()
        ->fetchField();
    }
    catch (\Exception $e) {
      $this->catchException($e);
      $batch = FALSE;
    }
    if ($batch) {
      return unserialize($batch);
    }
    return FALSE;
  }

to

  public function load($id) {
    // Ensure that a session is started before using the CSRF token generator.
    $this->session->start();
    try {
      $batch = $this->connection->select('batch', 'b')
        ->fields('b', ['batch'])
        ->condition('bid', $id)
        ->condition('token', $this->csrfToken->get($id))
        ->execute()
        ->fetchField();
    }
    catch (\Exception $e) {
      $event = new TablePossiblyMissingEvent(
        exception: $e,
        table: static::TABLE_NAME,
        schema: $this->schemaDefinition(),
      );
      $this->connection->dispatchEvent($event);
      $batch = FALSE;
    }
    if ($batch) {
      return unserialize($batch);
    }
    return FALSE;
  }

EDIT: ... or, else, how about a SchemaGenerator service with methods similar to the two events above?

mondrake’s picture

A PoC for #163 in MR!10397.

mondrake’s picture

Hm... looking at the MR now, I start figuring out we could simplify further, and let the dynamic query layer handle this:

  public function getId(): int {
        return $this->connection->insert('batch')
          ->fields([
            'timestamp' => $this->time->getRequestTime(),
            'token' => '',
            'batch' => NULL,
          ])
          ->onFailureEnsureSchemaAndRetry([static::TABLE_NAME => $this->schemaDefinition()])
          ->execute();
  }

  public function delete($id) {
      $this->connection->delete('batch')
        ->condition('bid', $id)
        ->onFailureEnsureSchema([static::TABLE_NAME => $this->schemaDefinition()])
        ->execute();
  }

catch’s picture

#163-#166 are incredibly concise so if we can make those work, that is very tempting.

mondrake’s picture

#166 would have BC concerns (db drivers would have to match the behavior of the onFailureEnsureSchema() method on all classes extending from Query), so maybe for now we can settle on a proxy of that.

mondrake changed the visibility of the branch 2371709-use-events-II to hidden.

mondrake’s picture

MR!10403 is for review. That includes the framework and some conversion, but not all of them. Once/if the framework is agreed, we can easy complete the remaining ones.

An interesting next step would be to pass the schema as a value object instead of an array, but there are other issues for that.

daffie’s picture

Status: Needs review » Needs work

mondrake changed the visibility of the branch 2371709-lazy-table-creation-trait to hidden.

mondrake’s picture

Assigned: Unassigned » mondrake

Thanks for review @daffie! Let's continue with the new approach then. Before completing the conversions: add tests for Connection::executeEnsuringSchemaOnFailure().

mondrake’s picture

Status: Needs work » Needs review

MR!10403 is now doing #163, deprecates/converts all uses of ::ensureTableExists() and ::catchException(), and is green on all databases. A review at this stage would be helpful.

#166 is just a piece of cake away with this underlying layer, so I will push that forward.

mondrake’s picture

Assigned: mondrake » Unassigned

Now Flood\DatabaseBackend implemenrs #166, and if this is OK we need to convert the rest where possible (i.e. calls to dynamic queries extending Query).

Pausing now to let reviewing happen.

mondrake’s picture

daffie’s picture

Assigned: Unassigned » acbramley

@mondrake: It looks like a beautiful solution. My problem is that the solution will not work with MongoDB. The solution relies on that the database will throw an exception when the table does not exists. However that is not what MongoDB does. MongoDB just creates a table when one does not exists on an insert. A table in MongoDB is called a collection. It will be a table with no validation added to make sure that it only allows data in the right form being added. See: https://www.mongodb.com/docs/manual/reference/method/db.collection.inser....

The reason I started to work on this issue was to add a hook to allow the database driver to change the table schema. The database driver for MongoDB stores timestamps as MongoDB\BSON\UTCDateTime instead of an integer value as is done by the by core supported databases.

If we could make it to work for MongoDB that would be great. I will need some time to think how I can make it to work with MongoDB. If you have some ideas or suggestions, then please speak up.

daffie’s picture

Assigned: acbramley » Unassigned

Wrongly assigned to issue to @acbramley

mondrake’s picture

@daffie well, that's why I suggest using events, that gives us the possibility to extend/override how they are processed. At least conceptually, the MongoDB driver could implement a subscriber/listener for the ExecuteMethodEnsuringSchemaEvent, make it so that it processes the event before core's SchemaRequestSubscriber does, do its alternative processing of the schema request, and stop propagation of the event so that core no longer processes it afterwards. Of course, needs to be proven in practice.

daffie’s picture

Status: Needs review » Needs work

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.

catch’s picture

Title: Move the on-demand-table creation into the database API » [PP-x] Move the on-demand-table creation into the database API
Issue summary: View changes

Adding the issue this is postponed on to the issue summary.