In #2121299: Migrate in Core: Drupal 6 to Drupal 8 (the web tests) we introduced a mechanic where you pass file paths to a prepare() method, to specify the dumps to load.
Imo this is not the most elegant way to do this.

Some proposed refactoring:

  1. Pass around objects instead of file paths. The IDE will love us back with type checking and refactoring options.
  2. More meaningful method names: loadDrupal6Dump() instead of prepare().
  3. Composition and method arguments over inheritance:
    Currently every Dump class inherits from a base class, which receives the database in the constructor.
    I propose to instead pass the database (or rather, a database wrapper with additional methods) to the load() method, and let the Dump classes only implement an interface.
  4. The Drupal6DbWrapper also gives us a dedicated place for the "migrateTables" array that prevents tables from being created twice.
  5. Clarify what is for Drupal in general and what is for Drupal 6 only.

Patch will clarify.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

donquixote’s picture

Component: other » migration system
Status: Active » Needs review
FileSize
208.61 KB

Local testing showed a hickup in MigrateCommentVariableInstance, but let's see what testbot says.

sun’s picture

+++ b/core/modules/migrate_drupal/lib/Drupal/migrate_drupal/Tests/Dump/Drupal6ActionSettings.php
@@ -7,17 +7,19 @@
+  public function load(Drupal6DbWrapper $dbWrapper) {
+    $dbWrapper->createTable('variable');
+    $dbWrapper->getDbConnection()->insert('variable')->fields(array(

Could we...

1) rename the argument to $database?

2) rename getDbConnection() to getConnection()?

It's also a bit weird that insert() is located on a connection, but createTable() is not...?

Normally, it's Database::getConnection()->schema()->createTable()

i.e., createTable() lives on schema(), and both schema() + insert() lives on a $connection.

donquixote’s picture

1) rename the argument to $database?

This could be misleading. What does $database mean these days? A db connection? Or something else?

Normally, it's Database::getConnection()->schema()->createTable()

Unlike schema()->createTable(), our wrapper method createTable() does not complain if the same table is added twice.
Maybe it should be renamed createTableOnce(), similar to include_once.
I did not invent this, it is from Drupal6DumpBase::createTable().

We could simply add one or two more shortcut methods on the wrapper, since most of the time it is just insert().
(and sometimes merge())

donquixote’s picture

In fact, a lot of it is insert('variable'), which could be short-cutted to $dbWrapper->variableSet($key, $value) or $dbWrapper->variableSetMultiple(array($key => $value)).

dawehner’s picture

I do agree with some that getDbConnection is kinda off, given that drupal is really written with the database in mind at the moment.

 /**
  * Base class for the dump classes.
  */
-class Drupal6DumpBase {
+class Drupal6DbWrapper {

Let's please adapt the documentation, given how often this class is used ...

Unlike schema()->createTable(), our wrapper method createTable() does not complain if the same table is added twice.

We could also just go with ensureTable

I wonder whether we could split the refactoring into smaller pieces. For example the introduction of MigrateDrupal6TestBase does not seem to be required for the other bit

donquixote’s picture

I do agree with some that getDbConnection is kinda off, given that drupal is really written with the database in mind at the moment.

I am fine with getConnection() then..

Let's please adapt the documentation, given how often this class is used ...

Yes.

We could also just go with ensureTable

This leads to the question whether we still want a want a wrapper, or rather pass around the db connection itself.

I wonder whether we could split the refactoring into smaller pieces. For example the introduction of MigrateDrupal6TestBase does not seem to be required for the other bit.

This would give us one big but repetitive part, and one smaller part. I am ok with it.

On a more general level, we can wonder if anyone has "bigger plans" for this system.
The change I suggested here is something that seems reasonable without looking too much at the details.

donquixote’s picture

I wonder whether we could split the refactoring into smaller pieces. For example the introduction of MigrateDrupal6TestBase does not seem to be required for the other bit

The only reason of existence for this class is
- to create and hold an instance of the Drupal6DbWrapper.
- to pass the Drupal6DbWrapper instance around to dump objects.
- to provide methods loadDrupal6Dump() and loadDrupal6Dumps() to load the dumps using the Drupal6DbWrapper.

So, creating an empty MigrateDrupal6TestBase as a separate commit, without any Drupal6DbWrapper, seems a bit pointless..

donquixote’s picture

I am tempted to introduce variableSet() and variableSetMultiple() on the db wrapper class.
I already started to replace some of the insert('variable') queries with these shortcuts, and it looks great.

However, now this makes me think: Maybe these dumps were actually auto-generated in some way? In this case, the variableSet() would actually make our life more difficult instead of easier.

donquixote’s picture

Some renames and comment improvements since #1, as suggested by sun and dawehner.

  • getDbConnection() renamed to getConnection()
  • $database renamed to $connection (parameter and object property)
  • createTable() renamed to ensureTable()
  • different logic in ensureTable(), eliminating the need for a migrateTables variable in Drupal6DbWrapper.
  • Comments improved.

Not changed:

  • $dbWrapper parameter on Dump classes is still named $dbWrapper, but I am open to suggestions.
    (but I am still sceptical about naming it $database, since this is often used for a db connection.)

Preview, but probably better off in a follow-up:

  • Introducing variableSet() and variableSetMultiple() on Drupal6DbWrapper, as a shortcut for getConnection()->insert('variable')->..
  • Replacing some getConnection()->insert('variable')->.. with variableSet() and variableSetMultiple().

The latter shows that variableSet() and variableSetMultiple() can make the Dump classes easier to read.
On the other hand, they become harder to create with the core/scripts/generate-d6-content.sh script.
So I'd say do this in a follow-up, which may include some changes to this script. Or even a different version of the script.

EDIT:

  • The first patch ("--history") contains separate commits to show the changes from #1.
    The last two commits introduce the variableSet() stuff, and use it to replace some of the insert('variable') queries. It does not replace all those queries, it is rather meant as a preview.
  • The second patch ("--squashed") is similar to the first, only that all but the last 2 commits are squashed into one.
  • The third patch ("--noVariableSet") is like the second, but without the two commits that introduce variableSet().
mikeryan’s picture

Status: Needs review » Needs work

Nearly a year later - definitely some major rerolling necessary.

mikeryan’s picture

Status: Needs work » Closed (duplicate)
Related issues: +#2550291: Improve and generalize database dump tools

A lot of water under the bridge since this was opened - plenty of changes to dumps made over that time, and a major refactoring is on the way at #2550291: Improve and generalize database dump tools, we expect that issue should address all remaining concerns (feel free to weigh in over there if you spot any holes).

@donquixote: Sorry we never got back to your patch when it would have helped.