Problem/Motivation

Migrate source tests are dependent on a source DB that is currently stored as a .gz file. This complicates the patch creation process.
This issue was brought up by @alexpott over in #2281393: D6->D8 Blocks - Custom titles not imported.

Proposed resolution

I discussed this issue at a Florida DrupalCamp sprint with @mrjmd, @douggreen, @thomasVI, @phenaproxima and we discussed several solutions. The consensus was that the simplest solution was to just get rid of the d6.gz file all together in lieu of a d6.sql file. Along with with, we discussed decreasing the size of the d6.gz file by only exporting the structure of certain (cache, session), tables to keep the size of the d6.sql file to a minimum. This solution will make it so the patch creating process for tests involving source DB changes will be the same as the rest of Drupal core and remove the need for a binary file in patches. @mrjmd and @douggreen are working on this patch...

As an additional step, we also discussed getting rid of the d6.sql/.gz file completely by writing a script that generates a new database from the /core/modules/migrate_drupal/src/Tests/Table/d6/ table dumps. This script will also need to generate the necessary cache (and other) tables that aren't included in the table dumps. @thomasVI and @phenaproxima are working on thTis patch.

Remaining tasks

Write the patches, test, and discuss.

User interface changes

None.

API changes

None.

CommentFileSizeAuthor
#73 2469623-73.patch625.26 KBphenaproxima
#71 2469623-71.patch612.14 KBphenaproxima
#69 interdiff-2469623-69.txt812 bytesphenaproxima
#69 2469623-69.patch625.26 KBphenaproxima
#67 interdiff-2469623-67.txt1.1 KBphenaproxima
#66 2469623-66.patch625.61 KBphenaproxima
#62 migrate-db.sh_.txt11.75 KBultimike
#62 interdiff.txt79.08 KBultimike
#62 2469623-62.patch625.33 KBultimike
#56 migrate-db.sh_.txt11.74 KBphenaproxima
#56 2469623-56.patch618.68 KBphenaproxima
#55 migrate-db.sh_.txt11.75 KBphenaproxima
#55 2469623-55.patch618.69 KBphenaproxima
#53 interdiff.txt1.37 KBamateescu
#53 2469623-53.patch618.34 KBamateescu
#50 migrate-db.sh_.txt10.27 KBchx
#49 interdiff-2469623-48-49.txt369 bytesphenaproxima
#49 2469623-49.patch659.08 KBphenaproxima
#48 interdiff-2469623-46-48.txt850 bytesphenaproxima
#48 2469623-48.patch659.1 KBphenaproxima
#46 interdiff-2469623-41-46.txt515 bytesphenaproxima
#46 2469623-46.patch659 KBphenaproxima
#41 migrate-db.sh_.txt10.03 KBphenaproxima
#41 2469623-41.patch658.84 KBphenaproxima
#39 2469623-39.patch637.25 KBphenaproxima
#37 2469623-37.patch637.09 KBphenaproxima
#36 2469623-36.patch637.11 KBphenaproxima
#35 2469623-35.patch679.22 KBphenaproxima
#32 2469623-31.patch678.72 KBphenaproxima
#30 2469623-30.patch657.54 KBphenaproxima
#29 interdiff-2469623-28-29.txt1017 bytesphenaproxima
#29 2469623-29.patch419.31 KBphenaproxima
#28 interdiff-2469623-27-28.txt3.33 KBphenaproxima
#28 2469623-28.patch419.31 KBphenaproxima
#27 interdiff-2469623-24-27.txt73.82 KBphenaproxima
#27 2469623-27.patch418 KBphenaproxima
#24 2469623-24.patch376.91 KBphenaproxima
#17 2469623-17.patch376.43 KBphenaproxima
#3 2469623-3.patch338.26 KBmrjmd
#2 2469623-2.patch1 KBultimike
#1 2469623.patch1.02 KBdouggreen
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

douggreen’s picture

FileSize
1.02 KB

Attached patch writes sql file and excludes data from several cache tables. The original gz file is 211k and the new uncompressed sql file i 237k.

ultimike’s picture

FileSize
1 KB

Ok - I think this patch looks pretty good, with one small change - we think we need to keep the menu_router data (no interdiff - sorry), I've updated the patch with this change. Here's how we tested:

  • Imported the existing d6.gz DB to a fresh D6 site.
  • Ran the updated scripts to generate a new d6.sql file.
  • Imported the new d6.sql file to a fresh D6 site, confirmed it worked.
  • Ran the updated script again.
  • Verified there weren't any new table dump diffs.

Also - @mrjmd worked on this patch with both @douggreen and I and should be added to any commit message.

Thanks,
-mike

mrjmd’s picture

Issue summary: View changes
Status: Active » Needs review
FileSize
338.26 KB

The attached patch gets rid of our need for either a gzip or a sql file, by adding a new script that is the inverse of ./core/scripts/migrate-dump-d6.sh. It includes the php files for dump schema of the previously excluded tables, and data for menu_router.

Written by @phenaproxima, @thomasVI.

phenaproxima’s picture

A couple of things I'd like to do, ideally, with this patch:

  • I think the dump/import scripts should be merged into a single script that switches on a command-line option.
  • We should be able to specify the tables that should be schema-only (command-line option).
  • Right now, both scripts expect a d6_migrate DB connection to be defined in settings.php. That should be changeable with an option.

I'm sure more things will occur to me, but these are the major ideas for now.

ultimike’s picture

Ok - this is really interesting now. We've eliminated the need for the .sql dump file to be in the repo at all... Some thoughts:

  • We had to add all the D6 tables to the table dumps, but we're not dumping data for cache tables and the like.
  • The new migrate-generate-d6.sh script is the inverse of migrate-dump-d6.sh, as it reads the table dumps from /core/modules/migrate_drupal/src/Tests/Table/d6/ and created the D6 source DB.
  • migrate-generate-d6.sh uses the same $databases['d6_migrate']['default'] settings.php entry for the connection to the D6 source DB.
  • This change will take care of the main issue (binary files in the repo) and reduce the size of the code base.
  • This change will require developers who need to modify D6 (and soon D7) tests to run the new migrate-generate-d6.sh script to create the source DB.
  1. +++ b/core/scripts/migrate-dump-d6.sh
    @@ -200,4 +199,4 @@ function _db_get_query($table) {
    +// There used to be a call to mysqldump here, and I got rid of it. Hahahahahaha.
    

    Please change "Hahahahahaha" to "Mwahahaha" for more evil-ness.

  2. +++ b/core/scripts/migrate-generate-d6.sh
    @@ -0,0 +1,41 @@
    + * This script rebuilds a Drupal 6 database from the PHP files generated
    + * by migrate-dump-d6.sh. To use, create a DB connection, called d6_migrate,
    + * in your settings.php. Then run this here script. I advise you to use
    + * either a completely different database from your main DB, or prefixing.
    + * This is the inverse operation of migrate-dump-d6.sh.
    

    We should probably re-write this to be similar to the format (numbered) as the migrate-dump-d6.sh (just for consistency).

This patch also will need include the removal of the d6.gz file.

-mike

benjy’s picture

I actually wrote a script to go from PHP to DB a while back when we first moved to the automated dumps: https://www.drupal.org/files/issues/createdb.txt Not sure if that helps here.

The only thing that concerns me is how do we know that people used the Drupal 6 site to make changes to the dumps? We don't want people manually editing the dumps. Seeing the changes in the .gz file used to give us this confirmation that they knew to generate the binary file.

In general though, i'm +1 for this.

phenaproxima’s picture

At the risk of sounding ignorant, is it really a major problem if someone manually edits the dump files? As long as it's valid PHP and the data is correct, what's the difference between manually editing the files and using D6 to do it? If anyone posts a patch with modifications to the dump files, it will be way easier to review the changes and ensure they won't break anything, whereas this is impossible to do at a glance with a gzip. On the other hand, if we continued to use a gz file, we would be unable to tell if D6 was used to make the changes, or the MySQL command line.

benjy’s picture

and the data is correct

This is the tricky bit to validate without using D6.

We used to manually manipulate the dumps and it was error prone. People made changes to the dumps because it showed a bug or "seemed right" but didn't actually represent a state that Drupal 6 could get into. These were very difficult to track down when I converted into the automated dump creation or just when they arose in general.

phenaproxima’s picture

It seems to me that the real issue there is less about the way we want to deliver the dump files (PHP or gz) and more about how to validate that the dumps are safe, since neither method prevents the database from being manually mucked with. Separate issue? :-?

ultimike’s picture

I guess it comes down to whether or not we need/want an automated method for validating any changes to the dump files **and** whether or not validating the dumps is important enough to hold up this issue.

How do we validate the dumps? Would we have to create a new DB from the dumps then run some automated tests on the resulting site? Wasn't this the plot of Inception?

We used to manually manipulate the dumps and it was error prone.

- that's an understatement :\

Just throwing this out there, but is there some way we can pseduo-verify that the dumps were creating using the script using some sort of timestamp or something inserted into the dumps as a comment?

-mike

phenaproxima’s picture

Just throwing this out there, but is there some way we can pseduo-verify that the dumps were creating using the script using some sort of timestamp or something inserted into the dumps as a comment?

How about using checksums? MySQL, it seems, natively supports them: https://dev.mysql.com/doc/refman/5.0/en/checksum-table.html. We could get a checksum for each table we dump, embed it in the dump file (as a public constant on the table class, say), and then verify that checksum after creating the table.

My mind also went to the rabbit-hole solution of spinning up a D6 and automatically testing it, but then I felt the beginnings of a headache and decided it wasn't a good idea.

benjy’s picture

The timestamp/checksum is a great idea. When you're reviewing migrate patches you want an easy way to see that the patch was generated using these scripts and not via manually manipulating the dumps. The gz gave us that and so will the table checksum.

+1

phenaproxima’s picture

I'll take a stab at a checksum patch.

amateescu’s picture

It would be awesome to have this issue fixed at some point, there are a few patches in the queue that have to update the d6.gz dump and it's not an easy procedure to say the least :)

For example, the most recent ones that I know of are #2454669: SQLite: Fix tests in migrate_drupal test group and #2477853: PostgreSQL: Add support for reserved field/column names.

benjy’s picture

and it's not an easy procedure to say the least :)

Ha, you should have tried adding a CCK field by manually updating all the dump files, I think the pain that caused wounded most of us :)

But yes, would love to get this finished.

ultimike’s picture

A little birdy tells me that good progress will be made on this issue starting this week...

-mike

phenaproxima’s picture

FileSize
376.43 KB

Added MD5 hash check to migrate-dump-d6.sh and migrate-generate-d6.sh. Sorry this took so long -- life got in the way!

I decided to use an MD5 of the dump code, rather than a table checksum provided by MySQL, after discussing this with @benjy on IRC. For two big reasons:

1) I don't know if table checksumming is supported by things other than MySQL;

2) Table checksums are extremely sensitive to schema variations, and rightfully so, but migrate-dump-d6.sh does not precisely recreate the schema that a normal Drupal 6 installation will generate. @benjy tells me that this does not affect the tests (at least not yet), but it's something to be aware of. Improving the exactness of the dump files' schema is probably a good idea, but I don't know if that belongs in this issue.

benjy’s picture

diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/Cache.php b/core/modules/migrate_drupal/src/Tests/Table/d6/Cache.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheBlock.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheBlock.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheBootstrap.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheBootstrap.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheConfig.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheConfig.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheContent.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheContent.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheDiscovery.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheDiscovery.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheFilter.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheFilter.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheForm.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheForm.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheMenu.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheMenu.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CachePage.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CachePage.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/CacheUpdate.php b/core/modules/migrate_drupal/src/Tests/Table/d6/CacheUpdate.php
diff --git a/core/modules/migrate_drupal/src/Tests/Table/d6/Cachetags.php b/core/modules/migrate_drupal/src/Tests/Table/d6/Cachetags.php

I wonder if we need these cache tables?

Also, can we add a newline after the hash?

I'll try and give this a proper test soon.

phenaproxima’s picture

@benjy: Don't we at least need the schemas? My patch in #3 alters migrate-dump-d6.sh so that it will dump schema for those tables, but not the data.

benjy’s picture

Well we never read from those tables to do the migrations so we don't need it for that.

I'm not sure if we need it when going back from D8 -> D6, it would depend on whether D6 automatically re-creates those tables if they're missing? If it doesn't, we would need the schemas yes.

phenaproxima’s picture

@benjy: I think we'll need them because we're going to be using these dump files, rather than an SQL file, to generate new D6 DBs for people to make changes in.

ultimike’s picture

We'll have to document the process for people to generate a D6 DB from the dump files as well.

-mike

ultimike’s picture

I jumped the gun with the "Save" button on my previous post...

Would it be possible to modify the migrate-generate-d6.sh script with a "--check-md5" switch that won't actually import the DB, but just make sure all the table hashes match? This would make it easier for patch reviewers.

Also - holy menu_router table, Batman!

Finally:

+++ b/core/scripts/migrate-generate-d6.sh
@@ -0,0 +1,50 @@
+  ¶

The most inconsequential of comments: extra space.

thanks,
-mike

phenaproxima’s picture

FileSize
376.91 KB

This patch adds a --validate option, which will output the result of every table's hash check. If you omit the option, the tables will actually be imported, although the hash will still be checked and invalid tables will still be skipped.

ultimike’s picture

@phenaproxima - thanks for adding the --validate option - I think that will be really helpful.

I don't think I'm qualified to RTBC this, but it looks good to me. I think you may have missed benjy's request to add a new line after the hashes though...

So, in practice, I guess the hashes would work like this:

Scenario 1 - creating a new patch involving a change to the D6 DB tables

  1. Run the generate script to re-create the D6 DB.
  2. Make changes to the DB by clicking around in the D6 site.
  3. Run the dump script to dump the D6 DB to our table dumps.
  4. Commit changes to the repository.

Scenario 2 - reviewing a patch involving a change to the D6 DB tables.

  1. Apply the patch.
  2. Run the generate script with the --validate option to ensure the table dumps were created from actual tables and not manually.
  3. Review the patch.

Does that sound about right?

thanks,
-mike

andypost’s picture

This patch adds a --validate option

please add interdiff.txt next time, to easy follow changes

PS: review of 300k is hard

phenaproxima’s picture

Here's an updated patch with a --database option added to both scripts, allowing the user to specify which database to use instead of forcing them to use d6_migrate. (d6_migrate is the default value, but this can now be overridden). As per discussion in the Google hangout last night, we will probably end up generalizing these scripts so that they can be used to create dumps for Drupal 6 and Drupal 7, so this adds a step towards that.

This patch also adds a new line after the hash in each dump file, as per @benjy's request.

Also my first go with interdiff. Yay!

phenaproxima’s picture

This fixes a bug in the previous patch, and adds support for detecting the version of Drupal core being dumped (or restored). As far as I can tell, this makes both the dump and restore scripts D7-ready.

phenaproxima’s picture

Status: Needs review » Active
FileSize
419.31 KB
1017 bytes

This patch fixes an options-related bug in #28 and #27, so I've hidden those broken brothers. This one works, and is capable of dumping/restoring D6 and D7 databases alike, although it defaults to D6 (using the d6_migrate database). I'm not sure how this affects the work being done over in #2410625: Update D7 dumps to match the D6 dumps.

Ideally I'd like to refactor these two scripts into one, called migrate-db.sh, which combines the dump and restore functions. But in the meantime, would love to get input on this.

phenaproxima’s picture

Status: Active » Needs review
FileSize
657.54 KB

Yet another version. This patch:

  • Removes d6.gz and replaces it with dump files, at the end of which are MD5 hashes for validation.
  • Replaces migrate-dump-d6.sh and migrate-generate-d6.sh with a single script, called migrate-db.sh, which contains both functions, activated by command-line switches.
  • Adds the ability to validate the dump file hashes without importing anything (--validate option)
  • Supports D6 and D7 alike.
  • Adds --database option for specifying a particular database connection to use.

I tried to generate an interdiff, but it choked at some point; my guess is that this is caused by the presence of the binary data formerly known as d6.gz.

Status: Needs review » Needs work

The last submitted patch, 30: 2469623-30.patch, failed testing.

phenaproxima’s picture

Status: Needs work » Needs review
FileSize
678.72 KB

Apparently, git doesn't like trailing white space when applying patches. This patch should appease the testbot.

ultimike’s picture

Looks pretty good - I love the fact it is just a single script for D6/D7 and dump/restore at this point. So clean.

benjy mentioned he wanted to play with it a bit as well, so combining my comments below with his desire to check it out leads me to think it's not ready to RTBC just yet.

Are the scenarios I outlined in comment 25 above valid? Are there any others I'm not thinking of?

Some comments below.

  1. +++ b/core/modules/migrate_drupal/src/Tests/Table/d6/System.php
    @@ -284,7 +284,7 @@ public function load() {
    -      'info' => 'a:10:{s:4:"name";s:4:"Path";s:11:"description";s:28:"Allows users to rename URLs.";s:7:"package";s:15:"Core - optional";s:7:"version";s:4:"6.29";s:4:"core";s:3:"6.x";s:7:"project";s:6:"drupal";s:9:"datestamp";s:10:"1384980946";s:12:"dependencies";a:0:{}s:10:"dependents";a:0:{}s:3:"php";s:5:"4.3.5";}',
    +      'info' => 'a:10:{s:4:"name";s:4:"Path";s:11:"description";s:28:"Allows users to rename URLs.";s:7:"package";s:15:"Core - optional";s:7:"version";s:4:"6.35";s:4:"core";s:3:"6.x";s:7:"project";s:6:"drupal";s:9:"datestamp";s:10:"1426707431";s:12:"dependencies";a:0:{}s:10:"dependents";a:0:{}s:3:"php";s:5:"4.3.5";}',
    

    What's up with all these 'info' diffs that looks the same?

  2. +++ b/core/scripts/migrate-db.sh
    @@ -0,0 +1,292 @@
    + * need new Drupal data. Drupal 6 and 7 are supported by this script. The version
    

    "Drupal source data" maybe?

  3. +++ b/core/scripts/migrate-db.sh
    @@ -0,0 +1,292 @@
    + * ./core/scripts/migrate-db.sh --dump --database=CONNECTION_KEY
    

    Is there a default behavior if the user doesn't specify --dump, --restore, or --validate?

  4. +++ b/core/scripts/migrate-db.sh
    @@ -0,0 +1,292 @@
    + * ./core/scripts/migrate-db.sh --restore --source-dir=path/to/table/dumps --database=CONNECTION_KEY
    

    Why do we need to define the source directory? Can we just define the major Drupal version?

  5. +++ b/core/scripts/migrate-db.sh
    @@ -0,0 +1,292 @@
    + * ./core/scripts/migrate-db.sh --validate --source-dir=path/to/table/dumps
    

    Same question about the source as above.

  6. +++ b/core/scripts/migrate-db.sh
    @@ -0,0 +1,292 @@
    +  echo "Invalid options.\n";
    

    Ah - I see, if no --dump, --validate, or --restore, then error out.

-mike

phenaproxima’s picture

What's up with all these 'info' diffs that looks the same?

Looking at it closely, it looks like the "datestamp" item in that serialized pile of detritus looks different across diffs. Not sure if that presents a problem (I can't imagine why it would), but that seems to be the reason why those diffs are happening.

Why do we need to define the source directory? Can we just define the major Drupal version?

Just to normalize the input. If we asked for a version, someone could conceivably enter "d7", "drupal7", "Drupal 7", "7", or just about anything else. Asking for the path to the dump files removes all ambiguity. If that's too clumsy, one possibility is to define additional command-line switches to specify the major version, like this:

For Drupal 7:
migrate-db.sh --restore -7 --database=d7_migrate

For Drupal 6:
migrate-db.sh --validate -6

...which, actually, I really, really like. Will post an updated patch with that.

phenaproxima’s picture

FileSize
679.22 KB

This patch ditches the --source-dir option in favor of version options (-6 and -7). This makes it more consistent with the --dump behavior, which knows the path to the dump files anyway.

I again apologize for the lack of an interdiff. It really seems to hate binaries in patches.

phenaproxima’s picture

FileSize
637.11 KB

Whitespace errors, you will be the death of me. This applies cleanly and should pass the testbot.

phenaproxima’s picture

FileSize
637.09 KB

One more -- removed an errant mention of the defunct --source-dir option.

ultimike’s picture

Hmm - I wonder if those datestamps in the "info" field of the "system" table are going to be different for every row in the table each time someone runs the script with --dump? That would suck. Is that the timestamp the module was installed, generated on d.o., or something else?

Also - for the version arguments, it might be easier to understand with something like --version=6 or --version=7 instead of just -6 or -7.

Thoughts?

-mike

phenaproxima’s picture

FileSize
637.25 KB

Replaced -6 and -7 with --core, which accepts the values 6, 6.x, 7, and 7.x. I agree that it makes for better UX.

As for the datestamps being odd...I'm not sure what's up with that. The dump files in the patch have been spun off from d6.gz as it exists in 8.0.x.

benjy’s picture

Contains \Drupal\migrate_drupal\Tests\Dump\Cache.

The docs are missing the namespace d6/d7 from the namespace.

Could we get a version of the patch with just the script? Make it easier to review?

phenaproxima’s picture

FileSize
658.84 KB
10.03 KB

Here's an updated patch which fixes that. I'm also attaching migrate-db.sh, as contained in the patch. :)

phenaproxima’s picture

I've postponed #2410625: Update D7 dumps to match the D6 dumps until this issue is fixed, since the current version of the patch covers Drupal 6 and 7.

benjy’s picture

This is coming on really well, I love the validate option, so handy!

I tried to restore though and had a fatal error?

╰─➤  ./core/scripts/migrate-db.sh --core=6 --restore --database=d6_migrate                                                                                                                            255 ↵
PHP Fatal error:  Class 'Drupal\migrate_drupal\Tests\Dump\DrupalDumpBase' not found in /Users/benjy/Sites/d8/core/modules/migrate_drupal/src/Tests/Table/d6/Access.php on line 20
PHP Stack trace:
PHP   1. {main}() /Users/benjy/Sites/d8/core/scripts/migrate-db.sh:0
PHP   2. file_scan_directory() /Users/benjy/Sites/d8/core/scripts/migrate-db.sh:213
PHP   3. restore_table() /Users/benjy/Sites/d8/core/includes/file.inc:1074
PHP   4. require_once() /Users/benjy/Sites/d8/core/scripts/migrate-db.sh:238

Fatal error: Class 'Drupal\migrate_drupal\Tests\Dump\DrupalDumpBase' not found in /Users/benjy/Sites/d8/core/modules/migrate_drupal/src/Tests/Table/d6/Access.php on line 20

Call Stack:
    0.0021     283208   1. {main}() /Users/benjy/Sites/d8/core/scripts/migrate-db.sh:0
    0.0370    5662152   2. file_scan_directory() /Users/benjy/Sites/d8/core/scripts/migrate-db.sh:213
    0.0422    5949784   3. restore_table() /Users/benjy/Sites/d8/core/includes/file.inc:1074
    0.0423    5959360   4. require_once('/Users/benjy/Sites/d8/core/modules/migrate_drupal/src/Tests/Table/d6/Access.php') /Users/benjy/Sites/d8/core/scripts/migrate-db.sh:238
phenaproxima’s picture

I've had that too when I try to restore while migrate_drupal isn't enabled. ;)

benjy’s picture

Seems like something worth adding a check for?

phenaproxima’s picture

Point. This adds that check to the --restore mode.

benjy’s picture

Status: Needs review » Needs work
function restore_table($path) {
  require_once $path;
  $version = _get_core_version_from_options();
 
  $class = 'Drupal\migrate_drupal\Tests\Table\\' . $version . '\\' . substr(basename($path), 0, -4);
  return;
  try {
    (new $class($connection))->load();
  }

This return seems to be hindering my ability to restore? :)

Also, I don't think $connection is in scope here.

phenaproxima’s picture

Status: Needs work » Needs review
FileSize
659.1 KB
850 bytes

/me is an idiot :)

phenaproxima’s picture

FileSize
659.08 KB
369 bytes

Got rid of an errant return statement pointed out by @benjy on IRC.

chx’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
10.27 KB

The patch is big and frightening but git diff HEAD --name-only|grep -v /Table/ reveals only three non-Table files changed, and if you add --diff-filter=D you will also see two of them are deleted.

The table changes surely are good since they pass tests.

So, the only file needing review is the new dumper script. For easier review I have attached it. The only worrisome code is $php = str_replace('{{PHP_VALUES}}', $values, $php); but since it is not a generic dumper script we won't have problems with {{PHP_VALUES}} in the table data. I also like the exclusive use of === without a single ==. Wish the rest of core did that.

The md5() hashing function is also fine for testing purposes -- although it is no longer used for runtime code, PHPunit mock objects also use md5().

rocketeerbkw’s picture

Status: Reviewed & tested by the community » Needs work
  1.     WHERE (`TABLE_SCHEMA` = 'd6_migrate') AND (`TABLE_NAME` = '{$table}') AND (`COLUMN_KEY` = 'PRI')
    

    Database name d6_migrate still hardcoded here

  2.     // Only dump the actual table values if we're NOT looking at a cache table,
        // watchdog, router, or sessions tables.
        if (substr($table, 0, 5) !== 'cache' && !in_array($table, array('watchdog', 'sessions'))) {
    

    Should the router table be checked for, or removed from the comment?

  3.   foreach ($tables as $table) {
        restore_table($table->uri, $connection);
      }
    

    Not skipping invalid tables anymore?

amateescu’s picture

Status: Needs work » Reviewed & tested by the community

#51.1: Fixed.
#51.2: The code does not exclude it anymore so I updated the comment.
#51.3: That's not needed because we're avoiding unneeded tables at dump time.

Setting back to RTBC because my changes are more than minimal.

I also reviewed the rewritten generate script and it looks very good. Actually, so good that at I'm going to use it as the base dump script for this critical issue: #2447573: [meta] Make sure 8.x - 8.x hook_update_N() testing is possible. Great work, @phenaproxima!

amateescu’s picture

FileSize
618.34 KB
1.37 KB

Oops, now with the patch.

amateescu’s picture

@phenaproxima pointed in out in IRC that I'm silly and got the database name wrong, he will post an updated patch :)

phenaproxima’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
618.69 KB
11.75 KB

Updated patch:

1) Fixes incorrect DB name being used when querying for table schema info.
2) Restores table validation during restore, as @rocketeerbkw suggested in #51, since that's a data integrity issue.
3) Documented all the helper functions.

phenaproxima’s picture

FileSize
618.68 KB
11.74 KB

Whoops! Botched a return statement in the previous patch.

You know what they say: 56th time's the charm.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Well, actually, 3rd time is a charm so let's try an RTBC for the third time :)

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 56: 2469623-56.patch, failed testing.

phenaproxima queued 56: 2469623-56.patch for re-testing.

phenaproxima’s picture

Status: Needs work » Reviewed & tested by the community

Setting this back to RTBC after testbot failure.

ultimike’s picture

Status: Reviewed & tested by the community » Needs review

I found a small bug in the new script with the --validate parameter - interdiff and patch included. Here's what I did:

Applied the patch from 56.
Restored the D6 DB.
I didn't make any changes and I dumped the DB using the script. This resulted in a bunch of changes from a doc typo ("cores" -> "core") and a field int length(?). The bug fix in the interdiff is at the very bottom:

diff --git a/core/scripts/migrate-db.sh b/core/scripts/migrate-db.sh
index 6b104cf..e59cb49 100755
--- a/core/scripts/migrate-db.sh
+++ b/core/scripts/migrate-db.sh
@@ -238,7 +238,7 @@ public function load() {
     $tables_dir = DRUPAL_ROOT . '/core/modules/migrate_drupal/src/Tests/Table/' . $version;
     $tables = file_scan_directory($tables_dir, '/.php$/', array('recurse' => FALSE));
     foreach ($tables as $table) {
-      echo (table_is_valid($path) ? 'OK' : 'INVALID') . ": {$path}\n";
+      echo (table_is_valid($table->uri) ? 'OK' : 'INVALID') . ": {$table->uri}\n";
     }
   }
   else {

I also spoke with alexpott about this issue, and he likes everything about it but would like to see a phpunit test built for the --validate option.

-mike

ultimike’s picture

FileSize
625.33 KB
79.08 KB
11.75 KB

Ah bugger - I forgot to add the diffs to the last comment...

-mike

alexpott’s picture

Status: Needs review » Needs work

We can add a PHPUnit test to ensure that file and hash are correct. I'm very +1 to removing the .gz - gonna be way nicer when reviewing patches that change stuff.

vendion’s picture

The current state of the database script both UID 0 and UID 1 do not get created when running the restore function. Is there a reason for this? It seems like if we would need to login with admin access to update data for tests this would create a hurdle.

It was noticed that in the current migrate-db.sh script the _db_get_query() function explicitly ignores UID 0 and 1.

benjy’s picture

Now symfony/console is going into core as a dependency for the upgrade script, maybe we should use that instead: https://www.drupal.org/node/2447573#comment-9911479

It might make testing a bit easier with their CommandTester.

phenaproxima’s picture

Status: Needs work » Needs review
FileSize
625.61 KB

Added a unit test of migrate-db.sh's validation procedure. See core/modules/migrate_drupal/tests/src/Unit/MigrateDBTableValidationTest.php. No changes were made to migrate-db.sh since #62, so I'm not bothering to attach it again.

phenaproxima’s picture

FileSize
1.1 KB

At @alexpott's request, posting the unit test by itself, since nothing else has been changed in the previous patch.

amateescu’s picture

@benjy, whether we can use symfony console or not has not been decided yet, and that issue will refactor the script added here anyway, so I suggest to move forward with the current patch.

phenaproxima’s picture

FileSize
625.26 KB
812 bytes

Changed the test as per @alexpott's request. It will now actually validate all table dumps so that contributors will know, at patch submission time, if they screwed up by manually editing a dump file.

The interdiff is only a renamed copy of the test itself, since nothing else has changed.

Status: Needs review » Needs work

The last submitted patch, 69: 2469623-69.patch, failed testing.

phenaproxima’s picture

Status: Needs work » Needs review
FileSize
612.14 KB

This patch regenerates the table dumps so the tests will pass. Beyond that, there are no changes to any code.

Status: Needs review » Needs work

The last submitted patch, 71: 2469623-71.patch, failed testing.

phenaproxima’s picture

Status: Needs work » Needs review
FileSize
625.26 KB

Re-rolling yet again to account for DB changes made in ultimike's patch in #62.

benjy’s picture

Does the test just copy the logic from the script to show that it works? I don't see the point in that.

Also, in the patch in #73, I can't see MigrateDBTableValidationTest at all?

phenaproxima’s picture

@benjy, it does copy the script's logic. But not to prove that it works, but rather to automatically test all the dumps at patch submission time. Let testbot help us, I say; CI for the win!

As for the name, it was changed to MigrateTableDumpTest. It's at the very top of the patch. :)

benjy’s picture

Status: Needs review » Reviewed & tested by the community

Ah ok, validating the hash in a test makes sense, nice call! RTBC again for me.

One small improvement could be a message on the assertion, rather than having hash x does not equal hash y. It could be something like, Table x looks like it has been changed manually and the hash is outdated?

phenaproxima’s picture

It sets the path of the dump file as the assert message for that exact reason.

benjy’s picture

Ah great, I was still looking at the patch in #67

ultimike’s picture

RTBC +1 on this!

-mike

kdechant’s picture

I've long wanted a DB fixture system for use with Drupal automated tests (e.g., for custom modules). The script migrate_db.sh (and the files it outputs to "src/Tests/Table") is close to what I had in mind. Would it make sense to (eventually) convert this into a general-purpose fixture generator that isn't specific to Migrate?

benjy’s picture

@kdechant, i've actually used the existing script in core for such a purpose, i used it on the iFrame example module: https://www.drupal.org/sandbox/benjy/2422307

alexpott’s picture

Status: Reviewed & tested by the community » Fixed
Related issues: +#2447573: [meta] Make sure 8.x - 8.x hook_update_N() testing is possible

re #80 that is exactly what @amateescu is planning to do on #2447573: [meta] Make sure 8.x - 8.x hook_update_N() testing is possible.

I think that Drupal\migrate_drupal\Tests\MigrateTableDumpTest could be a PHPUnit test - but let's do that in a followup.

Committed 78cc7d8 and pushed to 8.0.x. Thanks!

  • alexpott committed 78cc7d8 on 8.0.x
    Issue #2469623 by phenaproxima, ultimike, amateescu, mrjmd, douggreen,...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.