Problem/Motivation

I've extended the File source plugin with a custom one as I needed to join on the file_usage table in the query() method and ran into the following error when running the migration:

Migration failed with source plugin exception: SQLSTATE[23000]: Integrity constraint violation: 1052 Column 'fid' in on clause is ambiguous

SELECT f.*, map.sourceid1 AS migrate_map_sourceid1, map.source_row_status AS migrate_map_source_row_status
FROM
{file_managed} f
INNER JOIN {file_usage} fu ON f.fid = fu.fid
LEFT OUTER JOIN DB_NAME.migrate_map_MIGRATION_NAME map ON fid = map.sourceid1
...

As can be seen in my custom source plugin I've correctly give file_usage a custom alias, fu, however the migrate code doesn't seem to know what the 'file_managed' alias is in order to use it in the query.

This is a similar issue to #2987832: Term Node query Integrity constraint violation where the Term migration was missing the alias.

Steps to reproduce

Create custom source plugin and join on the file_usage table which also has the fid column.

Example of plugin:

class ExampleFile extends File {

  /**
   * {@inheritdoc}
   */
  public function query() {
    $query = parent::query();

    // Join on file_usage table to only migrate files used 
    // through certain modules.
    $query->innerJoin('file_usage', 'fu', 'f.fid = fu.fid');

    // Some custom condition.
    $query->condition('fu.module', ['example', 'example2'], 'IN');

    return $query;
  }

}

Proposed resolution

The getIds() method in the File source plugin is missing the 'alias' property for the 'fid' column, so the migrate module doesn't use the alias in the SqlBase source plugin query.

I've written a patch to add the alias to the getIds method and also added the alias to the query conditions to make sure there's no risk of them being seen as ambiguous in some other scenario (ie. if someone joins on some random table that happens to have 'uri' as a column).

I've tested this under 8.9.x so created a patch for that but I checked the File source plugin on all 9.x versions and none have the alias specified in the method so this patch could be applied there too.

Issue fork drupal-3183226

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

oldspot created an issue. See original summary.

oldspot’s picture

Attaching patch I created for this issue when running my migration which solved the error.

oldspot’s picture

Status: Active » Needs review
Matroskeen’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

I did literally exactly the same thing and encountered the same issue, so it's a huge +1 to have this fixed.

However, we should extend the test coverage to show the problem (publish a patch with test only) and then a fix to prove that it was fixed (patch with test + fix).

I'm also wondering if it makes sense to add a configurable filter by used files directly to the File source plugin. It might look like this:

if ($this->configuration['used_only']) {
  $query->innerJoin('file_usage', 'fu', 'f.fid = fu.fid');
}

I think it might be a good improvement but would like to get another opinion before creating a task.

Matroskeen’s picture

Version: 8.9.x-dev » 9.2.x-dev
Assigned: Unassigned » Matroskeen

Matroskeen’s picture

The last submitted patch, 7: 3183226-7-test_only.patch, failed testing. View results

Matroskeen’s picture

Issue tags: -Needs tests

In the first patch (and the first commit in merge request) I added a new test module with a custom plugin that extends File source plugin and joins file_usage table. (this is how we both reproduced this issue).

In the second patch there is the same test and fix from #2.

The request is ready for review, but there are still open questions:

1) As we can see, the test failed with the following message:

PHP Fatal error: Uncaught LogicException: The database connection is not serializable.

When I was debugging this on my local environment, I was able to see the actual exception message, which was the same as in the task description:

Migration failed with source plugin exception: SQLSTATE[23000]: Integrity constraint violation: 1052 Column 'fid' in on clause is ambiguous

I think it should be fixed (probably in a follow-up issue).

2) Do we want to include a configuration to include used files only in the original File source plugin?

quietone’s picture

Status: Needs review » Needs work

I have reviewed the patch and here are my thoughts.

First, looks good and the test fixes the problem in the IS. it is disappointing to have to write a test module and a Migration test to prove this fixes the error reported in the IS. That said, the test does the job.

I found the mix of names used in the test module and the migration test a bit confusing. I think if we use 'file_used' or FileUsed it will make this easier to follow and maintain. That means only the two changes below need to be made.

  • file_test_migration module -> migrate_file_used
  • MigrateFileCustomSourcePluginTest.php -> MigrateFileUsedTest.php
  1. +++ b/core/modules/file/tests/modules/file_test_migration/src/Plugin/migrate/source/d7/FileUsed.php
    @@ -0,0 +1,30 @@
    + * Drupal 7 file source from database restricted to used files only.
    

    nit, 'only' can be dropped without losing meaning.

  2. +++ b/core/modules/file/tests/modules/file_test_migration/src/Plugin/migrate/source/d7/FileUsed.php
    @@ -0,0 +1,30 @@
    +    // Join on file_usage table to only migrate files used through certain
    +    // modules.
    

    This join isn't joining on the module name so the 'through certain modules.' can be removed.

  3. +++ b/core/modules/file/tests/src/Kernel/Migrate/d7/MigrateFileCustomSourcePluginTest.php
    @@ -0,0 +1,26 @@
    + * Migrates used files in the file_managed table.
    

    This is testing the migration. So 'Tests the migration of used files.'

9.1 Yes, that error is a pain. Migrations never used to fail with the serialization error, it is a fairly recent nuisance.
9.2 I would say no, since it is not needed to migrate core. But it is still a good idea to ask the other maintainers.

Matroskeen’s picture

@quietone, thanks for the review!

It was addressed in the last commit except this change: file_test_migration module -> migrate_file_used
The idea was to create a test module that might be re-used in the future (for any other migration modifications), so I didn't want to name it specifically for this plugin. Please let me know what you think about it.

quietone’s picture

Status: Needs review » Needs work

@Matroskeen, your welcome!

And to my thoughtss...

My experience with migrate is that test modules are unique to a particular test. If migrations and/or plugins are added to this new test module sometime in the future then that changes the environment of the existing test and that should not happen. For example, if the current test included an assertion on the number of migrations discovered it would fail when a new migration was added. I know that is not the best example, but it is the principle about changing the test environment of an existing test to accommodate a different test.

And from a DX viewpoint, it is really handy to have the names in agreement. And if, say a new test migration was added to support a different test, and I was looking at the original test I would be confused why there is a second migration that is not used in the test. I would then spend time trying to find out why.

I hope that makes sense. ;-)

Matroskeen’s picture

It does make sense, but I still have mixed feelings about the test and test module. I'll try to explain why.

We already changed the test name to MigrateFileUsedTest.php and almost changed the test module name to migrate_file_used. However, we do not test the actual filter by used files here. We do not care whether only used files were migrated. What we really do here - make sure the query is not broken when we join another table with the same column names. I think it might be even more confusing for other developers.


Another thing is the issue might exist in other migrate source plugins. Here is an example: #2987832: Term Node query Integrity constraint violation.
Following this approach, we'll have to create a separate test module and extend the original query for each source plugin, which sounds like overkill. 🤷‍♂️

quietone’s picture

Right, so your thinking this is just one example of adding aliases to getids() and that there will be other cases. That would change the scope here or we need a meta to investigate the need to do that and track progress for the existing 121 source plugins. We should bring that up at a migrate meeting.

However, I still think that the names should be consistent with the purpose of one test. And as you point out the test is really about getids and not files uses do the names would more accurately names test_file_getIds, MigrateFileGetIdsTest.

Matroskeen’s picture

Status: Needs work » Needs review

I was looking for better options (trying to reduce the number of changes or make the tests more generic) but didn't find any 😔
Published one more commit to address feedback in #14.

Matroskeen’s picture

Assigned: Matroskeen » Unassigned

.

quietone’s picture

Status: Needs review » Reviewed & tested by the community

@Matroskeen, thank you for those changes!

I reviewed the patch and it fixes the problem in the IS and tests it is well. All good to go.

quietone’s picture

I wonder if this needs a change record?

alexpott’s picture

@quietone we'd need a change record if we expect people to have to make changes to existing migrations - is that the case here?

quietone’s picture

@alexpott, existing migrations are fine. This helps anyone extending the file plugin.

alexpott’s picture

Version: 9.2.x-dev » 9.1.x-dev
Status: Reviewed & tested by the community » Fixed

Committed and pushed 118b1c33a7 to 9.2.x and b999b138a8 to 9.1.x. Thanks!

  • alexpott committed 118b1c3 on 9.2.x
    Issue #3183226 by Matroskeen, oldspot, quietone: Specify table alias for...

  • alexpott committed b999b13 on 9.1.x
    Issue #3183226 by Matroskeen, oldspot, quietone: Specify table alias for...
Wim Leers’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.