Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
@todo
Proposed resolution
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Comment | File | Size | Author |
---|---|---|---|
#41 | core-derive_path_alias_migrations-3122649-41.patch | 32.19 KB | Wim Leers |
#41 | interdiff.txt | 1.12 KB | Wim Leers |
#40 | 3122649-40.patch | 31.61 KB | ravi.shankar |
#38 | core-derive_path_alias_migrations-3122649-38.patch | 31.6 KB | Wim Leers |
#37 | core-derive_path_alias_migrations-3122649-37.patch | 33.56 KB | Wim Leers |
Comments
Comment #2
huzookaComment #3
huzookaComment #4
Wim Leers🤔 I was really confused by what
$optional_migration_dependencies
was for.But I think I get it now!
It's about the optional migration dependencies for the
d*_url_alias
migration's "catch-all" derivative.I think a variable name like
$other_entity_destination_migrations
would have made this a lot clearer.🤔 This comment talks about required dependencies, but then the code it comments does not do anything with required dependencies?
🤔 I find this pretty difficult to understand.
Why only
entity
andentity_complete
? Why notnode
andnode_complete
? And isn't itd7_node_complete
for examle?The first comment belongs on the second if-statement.
The first if-statement needs a comment like
// Only modify migrations with an entity destination.
👍🤣 Imagine, aliases for aliases!
👍 I was gonna ask:
But this is much more robust! 👏
Rather than this comment, I think you could write
🐛
d6_url_migration
→d6_url_alias migration
👍 This matches the logic of
\Drupal\jsonapi\ResourceType\ResourceTypeRepository::isLocatableResourceType()
.When can this happen?
Why is it okay to return early?
This needs some documentation :)
🤓 Let's make this a class constant.
This comment is pointing out something pretty obvious — this can be removed?
😍👏 Very elegant!
s/that's/whose/
80 cols 🤓
😍👏
🤔 Couldn't this result in hundreds or even tens of thousands of values in that
IN
expression in the SQL query? That query would fail.I did not expect this would change the number of
path_alias
entities. This sounds like a bug? 🤔Comment #5
Wim LeersEverything in #4 is superficial, except for points 17 and 18. I wonder if point 18 is caused by point 3.
Comment #6
Wim LeersI think a label like
Non-entity path aliases
would make more sense.Comment #7
huzookaRe #6:
Marginal note: If the taxonomy module is not installed on the destination site, we won't create term specific derivatives, but the Other derivative will try to migrate the term aliases from the source site.
Comment #8
Wim LeersI found and fixed a dependency problem. :)
Comment #10
huzookaAddressed most of the feedbacks in #4; except of #4.18.
#4.1: variable renamed 🙂
#4.2: I hope that the new comment is clearer now.
#4.3: Well, we don't have migration source plugins like
node
ornode_complete
. I check the destination plugin here.d7_node_complete
is a migration plugin ID.#4.4: Fixed.
#4.5, #4.6: 😊
#4.7, #4.8: Fixed
#4.10: Hoping that the new line makes more sense...
#4.11: I made a new trait where this is a protected property. May it work?
#4.12: Removed.
#4.13: Thanks! 🙂
#4.14, #4.15: Fixed
#4.16: 😊
#4.17: I fixed this by the help of a new trait that provides the base for both the Deriver class and the migration source plugin.
#4.18: I don't know why this is happening, we have to check this. The Drupal 7 database fixture contains only 6 rows...
Comment #11
Wim Leers👏
Comment #12
huzookaOnly one failing test expected...
Comment #13
huzookaComment #15
Wim LeersThis rebases the patch on latest
9.0.x
and massively improves query performance.For a site with 28K nodes, 50K path aliases and 7 node types, this makes the computing of the per-node-bundle derivatives
d7_url_alias:node:*
go down from 61 seconds to 20 milliseconds — or about 3000 times faster!Comment #17
Wim LeersWith #15 solved,
\Drupal\path\Plugin\migrate\source\d7\UrlAlias::count()
was still super slow. And in the project I'm working on, we're hitting that all the time.But …
public function count($refresh = FALSE)
is the signature. The refresh is optional. And if the source DB doesn't change after discovery time, then the counts stay the same. Clearing all caches after refreshing the source DB is no big deal obviously.So … YAY, rather than doing expensive queries and trying to optimize those (which I first did: I was bringing the same more efficient query pattern from #15 to
\Drupal\path\Plugin\migrate\source\d7\UrlAlias::query()
), we can just use the numbers at query time! Much faster!Comment #19
Wim LeersOops.
Comment #21
Wim LeersWell, #17 made
::count()
run in constant time, but we still need to speed up::query()
since that's actively used by the migration system also.Applying a similar query optimization as in #17.
Comment #23
Wim Leers@huzooka queued PostgreSQL & SQLite test runs of #21, those show that the migrations are now failing on those databases. This is due to the use of
SUBSTRING_INDEX()
, which is MySQL-specific. Figuring out an alternative query … 🤔Comment #24
Wim LeersThis should make the patch also work on SQLite.
Comment #25
Wim Leers😬 Forgot to
git add
a few bits 🙈Comment #28
Wim Leers@huzooka rightfully pointed out in chat to me just now that it's wrong to assume the first link template is the canonical one.
However, for the currently supported entity types (
Node
andTerm
), this assumption holds true 🙈🤓Whenever we expand this to support more entity types (or before this patch gets committed to Drupal core), we'd need to fix this.
Comment #29
huzookaNitpick: We are actively using the patches here, but it seems that no one has any idea what "Non-entity URL aliases" are. So I'll change the label of the remaining aliases' migration to 'URL aliases (remaining)' (as requested).
Comment #30
huzookaComment #32
Wim LeersThis introduced a regression.
It only counts the unique number of terms of a vocabulary that have an URL alias.
That means it fails to count every URL alias for terms in a given vocabulary.
In other words: it always counts a single URL alias per term, even if a term has lots of historical path aliases.
The consequence when combined with the
__count_at_discovery_time
optimization is that the reported total count can be lower than the imported count! 🙃🙈This interdiff fixes that, and reverts a bunch of the now obsolete changes that I introduced in #15.
Comment #34
Wim LeersWe should also skip
redirect
migrations (for https://www.drupal.org/project/redirect).Comment #37
Wim Leers#17 was true at the time but now with #3190815: Source count caching broken: impossible to enable source count caching for SqlBase-based source plugins (plus, unneeded cache I/O), a better/simpler approach is possible: let the migration system's built-in source count caching take care of this!
Comment #38
Wim LeersOops. That included a
*.orig
file 🙈The patch in #37 is incorrect. The interdiff is correct though.
Comment #39
Wim LeersAh, apparently this conflicts with #3096951: d7_node migration should have dependency on d7_node_title_label migration 😬Which the migration maintainers have shot down. Ah well, then there won't be a patch that applies.
Comment #40
ravi.shankar CreditAttribution: ravi.shankar at OpenSense Labs commentedAdded reroll of patch #38.
Comment #41
Wim LeersThe patches so far generate optional dependencies on
d7_shortcut
,d7_custom_block
, et cetera. This does not make sense, because they only have admin-facing URLs.