A Migrate plugin that could convert basic <a href="node/123"> links into the required format would be really useful, both because lots of sites used that structure, and it could be useful as an example for people on how to build custom plugins for their own scenarios.

It would need to replace <a href="node/123"> links with the following structure:
<a data-entity-substitution="canonical" data-entity-type="node" data-entity-uuid="f52dbbd3-17cb-4d7f-8d65-f136645298db" href="/node/123">

Comments

DamienMcKenna created an issue. See original summary.

damienmckenna’s picture

Issue summary: View changes
damienmckenna’s picture

Status: Active » Needs review
StatusFileSize
new3.79 KB

This appears to work fine.

Before:
<a href="node/123">

After:
<a data-entity-substitution="canonical" data-entity-type="node" data-entity-uuid="href="78645291-82d5-475c-8a01-32beb27ebf92" href="/node/123">

This is how you use it:

   field_body/value:
     plugin: linkit_links
     source: body
damienmckenna’s picture

StatusFileSize
new3.79 KB
new760 bytes

Doh! The output should now be like this:

<a data-entity-substitution="canonical" data-entity-type="node" data-entity-uuid="78645291-82d5-475c-8a01-32beb27ebf92" href="/node/123">

damienmckenna’s picture

Right now the plugin only handles node links, but it could be easily expanded to handle other entities.

damienmckenna’s picture

StatusFileSize
new4.67 KB
new4.08 KB

After running the migration with the patch above I discovered there were a dozen links which had attributes after the href attribute, so searching for a complete tag didn't work.

wim leers’s picture

Issue tags: +migrate-d7-d8
wim leers’s picture

wim leers’s picture

  1. +++ b/src/Plugin/migrate/process/LinkitLinks.php
    @@ -0,0 +1,133 @@
    + * Also converts "node/123" links from D6/7 to the expected "/node/123" format.
    

    What about links that linkit generated that use path aliases?

  2. +++ b/src/Plugin/migrate/process/LinkitLinks.php
    @@ -0,0 +1,133 @@
    + * Note: this will skip rows where the entity cannot be loaded. This will happen
    + * if the entity doesn't exist, the primary reason being that it has not been
    + * migrated yet. As such, it may be necessary to run any migration this process
    + * is added to twice, once to pass dependencies for other migrations, and a
    + * second time when the entities have been loaded. YMMV.
    

    This is very problematic :| But I don't see an elegant solution either.

    If you have a foo node that is referring to another foo node, it's impossible to know the UUID of the referred node until it's migrated, but it could be migrated in the reverse order.

    Even if you could order this "correctly", it would still not solve circular references. Which are definitely allowed in text fields.

    Tricky.

    I wonder if the migration system maintainers have elegant solutions for this. Basically this process plugin can only be reliably run after the migration has already run once. It's almost like a "post-process" phase.

    The only thing I can think of is to make \Drupal\migrate\Plugin\MigrateProcessInterface::transform() track unresolvable links (time A), and create a new/separate linkit_post_processor migration whose rows are effectively the ones tracked at time A, so that while the migration at time A is running, the number of rows to process is literally generated during the migration.

    Then at a later time B, we could in principle reliably update all of the tracked rows!

    (Yes, I'm handwaving a lot here. I'm hoping this will trigger a response from one of the migration system maintainers pointing out a much better way to do this 😄)

joel_osc’s picture

We do the post-process as you described... in your source plugin for your post-processer you select all nodes inner joined with your migrate map table. And then you convert links to linkit format. It gets even more fun coming from a non-drupal system where you may be converting non-drupal a tags to linkit links that could have image assets inside of the tag that get converter to media entities. ;)

mstrelan’s picture

+ * Because the plugin doesn't match links which have extra attributes before the
+ * "href" attribute, it might be worth testing the migration first, seeing what
+ * items are missed, and then adding extra str_replace commands first.

It seems this would be better off extending the dom_str_replace plugin in migrate_plus.

We do the post-process as you described... in your source plugin for your post-processer you select all nodes inner joined with your migrate map table. And then you convert links to linkit format.

Care to share example code?

What about links that linkit generated that use path aliases?

This would be good to have. You could do a lookup on the path_alias table but would you want to do this for all links or just those with /node/123 pattern? Also the aliases may not exist yet (or at all) on the destination site.

It almost seems it would be better to develop this as a standalone batch task that can be run at any time.

damienmckenna’s picture

Title: Migrate process plugin to convert basic "node/123" links to the correct format » Example migrate process plugin to convert basic "node/123" links to the correct format
Issue summary: View changes

+++ b/src/Plugin/migrate/process/LinkitLinks.php
@@ -0,0 +1,133 @@
+ * Also converts "node/123" links from D6/7 to the expected "/node/123" format.

What about links that linkit generated that use path aliases?

I wasn't looking at that, I thought bundling a simple optional plugin could be beneficial as a learning exercise for people who wanted to map custom HTML. Properly mapping Linkit on D7 to D9 should be handled separately.

liam morland’s picture

I noticed the examples in this issue all show the site being at the root of their server. This solution should work with sites that are not at the root, for example: <a href="/site-path/node/123">

liam morland’s picture

Is there a reason site paths are included at all? See related issue.

damienmckenna’s picture

Status: Needs review » Needs work

This needs work to make it cover additional scenarios.

damienmckenna’s picture

Version: 8.x-5.x-dev » 6.0.x-dev
Status: Needs work » Needs review
StatusFileSize
new830 bytes
new4.79 KB

This improves the documentation.

damienmckenna’s picture

Status: Needs review » Needs work

Back to "needs work" per #9, #11, etc.

mark_fullmer’s picture

Version: 6.0.x-dev » 7.x-dev
dalemoore’s picture

This hasn't had any traffic in over a year, but I'm running into the same issue. I figured out how to create a migration to fix broken links when the node IDs/aliases changed from a migration a former coworker did, but the links are converted to the node/ID format, and Linkit doesn't seem to do anything with them to make them into their nicer looking alias without those extra data-attributes. I thought if I just resaved the pages it would kick in but it doesn't I guess w/o those attributes.