As part of a presentation, Jonathan Franks created a PDF parser plugin for Migrate Plus. I think it would be a great addition to the Migrate Plus plugins: https://github.com/jonathanfranks/d8migrate/tree/master/web/modules/cust...
Attaching a sample module, to speed up testing the plugin (rename from migrate_pdfs.tar_.gz to migrate_pdfs.tar.gz after download).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

ressa created an issue. See original summary.

heddn’s picture

We'd need to add a suggestion to the composer.json of the project and do a check in the process plugin's constructor if the parser class is available. Plus tests. But great suggestions.

ressa’s picture

Sounds great. I have alerted the author about this issue, to hear if he wants to create the plugin himself.

ressa’s picture

Here is a first patch. No tests, I hope somebody else can help out with that.

ressa’s picture

Status: Active » Needs review

Status: Needs review » Needs work

The last submitted patch, 4: migrate_plus_pdf_parser_plugin-3019758-4.patch, failed testing. View results

ressa’s picture

Issue summary: View changes
FileSize
187.35 KB

Sample module, to speed up testing the plugin (rename from migrate_pdfs.tar_.gz to migrate_pdfs.tar.gz after download).

ressa’s picture

Status: Needs work » Active

Can somebody else help with debugging why the patch fails? It works fine locally ...

heddn’s picture

Status: Active » Needs work

composer.json files can't be patched and assume the new requirements are downloaded. But that also brings up a good point. We need to add it as a suggestion.

+++ b/src/Plugin/migrate/process/ParsePDF.php
@@ -0,0 +1,51 @@
+    if (!class_exists('\Smalot\PdfParser\Parser')) {
+      echo "You need to install the smalot/pdfparser package with composer require 'smalot/pdfparser' to use parse_pdf.\n";
+      echo "Reset you migration with drush migrate:reset-status migration_id.\n";
+      exit();
+    }

Put some of this into the constructor. And see if we can somehow silently fail plugin discovery if the pdf parser code isn't available.