Problem/Motivation
In our case we have 1,000 URLs and for a subset of these there is no value present in the markup. What we then get is an InvalidArgumentException with the following error for every odd URL of the list:
The current node list is empty
See the Crawler class:
...
public function link(string $method = 'get'): Link
{
if (!$this->nodes) {
throw new \InvalidArgumentException('The current node list is empty.');
}
Steps to reproduce
Have a migrate setup with large sets of URLs, having a few odd rows w/o a value for a XPath selector. The migration will fail on the first error and halt.
Proposed resolution
I see we're using the same InvalidArgumentException class as used within Crawler class, but perhaps we could create a different class to distinguish amongst those?
Then we can still halt the migration on faulty usage of the migrate source plugin but are able to catch the Crawler error and emit a warning for the failing URLs.
Remaining tasks
1. Create a different Exception class for supported filter types within the migrate source plugin
2. Catch and log the Crawler's "The current node list is empty" error but continue migration process
3. Review and test change
User interface changes
N/A
API changes
N/A
Data model changes
N/A
Issue fork migrate_source_scraper-3505532
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
baikho commentedComment #10
reinchekHi @baikho, there's now a new custom exception, called PluginErrorException designed to handle plugin configurations errors. Furthermore i've added a try catch around main for loop to catch and log InvalidArgumentException errors, in this way the migration doesn't stop.
Everything is included in the new release 1.0.1.
Thanks ;)
Comment #11
reinchekComment #12
reinchekComment #13
reinchekComment #14
baikho commented#10 Great stuff, many thanks!