I am doing a migration where I have source files which contain body text, and I am extracting image files from IMG tags. I'm therefore using MigrateSourceList with MigrateListFiles as the list handler, to which I'm passing a custom subclass of MigrateContentParser. This analyses the body text, and extracts source IDs for the image files, which my migration can then use to retrieve the files from the site I'm importing from.
However, MigrateListFiles::getIDsFromFiles() only returns a chunk ID when a single file has multiple chunks. When a file happens to have only one IMG in the body, it returns just the file ID:
if ($this->parser->getChunkCount() > 1) {
foreach ($this->parser->getChunkIDs() as $chunk_id) {
$ids[] = str_replace($this->baseDir, '', (string) $file->uri) . MIGRATE_CHUNK_SEPARATOR . $chunk_id;
}
}
else {
$ids[] = str_replace($this->baseDir, '', (string) $file->uri);
}
This is a problem because it means in that case my MigrateItem handler (or my migration) needs to parse the original file all over again. This is not very clean, as it creates a special case.
It would be handy to have an option somewhere to cause all IDs to have the chunk ID appended.
Comment | File | Size | Author |
---|---|---|---|
#2 | 2550793.migrate.always-append-chunk-ids.patch | 1.75 KB | joachim |
Comments
Comment #2
joachim CreditAttribution: joachim commentedHere's a patch. It's on top of the patch I've posted at #2505683: Pass item id to chunk parser for debugging purposes, hence it's going to fail tests right now.
I wasn't sure where to add the option. MigrateListFiles's __construct() is already loaded with parameters, and I figured this is something for the content parser to decide.
Comment #6
mikeryanCommitted, thanks!