Problem/Motivation

We have a new 8.3.x-compatible branch of the User Guide -- see #2857156: Make a 3.x branch. It includes several new languages, including one RTL language (Farsi), many new screenshot images, and a few text updates.

So today, I got onto a development site and tested importing the 8.x-3.x branch of the User Guide, just to make sure it would work. It did mostly import OK, but what happened is that for instance with French, the content of most pages hadn't changed, and the nodes didn't get updated. However, most of the pages needed to be updated, because all of the French screenshots had been updated.

The problem, I'm pretty sure, is in how Feeds decides whether the incoming content is different from the last time the content was imported. Basically, it makes a hash of all of the "mapping sources" (all the incoming content that is mapped to fields on the node). Each time it imports content, it stores the hash, and then the next time, it compares the new incoming hash to the stored hash. If they're the same, the node is skipped; if they're different, it's updated. Here's the code that makes the hash:

  protected function hash($item) {
    $sources = feeds_importer($this->id)->parser->getMappingSourceList();
    $mapped_item = array_intersect_key($item, array_flip($sources));
    return hash('md5', serialize($mapped_item) . serialize($this->getMappings()));
  }

The problem is for the image files, all that is hashed here would be the full path to the file names, and this isn't changing from import to import.

Proposed resolution

The Feeds node processor has an option to ignore the hash and update all the nodes no matter what. We should use that option.

It will mean that every node will be updated in every import, which is less efficient than only updating the ones that need updating. But since we don't do updates very often, it shouldn't be too much of a load on the server.

I don't think this is necessary for the Contributor Guidelines book, because it has few images and they wouldn't change unless the text in that page changed. But for the User Guide, we update images based on new versions of Drupal, more translations, etc., and the text in the corresponding pages often is not updated at the same time.

Remaining tasks

1. Make a patch. The Feeds options are in a feature in the drupalorg project.

2. Test the patch on a development site. Note that Fastly is in use on drupal.org and on staging, which caches images for up to a day. So it may still appear that a given image hasn't been updated after a feed, until a day has passed. But this is not the case on the development servers.

3. Deploy the patch to drupal.org production.

User interface changes

When we import the User Guide, all nodes will be updated, and images will be updated.

CommentFileSizeAuthor
#2 2888561.patch742 bytesjhodgdon

Comments

jhodgdon created an issue. See original summary.

jhodgdon’s picture

Status: Active » Needs review
StatusFileSize
new742 bytes

Here is a patch. Going to test it out on the guide-drupal.dev.devdrupal.org site and see if images are updated relative to the import I did earlier today on the same site.

jhodgdon’s picture

Status: Needs review » Reviewed & tested by the community

OK, this is working on the guide site, once I got around some permissions problems with the existing image files that couldn't be overwritten when I imported using the GUI instead of the drush comman line.

  • drumm committed da3dd3a on 7.x-3.x, dev authored by jhodgdon
    Issue #2888561 by jhodgdon: User guide pages not being updated for...
drumm’s picture

Status: Reviewed & tested by the community » Fixed

This has been deployed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.