Hello,
today I find another problem using boost and cache expiration on a site with entity translation

I have 7 languages on the site but one content type is edited only in english and croatian...
1) I created a croatian node ( the node has a croatian name and path and the other languages use the fallback node/nid )
2) after editing the node (while the site language is set to croatian) I cannot expire all the 7 cache pages because the cache expire module is looking for the following urls:
www.visitnovalja.hr/hr/node-title -> this one is the only existing cache file
www.visitnovalja.hr/hr/it/node/nid -> this one and the remaining 5 are non existing cache files because of the double language entry

3) My temp solution was to edit and create the nodes in the default english language (without any language prefix)
now all the language fallback paths are cleared
www.visitnovalja.hr/node/nid (the english one)
www.visitnovalja.hr/it/node/nid (the italian one) ... and so on
but the problem is that the cache file for the croatian node which has a node title remains undeleted (www.visitnovalja.hr/hr/node-title)
it seems that clearing the cache for www.visitnovalja.hr/hr/node/nid doesn't clear the cache for www.visitnovalja.hr/hr/node-title

In my opinion clearing the cache of any node with entity translation enabled should clear every path in every language with the title and content translated (having node-title in the url) and without the translation (having node/nid in the url)

thank you for your help and effort
regards
marko srdoc

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

SebaZ’s picture

Have you found any solution? Patch, module, quick fix to solve this problem?

It works as you described and it is not proper behaviour.

das-peter’s picture

Version: 7.x-2.0-rc3 » 7.x-2.x-dev
Status: Active » Needs review
FileSize
5.96 KB

Just stumbled over this too.
Attached patch changes the expire api to be able to handle multiple languages for an object / aliases of the object.
If the object has entity translation enabled all available languages are collected and used to create alias urls.
I've also added a helper for other modules to get language prefix aware urls.
That said: As far as I understand expire doesn't want to deal with stuff like language prefixes and so on.
It simply provides the "base urls" other modules, that implement expire_cache, can act on.
So we still need to make other modules like authcache / varnish language (prefix) aware.
For varnish I just posted this patch: #2546218: Support for language prefixes in expire integration

joelpittet’s picture

I wonder if we can get the maintainer to enable automated testing and we can put this to the test-bot?

This is somewhat related regarding authcache #2507797: authcache_builtin_expire_v2_expire_cache is a killer with locale on

joelpittet’s picture

+++ b/includes/expire.api.inc
@@ -55,21 +55,26 @@ class ExpireAPI {
+        if (entity_translation_enabled($object_type, $object) && $handler = entity_translation_get_handler($object_type, $object)) {

Should there be a module_exists check here?

Also, I tried the patch and didn't seem to do anything in my case, but my use-case may not fit:
I've got entity_translation on some nodes but others don't need it. In my case a non-entity translated page is expiring on the edited page, but the other language it's not.

das-peter’s picture

Should there be a module_exists check here?

Oh indeed, this absolutely needs to be caught.

Also, I tried the patch and didn't seem to do anything in my case, but my use-case may not fit:...

The only thing the change should do is fetch all available aliases for an object. The base path e.g. node/1 stays the same as it is entity translation.
However node/1 likely has a language specific alias for every available translation. And since we can't be sure if language neutral or language specific fields were edited we need to flush all the aliases too.
Thus the patch changes the signature of ExpireAPI::processInternalPaths() so that it accepts multiple languages - and uses all of them to detect aliases.

joelpittet’s picture

@joelpittet I think from testing I sorted out problem why it didn't seem to work. If I have entity translation on but certain node types it's turned off, this script will only pick up the translation of current entity language. Which in my case let's say I have ['en-US', 'en-CA'] codes(because I do)

The prefix will be for /en-ca/ only for that node even though it has the same content on /en-us/.

With entity translation we can also have fields that cross all translations or prefix all pages.

It may be just easier to get all active languages and and the prefixedless version and add them to the array?

Sounds expensive but I wouldn't know how else to account for the permutations, thoughts?

joelpittet’s picture

Status: Needs review » Needs work

I think I may have found part of the problem and some other things in the patch:

  1. +++ b/expire.module
    @@ -358,3 +358,35 @@ function _expire_load_single_entity($entity_type, $entity_id) {
    +function expire_get_language_prefixed_urls($urls) {
    
    +++ b/includes/expire.api.inc
    @@ -78,17 +83,17 @@ class ExpireAPI {
    +        $langcodes = $language->language;
    

    Maybe the code is meant to do what I was suggesting?

    Is this supposed to be adding not overriding as a string?

    Aka $langcodes[] =

  2. +++ b/includes/expire.api.inc
    @@ -392,12 +397,16 @@ class ExpireAPI {
    +   * @param array $langcode
    

    $langcodes

  3. +++ b/includes/expire.api.inc
    @@ -392,12 +397,16 @@ class ExpireAPI {
    +  protected static function processInternalPaths($internal_paths, $langcodes = NULL) {
    +    // Backward compatibility.
    +    if (!is_array($langcodes) && !empty($langcodes)) {
    

    Is this likely that this protected method needs BC code?

    Also just a nice little trick you can cast a string to an array and an array to an array and it won't nest. Also works with NULL
    $langcodes = (array) $langcodes;

    $ php -r "var_dump((array) NULL, (array) array(), (array) 'string');""
    

    Though should we default the param to an empty array()?

joelpittet’s picture

Status: Needs work » Needs review

While that is part of the problem my initial problem in #6 where if I save the US version the CA version doesn't expire.

joelpittet’s picture

Status: Needs review » Needs work

Whoops still needs work for #7

joelpittet’s picture

Explaining my tests:

  1. Authcache caching pages.
  2. 2 languages en-US and en-CA
  3. Basic Page cache primed in both languages for admin, editor and anonymous.
  4. Edit the body of the basic page (not entity translated but ET is turned on the site)
  5. Language prefixed urls

Saving the page in en-CA seems to clear it for admin in both languages. Editor it doesn't seem to change in either language(assuming authcache key problem), and anonymous it is expired in en-CA but not in en-US.

Also this module really could use simple tests, it would really help

joseph.olstad’s picture

subscribing...

das-peter’s picture

It may be just easier to get all active languages and and the prefixedless version and add them to the array?
Sounds expensive but I wouldn't know how else to account for the permutations, thoughts?

I think you're right.
Especially the alias handling needs to take all language in account. A pretty simple case is if you want to flush a node path (imagine a node with entity translation) using the rules implementation; as it is right now aliases are completely ignored because there's no language information whatsoever.
Another compelling reason was brought up by joelpittet too:

With entity translation we can also have fields that cross all translations or prefix all pages.

So even if we've the entity object and we figure out what language it was edited in we don't know what fields have changed. If it was a non translatable field we need to flush all language versions.

So all in all the most convenient and safe approach seems to be greedy with languages and let people optimize the other way around - using hook_expire_urls_alter().

The updated patch still uses all means to get the appropriate language for the given object - but self::processInternalPaths() is called with all enabled languages.

dsutter’s picture

Using patch #12 with the latest dev of expire and a bilingual distribution called wetkit and we observed that the patch is expiring the path alias in the wrong language when the source language is NOT english.

publish http://localhost/en/content/frequently-asked-questions (source language is french)

expire then expires:
For example:
URL: http://localhost/fr/content/frequently-asked-questions
Wildcard: false
Expired object: node
--------
Note: http://localhost/fr/content/frequently-asked-questions is a page that doesn't cache, because it doesn't exist.

It does however correctly expire the french one:
http://localhost/fr/content/foire-aux-questions

But the cached document for http://localhost/en/content/frequently-asked-questions
is not expired.

so the english http://localhost/en/content/frequently-asked-questions in this case is not expired.

das-peter’s picture

Found an issue with #12 but unlikely related to #13. Have check what's up regarding #13 first.

joelpittet’s picture

@das-peter

+++ b/expire.module
@@ -358,3 +358,35 @@ function _expire_load_single_entity($entity_type, $entity_id) {
+function expire_get_language_prefixed_urls($urls) {
+  $language_prefixed_urls = array();
+  // Check if any language has the url prefixes enabled.
+  if (language_negotiation_get_any(LOCALE_LANGUAGE_NEGOTIATION_URL)) {
...
+    }
+  }
+  return $language_prefixed_urls;

Shouldn't this return the original $urls passed in if the language_negotiation_get_any() fails?

Example:

function expire_get_language_prefixed_urls($urls) {
  // Return original urls if any language has the url prefixes not enabled.
  if (!language_negotiation_get_any(LOCALE_LANGUAGE_NEGOTIATION_URL)) {
    return $urls;
  }

  $language_prefixed_urls = array();
  $languages = language_list();
  foreach ($languages as $language) {
    if (!empty($language->prefix)) {
      $prefix = $language->prefix . '/';
      foreach ($urls as $key => $data) {
        if (isset($data['path'])) {
          $data['path'] = $prefix . $data['path'];
        }
        else {
          $data = $prefix . $data;
        }
        $language_prefixed_urls[$prefix . $key] = $data;
      }
    }
  }
  
  return $language_prefixed_urls;
}
joseph.olstad’s picture

as for #13, I confirm your observations.
The use case:
creating a node and set the language 'other than' the default language.

assuming the system is default language english and your second language french then create a new node select french as the language of the node, save it, then add an english translation (in that order).

then add translation, save the translation, publish the node and review what was expired, make sure the french title /alias is not the same as the english one, edit draft on this node making some noticable change and publish it again and look at what is expired or attempted to expire.

when expire tries to expire you will see similar results as in comment #13

I think that given this , a new patch would look at the source language of the node entity being published to determine which path to expire rather than asuming it would be the default language. currently it's an assumption, we need a patch that checks source language in order to get the correct path in all situations, either that or force everyone to always create a node in the default language but thats not how things work in Drupal presently.

das-peter’s picture

@joelpittet Hmm, I'm not sure about that. The function explicitly say it's returning language prefixed URLs. And right now I'm more confused about the fact that the function isn't used anywhere ;) It's just a helper function but I wonder if I haven't forgot to let it help somewhere :O

joelpittet’s picture

Here's my current workaround to get around content translation:

/**
 * Implements hook_expire_urls_alter().
 */
function MODULE_expire_urls_alter(&$urls, $object_type, $object) {
  // Return original urls if any language has the url prefixes not enabled.
  if (!language_negotiation_get_any(LOCALE_LANGUAGE_NEGOTIATION_URL)) {
    return;
  }

  // Check if base urls should be included.
  $include_base_url = variable_get('expire_include_base_url', EXPIRE_INCLUDE_BASE_URL);
  $language_prefixed_urls = $urls;
  $languages = language_list('enabled');
  $languages = $languages[1];
  foreach ($urls as $path => $data) {
    foreach ($languages as $language) {
      $prefix = !empty($language->prefix) ? $language->prefix . '/' : '';
      if ($include_base_url) {
        $language_prefixed_urls[$prefix . $path] = url($path, array(
          'absolute' => TRUE,
          'alias' => TRUE,
          'language' => $language,
        ));
      }
      else {
        $language_prefixed_urls[$prefix . $path]['path'] = $prefix . $path;
      }
    }
  }

  // Filter out duplicates.
  $urls = array_unique($language_prefixed_urls);
}

Which has some similarities though doesn't deal with, not clearing the cache of actual translated content. Which to me is better than not clearing the cache and also better than clearing all page cache.

joseph.olstad’s picture

Hi Joelpittet, that patch looks very interesting. I may be using part of it to improve our own sandbox project.

for our small multilingual site we took a similar (but very simple) approach and created this sandbox project called "boost_blast" , which is essentially executing our simplistic blast logic on hook_expire to clear everything regardless of what $urls are expired. This is because our site has multiple aliases and the caches fill up in different folder trees, without the blast we were only seeing cache expire on one of the aliases.

we have ideas to make it a bit more sophisticated with this patch to the same sandbox project, still in testing phase.

One unrelated thing I did notice during testing however, is that expire currently does not support bean entity and so when we update a bean entity the hook_expire doesn't get called. There's a seperate unrelated issue for that.

beltofte’s picture

We have just implemented a similar solution on a site hosted at Acquia and using Acquia Purge. Our solution is:

/**
 * Implements hook_expire_urls_alter().
 */
function MODULE_NAME_expire_urls_alter(&$urls, $object_type, $object) {
  // Return original urls if any language has the url prefixes not enabled.
  if (!language_negotiation_get_any(LOCALE_LANGUAGE_NEGOTIATION_URL)) {
    return;
  }
  
  // Only manipulate object type node
  if ($object_type != 'node') {
  	return;
  }    	
  
  // Check if base urls should be included.
  $include_base_url = variable_get('expire_include_base_url', EXPIRE_INCLUDE_BASE_URL);
  
  // Finding language prefix for the current object language.
  $languages = language_list('enabled');  
  if ($languages[1][$object->language] && !empty($languages[1][$object->language]->prefix)) {
    $prefix = $languages[1][$object->language]->prefix . '/';
  }
  else {
    $prefix = '';
  }
    
  $language_prefixed_urls = $urls;
  foreach ($urls as $path => $data) {
    if ($include_base_url) {
   	  $language_prefixed_urls[$prefix . $path] = url($path, array(
        'absolute' => TRUE,
        'alias' => TRUE,
        'language' => $language,
      ));
    }
    else {
      $language_prefixed_urls[$prefix . $path] = $prefix . $path;
    }
  }

  // Filter out duplicates.
  $urls = array_unique($language_prefixed_urls);
}
masipila’s picture

I was testing the patch in #14 without luck. Here are the notes from investigation together with a new patch.

Modules & versions

Expire: latest 7.x-2.x-DEV
Boost: latest 7.x-2.x-DEV (because latest DEV has support for |wildcard)
Entity translation: 7.x-1.0-beta4
Expire settings: Include base URL in expires

Content type settings

  • Content type / Publishing settings / Multilingual support: Enabled, with field translation
  • Most of the fields are language neutral and shared for all language versions
  • Some of the fields are translatable with Entity Translation
  • I have two languages on my site: fi (primary) and en
  • Path for the fi language version is fi/foo/[node:nid]
  • Path for the en language version is en/foo/[node:nid]
  • I have also some views that show content of this type but I'll leave that out of this for now.

Expected result

For the sake of simplicity, let's assume that only node page is set to be expired when a node is created / updated / deleted.

When a node, let say nid 123 is updated, I would expect the following paths to be expired:

Actual result

Only FI paths are expired:

Debugging and code analysis, patch #14 applied

Function processInternalPaths tries to get handle the languages as follows:

      // Get the path aliases for this path, and add it to the array if one was
      // found.
      foreach ($langcodes as $langcode) {
        $alias = drupal_get_path_alias($path, $langcode);
        if ($alias != $path) {
          $urls[$alias] = array(
            'path' => $alias,
            'query' => array(),
          );
          $wildcards[$alias] = $wildcard;
        }
      }

In the use case I described above, this will only add node/123 and foo/123 to the $urls. However, if I had configured different path alias patterns for each language (let's say foo/[node:nid] for FI and bar/[node:nid] for EN), patch #14 would do the magic here and add both foo/123 and bar/123 to $urls as expected. So nothing wrong here as far as I can see.

However, the langauage prefixes are not handled properly when absolute urls are created in function executeExpiration when $include_base_url is set.

      // Adds paths aliases, defines wildcards, etc.
      list($urls, $wildcards) = self::processInternalPaths($urls, $langcodes);
	  
      // Debug comments: At this point $urls are the internal paths without
      // language prefixes. $urls will contain node/123, foo/123 (and bar/123
      // if different path patterns are defined for different languages) but
      // they are all without any language prefix at this point.

      // If base site url should be included, then simply add it to the internal paths.
      if ($include_base_url) {
        foreach ($urls as $raw_url => $url) {
		
          // Debug comments: we are generating the absolute path here, using only
          // one language here. As a results, $urls will contain
          // http://www.example.com/fi/node/123 and http://www.example.com/fi/foo/123
          // but no paths in en language version.
          $urls[$raw_url] = url($url['path'], array(
            'absolute' => TRUE,
            'alias' => TRUE,
            'language' => $language,
            'query' => $url['query'],
          ));
        }
      }

Cache engines that expect to get just the internal paths (e.g. Varnish) will most probably work with #14 (assuming that they are able to add language prefixes for all lanaugages correctly). Cache engines like Boost which are expecting full absolute URLs will not work correctly because we are only providing the absolute URLs with one language prefix.

Attached is a patch that will modifies the $urls array so that all language prefixes are added. Array keys are modified so that the paths will contain language prefixes. When we do this, we need to also modify the $wildcards array. I wrote a small helper function to do this. Patch and and interdiff attached.

Cheers,
Markus

masipila’s picture

Nggh, I forgot to add the patches to previous comment.

masipila’s picture

Previous patch (#21/#22) incorrectly added the language prefixes to files as well, for example http://www.example.com/en/files/foo.txt. This new patch attached adds language prefixes only if the path is not pointing to a file.

Markus

gge’s picture

I tested #23 and nothing happens if a translation is removed.

vadym.kononenko’s picture

HI, guys.

I've applied patch #23 and see keys behaviour passed to hook_expire_cache() is not unified.
Could someone explain see here and if these expiring URLs are correct?

[] => http://example.org/
[/node] => http://example.org/node
[lt/node] => http://example.org/lt/node

Why not language prefixed urls begins with slash and any other havn't it?

joseph.olstad’s picture

@vadym.kononenko using your example:

[] => http://example.org/
[/node] => http://example.org/node
[lt/node] => http://example.org/lt/node

[] should expire if lt is your default language
[/node] should expire if lt is your default language
[lt/node] should always expire if that is indeed the language of the content that triggered the expire lt in the first place.

as for patch #23 , I'll try to run some tests asap and report back.

jefuri’s picture

I stumbled upon this after I already implemented my own patch in a project. We added a setting to enable the expiration of all languages on entities when needed.

Which means this patch just get's the language list if it is turned on and generates an absolute url for all languages of the entity that is expired. More wouldn't be necessary right?

jefuri’s picture

Fixed an issue where wildcards are not being taken into account when generating language specific url's.

esolitos’s picture

I have re-rolled the patch which was not applying anymore, also I extended the behaviour adding the handling of "default" paths without an alias.
The number of expired paths it is increased after my patch, the reason is that you can access the same path in many ways in drupal so the expiration should reflect this, expiring all the required paths.

Reviews and tests are very welcome as it's hard to wrap the head around all the edge cases.

quotesBro’s picture

Confirming #29 works well.
(I use Boost 7.x-1.x-dev)

kyilmaz80’s picture

I had the same problem. My workaround was to add a new rule for purging the pages via a curl php code.

AmiOta’s picture

An alternative easy to implement is use the module https://www.drupal.org/project/node_edit_redirect that just redirect the admin to language of content begin edited. This just expire the current content in current language.