Hi everybody.
I am not able to use this wonderful module because when I enable it on a field I get a 500 error. Of course, because it's trying to reindex all nodes and I have too many to do it easily (thousands and thousands).
I was wondering if I could force reindexing items via usual drupal page, 500 items a time, and get the module working this way. As a matter of fact, when Drupal indexes items, the module should do its job too, isn't it?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

giuvax created an issue. See original summary.

generalredneck’s picture

Status: Active » Postponed (maintainer needs more info)

This should be handled via a batch job. The question now is if the gathering of the initial list is efficient enough.

You can see that here. http://cgit.drupalcode.org/views_natural_sort/tree/views_natural_sort_te...

Looking at the code I'm not 100% sure I can make it too much more efficient as I'm only getting a list if node ids anyway when creating items to batch. See http://cgit.drupalcode.org/views_natural_sort/tree/views_natural_sort_te...

Can you tell me w bits of info. 1: what's your php memory limit. 2: how many thousands of nodes have this field?

Lastly see if you see anything in drush queue-list in the views natural sort queue.

giuvax’s picture

Thank you generalredneck.
1) php memory limit 256mb
2) I manage a magazine digital archive, so I have three content types: 48870 movies, 101933 people, 18767 articles.

But truth is I just need the natural sort on one content type (articles), or even better on a single taxonomy (about 1000 articles). I know it's impossible to deal with such numbers, but... how can I do? I have to print a catalogue of articles with the correct alphabetical sort, otherwise I have to sort them manually moving pages in Acrobat after printing the pdf.

generalredneck’s picture

Status: Postponed (maintainer needs more info) » Active

When you apply natural sorting to a field, it applys to that field across all bundles the field is applied on (to the base not the instance).
So the reason why it would do "all" nodes would be that the field you are attempting to enable natural sorting on is on all content types. I ran a couple of tests.

First I set my memory limit to 192, the default memory limit for most PHP installs.
Next, I generated 160,000 nodes with values in that field. I got

Fatal error: Allowed memory size of 201326592 bytes exhausted (tried to allocate 9437184 bytes) in /var/www/drupalvm/web/includes/entity.inc on line 1395

Which is in the heart of EntityFieldQuery. What is failing is this:

  $query = new EntityFieldQuery();
  $result = $query->entityCondition('entity_type', $entry_type['entity_type'])
    ->entityCondition('bundle', $bundles, 'IN')
    ->execute();

at 256 I get the following

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 20480 bytes) in /var/www/drupalvm/web/includes/database/select.inc on line 1529

Which is interesting as that is in the query building portion of a select query.

Reguardless, the challenge here is I MUST get the initial list of data to be able to batch process which is what is happening at the point of failure. I can rebuild the query and NOT use entityfieldquery and build my own query but I'm not 100% sure how much that will gain us.

But truth is I just need the natural sort on one content type (articles), or even better on a single taxonomy (about 1000 articles).

The way to achieve what you propose would require custom work on your part. It's not impossible, but requires some intimate knowledge of the hooks that VNS exposes to you... and a module MUCH like views_natural_sort_text_field.

Another option would be... If at all possible increase your memory limit. A temporary bump would get you initially indexed. The fact you are writting this probably means this is impossible.

The last option is for me to refine the logic gathering the ids. EFQ is not the most memory efficient way to do this, but it is much more difficult to get entity ids without it as I got to do all the looking up for tablenames and key fields and what not. All I can say is let me see what I can whip up, but no hard promises.

generalredneck’s picture

First thing... I'm confused what you mean by this

I was wondering if I could force reindexing items via usual drupal page, 500 items a time, and get the module working this way. As a matter of fact, when Drupal indexes items, the module should do its job too, isn't it?

I'm not sure where you are making the distinction between "Drupal" and "the module", but I think you assume that Drupal's search index is the same thing as Views Natural Sort's Index. They are not. This is a different process entirely that cleverly transforms strings in such a way to trick mysql into sorting them as if they were naturally sorting. Given that, this index is a mapping table that matches a field's value to that transformed string so that you can sort by it using Views.

As for the problem you are having...

I was able to build a version of the code that was memory efficient enough to work. The first problem was EFQ... the second problem was entity_load()'s cache.

The new challenge is that your web server, php-fpm server, or Commandline PHP are all going to most likely timeout before all those nodes get loaded into the queue... What I'm trying to avoid is having a process that batches a process that places items in a batched process...

If you know you can run commandline PHP without a time limit (which most of the time there isn't one unless you are on certain hosting companies)... you can try the following

  1. Apply the patch I provided to make the memory usage more bueno.
  2. check the checkbox on the field and let it time out (this will get you the field setting correct)
  3. run drush eval "views_natural_sort_text_field_views_natural_sort_queue_rebuild_data(array('entity_type'=>'node','field'=>'FIELD_MACHINE_NAME'))" On my extremely fast setup, this took about 3 mins
  4. run drush queue-run views_natural_sort_index_queue

That should get you your initial index and any new node will be indexed on save.

Other solutions may be to spread the reindex out somehow but that's going to take an overhaul of the way items are queued and batched... and I don't think I can tackle that just this minute.

generalredneck’s picture

Lastly, I might suggest another work around for you. If you CAN move the content from the field that the article bundle is using to another field that isn't being shared across other bundles, this would help you out significantly. It would require you to write a quick update hook moving all the content from one field to the other and then finally removing that field instance from the article content type, then setting up the view to use the new field.

This would cut down the indexing to 18767 nodes instead of ~160,000

Yet another work around would be to write a quick script that only indexes the items you wanted by checking the checkbox for Natural sorting on the field... letting it time out and then altering the piece below and running it:

  $queue = views_natural_sort_get_queue();
  $field = field_info_field($FIELDNAME);
  $entity_type = 'node';
  $entity_info = entity_get_info($entity_type);
/*
 Replace the query below with one that gets the entity ids for the nodes you expect to be ordered naturally.
 $entity_ids = db_select($entity_info['base table'], 'e')
    ->fields('e', array($entity_info['entity keys']['id']))
    ->condition('e.' . $entity_info['bundle keys']['bundle'], $bundles, 'IN')
    ->execute()
    ->fetchCol();
*/ 

  if (!empty($entity_ids)) {
    foreach ($entity_ids as $entity_id) {
      $results = entity_load($entity_type, array($entity_id), array(), TRUE);
      $entity = reset($results);
      $entries = _views_natural_sort_text_field_to_vns($entity_type, $entity, $field);
      foreach ($entries as $entry) {
        $queue->createItem($entry);
      }
    }
  }

You could build this either via an update hook or as a script you could run with drush scr.

giuvax’s picture

First of all, thanks a lot for your help.
I'm answering point by point.

1) Yes, I was assuming the indexing process would be together with natural Drupal search indexes building, that's why I wanted to try resetting the Drupal search index. Ok, that's not possible.

2) I have a server of my own so I can increase PHP memory limit, but if I understand all you wrote, it will not work. Anyway, I have 32 GB RAM, could I try with... 2, 4 GB as memory limit? If you want, I can experiment and then tell.

3) Just give me the time to learn using drush. I just migrated the website from a shared hosting, and in my own server, where I can do everything independent and freely, it's all brand new and I still didn't have a chance to install drush.

I suppose once I'm able to use drush, the patch + the drush command is to achieve the goal to have the initial index of ids, because after that the module is able to batch the indexing. Is that right?

Thanks again. Will be back with feedback as soon as I can.

Giulia

generalredneck’s picture

In thinking about your use case, I may be able to alter the code so that I'm not queueing up candidates for indexing, but instead querying for candidates when the batch is initiated based on a query provided by the views natrual sort plugins. (currnetly VNS for entity properties, and VNS for entity fields). This would off load the heavy lifting to cron (or batch process) and not the call that is putting things into the batch. This may take me a bit to get to.. most likely later this month before I can start working on it. I'll probably create a new ticket soon and relate it to here soon.

2) I have a server of my own so I can increase PHP memory limit, but if I understand all you wrote, it will not work. Anyway, I have 32 GB RAM, could I try with... 2, 4 GB as memory limit? If you want, I can experiment and then tell.

If you apply the patch I supplied that won't be necessary. I'll most likely go ahead and make it part of the codebase as you helped me optimize that query and it will be useful later when I'm working out what I mentioned above.

3) Just give me the time to learn using drush. I just migrated the website from a shared hosting, and in my own server, where I can do everything independent and freely, it's all brand new and I still didn't have a chance to install drush.

Depending on how technical you are... work with caution. I do this kinda thing for a living so I take for granted that skill. make sure and back up your database before you do any work on the site. I would suggest if you can, work on a copy of your site the first time or two so that you can always scrap it if that doesn't work. Setting up a "server" on one of your machines is fairly easy with a ton of options out there to match a variety of technical levels. You got Acquia Dev Desktop (probably the easiest) MAMP WAMP XAMP for servers on your host machine (your laptop, desktop)... You have Drupal VM if you want to go the Virtual Machine route, and you have lando if you want to go the Docker route.

You can also set it up side by side on your server since you got your own machine.

giuvax’s picture

Thank you once again.

Just to update you: I don't think I'm able to install drush on my server. I have a Debian 9, and I'm not familiar with Composer. So, unless I give this part of the job to someone more skilled than me, I don't know how to proceed.
Besides, for this particular need I guess I have to look for a workaround because the catalogue is to be printed at the end of the month. So maybe I'm going to create a new indexed field, just like you said, and let the module work on it (just on one content type).

But nevermind: when you will be back and you are going to work on it, just tell me, cause I am going to help you testing and experimenting.

Thank you very much.

generalredneck’s picture

On Debian, you will want to do the following to install composer...

php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
php -r "if (hash_file('SHA384', 'composer-setup.php') === '544e09ee996cdf60ece3804abc52599c22b1f40f4323403c44d44fdfdd586475ca9813a858088ffbc1f233e9b180f061') { echo 'Installer verified'; } else { echo 'Installer corrupt'; unlink('composer-setup.php'); } echo PHP_EOL;"
sudo php composer-setup.php --install-dir=/usr/local/bin --filename=composer
php -r "unlink('composer-setup.php');

From there you should be able to do

composer global install drush/drush:8.1.14

Make sure that you have something simular to this in your ~/.bashrc file
export PATH=~/.composer/vendor/bin:$PATH

if you have to add that... you will need to run source ~/.bashrc to have it take affect or log out and log back in.

That will take care of installing drush...

  • generalredneck committed 40db154 on 7.x-2.x
    Issue #2913741 by generalredneck, giuvax: Fix memory limit issues during...
generalredneck’s picture

Status: Needs review » Fixed
FileSize
305.03 KB

@guivax,
I've managed to fix this issue. Without a code fix, you still have to wait for all 160,000 items to index... but with that said... it should no longer time out or run you out of memory and you should get a nice progress message and stuff now.

Let me know if you have any issues after downloading the Dev verson. I'll release it as 2.6 soon.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

generalredneck’s picture

Version: 7.x-2.x-dev » 8.x-2.x-dev
Category: Support request » Bug report
Issue tags: +port to d8

Reopening as this needs to be ported to the D8 verison

generalredneck’s picture

Status: Closed (fixed) » Active
Issue tags: -port to d8 +needs port to Drupal 8
giuvax’s picture

Hi generalredneck, sorry for late answer. I'll try the dev version and I'll get back to you.
Thank you so much.

generalredneck’s picture

Status: Active » Fixed

Tested out to see if I have a simular challenge on Drupal 8/9 today.. Tested with 100,000 nodes and all seemed well.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.