Maybe I'm mistaken but at the moment whenever I add a new node I need to either manually index it via the Search API interface or wait for the cronjob.
Is there a particular reason why this doesn't get triggered when a node is saved? It's sort of counter-intuitive for my users that when they add a new node on some views (normal node ones) they see them immediately and other ones they have to wait.
Could this be implemented as (yet another) a separate module?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drunken monkey’s picture

The reason is that apparently most other search modules do it the same way (core search and apachesolr at least) and I'm confident that they have some reason. ;) Indexing can take some time, depending on necessary preprocessing, and in those cases letting the user wait that long on node saves would probably be a bad choice.
fago is currently on vacation, according to his profile, but as far as I know he planned to provide a patch for adding a Rules action for indexing entities. With that, you could just set up a Rules action to directly index a node on saves.

Adding an option to indexes for indexing "dirty" items immediately should of course also be possible rather easily, and does make sense. Let's see, when I get to it, don't have much time currently …

Shadlington’s picture

Do entities only get marked as 'dirty' when they are saved?

If that's the case then - in addition to a rules action for indexing entities - an action that marks entities as 'dirty' could be useful in some scenarios, I think.
Primarily it'd be useful to modules that store some data related to the entities but which are updated at a time other than when saving.
Flag and votingAPI modules spring to mind. Which probably don't work with Search API just yet, but I can imagine that they (or others like them) may in the future.

EDIT: Sorry for issue hijacking... Just happened to read the rules action comment and got thinking.

drunken monkey’s picture

If that's the case then - in addition to a rules action for indexing entities - an action that marks entities as 'dirty' could be useful in some scenarios, I think.
Primarily it'd be useful to modules that store some data related to the entities but which are updated at a time other than when saving.
Flag and votingAPI modules spring to mind. Which probably don't work with Search API just yet, but I can imagine that they (or others like them) may in the future.

Actually, that problem is already there with just core entities, although now somewhat mitigated by #1012878: Add a way to index an entity directly. E. g., when indexing a node author's name, the node won't be re-indexed when the username changes. Generally, this can occurr when indexing any related entity's field that can change.

I can put your mind at ease though: that Rules action is already planned (for some time now), fago has agreed to add this. Maybe I should bug him again about it …

Shadlington’s picture

Heh. I really do love how pretty much every time I think 'hey, it'd be good if Search API could do this' it turns out its already planned or even in the works :)

Anyway, my mind is at ease! Thanks!

girishmuraly’s picture

I believe the reason why indexing does not happen too often is that the backend caches will keep getting cleared on each index run and queries will keep hitting the backend all the time, thereby slowing down the results load time. Making it a cron job instead makes sense in this regard.

For example, solr cache gets cleared during each indexing run. It would be better for large sites to index only on a larger interval. The optimum frequency depends on various factors like how quickly the updates need to show up, backend cache size, and speed of indexing, amongst others.

drunken monkey’s picture

Ah, yes, that's of course another reason. Didn't think about that one.
Anyways, indexing right away might make sense for some sites, so allowing this (without making it the default) shouldn't do any harm.

drunken monkey’s picture

Status: Active » Needs review
FileSize
9.8 KB

Please see the attached patch for a first stab at this.
In principal works for me.

fangel’s picture

Subscribing. I'll test out the patch tomorrow on our large'ish indices..

quazardous’s picture

+1

fangel’s picture

Sorry, it took awhile before I could test this. I can confirm that the patch in #7 does as promised. Awesome.

drunken monkey’s picture

Title: Index documents on node_save / node_update » Add option to index entities instantly after they are saved
Status: Needs review » Fixed

OK, committed.

Shadlington’s picture

Are you sure you committed it? Git says otherwise.

drunken monkey’s picture

I did commit it, just forgot to also push it …
That's one problem we didn't have with CVS. But it's done now.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

j0rd’s picture

I'm having a problem where my nodes are not showing up right away even when this is checked.

When creating the node, I'm creating it un-published, then after the user pays for the node, I publish it. This usually happens in the span on 2-3 minutes. After it's published though, the node does not show up on the site.

`drush cc all` does not fix
`drush core-cron` does not fix

Only way I can get it to take is by re-building search api cache.

I'm using apache solr backend.

Any ideas about how I can fix this. It's important for me that the users can see their nodes once they're published. Performance is not an issue for my small site.

Cheers,
Jordan

stoporko’s picture

Issue summary: View changes

@j0rd Hi, did you manage to solve your problem? I am stuck with exactly the same issue:
Search_api index and server (solr) are running fine, results displayed via views. Admin creates a new node, but it is NOT added to get indexed. I need to rebuild the index. Any ideas what could be wrong?

Stoporko