Problem/Motivation

On large sites for example with 16k content nodes I am having an error due to trying to populate the URL alias table all in one php process, I hit a message about 60 second timeout during install, or on cloud environments deployment logs such as acquia a "killed" is shown in the logs.

Steps to reproduce

Have a large site, install the module.

Proposed resolution

Use the batch API to chunk out all the work of populating the views url alias table.

Remaining tasks

Review the changes.

User interface changes

Shows a batch progress bar when rebuilding the index of URL aliases.

API changes

Rebuilding the index of URL aliases no longer does the work there and then, but sets a batch to do it.

Data model changes

None.

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

NicholasS created an issue. See original summary.

nicholass’s picture

So I tried batch and it works over the UI but does not populate during a drush cim install of the module.

nicholass’s picture

So tested this locally and MR 16 is just the Queue API change, but I had another issue open and need that as well so that is MR 17 since I can't have 2 patches pathing the same lines.

Please review MR 16

nicholass’s picture

Status: Active » Needs review
nicholass’s picture

Issue summary: View changes
nicholass’s picture

I have done a lot of testing on my site and I think MR17 should be reviewed it fixes multiple things with this module and works as intended. So I closed the other individual issues in favor of this single change since it looks like this module doesn't have many maintainers.

joel_osc made their first commit to this issue’s fork.

joel_osc’s picture

Great patch everyone! Necessary on my site in order to use this module. I noticed that some nodes could not be found by alias, in looking at it I found the queueing code was storing paths that it had already done and checking that before queueing. They key the code used did not have the langcode of the path so I was only getting each node in one of two languages. Small fix committed above.

dstorozhuk’s picture

Version: 8.x-2.x-dev » 3.x-dev
dstorozhuk’s picture

Queue options might not work for people who has cron disabled for some reason.
I think the right option here is Batch operation for views_url_alias_rebuild_path(). But also module installation should use batch somehow.

michaelsoetaert’s picture

StatusFileSize
new6.44 KB

I've rerolled the changes from MR#17 on branch 8.x-2.x-with-issue-3396154 on the latest version of the 3.x branch.

We needed multilingual support (different URL aliases for different translations), which branch 3.x provides, but we were also getting timeouts because of the size of the website (where the changes in this issue come into play).

michaelsoetaert’s picture

StatusFileSize
new3.79 KB

Sadly, the patch in comment #13 still resulted in timeouts on our higher environments (due to different PHP values). The issue seemed to be the large amount of data being loaded in views_url_alias_rebuild_path, since it's retrieving the complete Entity-object of each path alias.

I decided to try the approach @dstorozhuk suggested (using the Batch API). Only loading the path alias IDs in views_url_alias_rebuild_path, splitting the list in chunks and only loading the Entity-objects of the given path alias IDs in each batch operation. That seems to have fixed the timeouts.

Attached patch with the described functionality.

steven jones’s picture

I'm evaluating this module for a feature that I need to implement for a site, and so this might be a bit of a 'drive-by' contribution if I decide to not really use it, but I would like to commend the approach in #14 to use the Batch API.

Can I suggest that the approach this module has of maintaining a separate index table of data feels a lot like the node access API (or maybe Search API) and so can I suggest that you should be inspired by those systems. In particular I'd recommend switching from even trying to do the work on install of the module and instead make a robust batch API or queue process that does the indexing needed. Then on install pop a message up informing users that there's something else they need to do (unless there are no aliases in the DB) and then also a hook_requirements message that also informs administrators.

Also the patch in #14 doesn't apply to the latest 3.1.0 as far as I could tell, so this probably Needs Work.

steven jones’s picture

Issue summary: View changes

I've applied the patch from #14 to a 3.x-dev version and I'll open a MR shortly with those changes.

From my testing if you enable the module through the web UI you get a nice progressbar and batch process for building the index.
If you install via Drush then you don't get a progressbar, but it does work, and is using the batch to build up the table, nice! No timeouts etc.

I suppose that for environments that would timeout that Drush command, it's still not a great experience tbh. maybe we should move to setting some kind of 'rebuild' flag on module install, and then detect that and clear it when rebuilding etc.

steven jones’s picture

Merge request !29 now contains a 'flag' version whereby it's super quick to install the module, but you have to a one-time process in the webUI to build up the index table.

Does this need a Drush command maybe?

steven jones’s picture

steven jones changed the visibility of the branch 8.x-2.x to hidden.

steven jones changed the visibility of the branch 8.x-2.x-with-issue-3396154 to hidden.

n-m-daz’s picture

Status: Needs review » Reviewed & tested by the community
StatusFileSize
new96.72 KB
new43.63 KB
new133.98 KB
new182.7 KB
new72.66 KB
new171.3 KB

Tested and working on my side.

Steps.
1. Use devel generate to create 20K nodes. With URL alias and translations.
2. Enable views_url_alias
The views_url_alias is installed very quickly.
The views_url_alias table is not populated during install or update of the module.
3. There is a warning message on /admin/help/views_url_alias. "The Views URL Alias table needs to be rebuilt. Rebuild table."
Needs rebuild
4. At /admin/structure/views/settings/alias, after clicking the rebuild table.
A batch have started to rebuild the views_url_alias_table.
Rebuild views url alias
5. After the batch, "All paths have been processed." message is shown.
6. The views_url_alias table have been populated.
Table populated
7. Configure a View for url alias. Add url alias relationship and Url alias field and exposed filter.
8. The View, URL alias exposed filter and language filter are all working.
View working
Alias filter working
Language filter working
9. I also have not found issue with coding standards.

nicholass’s picture

StatusFileSize
new80.97 KB

Can the typehint be added back? Removing it makes my other patch fail to apply https://www.drupal.org/project/views_url_alias/issues/3491389
s

nicholass’s picture

StatusFileSize
new128.33 KB

Sorry to report but even with this patch and my other I still can't bulk process all the nodes, it worked intermittently locally but almost never on our dev server. I am just going to remove this module its been too problematic for our site. Hopefully you can get it working one day. I think it has something to do with all the nested language loops in the batch. Oh and I think I had about 29k nodes
ajax error

AI tools thought the following could be the issue, but after hours of attempting to patch I gave up. But here are some breadcrumbs for the next person.

After reviewing the code, I've identified several critical memory and performance issues:

Major Issues:
Router Lookup Explosion: views_url_alias_get_path_entity_type() is called for every language variant, even when the path is identical (e.g., /node/123). This is extremely expensive.

Repeated Entity Loading: The same entity is loaded multiple times across different languages without caching.

Individual Database Operations: Each save operation performs separate DELETE and INSERT queries instead of batching.

Memory Accumulation: Entity cache pollution from loading multiple translations without cleanup.
rachel_norfolk’s picture

Status: Reviewed & tested by the community » Needs work

This doesn’t look like it is RTBC any more. Maybe someone can confirm where we are?

n-m-daz’s picture

StatusFileSize
new133.62 KB

Might be best to wait for https://www.drupal.org/project/views_url_alias/issues/3491389 to be merged.
That ensure only null and ContentEntityInterface will return by views_url_alias_get_path_entity_type method.

screenshot

n-m-daz’s picture

Status: Needs work » Needs review

Fix merge conflicts in MR !34.

I believe most of the issue from the previous comment is from issue https://www.drupal.org/project/views_url_alias/issues/3491389 which is already closed.

Moving this to "Needs review".

Comment from #22 can be use as test guides.

andreisandu’s picture

StatusFileSize
new339.7 KB

I’ve re-tested the merge request following the steps outlined in comment #22.

Using Devel Generate, I created ~10k nodes with URL aliases and translations. Enabling views_url_alias completed quickly, and rebuilding the Views URL Alias table via the admin UI correctly ran as a batch process.

After the batch completed (“All paths have been processed”), the views_url_alias table was fully populated. Views using the URL alias relationship, exposed alias filter, and language filter all worked as expected.

From my re-test, everything still appears to be working correctly.

screenshot

andreisandu’s picture

Status: Needs review » Reviewed & tested by the community

rachel_norfolk’s picture

Status: Reviewed & tested by the community » Fixed

Love it - merged!!

Thank you all! Don’t forget to check your contribution attribution is correct.

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.