It would be great to add some drush commands for bulk updating and deletion.

Files: 
CommentFileSizeAuthor
#43 pathauto-add-drush-support-867578-42.patch4.36 KBOwen Barton
PASSED: [[SimpleTest]]: [MySQL] 316 pass(es).
[ View ]
#30 pathauto-add-drush-support-867578-30.patch4.36 KBmake77
PASSED: [[SimpleTest]]: [MySQL] 316 pass(es).
[ View ]
#29 pathauto-add-drush-support-867578-29.patch3.88 KBDuaelFr
PASSED: [[SimpleTest]]: [MySQL] 316 pass(es).
[ View ]
#25 867578-pathauto-drush-generate.patch2.25 KBDave Reid
PASSED: [[SimpleTest]]: [MySQL] 304 pass(es).
[ View ]
#21 new_drush_commands_to_create_and_delete_url_aliases-867578-21.patch5.58 KBcolan
PASSED: [[SimpleTest]]: [MySQL] 280 pass(es).
[ View ]
#10 new_drush_commands_to_create_and_delete_url_aliases-867578-10.patch5.72 KBcolan
PASSED: [[SimpleTest]]: [MySQL] 280 pass(es).
[ View ]
#7 pathauto.drush_.inc_.patch4.97 KBdelta
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch pathauto.drush_.inc__0.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]
#5 pathauto.drush_.inc_.patch4.72 KBdelta
FAILED: [[SimpleTest]]: [MySQL] Invalid patch format in pathauto.drush_.inc_.patch.
[ View ]

Comments

msonnabaum’s picture

Here's a command I wrote a while back for pathauto 6.x-1.x which may help move things forward:

https://gist.github.com/940710

It could use a little cleanup and I think there's a bug with the reporting, but it worked great when I used it after a big migration. I stole the drush_backend_invoke() technique from migrate module, which works quite well to avoid memory limits.

Dave Reid’s picture

Is there docs on how Drush can run a batch API process? Because that's what we use now.

msonnabaum’s picture

We have our own implementation of batch, which is used for update.inc:

http://drupalcode.org/project/drush.git/blob/HEAD:/commands/core/drupal/...

It should accomplish the same as my example above.

Dave Reid’s picture

Component:Code» Bulk generation
delta’s picture

Status:Needs work» Needs review
StatusFileSize
new4.72 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid patch format in pathauto.drush_.inc_.patch.
[ View ]

Hello,
i use this implementation actually on my project, for the moment i tested it just with 1500 alias max (node,user)

it implements two command :

pathauto-bulk-update all OR (node,taxonomy_term,user,other_modules...)

pathauto-delete-alias all OR (node,taxonomy_term,user,node/,other/default/path/,..)

the code contains enough comment and it's directly inspired from the actual pathauto code, actually this code use the batch from the pathauto code.

  batch_set($batch);
  $batch =& batch_get();
  $batch['progressive'] = FALSE;

  // Process the batch.
  drush_backend_batch_process();

Status:Active» Needs work

The last submitted patch, pathauto.drush_.inc_.patch, failed testing.

delta’s picture

StatusFileSize
new4.97 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch pathauto.drush_.inc__0.patch. This may be a -p0 (old style) patch, which is no longer supported by the testbots.
[ View ]

Git diff it's maybe better

Status:Needs review» Needs work

The last submitted patch, pathauto.drush_.inc_.patch, failed testing.

colan’s picture

Note for later: Once this gets finalized, we need to update the documentation page.

colan’s picture

Status:Needs work» Needs review
StatusFileSize
new5.72 KB
PASSED: [[SimpleTest]]: [MySQL] 280 pass(es).
[ View ]

I found some problems. Here are the changes with this new patch:

  • Fixed the function name mismatch between the help and the callback.
  • Gave commands more consistent naming: pathauto-aliases-create and pathauto-aliases-delete.
  • Fixed formatting.
  • Added the Drush include file to the info file.
  • Put the Drush command file, pathauto.drush.inc, in the current directory, not '"b/'. (I'm writing this weirdness off as a Windows problem. ;)
  • Added proper Git metadata.

Things seem to work, but recreating the nodes on my site takes hours, so will report back if there are problems. Deleting and creating everything else works fine.

EDIT: Turns out it only took 20 minutes with no problems. From the UI, it took several hours.

Alex Andrascu’s picture

Go go go :). Get it in .dev. Need this badly !

colan’s picture

@Alex: This process would go much faster if you helped test the patch, and/or provide a code review, and then report back. Comments like that really don't help speed things along.

Alex Andrascu’s picture

OK @colan.
You'll be happy to know then, that patch applied successfuly against 7.x-1.x-dev and works as intended with drush version 4.5.
Don't get so dramatic :)

colan’s picture

Status:Needs review» Needs work

There's something missing here. Translations (with Entity Translation) aren't being updated, even though this works perfectly well from the UI. See #1155134: Integrate pathauto bulk generation for details. I'll try to figure out what's going on.

@Alex: Dave Reid has ranted about getting spam in the past. I'm just trying to keep you on his good side. ;)

Dave Reid’s picture

1. I think it would probably be better to present the user with a list of options for bulk generation or deletion (like the drush cache-clear command) that also works with an argument if you provide it. Also like the drush cache-clear command I'd also not worry about multiple types, but an all or only one single at a time. Should make this a bit easier.
2. Don't need to add pathauto.drush.inc to pathauto.inc as that file doesn't contain any classes for the D7 class registry. Let's back out that change.

Dave Reid’s picture

Assigned:Dave Reid» Unassigned
Dave Reid’s picture

That said, I definitely would love to see this feature land. Can we also add unit tests for Drush commands provided by a module at all? Anyone know?

colan’s picture

Assigned:Unassigned» colan

Here is a list of things that I can get done in the immediate future:

  • Drop the addition to pathauto.info.
  • Get field translations working like they do through the UI.
  • Modify the "create" command to operate on more than only empty values. It should do whatever's set by the Update Action parameter in the UI. (This can be changed by a "drush vset" call to the variables table, or we can stick this vset call in another pathauto drush command later.)
    • Do nothing. Leave the old alias intact.
    • Create a new alias. Leave the existing alias functioning.
    • Create a new alias. Delete the old alias.
Dave Reid’s picture

I'd rather the create command keep with the current logic of the bulk alias batch operation and only work on unaliased items only and not also attempt to update existing content with aliases as well. We can consider that a separate feature change and improvement to the bulk generation after this patch lands.

Dave Reid’s picture

colan’s picture

StatusFileSize
new5.58 KB
PASSED: [[SimpleTest]]: [MySQL] 280 pass(es).
[ View ]

I'd rather the create command keep with the current logic of the bulk alias batch operation and only work on unaliased items only and not also attempt to update existing content with aliases as well.

Until a few minutes ago, I thought the bulk update from the UI was using that setting, but it's not. So yes, I agree.

Here's a patch that:

  1. removes the addition to pathauto.info.
  2. gets field translations working like they do through the UI.

The problem with the second one was that only one module's settings were being kept for each type. So Entity Translation's (ET's) override settings for node were being clobbered by those from node itself. I've changed it so that settings for each type are only being recorded once. If they already exist, they don't get set again. This works because ET's settings get added to the settings array before node's. (I'll admit I don't really understand why. ET's weight (11) is higher than Pathauto's (1) which contains the settings for node.)

I changed:

<?php
$callbacks
[$settings->module] = $settings;
?>

...to:

<?php
// Only assign the settings once for each type.  Otherwise, we'll be
// overwriting settings of other modules that got there first.
if (!isset($callbacks[$settings->module])) {
 
$callbacks[$settings->module] = $settings;
}
?>

If I understand Dave's comments correctly, we still need to add the following:

  • Turn the type argument into a menu chooser (unless it's provided in "--types").
  • Allow for an entity ID argument. (I don't know how this would work, as these aren't unique. It would be necessary to specify the entity type as well.) If it exists, only operate on that entity. Otherwise, operate on all of them.
colan’s picture

colan’s picture

Status:Needs work» Needs review
delta’s picture

Thank you for your help, we need this feature, but i don't know how to make a patch in a good (drupal) way. so thank you for your help. I'm gone a test the latest patch and let you know.

Thx

Dave Reid’s picture

StatusFileSize
new2.25 KB
PASSED: [[SimpleTest]]: [MySQL] 304 pass(es).
[ View ]

Here's my effort to get this working as 'pathauto-aliases-generate' which also uses the same options as the tab in the pathauto UI, offers the user the ability to select a type, or provide it via the command. I've left off the delete callback as I'd really like to have #1359306: Bulk delete "Users" should not delete aliases for user/register, user/login or user/password land first, then add support for bulk delete via the batch operations.

delta’s picture

Your code seems to be very clean as if we not pass the parameters, he add a list of choices of types of alias pathauto are going to recreate. We need to merge the last patch from colan #21
with #25. Are you agree with that ? and do you want to add a choices of types too for the delete command ?

_redfog’s picture

Hello everyone.

I applied the path new_drush_commands_to_create_and_delete_url_aliases-867578-21.patch and 867578-pathauto-drush-generate.patch but independently.

For the first one, I launch the command:
drush pathauto-aliases-create

and for the second one, I tried this:
drush pathauto-aliases-gerenate

Commands are executed, but end up quickly. I checked some url, nothing happened. Someone else tried ?

Pauthauto version I use: 7.x-1.1

colan’s picture

Assigned:colan» Unassigned
DuaelFr’s picture

StatusFileSize
new3.88 KB
PASSED: [[SimpleTest]]: [MySQL] 316 pass(es).
[ View ]

Added delete support on the David's patch basis because of a personal and urgent need.
Use with caution.

make77’s picture

Version:7.x-1.x-dev» 7.x-1.2
Status:Needs review» Needs work
StatusFileSize
new4.36 KB
PASSED: [[SimpleTest]]: [MySQL] 316 pass(es).
[ View ]

I updated patch given in #29 as it didn't work with pathauto 7.x-1.2.

make77’s picture

Status:Needs work» Needs review
perke’s picture

Just tested the patch in #30 on bulk delete and update on ~25k nodes and it works perfectly.

Only issue was that job was failing due to exceeding maximum execution time of 120 seconds but that's related to my particular setup, so I had to run the command 3 times to get everything updated.

Thanks

Mac_Weber’s picture

adding an argument to limit the number of nodes to be processed would fix the issue related on #32

MiroslavBanov’s picture

SGhosh’s picture

Patch at #30 tested. Deleted and generated 72 aliases at one go. Works awesome!

Mac_Weber’s picture

Status:Needs review» Needs work

It seems good but I think it would need a bit more work to address scenarios on comment #32, maybe as stated on #33

Dave Reid’s picture

I'm surprised the drush batch processing doesn't automatically handle that for us? Is there something we're not doing right in invoking the batch?

pwaterz’s picture

I dont think the command should print out output each time it updates. Writing to stdout can slow down the command. I think it should be a percentage update.

emilyf’s picture

I tried patch #30 on 7.x-1.2 and it applied cleanly but then when I issued either command

drush pathauto-aliases-generate
drush pathauto-aliases-create

I get:
Fatal error: Cannot redeclare redirect_drush_command() (previously declared in /sites/all/modules/redirect/redirect.drush.inc:13) in /sites/all/modules/redirect1/redirect.drush.inc on line 24
Drush command terminated abnormally due to an unrecoverable error. [error]

EDIT: That error was just my own stupidity with a redirect issue. THE PATCH APPLIES AND WORKS. Running drush pathauto-aliases-generate then gives me the option to choose what I want to generate.

emilyf’s picture

Anyone have ideas on why it might be extremely slow?

Just doing about 20 nodes took more than 20 minutes...

drush pathauto-aliases-generate
Enter a number to choose which entity types to bulk generate aliases.
[0] : Cancel
[1] : all
[2] : node
[3] : taxonomy_term
[4] : user

2
Updated alias for node 14893. [ok]
Updated alias for node 14918. [ok]
Updated alias for node 14943. [ok]
Updated alias for node 14968. [ok]
Updated alias for node 14993. [ok]
Updated alias for node 15018. [ok]
Updated alias for node 15043. [ok]
Updated alias for node 15068. [ok]
Updated alias for node 15093. [ok]
Updated alias for node 15118. [ok]
Updated alias for node 15143. [ok]
Updated alias for node 15168. [ok]
Updated alias for node 15193. [ok]
Updated alias for node 15218. [ok]
Updated alias for node 15243. [ok]
Updated alias for node 28918. [ok]
Updated alias for node 28943. [ok]
Updated alias for node 28968. [ok]
Updated alias for node 28993. [ok]

DuaelFr’s picture

@emilyf : This article contains some performance tips it is a bit outdated but it could help you.

saurabh.arya’s picture

drush pathauto-aliases-generate is not working for me.I have large number of node,any other solution?

Owen Barton’s picture

Status:Needs work» Reviewed & tested by the community
StatusFileSize
new4.36 KB
PASSED: [[SimpleTest]]: [MySQL] 316 pass(es).
[ View ]

I think this looks good - confirmed the Drush batch usage is correct. Attached patch just fixes a doxygen typo.

The parent php process is still subject to php timeout constraints (the batch backend calls are still useful in that they limit excess memory usage). In theory we could set an unlimited timeout in the callback, but I think this is a bad idea - if this is being run on cron and something goes wrong (e.g. infinite loop somewhere) it will kill the server - best to leave that decision to the site/sysadmin. This can be changed in any systemwide php.ini, or by the user via Drush drush.ini/DRUSH_INI, or via $drush_php_override/PHP_OPTIONS.

In terms of the script output, I agree that might be improved, but I don't think is a performance factor (writing a line of terminal output is minute relative to loading a node), so would suggest it is better to do in a followup patch.

thermador’s picture

Applied the patch, got it working, but regardless of my selection:

Enter a number to choose which entity types to bulk generate aliases.
[0]  :  Cancel
[1]  :  all
[2]  :  node
[3]  :  taxonomy_term
[4]  :  user
[5]  :  forum

... I always get "No New URL aliases to generate."

Do I really have to *delete* all of the old URL aliases before I generate new ones?

Can't I just *update* the aliases to a new replacement pattern (specified in: /admin/config/search/path/patterns) without deleting the old aliases?

I don't want to break all my links, and the redirect module don't seem to pick up on the fact that the aliases are changing (even though it is supposed to, I have the box checked "Automatically create redirects when URL aliases are changed."), so deleting an alias == broken link.

I think I'm probably just going to have to update my URL aliases 50 at a time from admin/content/.

Pathauto is used by over half a million sites, one of the most widely used modules (it should be in the core) and yet there still isn't any simple upgrade path from Drupal 6 -> Drupal 7...

jlporter’s picture

Issue summary:View changes

Patch from #43 works for me. It's beyond time to merge this so it will get used and further user testing can be done. Low risk to release as only drush users will be affected. I was shocked this didn't exist already.

annya’s picture

#43 works great for me. I update this issue, cause we really should to implement drush integration in pathauto.

kenorb’s picture

Tested and it works fine.

$ drush pathauto-aliases-delete Forums
All of your Forums path aliases have been deleted. 
$ drush pathauto-aliases-generate forum
Updated alias for forum 652. [ok]
Updated alias for forum 666. [ok]
Generated 38 URL aliases. 

Testing path alias change via:

drush vset pathauto_forum_pattern '[term:vocabulary]/[term:parents-all]/test' && drush pathauto-aliases-delete Forums && drush pathauto-aliases-generate forum && drush sqlq "SELECT source, alias FROM url_alias WHERE source LIKE 'forum/%'"

Works.

However these seems to inconsistent (capital and lower-case names):

$ drush pathauto-aliases-delete
Enter a number to choose which entity types to bulk delete aliases.
[0]  :  Cancel        
[1]  :  all           
[2]  :  Users         
[3]  :  Content       
[4]  :  User blogs    
[5]  :  Taxonomy terms
[6]  :  Forums
$ drush pathauto-aliases-generate
Enter a number to choose which entity types to bulk generate aliases.
[0]  :  Cancel       
[1]  :  all          
[2]  :  node         
[3]  :  taxonomy_term
[4]  :  user         
[5]  :  forum        
[6]  :  blog
DamienMcKenna’s picture

I suggest including this in the 1.3 release.

Anybody’s picture

Thanks a lot, this patch works just fine and is quite important. Is it possible to get this into the next release soon?

bewilled’s picture

In regards to the performance issue mentioned in #40 the following query shows how to generate url aliases at a database level for nodes in Postgresql:

INSERT INTO url_alias(source, language, alias)
SELECT 'node/' || sub.nid,
                  'und',
                  CASE
                      WHEN sub.row_num = 1 THEN ''%your_path%/'||lower(replace(regexp_replace(sub.title,'[^A-Za-z0-9 ]', '', 'g'),' ','-'))
                      ELSE '%your_path%/'||lower(replace(regexp_replace(sub.title,'[^A-Za-z0-9 ]', '', 'g'),' ','-'))||'-'|| sub.row_num
                  END AS title
FROM
  (SELECT title,
          nid,
          ROW_NUMBER() OVER(PARTITION BY title
                            ORDER BY nid) AS row_num
   FROM node where node.type = ''%your_content_type%') AS sub;

replacing %your_path% and '%your_content_type% with their respective value.

What this basically does is select the titles from the node table, trim off all non alphanumeric characters(except spaces), then replace all spaces with the '-' char. If there happens to be a two titles that are equal, then it appends a number that increases as many times as there are duplicates to the end of the url.

Use at your own risk.

doitDave’s picture

Subscribing, can #43 please make it into 1.3, and more important: Can 1.3 pls make it soon into the public? ;) (Any plans on this?)

DamienMcKenna’s picture

@doitDave: We need a) code review on several pending issues, possibly a few needing feedback, b) a maintainer / new co-maintainer who has the time to deal with it.

Media Crumb’s picture

Did this ever go anywhere?

DamienMcKenna’s picture

@Media Crumb: The issue is still at RTBC, therefore it has not proceeded anywhere in the past two years.

Media Crumb’s picture

Crazy,

I've been using it for a while now and have had not real issues with it. Seems a shame not to move it closer to a commit.

doitDave’s picture

@damien #53 sorry, somehow I did not receive the "updated" flag but found it updated accidentally today.

Is there still something to be done? Can I assist?

DamienMcKenna’s picture

Version:7.x-1.2» 7.x-1.x-dev

@doitDave: If you think you have the time to contribute to this module, going through the issue queue and reviewing patches would be the best step forwards, then send Dave a message offering to comaintain the module.