The generate content scripts are fantastic. Many people who generate content will want to run tools like siege against the content, taxonomy pages, feeds, user pages.

Could we generate a list of URL's for all the content we create?

Kieran

Comments

moshe weitzman’s picture

Anyone want to take this on?

jamesJonas’s picture

Here is a very simple program to create a urls.txt file for 'siege'. Not fancy, but gets the job done.

$domain_name = the domain name you are testing
$min_node = the first node in your site
$max_node = the last node in your site
$number of urls = the number of urls you wish to create.
$subdir = the subdirectory in you site

This is designed to run from command line. If you wish to add several different types of content then run it several times changing
$fh=fopen("/usr/etc/urls.txt","a");
$subdir = "/node/"; to some other content

php -r '$domain_name = "example.com";
 $subdir = "/node/";
 $min_node = 1;
 $max_node = 1000;
 $number_urls = 100;
 for ( $counter = 1; $counter <= $number_urls; $counter += 1) {$urls .= $domain_name . $subdir . rand($min_node,$max_node)  . "/" . "\n";} echo $urls;  $fh=fopen("/usr/etc/urls.txt","w"); fwrite($fh,$urls);
 fclose($fh);'

Running siege

siege -c 32 -i -t 11m -d 5 -f /usr/etc/urls.txt

James

joshk’s picture

Component: Code » devel

FYI I'll be working on this:

http://drupal.org/project/siege

joshk’s picture

Also, for people interested in getting real-world data for Siege purposes, Sproxy is a great resource:

http://www.joedog.org/Sproxy/Manual

federico’s picture

You can take an urllist from xmlsitemap.

Just: copu everything, paste into a spreadsheet editor and divide columns (text to columns) . Copy the first column to a text editor.

federico’s picture

You can take an url list from xmlsitemap module.

Just: copy everything in yoursite/sitemap.xml , paste into a spreadsheet editor and divide columns (text to columns) . Copy the first column to a text editor.

devyd’s picture

For the record:
If not obvious, sproxy will be tunneling your browser requests.
That means, as a result, the urls.txt will include all .js, .css, favicon.ico etc. request data from a real browser -- exactly what you need.

As joshk mentioned, it won't get any more realistic than this. Think: "unprimed cache frontpage hit"-urls.txt file.
A great solution. I don't see the need for a module support here -- anyone who is using siege for his work ... should also be able to do a quick installation of sproxy, imho.

salvis’s picture

Assigned: Amazon » Unassigned
Category: task » support
Status: Active » Fixed

Thanks, devyd!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.