Come together with the global Drupal community in Rotterdam, 28 Sept – 1 Oct 2026. Sessions, contribution, connection, and Early Bird savings until 8 June.
Well I thought about it.. and this one really complements the CDN module. Pretty much CDN rewrites the links inside the html page and this module in conjunction with (advagg) rewrites the links in the aggregated css files.
And I could use some opinion about this:
I can be wrong about this but the admin panel in CDN admin/settings/cdn/details has the logic setup for by file type. Usually people would do
pretty much with a data set of
background:url('sites/all/themes/do/1.png');
background:url('sites/all/themes/do/2.png');
background:url('sites/all/themes/do/3..png');
background:url('sites/all/themes/do/4.png');
background:url('sites/all/themes/do/5.png');
background:url('sites/all/themes/do/6.png');
background:url('sites/all/themes/do/7.png');
background:url('sites/all/themes/do/8.png');
Having said that what would be the user experience for managing this? I can't picture of a way to consolidate these 2 different requirements into just one admin interface aka the mapping Text Area field. What do you think?
I can make an assumption that during the advagg process of parallel_css it's always gonna be whatever is the selected mapping url(s) we want to balance this out as evenly as possible.
pretty much the same formula:
<?php
$servers_for_file[$unique_file_id % count($servers_for_file)];
// OR
$paralell_css_settings_urls[$parallel_css_counter % $paralell_css_settings_count]
?>
With that said I can do like this:
<?php
// If cdn basic mapping is available, assign as the mapping url array
// else if parallel css mapping is available, assign as the mapping url array
// else do nothing
// then process the mapping url array and the contents data
?>
So i pretty much make separate module of parallel css mapping admin for just in case folks that dont want to make use of the cdn module but still wants to do a load balancing on their css aggregates..Or pretty much just remove the admin aspect of the parallel css and use cdn.
You want some sort of hash on the filename, that way the same file will always be coming from the same server; thus your browser will always have the cache of it. I don't think your current code does that. Also set the weight of this to be heavier than css_emimage
static0.computerdocs.com.au. 240 IN A 165.228.91.94
static1.computerdocs.com.au. 240 IN A 165.228.91.94
static2.computerdocs.com.au. 240 IN A 165.228.91.94
@peter bowey
In regards to #8, that works great until the order of your link tags change; once they change then you have to re-download the same CSS file from a different domain instead of getting it from your browser cache. Or in this case if you add/remove a url() link at the top of a CSS file then all the url() references will be pointing to a different server.
The code below shows how the filename hash thing works. If you change the number of servers than the modulus will be different. This isn't perfect by any means but in terms of code complexity VS getting it right, its a pretty good tradeoff. The url() changes when the # of available servers changes, which makes sense.
I will plan to integrate it into the advagg + parallel interface 'thingy'
Appreciate you will and time to encourage an 'old dog non-cms coder'.
* I am still learning the correct Drupal 'bark' - it is not 'woof - woof' - more like 'callback sometime grrrr' * :)
I made you a co-maintainer if you want to handle hash code , ill whip up the cdn integration thing. sorry talking with a client. cant respond for a time.
@mikeytown2 project support count = +1
Mike, must be about 22+ projects you love + support :)
I elect that you have 26 hours per day, the rest of us 24....
Yikes, i thought about #11 more .. its just that i wasnt aware of the concept. I can quickly research and implement it. but if you want to take care of it (at least that part of the module) thats okay too :)
I re-read the messages and i realized that i am not getting the big picture here.
Picking on the clues "asset collective" and "to be heavier than css_emimage" I started
reading the issue queues of several modules including advagg and css_emimage.
Having said that I just want to be clear on what we are trying to pull off here.
Senario A parallel_css, advagg compress css, core advagg css/js are enabled. css_emimage is not.
During the css aggregation process because parallel_css has a weight of -10 see (parallel_css.install) it gets first dibs
on hook_advagg_css_alter. parallel_css gets the mapping url array from cdn_basic_mapping and then proceeds to the replacement
process. After the replacement process of $content it gets pass to the other implementers of hook_advagg_css_alter and at the
end of the process we get an aggregated file of css_0f8107b462965cd0d36e3ad9a51359e7_0.css containing among its contents:
XXXX-CSS-EmbeddedString-XXXXX is a BASE64 encoded version of that file. You get the benefits of a image sprite without some of the hassles that come with it. So this module (Parrallel CSS) needs to check that the ulr() is not base64 encoded and is a file. css_emimage will only drop in 32kb of image data into the CSS file so anything larger will then be processed in this module.
Mikeytown2 and peter bowey , you guys were pretty deep, I could not get what you guys were saying. mbutcher and i figured it out and even made some improvements.
First I wanted to make sure that this is the concept that we are trying to achieve.
1.png from 1.css is loaded the first time as img1.d.o/1.png
at the next pages: 1.png from 2.css is appearing as img2.d.o/1.png
what we want to make sure is 1.png is always attached to the same server.
1.png is always http://img1.d.o/1.png from any aggregated css.
although the distributed set is not always optimal i agree that this is the best way in the long run.
//Equal Distribution algorithm
$servers_for_file[$unique_file_id % count($servers_for_file)];
// result set of (img1,img2,im3,img4,img1,img2,im3,img4)
//Hash based algorithm
$paralell_css_settings_urls[$parallel_css_counter % $paralell_css_settings_count]
// result set of (img1,img1,im1,img2,img1,img2,im3,img4)
Matt Butcher made some suggestions on how to speed it up from:
md5
Overall Summary
Total Incl. Wall Time (microsec): 4,401 microsecs
Total Incl. MemUse (bytes): 102,348 bytes
Total Incl. PeakMemUse (bytes): 199,084 bytes
Number of Function Calls: 466
crc32
Overall Summary
Total Incl. Wall Time (microsec): 3,173 microsecs
Total Incl. MemUse (bytes): 101,664 bytes
Total Incl. PeakMemUse (bytes): 181,248 bytes
Number of Function Calls: 420
Many Thanks for working this through.
The use of crc32() many not be unique enough in some cases, hence the reasoning for using md5().
CRC is 'at most' an error detection method than a serious hash function. It helps in identifying say 'corrupted files' rather than uniquely identifying them.
Given a file, and a CRC32 checksum, it is relatively simple to make small modifications to the file so that it has the desired checksum. There is no easy way to do this with md5 sums.
CRC32 is useful for say, a communications checksum, because it's fast and efficient and effective at catching the kinds of errors that happen over a communictions line (short bursts of errors, at most, in relatively small blocksizes). It's easy to implement and long predates MD5.
But if you're using it for anything other than a simple communications checksum, 'it's being abused'.
Most browsers will only open up 2 concurrent connections per cname. this means that if all of your assets are being served from http://example.com and you have a lot of little images and scripts on the page, the clients browser will only open two connections to example.com and pipeline or use those two connections to download all the assets on the page.
By using a wildcard subdomain or manually setting dns so that you spread out the static assets over a few different subdomains, you let the browser open two connections per subdomain so all the assets will download in a more parallel fashion. This can be the difference between your page loads lagging at the end while they load up all the little assets and having the page snap into place and seem a lot quicker to the user.
So if we use a simple little view helper method for all of our image urls we can spread the load out by faking the browser into thinking it is connecting to multiple servers. For example if we serve all our images from these subdomains:
This will give us 8 concurrent connections from the browser to the server for static assets which dramatically decreases page load time. The thing to watch out for is that you always want to serve the same asset from the same subdomain or else you defeat browser caching and won’t gain anything from this trick. So we will use an Zlib hash of the asset url modulo 4 to choose a subdomain. Here is a simple helper:
require 'zlib'
# balance images across many domains to force the opening of more connections
# updated to use Zlib.crc32 instead of md5 as per
# comment from David
def balanced_asset_url(asset)
idx = (Zlib.crc32(asset || "error" ) % 4) + 1
%!http://asset#{idx}.#{request.domain}#{asset}!
end
Then use it like this:
<%= image_tag balanced_asset_url('/images/foo.png') %>
By hashing the asset path we make sure that each time this helper is called for the same asset it will always return the same subdomain.
This technique is most useful when you have many objects on a page that need to make an additional http request each to render. By tricking the browser into making more concurrent connections when fetching assets we can speed up our page load times and make our sites seem more ‘snappy’
The above 'quote' is only meant as a idea 'template' and 'brain food' :)
One of the golden rules for front-end performance optimisation — one recommended by both Yahoo's YSlow and Google's Page Speed — is to split your page assets across multiple hostnames to allow web browsers to download more of those assets in parallel. Unfortunately it turns out that some consumer-grade network devices will block traffic to sites that use these techniques if the asset hosts all have the same IP address.
Consequently, if your site downloads page assets from multiple hosts — often referred to as domain sharding — make sure they all have separate IP addresses.
...
... Timeout woes, SYN Flood to Host
Unfortunately, days later we started to get a steady trickle of customers complaining that they were getting timeout errors when accessing the LOVEFiLM website. They were reporting that the first page loaded, but most (though not all) of the images were broken. Any subsequent page requests all failed with a timeout error. Other than the symptoms, the customers had very little in common; ISPs, operating systems and browsers all seemed to be affected proportionately to our visitor stats.
The problem turned out to be caused by a well-intentioned but ultimately misguided setting baked into the stateful firewall built into certain consumer-grade ADSL routers. These routers track the number of unfinished TCP connections — that is, outbound TCP connections where the SYN packet has been sent but the router has yet to see a SYN ACK response from the server, otherwise known as embryonic connections — to each IP address. If the number of unfinished TCP connections to an individual IP address exceeds a given threshold, all subsequent packets to that IP are silently dropped for a period of 5 minutes. In the user's web browser, this results in timeout errors for any requests that did not make it through before the door was slammed shut.
The setting in question is commonly labelled Maximum unfinished TCP/UDP connections per host. On some devices such as the Belkin F5D7630 this setting is configurable through a hidden page in the router's web-based admin interface, but on others the threshold is simply baked into the firmware and cannot be changed. Worse, some devices ship with a default value as low as 10 for this setting. Modern web browsers make anywhere between 6 and 15 HTTP connections per hostname, so loading static assets from more than one hostname is almost certain to trigger this rule.
The only clue a user would have that their router was causing the connection to be blocked is the SYN Flood to Host entry in their firewall logs:
I can only assume that this setting is an attempt lessen the effectiveness of DDoS attacks from the client-side. A noble intention, to be sure, but preventing or lessening the effectiveness of DDoS attacks on websites is not something I would consider to be within the domain of a consumer-grade ADSL modem. By all means protect the user against inbound DoS/DDoS attacks, but blocking outbound traffic based on what the router manufacturer deems to be normal usage seems like a step too far.
Refer #25 Oh joy! My company website is like that http://www.straightnorth.com
We are pretty much using (img1.straightnorth.com,img2.straightnorth.com,img3.straightnorth.com,img3.straightnorth.com,img4.straightnorth.com,css.straightnorth.com) all pointing to the same ip :(
*smile* That is only meant to be a 'heads up' about some 'possible issues' + how some 'typically older consumer' grade ADSL routers offer 'crude' 'firewall' protection... eg: "SYN Flood to Host" :)
Unfortunately it turns out that some consumer-grade network devices will block traffic to sites that use these techniques if the asset hosts all have the same IP address.
Personally, I use a dual-wan Linksys RV082 ADSL2+ on two active ADSL2+ lines - with two static IP's... feeding a dedicated Linux Server (3 x Ethernet Ports / Gateway). In this event, I have 'disabled' the Linksys RV082 WAN firmware 'crud protection' and use Linux 'packet stateful' firewall..
Notes: Google is pushing a growing number of hits for your module:
Showing results for drupal advagg
Search Results
Advanced CSS/JS Aggregation | drupal.org
drupal.org/project/advagg
19 Feb 2011 ... If the user has the permission of "bypass advanced aggregation" then adding ?advagg=0 to the end of the URL will turn off aggregation for ... Parrallel CSS - AdvAgg Plugin | drupal.org
drupal.org/project/parallel_css
8 Jun 2011 ... Inspired by the request AdvAgg - Use the CDN module for ...
This is a bit tricky, i am troubled by cdn's approach of only those who knows php will be able to pull this off. http://drupal.org/node/962266
We need a better approach here:
What do you think of this:
@ /admin/settings/advagg/parallel-css
If function_exist(cdn_file_url_alter) :
[X] Use Available CDN Mapping and CDN pick-server
----------------------------------------------------------------
Be sure to read: http://drupal.org/node/962266
----------------------------------------------------------------
URL:
----------------------------------------------------------------
Enter the domains urls you want included separated by each line. Warning dont include a '/' at the end of the domain url.
Instead of htaccess rules there is an issue for CDN in regards to SEO. It's fairly high on my priority list #1060358: CDN and SEO as in it might get done in 2 weeks
I mean .... I am giving an option for people to use the CDN mapping and cdn_pick_server instead of using parallel_css mapping and logic.
Pretty much a checkbox in the admin settings page of parallel_css
[ YES OR NO ] [X] Use Available CDN Mapping and CDN pick-server
----------------------------------------------------------------
Be sure to read: http://drupal.org/node/962266
-----------------------------------------------------------------
@peter bowey
Not the right solution. We need to send out a 404 at a minimum or a 301 ideally if someone tries to access html content on your server through the CDN.
In nginx.conf, I use something like this for the CDN private back-channel URI path (what the CDN pulls from):
# Drupal STATIC DOMAIN = Static Assets for CDN PULL
server {
server_name cdn1.peterbowey.com.au cdn2.peterbowey.com.au cdn3.peterbowey.com.au cdn4.peterbowey.com.au;
root /var/www/virtual/peterbowey.com.au;
limit_conn gulag 12; # max concurrent connections per client /ip
index index.php index.html;
if_modified_since exact;
access_log /var/log/nginx/wp_static.log main buffer=32k;
//...
//...
# Avoid bandwidth stealing (Media resources) - serve 1x1 transparent GIF
valid_referers none blocked server_names www.peterbowey.com.au www.pbcomp.com.au small.gdlcdn.com ~(peterbowey.com.au.|google.); # reduce linking from outside
if ($invalid_referer) {
return 403;
}
# Deny illegal Host headers
if ($host !~* ^(cdn1.peterbowey.com.au|cdn2.peterbowey.com.au|cdn3.peterbowey.com.au|cdn4.peterbowey.com.au|www.peterbowey.com.au)$ ) { # allow access for CDN + self
return 444;
}
//...
//...
# send our static not cached requests to our drupal PHP Dynamic Domain with clean URLs support (301))
location @drupal {
rewrite ^/(.*)$ $scheme://www.peterbowey.com.au/index.php?q=$1 last;
}
//...
//...
# deny access to any php files
location ~* ^.+\.php$ {
deny all;
}
Then something like this on the Drupal PHP side:
server { # DRUPAL DYNAMIC SECTION:
server_name www.peterbowey.com.au;
root /var/www/virtual/peterbowey.com.au;
limit_conn gulag 20; # max concurrent connections per client /ip
index index.php index.html;
access_log /var/log/nginx/peter-drupal.log main buffer=32k;
error_log /var/log/nginx/bad-error.log;
# Deny illegal Host headers
if ($host ~* ^(cdn1.peterbowey.com.au|cdn2.peterbowey.com.au|cdn3.peterbowey.com.au|cdn4.peterbowey.com.au)$ ) { # Remote CDN should NOT come here
rewrite ^ $scheme://cdn1.peterbowey.com.au$request_uri permanent; # send it to the correct static host
}
hey how expensive is it to get several ips and host accounts and just have it rsync? trying to solve my straightnorth.com and imgX pointing to same ip problem.
BTW CDN make use of hook_file_url_alter via cdn_file_url_alter --- that function is a beast with user access checks, cdn testing checks and cdn_devel_page_stats stuff. i am currently pretty much copying and pasting the important parts of cdn_file_url_alter -- or i could go the route of calling cdn_file_url_alter... what do you guys think?
*Same IP's*
That should only be a 'problem' if the router starts blocking packets. See #28
Unfortunately it turns out that some consumer-grade network devices will block traffic to sites that use these techniques if the asset hosts all have the same IP address.
pretty much we still have to go to
@ /admin/settings/advagg/parallel-css
Check the box [X] use available cdn mapping and cdn_pick_server of cdn
this doesnt do the following CDN features:
a.) CDN supports HTTPS
b.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for all users.
c.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for authenticated users only.
this doesnt do the following CDN features:
a.) CDN supports HTTPS
b.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for all users.
c.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for authenticated users only.
These will prolly have to be other issues. Right now, I dont know how to pull these off.
Why are you copying the cdn_file_url_alter function? Just require the CDN module and be done with it. Or am I missing something? Run the image references in the CSS through file_create_url or if they are running CDN on a non patched drupal, detect it by stealing the first part of cdn_init() (variable_get(CDN_THEME_LAYER_FALLBACK_VARIABLE, FALSE) == TRUE) and then call cdn_file_url_alter directly.
Have it look something like this
// CDN Support.
if (module_exists('cdn')) {
$status = variable_get(CDN_STATUS_VARIABLE, CDN_DISABLED);
if (($status == CDN_ENABLED || ($status == CDN_TESTING && user_access(CDN_PERM_ACCESS_TESTING))) && !variable_get(CDN_THEME_LAYER_FALLBACK_VARIABLE, FALSE)) {
if (variable_get(CDN_THEME_LAYER_FALLBACK_VARIABLE, FALSE) == TRUE) {
return cdn_file_url_alter($path);
}
else {
return file_create_url($path);
}
}
}
else {
// "Simple" fallback if CDN is not installed. Don't re-implement a modules logic.
}
@chriscalip, your module may have possibly saved me from moving my D6 Core to D7 (a long story within...) .. :)
@mikeytown2, the updated advagg_build_uri() you applied has made the 'great Code Sun' shine here :)
see http://drupal.org/node/1185786. Great, see my comment above to Chris.
Many thanks for a useful module Chris, additionally - we have learned some 'cool stuff'.
Teamwork = Cool!
Comments
Comment #1
chriscalip commentedWell I thought about it.. and this one really complements the CDN module. Pretty much CDN rewrites the links inside the html page and this module in conjunction with (advagg) rewrites the links in the aggregated css files.
And I could use some opinion about this:
I can be wrong about this but the admin panel in CDN admin/settings/cdn/details has the logic setup for by file type. Usually people would do
CDN Mapping:
http://img1.drupal.org|jpg
http://img2.drupal.org|gif
http://img3.drupal.org|png
http://img4.drupal.org|ico
While this module has it for the sequential replacement element of the logic.
http://img1.drupal.org
http://img2.drupal.org
http://img3.drupal.org
http://img4.drupal.org
pretty much with a data set of
background:url('sites/all/themes/do/1.png');
background:url('sites/all/themes/do/2.png');
background:url('sites/all/themes/do/3..png');
background:url('sites/all/themes/do/4.png');
background:url('sites/all/themes/do/5.png');
background:url('sites/all/themes/do/6.png');
background:url('sites/all/themes/do/7.png');
background:url('sites/all/themes/do/8.png');
The result would be:
background:url('http://img1.drupal.org/sites/all/themes/do/1.png');
background:url('http://img2.drupal.org/sites/all/themes/do/2.png');
background:url('http://img3.drupal.org/sites/all/themes/do/3..png');
background:url('http://img4.drupal.org/sites/all/themes/do/4.png');
background:url('http://img1.drupal.org/sites/all/themes/do/5.png');
background:url('http://img2.drupal.org/sites/all/themes/do/6.png');
background:url('http://img3.drupal.org/sites/all/themes/do/7.png');
background:url('http://img4.drupal.org/sites/all/themes/do/8.png');
Comment #2
chriscalip commentedHaving said that what would be the user experience for managing this? I can't picture of a way to consolidate these 2 different requirements into just one admin interface aka the mapping Text Area field. What do you think?
Comment #3
mikeytown2 commentedI helped Wim Leers with some prototype code that deals with this exact situation. It is available in the CDN module. See readme.txt for details.
Your mapping on
admin/settings/cdn/detailswould look likeAnd then the "PHP code for cdn_pick_server()" on
admin/settings/cdn/otherwould look likeThis will spread all cdn requests fairly equally across the 4 different img* domains.
Comment #4
chriscalip commentedIt is very possible to do integration with cdn. Because both $cdn_basic_mapping and $parallel_css_settings are just a string of urls.
$parallel_css_settings is always:
http://img1.drupal.org
http://img2.drupal.org
http://img3.drupal.org
http://img4.drupal.org
While $cdn_basic_mapping can be
http://img1.drupal.org
http://img2.drupal.org
http://img3.drupal.org
http://img4.drupal.org
With:
$CDN_PICK_SERVER_PHP_CODE_VARIABLE:
$filename = basename($servers_for_file[0]['url']);
$unique_file_id = hexdec(substr(md5($filename), 0, 5));
return $servers_for_file[$unique_file_id % count($servers_for_file)];
or this:
http://img1.drupal.org|png
http://img2.drupal.org|gif
http://img3.drupal.org|jpg
http://img4.drupal.org|ico
The devil is in the details
Comment #5
chriscalip commentedI can make an assumption that during the advagg process of parallel_css it's always gonna be whatever is the selected mapping url(s) we want to balance this out as evenly as possible.
pretty much the same formula:
With that said I can do like this:
So i pretty much make separate module of parallel css mapping admin for just in case folks that dont want to make use of the cdn module but still wants to do a load balancing on their css aggregates..Or pretty much just remove the admin aspect of the parallel css and use cdn.
What do you think?
Comment #6
Peter Bowey commented@chriscalip
I love the idea = +1.
Load Balancing => 'yes'
Notes: I have not started using this module yet, I prefer to read the source and see where it is going...:)
Well done!
Comment #7
mikeytown2 commentedYou want some sort of hash on the filename, that way the same file will always be coming from the same server; thus your browser will always have the cache of it. I don't think your current code does that. Also set the weight of this to be heavier than css_emimage
Comment #8
Peter Bowey commentedReferring to #7
In the 'ancient' non CMS days .... :)
I used a 'parallel' URI CDN 'hash' like this: (0-2)
eg: ('old timer' HTML method sample):
DNS:
Comment #9
mikeytown2 commented@peter bowey
In regards to #8, that works great until the order of your link tags change; once they change then you have to re-download the same CSS file from a different domain instead of getting it from your browser cache. Or in this case if you add/remove a url() link at the top of a CSS file then all the url() references will be pointing to a different server.
The code below shows how the filename hash thing works. If you change the number of servers than the modulus will be different. This isn't perfect by any means but in terms of code complexity VS getting it right, its a pretty good tradeoff. The url() changes when the # of available servers changes, which makes sense.
Outputs
Comment #10
Peter Bowey commented@mikeytown2
Many thanks!
That is a 'acceptable' method!
I will plan to integrate it into the advagg + parallel interface 'thingy'
Appreciate you will and time to encourage an 'old dog non-cms coder'.
* I am still learning the correct Drupal 'bark' - it is not 'woof - woof' - more like 'callback sometime grrrr' * :)
Comment #11
chriscalip commentedHey mikey,
I made you a co-maintainer if you want to handle hash code , ill whip up the cdn integration thing. sorry talking with a client. cant respond for a time.
Comment #12
Peter Bowey commented@mikeytown2 project support count = +1
Mike, must be about 22+ projects you love + support :)
I elect that you have 26 hours per day, the rest of us 24....
Comment #13
chriscalip commentedYikes, i thought about #11 more .. its just that i wasnt aware of the concept. I can quickly research and implement it. but if you want to take care of it (at least that part of the module) thats okay too :)
Comment #14
mikeytown2 commentedI'll be busy over here for a little while so the ball is in your court :)
http://groups.drupal.org/node/154564
Comment #15
chriscalip commentedI got this, should be finish by tomorrow. need to sleep and all
Comment #16
Peter Bowey commented@chriscalip
I think @mikeytown2 has planted enough 'good seed' to get this 'hash code' rose 'in bloom' :)
Comment #17
chriscalip commentedI could not sleep. This is interesting.
I re-read the messages and i realized that i am not getting the big picture here.
Picking on the clues "asset collective" and "to be heavier than css_emimage" I started
reading the issue queues of several modules including advagg and css_emimage.
Having said that I just want to be clear on what we are trying to pull off here.
Drupal site http://www.example.com
has several css files including the following
site admin installs cdn, advagg, parallel_css, and css_emimage.
CDN mapping url:
Three scenarios:
Senario A parallel_css, advagg compress css, core advagg css/js are enabled. css_emimage is not.
During the css aggregation process because parallel_css has a weight of -10 see (parallel_css.install) it gets first dibs
on hook_advagg_css_alter. parallel_css gets the mapping url array from cdn_basic_mapping and then proceeds to the replacement
process. After the replacement process of $content it gets pass to the other implementers of hook_advagg_css_alter and at the
end of the process we get an aggregated file of css_0f8107b462965cd0d36e3ad9a51359e7_0.css containing among its contents:
--- So why does parallel_css needs to implement hash code if the other modules are doing it?
Senario B parallel_css, advagg compress css, core advagg css/js are enabled. css_emimage.
parallel_css is lightest.
we get an aggregated file of css_0f8107b462965cd0d36e3ad9a51359e7_1.css containing among its contents:
--- Is Css Embedded Image able to handle a domain name ???
Senario C parallel_css, advagg compress css, core advagg css/js are enabled. css_emimage.
parallel_css is heavier than css_emimage
we get an aggregated file of css_0f8107b462965cd0d36e3ad9a51359e7_2.css containing among its contents:
--- are these strings valid ???
Comment #18
mikeytown2 commentedXXXX-CSS-EmbeddedString-XXXXX is a BASE64 encoded version of that file. You get the benefits of a image sprite without some of the hassles that come with it. So this module (Parrallel CSS) needs to check that the ulr() is not base64 encoded and is a file. css_emimage will only drop in 32kb of image data into the CSS file so anything larger will then be processed in this module.
Comment #19
chriscalip commentedMikeytown2 and peter bowey , you guys were pretty deep, I could not get what you guys were saying. mbutcher and i figured it out and even made some improvements.
First I wanted to make sure that this is the concept that we are trying to achieve.
1.png from 1.css is loaded the first time as img1.d.o/1.png
at the next pages: 1.png from 2.css is appearing as img2.d.o/1.png
what we want to make sure is 1.png is always attached to the same server.
1.png is always http://img1.d.o/1.png from any aggregated css.
although the distributed set is not always optimal i agree that this is the best way in the long run.
Matt Butcher made some suggestions on how to speed it up from:
md5
Overall Summary
Total Incl. Wall Time (microsec): 4,401 microsecs
Total Incl. MemUse (bytes): 102,348 bytes
Total Incl. PeakMemUse (bytes): 199,084 bytes
Number of Function Calls: 466
TO:
crc32
Overall Summary
Total Incl. Wall Time (microsec): 3,173 microsecs
Total Incl. MemUse (bytes): 101,664 bytes
Total Incl. PeakMemUse (bytes): 181,248 bytes
Number of Function Calls: 420
Comment #20
Peter Bowey commented@chriscalip
Many Thanks for working this through.
The use of crc32() many not be unique enough in some cases, hence the reasoning for using md5().
Given a file, and a CRC32 checksum, it is relatively simple to make small modifications to the file so that it has the desired checksum. There is no easy way to do this with md5 sums.
CRC32 is useful for say, a communications checksum, because it's fast and efficient and effective at catching the kinds of errors that happen over a communictions line (short bursts of errors, at most, in relatively small blocksizes). It's easy to implement and long predates MD5.
But if you're using it for anything other than a simple communications checksum, 'it's being abused'.
Comment #21
chriscalip commented@peter bowey
My pleasure its a fun project for me. http://drupalcode.org/project/parallel_css.git/commit/18974e3 Done.
Comment #22
Peter Bowey commentedRefer #19:
See http://brainspl.at/articles/2006/12/29/speed-up-page-loads
The above 'quote' is only meant as a idea 'template' and 'brain food' :)
Comment #23
chriscalip commentedI think we have achieved this now. :) 1.png will always be assigned to the same domain.
@TODO if cdn_basic_mapping exist use that instead of the parallel_css_mapping
@TODO make parallel_css weight more heavy than css_emimage
Comment #24
Peter Bowey commentedRefer #23
@chriscalip
Looking through the latest code @ http://drupalcode.org/project/parallel_css.git/blob_plain/refs/heads/6.x...
The above code methods look good to me.
I will test this 'real-time' today! :)
+1
Many thanks for contributing to Drupal projects!
Comment #25
Peter Bowey commentedRefer #23
It is also interesting reading through other projects / ideas that used this parallel asset method:
See -> http://statichtml.com/2010/use-unique-ips-for-sharded-asset-hosts.html
Overloading of brain food (sorry!)... :)
Comment #26
chriscalip commented@TODO make parallel_css weight more heavy than css_emimage
http://drupalcode.org/project/parallel_css.git/commit/32c84d8 Done.
Comment #27
chriscalip commentedRefer #25 Oh joy! My company website is like that http://www.straightnorth.com
We are pretty much using (img1.straightnorth.com,img2.straightnorth.com,img3.straightnorth.com,img3.straightnorth.com,img4.straightnorth.com,css.straightnorth.com) all pointing to the same ip :(
Comment #28
Peter Bowey commentedRefer #27
@chriscalip
*smile* That is only meant to be a 'heads up' about some 'possible issues' + how some 'typically older consumer' grade ADSL routers offer 'crude' 'firewall' protection... eg: "SYN Flood to Host" :)
Personally, I use a dual-wan Linksys RV082 ADSL2+ on two active ADSL2+ lines - with two static IP's... feeding a dedicated Linux Server (3 x Ethernet Ports / Gateway). In this event, I have 'disabled' the Linksys RV082 WAN firmware 'crud protection' and use Linux 'packet stateful' firewall..
Comment #29
Peter Bowey commentedRefer #26
@chriscalip
Good work Chris! +1
Just one to go: :)
Of interest see the following Drupal CDN links:
http://drupal.org/node/962266
http://drupal.org/node/956164
Notes: Google is pushing a growing number of hits for your module:
+1:)
Comment #30
chriscalip commentedThis is a bit tricky, i am troubled by cdn's approach of only those who knows php will be able to pull this off.
http://drupal.org/node/962266
We need a better approach here:
What do you think of this:
@ /admin/settings/advagg/parallel-css
[X] Use Available CDN Mapping and CDN pick-server
----------------------------------------------------------------
Be sure to read: http://drupal.org/node/962266
----------------------------------------------------------------
URL:
----------------------------------------------------------------
Enter the domains urls you want included separated by each line. Warning dont include a '/' at the end of the domain url.
* For example http://img1.drupal.org
* http://img2.drupal.org
* http://img3.drupal.org
* http://img4.drupal.org
* https://s1.amazonaws.com/drupal_cdn
In addition for SEO purposes (prevent double content) : Please update the .htaccess file
In between these two lines:
# RewriteBase /
# Rewrite URLs of the form 'x' to the form 'index.php?q=x'.
* # Parallel CSS - Start RewriteCond %{HTTP_HOST} img1.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} img2.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} img3.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} img4.drupal.org [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
*
* RewriteCond %{HTTP_HOST} s1.amazonaws.com/drupal_cdn [NC]
* RewriteCond %{REQUEST_URI} !\.(png|gif|jpg|jpeg|ico)$ [NC]
* RewriteRule ^(.*)$ http://www.drupal.org/$1 [L,R=301]
# Parallel CSS - End
----------------------------------------------------------------
Comment #31
mikeytown2 commentedInstead of htaccess rules there is an issue for CDN in regards to SEO. It's fairly high on my priority list
#1060358: CDN and SEO as in it might get done in 2 weeks
Comment #32
Peter Bowey commentedRefer #30
"Oh No", not Apache .htaccess rules 'again'.... :(
"tongue-in-cheek"
I use exclusively Nginx, that poor Apache 2.x 'sod' died for me 2 years past (R.I.P.)
Research Reference: http://drupal.org/node/1060358#comment-4333802
Comment #33
chriscalip commented#32
I mean .... I am giving an option for people to use the CDN mapping and cdn_pick_server instead of using parallel_css mapping and logic.
Pretty much a checkbox in the admin settings page of parallel_css
[ YES OR NO ] [X] Use Available CDN Mapping and CDN pick-server
----------------------------------------------------------------
Be sure to read: http://drupal.org/node/962266
-----------------------------------------------------------------
Comment #34
Peter Bowey commentedRefer #33
@chriscalip
Sounds good to me Chris! +1
I got the 'shakes' when I saw that .htaccess 'thingy :)
Comment #35
chriscalip commentedok dokes. going with that option.
Comment #36
Peter Bowey commentedRefer #30 + #31
For those setup's effected by CDN 'duplicate' SEO a partial solutions exists here ->
http://drupal.org/project/files_proxy
Comment #37
mikeytown2 commented@peter bowey
Not the right solution. We need to send out a 404 at a minimum or a 301 ideally if someone tries to access html content on your server through the CDN.
Comment #38
Peter Bowey commentedRefer #37
@mikeytown2
Thanks, I misunderstood the 'doc' reading @ http://drupal.org/project/files_proxy
In nginx.conf, I use something like this for the CDN private back-channel URI path (what the CDN pulls from):
Then something like this on the Drupal PHP side:
Comment #39
chriscalip commentedhey how expensive is it to get several ips and host accounts and just have it rsync? trying to solve my straightnorth.com and imgX pointing to same ip problem.
Comment #40
chriscalip commentedBTW CDN make use of hook_file_url_alter via cdn_file_url_alter --- that function is a beast with user access checks, cdn testing checks and cdn_devel_page_stats stuff. i am currently pretty much copying and pasting the important parts of cdn_file_url_alter -- or i could go the route of calling cdn_file_url_alter... what do you guys think?
Comment #41
Peter Bowey commentedRefer #39
*Same IP's*
That should only be a 'problem' if the router starts blocking packets. See #28
Comment #42
chriscalip commented#41
Router being the router of the users looking at the site or the router of the hosting company of the site?
Comment #43
Peter Bowey commentedRefer #42
A) = Host / Server router
Side-note: Obviously, you 'hire' hosting. I run my own dedicated server -'in-house' :)
All I pay for, is 2 x ADSL2+ 'public' lines / connections (100Gb x 2 - per-month use)...
In your case, I do not think that a professional 'host' company would have 'that issue' with their modern routers!
Comment #44
chriscalip commented#43
thank you.
Comment #45
chriscalip commentedFirst working prototype of cdn integration very basic.
http://drupalcode.org/project/parallel_css.git/commit/6f62c02
pretty much we still have to go to
@ /admin/settings/advagg/parallel-css
Check the box [X] use available cdn mapping and cdn_pick_server of cdn
this doesnt do the following CDN features:
a.) CDN supports HTTPS
b.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for all users.
c.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for authenticated users only.
Comment #46
Peter Bowey commentedRefer #45
http://drupalcode.org/project/parallel_css.git/commit/6f62c02
Looking good so far! +1
Comment #47
chriscalip commentedthis doesnt do the following CDN features:
a.) CDN supports HTTPS
b.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for all users.
c.) Drupal paths entered in this blacklist will not serve any files from the CDN. This blacklist is applied for authenticated users only.
These will prolly have to be other issues. Right now, I dont know how to pull these off.
Comment #48
mikeytown2 commentedWhy are you copying the cdn_file_url_alter function? Just require the CDN module and be done with it. Or am I missing something? Run the image references in the CSS through file_create_url or if they are running CDN on a non patched drupal, detect it by stealing the first part of cdn_init() (
variable_get(CDN_THEME_LAYER_FALLBACK_VARIABLE, FALSE) == TRUE) and then call cdn_file_url_alter directly.Have it look something like this
Comment #49
chriscalip commenteddoh! or even better
Incidentally this is the one prone to let the relative urls in ../.. which causes a bug like http://drupal.org/node/1183062
I am hoping that somewhere in the process advagg_build_css_bundle always gets run.
Comment #50
mikeytown2 commentedGood idea!
I've added in the fallback logic on my end so advagg_build_uri() looks a lot like #48 (#1185786: allow for URLs to get CDN-ed even if cdn patch is not applied). As for #1183062: Support for URI (path) rather than Domain, how that works is configurable in the CDN module. The reason why it wasn't working is by default CDN disables it's self on all paths that start with
admin/*; I have a special case to handle those now.Comment #51
chriscalip commentedhttp://drupalcode.org/project/parallel_css.git/commit/1dcee3d Done.
Nice one on the CDN fix!
Comment #52
Peter Bowey commented@chriscalip
@mikeytown2
Nice teamwork Mike and Chris!
@chriscalip, your module may have possibly saved me from moving my D6 Core to D7 (a long story within...) .. :)
@mikeytown2, the updated advagg_build_uri() you applied has made the 'great Code Sun' shine here :)
see http://drupal.org/node/1185786. Great, see my comment above to Chris.
Many thanks for a useful module Chris, additionally - we have learned some 'cool stuff'.
Teamwork = Cool!
Comment #53
chriscalip commentedit was! lets do it again sometime.
Comment #54
chriscalip commented