See #734080: By default, don't aggregate jquery.js - having this as true by default means that aggregation is broken by any contrib modules that (a) add JS/CSS conditionally (i.e. not on every page), and (b) don't explicitly set this parameter to FALSE. This is a very common mistake and leads to users downloading a new aggregate file on many pages, downloading duplicate code many times over, which slows pageloads, breaks browser caching and wastes bandwidth.
Basically, all JS/CSS should either:
- be added unconditionally on every page, and preprocess set to TRUE (a suitable approach for frequently used JS/CSS, such as fieldset collapsing, autocomplete etc).
- be added only on pages where it is known to be needed, and preprocess set to FALSE (a suitable approach for less commonly called JS/CSS, such as tablesort, or the views admin interface).
Because of the care needed with the file addition to ensure it is not conditional when setting this option to TRUE, the sane default is to FALSE.
Comment | File | Size | Author |
---|---|---|---|
#230 | drupal.preprocess.230.patch | 38.04 KB | effulgentsia |
#226 | drupal.preprocess.226.patch | 38.07 KB | effulgentsia |
#220 | drupal.preprocess.220.patch | 35.81 KB | mfer |
#211 | drupal.preprocess.211.patch | 37.32 KB | effulgentsia |
#205 | drupal.preprocess.205.patch | 37.08 KB | effulgentsia |
Comments
Comment #2
sunThe fails seem to be real. Probably the tests presume that default value.
Comment #3
catchAre there existing drupal_add_js() and drupal_add_css() calls which need to explicitly add the process if the default has changed? http://api.drupal.org/api/function/system_init/7 for example.
Comment #4
sunoh dear god. system.css is full of styles that belong into admin.css. system-behaviors.css should only be loaded if one of the behaviors is actually loaded. And so on.
From those, only system.css and admin.css should be aggregated.
jQuery + jQuery.once() are defined as library in http://api.drupal.org/api/function/system_library/7 - should be the only two to aggregate by default.
Comment #5
sunThough the question remains... does that condition for admin.css also count as condition? Or slightly re-phrased: When fixing our aggregates, is our goal to have ideally 2 aggregates? (one for regular, one for /admin)
Comment #6
catchNot drupal.js then? What about node.css and user.css?
Comment #7
Owen Barton CreditAttribution: Owen Barton commentedThis is not done, it just does step 1 - removing all the redundant "preprocess" => FALSEs.
Comment #8
sunBefore going ahead with further patches, let's find a consensus first.
1) drupal.js definitely should be aggregated.
2) What about the high-level goal? Just treat any condition as a condition? Or should the ultimate goal be to have 2 aggregates - one for all regular pages, and another for all administrative pages?
3) Those system*.css stylesheets definitely need to be cleaned up in a separate issue to apply proper preprocess options for each one.
Comment #9
pwolanin CreditAttribution: pwolanin commentedI tried it here also: http://drupal.org/node/734080#comment-2878458
I could not get it to work right. I think we jsut need to disaggregate jquery.js
Comment #10
Owen Barton CreditAttribution: Owen Barton commentedI have been working on a patch to set preprocess back to TRUE where appropriate.
I think the basic direction is that css/js files that an "average" user would require on an "average" page load should be aggregated. This might include some basic content entry type stuff, especially if it is tiny (e.g. < 1k). Files primarily for admins, or stuff focused on a single occasionally visited page are not worth aggregating, at least for now.
I think a separate aggregate for /admin is less important - admins are normally on the site for multiple page loads, and now we have the version querystring and mod_expires rule for non-aggregated files they shouldn't need to even to do a 304 check after the initial load. If we can come up with an elegant reliable approach we could consider this, but I think that is probably something for Drupal 8 (or at least a different issue). One way to approach this would be to associate css/js files with permissions for the functionality they are associated with. We could then use those permissions to build a custom aggregate file per role. I think using an "if {}" based interface for splitting aggregates will just be too hard to get right.
Comment #11
sunBumping to critical:
Comment #12
sun@Owen:
That only makes sense if the "basic content entry type stuff" is loaded unconditionally. Otherwise, clients will re-download an entirely new aggregate, from which most files were already contained in the "unconditional global default" aggregate.
Comment #13
Owen Barton CreditAttribution: Owen Barton commentedYes - that is exactly the point of this issue :)
...anything that is set to be aggregated would be included unconditionally, if it is not already.
I will upload an initial cut at a patch tonight.
Comment #14
sunCross-linking: #630100: Check non-preprocessed CSS/JS is loaded only when really needed
Comment #15
Owen Barton CreditAttribution: Owen Barton commentedHere is a work in progress patch - it is certain to need some tweaks here and there, and I haven't tested it at all yet. I have gone through all the module and include code however. I will go over it some more tomorrow - feedback welcome in the meantime.
For things that looked borderline I tended towards aggregating, on the basis that it is easy to remove the chunk if we decide it isn't worth it. It is easy to test things using charles proxy or similar (see #81835: Separate CSS files increase initial page load time by 40% to 70% for a method) and determine the effect this would have on the page time for an aggregate (a) without the chunk, (b) with the chunk and (c) without the chunk, but with an additional http request to download the js/css separately. If (a) ~= (b) and both << (c) then that suggests aggregation is optimal.
To test, we should actually start with the patch in comment 7, and ensure that we don't get aggregate files anywhere even when aggregation is enabled (I got one report that we get, suggesting we have a TRUE lurking). We should then test this patch and run through pretty much all the CSS and JS on the site, making sure nothing has broken - obviously this could take some time. We also need to check that we don't get any duplicate aggregate files, with the sole exceptions of one aggregate file per "type" (print css, footer js etc), per language and per theme. The aggregate will change when modules are enabled or disabled, which is also expected.
For anyone comparing notes with #630100 they are actually both correct - js/css that is aggregated should only ever be loaded on hook_init or equivalent (every page load), wheras everything else should be loaded as rarely as possible, only when needed (using #attached for rendered stuff).
Comment #16
Owen Barton CreditAttribution: Owen Barton commentedHere is a variant of patch 1 that removes a TRUE that snuck in on drupal.js, and a spider confirms that we have caught all these and are starting with a clean slate (i.e. we don't have any accidental conditionally aggregated files left in).
Comment #17
Owen Barton CreditAttribution: Owen Barton commentedOK, fixed a couple of minor issues with the patch and did some testing. Here are the results which, as you can tell show the issue here really dramatically - even on core (where one would think we know what we are doing!) we get it wrong enough to create a significant number of duplicate aggregates. Imagine this multiplied by a few contrib modules that also get it wrong and the issue is even more obvious.
Before:
After:
As you can see here, the patch results in a 98% reduction in aggregate file download size, and avoids 33 unnecessary http requests for aggregate files (the latency of which add perhaps 50-100ms each to the pages on which they occur). The other thing to note is that a simple check confirms that the vast majority of the lines in the before case are duplicated.
In the after case we have an interesting issue with CSS - there was actually only 2 unique files transmitted, but the aggregate version number was incremented leading to one set of data (with different filenames) being downloaded 3 times, and another set downloaded twice. If we could resolve this, there would only be 14K of CSS transfer in total. There are actually 2 patches that take similar approaches to resolve this - #678292: Add a development mode to allow changes to be picked up even when CSS/JS aggregation is enabled (mtime based) and #721400: Order JS files according to weight, don't change filenames for aggregated JS/CSS (file contents based). I tested the second patch on-top of my patch (they both applied cleanly, bonus!) and had the following results:
With both patches:
This pushes the bandwidth saving to over 99% compared with the current code, and cuts the number of http requests (for aggregate files) down to 4 total.
For anyone interested in reproducing these, here is my test routine (you will probably get slightly different numbers if you test, but overall the same pattern):
Next up is to make sure simpletests pass, test that JS and CSS across Drupal is still working as expected. We may want to tune the list of things included/excluded to add anything that appears frequently for anon/regular-auth users (with more typical permissions!) and remove anything that is more obscurely used than I thought (although given the above, it may be better to commit first and tune in a separate patch).
The other thing I forgot to mention is that in addition to getting "one aggregate file per "type" (print css, footer js etc), per language and per theme" you also get a separate aggregate file when it is necessary to place one or more non-aggregate file in between to maintain the correct code order. In my tests this did not create any duplication, although it could if conditionally included non-aggregated files pop up on one page, where they are not on another. This should be fairly rare, and I can't think of a good way around this without restricting/changing the order people can put things in, but perhaps it is something to look at for Drupal 8.
Comment #19
Owen Barton CreditAttribution: Owen Barton commentedDon't expect this to catch all the tests yet, but worth a try.
Comment #21
sunwow, just wow, I've to say those stats look just amazing! More or less exactly what I imagined.
Comment #22
bleen CreditAttribution: bleen commentedcool beans!! subscribe
Comment #23
q0rban CreditAttribution: q0rban commentedsubscribe
Comment #24
pwolanin CreditAttribution: pwolanin commentedthis is important, but not critical. We can exclude jquery.js from aggregation as a stop-gap.
Comment #25
effulgentsia CreditAttribution: effulgentsia commentedSubscribing.
Comment #26
andypostsubscribe
Comment #27
alanburke CreditAttribution: alanburke commented#19: preprocess_false_4.patch queued for re-testing.
Comment #29
alanburke CreditAttribution: alanburke commentedAn attempted re-roll.
Comment #31
alanburke CreditAttribution: alanburke commentedAnother attempted re-roll
Comment #32
alanburke CreditAttribution: alanburke commentedI'll get this right eventually...maybe
Comment #33
alanburke CreditAttribution: alanburke commentedComment #35
Owen Barton CreditAttribution: Owen Barton commentedThe previous patch included the patch in the diff (mmm, recursion!) - here is one without it. There are basically 2 tests failing - the CSS test should be pretty straightforward to fix, but the AJAX one is a bit confounding - it is hard to see how it is related to this at all.
Comment #36
Owen Barton CreditAttribution: Owen Barton commentedThrough a tedious process of bisecting the patch chunks and testing it is clear that the AJAX Framework failure is caused by the JS loading being moved from '#attached' to hook_init (see attached files). This is being done done on the basis that the files are tiny and hence the bandwidth cost of adding them to the aggregate of negligible relative to the large latency cost of loading the file later on. Of the 4 files, the comment JS seems the most important one to aggregate, since that would be present on most site pages on many sites.
I am not yet clear what exactly the cause of the problem is - the AJAX test simply loads a JSON callback, decodes it and checks some parameters. I have looked at the output from the callback and it is identical both with and without the patch, and decodes to identical arrays. As far as I can tell from http://api.drupal.org/api/function/drupal_process_attached/7 all that is doing is calling drupal_add_* - but my guess is that there must be some dependency check, or some other smoke and mirrors hidden away somewhere.
@sun - do you have any ideas as to what is going on here?
Comment #37
sunLet's get some testbot results on these. I've had issues with JS/CSS via #attached/hook_init() before, but need a test case name and error message to recall. All or at least most JS/CSS is supposed to be added via #attached.
However, I'm quite puzzled by what happened here -- starting from #17, we suddenly changed most (or all?) JS and CSS in core to be loaded unconditionally? Most of those files are actually needed on a few certain pages only. No, I really don't want to load book.js and book.css on each and every and all pages of my site, just because I thought it would be a nice idea to provide a tiny little user handbook somewhere, deeply buried into the site. Let's rethink those latest changes, please. I'm also unable to find a reasoning for them in this issue.
Comment #38
alanburke CreditAttribution: alanburke commentedOne of the problems with the current aggregation process is that the code must make assumption about the nature of the final site when deciding which files to add to the aggregated file.
"How often will x.css file be needed - should it be in the aggregate ?" has to be decided for all files, but the answers can, and will, vary wildly from site to site, depending on use case of each module.
I don't think core needs to include all of the functionality of a module like
http://drupal.org/project/sf_cache
but if anything could be done to make that an easier contrib module to write, than that would be useful.
I support the decisions made in this patch re which files to aggregate.
The performance numbers above show that this patch should improve front end performance for all sites.
To that end, I'm tagging this as major - It's so close to being critical for the performance improvement it brings.
Comment #39
alanburke CreditAttribution: alanburke commentedComment #41
sun1) You cannot load arbitrary JS and CSS on all pages. I mean, you can, but it sucks. It also reveals bugs. We don't want to reveal bugs at this point in time. :P
2) WTF? This change has an effect on inline CSS? (was somewhat stripped/minified previously, is now output raw/as-is)
No, really. If at all, then we need to defer all of those CSS+JS preprocess changes into a separate patch/issue. It's much more important to fix the default for preprocessing.
Making those tests pass. Well, and, making manual testing back work. I don't want to see that comment JS fix committed as part of this issue/patch. Let's scale this patch back down to its essentials, i.e., attached patch along the lines of #16.
Comment #42
sunMost of these permanently added JS/CSS is highly debatable anyway. The only one I'd remotely consider would be field.css, as you can probably assume that there is almost no page in D7 that does not contain any fields (in whatever kind of rendering).
But most of the other CSS, and more importantly JS, is only ever needed in conditional situations. It would be insane to load that for everyone, anytime, and anywhere. The JS/CSS is mostly bound to specific form elements, which may appear in... what? 1 of 1,000 pages for every 100,000th site visitor?
Consider a site that does not even allow visitors/users to post comments or whatnot. Gazillions of innocent web users will needlessly load a bloat of JS/CSS for some form elements, which they will never never ever never see.
To do what you had in mind here would require different sets/groups of aggregates, per user role(s). If that would be the case, then unconditionally loading all of this CSS/JS for registered users would make sense, I agree. Until that is possible, sf_cache will be the friend of all who need to tweak this further.
Comment #43
Owen Barton CreditAttribution: Owen Barton commented@sun - I actually agree with the majority of these changes (the selection of what was in/out was really just a first pass), and I agree that we should fundamentally be targeting what anonymous users would need on a typical page on a typical site. Of course some sites would ideally have more stuff aggregated (most users are authenticated, for example a private forum) and some less (minimal brochure type sites), so we clearly can't get it "perfect" for everyone (D8 watch out!).
Some specific things I think need a bit more consideration:
Let's consider the scenarios the effects on initial hit for an anonymous user on different site configurations:
Given that (a) a significant number of sites with comment module enabled will have them enabled for anonymous users to view, or even add (hopefully with Mollom or CAPTCHA!), and (b) a significant number of sites without commenting will simply disable the module I think it makes a lot of sense to aggregate comment.css. If this file is not aggregated you add an additional http request to any comment enabled page (which is most pages on many sites), which adds a serious amount of roundtrip latency (potential browser pipelining aside). On the other hand, the cost to sites that have commenting enabled for a few authenticated users is negligible - less than 1ms (comment.css is only 246 bytes - hardly "bloat").
I think this argument applies for the following files:
comment.css
field.css
openid.css
openid.js
As you described, field.css will be needed on pretty much any site page that has fields, which should be pretty much every page on most sites. If OpenID is enabled, then there is a good chance you need these files on every page (assuming the user login block is enabled), and if it is a disabled then the files are not aggregated anyway. Attached patch makes these changes - feel free to ping me on IRC if you want to thrash this out :)
Comment #44
sunAgain:
openid, we need to defer to a separate issue. Neither the JS nor the CSS is prepared for being loaded on any page.
To get comment-node-form.js back into the game, you need to merge in the fix from my patch in #41. However, that's rather a stop-gap-fix, too. I'd prefer to defer that, too - as the bottom line is: comment-node-form.js requires form.js, but form.js is rightfully only loaded on pages containing forms. See #749748: Contact, register and comment forms do not prefill with user info from browser
Sorry if I didn't make myself clear before.
Comment #45
sunAnother bottom line being: A regular Drupal page, without any optional modules additionally enabled, does, at least until just recently, did not load any JavaScript at all. OK, I'm actually not sure whether that still is the case, but seeing that it was the case was quite impressive just recently (i.e. only some weeks ago). :)
Comment #46
Owen Barton CreditAttribution: Owen Barton commented@sun - my patch only included comment.css and field.css, not comment-node-form.js (please see #43 for explanation). I am not aware of any issues that these 2 css files would have on pages without comment/fields.
I can confirm that openid.js causes errors when added unconditionally - let's just exclude openid files from aggregation for now.
Comment #47
sunAight! Works, thanks!
Also, bumping back to critical for two reasons: 1) There's an API change here. 2) We absolutely need to fix the default preprocess behavior for #attached, which has been introduced for D7.
Comment #48
aspilicious CreditAttribution: aspilicious commentedIf I'm correct Bartik needs some adjustments too?
Comment #49
sunIn addition to #48, I re-considered: We also want to revise and clarify the existing PHPDoc description for the "preprocess" option for drupal_add_js/css(). Absolutely required to make D7 developers understand its proper usage.
Comment #50
marcingy CreditAttribution: marcingy commentedChanging to major as per tag.
Comment #51
MustangGB CreditAttribution: MustangGB commentedTag update
Comment #52
Owen Barton CreditAttribution: Owen Barton commentedI took a pass at the doxygen, made it quite a bit more detailed and specific, and also added some for drupal_add_js. The docs could use some review as I got to the point where I couldn't tell if my tweaks were an improvement or not. I also removed the redundant option from Bartik as suggested by @aspilicious.
The other thing I noticed and fixed is that the regular css/js added by themes (in the theme.info files) was not set to preprocess, but should be (as these files are always added on every page). As mentioned in #17, this does create duplicate caches per-theme, but this is not an issue as changing themes is rare.
Comment #53
pwolanin CreditAttribution: pwolanin commentedThis looks basically good to me.
how big is the search module css? That's the only case I question moving into hook_init.
Comment #54
sun"by a typical user on the majority of site page views" sounds too limiting to me. The next paragraph is much more precise and correct: "unconditionally" is the magic word.
(same applies elsewhere)
s/drupal_add_css/drupal_add_js/
Ideally, we'd split this into search.css and search-results.css.
Powered by Dreditor.
Comment #55
Owen Barton CreditAttribution: Owen Barton commentedI tried to relax the wording around when preprocess should be used - alternative proposals welcome.
I fixed the add_js issue.
I don't see a good reason for splitting search.css - search is one of the most commonly used functions on many sites (the only function for some users!), and the cost of bundling the css with the initial view is a negligible (IMO) 2ms or so, relative to the additional http request on search which would be more like 40-100ms. Also, the advanced search form is very often on the SERP, so splitting it up would have no advantage in these cases. Either way, I think it should be discussed in a separate issue.
Comment #56
pwolanin CreditAttribution: pwolanin commentedNot disagreeing about including the search css, just one that wasn't 100% obvious to me that should always be loaded.
Given that search.css is only 624 bytes, splitting it up or not including it is, indeed, likely to cause more pain than gain.
Comment #57
sunNo, this still doesn't make sense for me. 1st para effectively mentions exceptions, the 2nd mentions *unconditionally*. Only the 2nd is right, and is actually what this patch changes throughout Drupal core.
Powered by Dreditor.
Comment #58
Owen Barton CreditAttribution: Owen Barton commentedThe intent of the first para is only to give some guidance as to when preprocess should be used in the first place (i.e. for CSS/JS relating to pages a regular visitor would likely see, but not admin pages). The second para should detail exactly how it should be used (i.e. added unconditionally) once it has been decided that this is the correct approach for a particular file. Could you take a pass at rewording this in a way that captures this for you, at least the first para?
Comment #59
sunMuch shorter + up to the point.
Comment #60
sunAdditionally wondered whether all of those details about preprocess shouldn't be moved into @param $options - preprocess:
But let's defer that to a separate, minor issue. This patch is RTBC, unless you disagree, and if not, please do.
Comment #61
Owen Barton CreditAttribution: Owen Barton commentedThat looks great, I think this is ready to go!
Comment #62
Dries CreditAttribution: Dries commentedSad but necessary. Fortunately, we have good documentation on this so hopefully module developers will know to do the right thing. Committed to CVS HEAD.
Comment #63
rfayIMO this doesn't need an API change announcement - it's not breaking backward compatibility in any way I don't think. If you think otherwise let me know.
Comment #64
sunIt needs to be documented in the module upgrade guide.
@rfay: Announcing this change would be a good idea, I think. We need to make sure that each and everyone understands how preprocessing is supposed to work. JS/CSS that is loaded in already ported modules should be verified.
@Dries: Actually, I'd consider the performance gained by this bugfix anything but sad.
Comment #65
moshe weitzman CreditAttribution: moshe weitzman commentedWow, just learned about this issue. Sorry for my tardiness.
$preprocess defaulting to FALSE may not be so wise. I really thinks devs will miss this and our typical number of http requests will increase. our own docs say "Load fewer external objects. Due to request overhead, one bigger file just loads faster than two smaller ones half its size.". i get that we were re-aggregating a lot. But thats still one http request versus several. I didn't see much discussion here about this tradeoff.
tough to balance all concerns here.
Comment #66
mfer CreditAttribution: mfer commentedThe discussion around this patch was missing one major element. The context in which pages were served. It is true that downloading css and js more than once is not as good for people navigating a site. But, we have to make the assumption that people are navigating more than one or two pages on a site for the performance benefit to be true.
If someone visits a site, looks at one or two nodes of the same type, and then leaves the benefit of breaking the css and js into multiple files on that page is lost. Those users saw slower performance because of the network latency to download more files.
This patch made the assumption, that I did not read, that we are targeted at users who visit many different types of pages on the site with many different configuration.
For many users visiting many sites this patch will increase the amount of time to render a page and the bandwidth. This is a problem that should have been fixed in Drupal 8 with something that targets multiple types of sites and users. This patch didn't produce a net benefit. It switched the benefit from sites where users visit one or two pages to sites that have users visiting lots of pages with lots of differences.
Comment #67
DamienMcKennaCould an option be added to the default installation profile to let users select which they would prefer, with it defaulting to what works best for small sites (i.e. for sites where people don't know what they're doing)?
Comment #68
Crell CreditAttribution: Crell commentedA per-site default would clash with whatever the developer coded in the first place.
There's a much better solution: Just always load your CSS and JS and make sure the JS self-terminates fast if it doesn't apply to that page. The whole thing downloads once and then you're done. If you're calling drupal_add_css() outside of hook_init(), ask yourself what good reason you're doing it for. Odds are you'd be better off just adding it everywhere to be done with it. Aka, it's a bug.
Comment #69
moshe weitzman CreditAttribution: moshe weitzman commentedRight. Conditional adding of css/js should be rare. There is very little benefit given our aggregation.
This patch should be rolled back IMO.
Comment #70
mfer CreditAttribution: mfer commented95% of Drupal sites run on shared hosts. What type of default aggregation best suits those sites? Before we can say that we need data on how visitors use those sites.
Of the smaller sites I've run (more than a few) default to preprocessing files is better. I looked into it earlier this year.
Switching the default to something the helps larger and more complex sites is a change that will hurt the smaller sites I worked on. For those larger sites to optimize their js/css they should use hook_js_alter and hook_css_alter to make the changes for themselves. The default settings should benefit the 95%.
This change should be rolled back.
Comment #71
sunComment #72
mfer CreditAttribution: mfer commented@sun - I would love to have the best of both worlds. I had planned on attacking that when D8 opens up because I cannot come up with a simple solution that does so. It would be a big change too late in the game. This is about optimization and that is just not all that simple to do. You need use cases to optimize to and the ability to implement optimization based on some customizable logic.
I did some basic testing of my blogs use case. I found a 23% increase in the time it take to render the page from before to after the patch. This was done on my local system so network latency does not come into play either. This was due to the use of the comment form on the full node view. When that happens the following non-preprocessed JS files are added to the page:
The increased time to render the page was due to having to go get each of these files. I did the testing in FF 3.6 which supports async downloading of files. Many browser versions that are in popular use do not support async downloading of scripts. Looking at the July 2010 stats and major browser versions that do not support async downloads (meaning they have script blocking) I see the number is over 31% of usage. This is significant. For these users the time it takes to render the page will be even greater than what I experienced.
I'm all for solving this problem. But, making this broad change is going to make the default performance for rendering pages worse for many common use cases.
I would solve this with #352951: Make JS & CSS Preprocessing Pluggable. I was waiting for handlers to make it in which I should not have done. This will have to wait for D8 and is the better solution.
Comment #73
sunAt least jquery.js and jquery.once.js should be preprocessed.
form/textarea/filter basically fit into the comment module discussion above. However, whenever we set files to preprocessing TRUE, it's required to also move them from #attached into new hook_init() implementations.
Comment #74
mfer CreditAttribution: mfer commented@sun - Your patch is a no brainer. If it passes the test bot (and it will), and it should good to go.
But, this doesn't address the larger issue of this issue. The change made here optimized Drupal default for the minority use case rather than the majority.
Comment #75
mertskeli CreditAttribution: mertskeli commentedevery site visitor would have to download >= 500 KB
The question is why it is >= 500 KB ?
Actualy, it is already >= 500 KB for JS ONLY + another >= 300 KB for CSS.
And it is already >= 100 JS files and >= 100 CSS files...
I don't know what average 95% users do, but I have to alter 95% of core css and js files and to return to normal html behavior with 20-50 KB of custom CSS (just 2 actualy - system.css and admin.css).
Comment #76
mfer CreditAttribution: mfer commented@mertskeli - overriding css is a different issue from this one. If you want to replace core with your own css all together there is already a method for that which does not affect hacking core.
Also, the size of the download is only part of the picture when it comes to front end performance. One file that is larger that two smaller files loads faster in a browser.
Comment #77
mertskeli CreditAttribution: mertskeli commented@mfer, my English must be very bad. I haven't said a single word about hacking.
One file that is larger that two smaller files loads faster in a browser.
I do understand what server requests are and it has been mentioned in #65.
Two smaller? How about current 100? Is there any page which adds less than 10?
Some css are just few lines. There are so many classes for every element, there's such a mess in styling, that Seven theme has to use reset.css (which still has duplicates) to start styling all over again. Corolla was trying to do it, but was abandoned. So I have to use css_alter function to clean it up, otherwise it becomes uncontrollable.
And, the size of the download has always been the most important part any web performance.
Comment #78
Crell CreditAttribution: Crell commented@mertskeli: Actually no, that's not true. The size of the download is a major factor in web performance, but not "the most important". For Drupal 5, when the CSS compressor was first added, benchmarks showed very very clearly that it takes a browser longer to download four 25k files than to download one 100k file. The extra round-trips to the server, and the HTTP overhead of doing so, can be quite slow. Additionally, even if the files are already cached by the browser on subsequent requests the browser still has to send a "is it updated?" request to the server for each file; even though it downloads nothing that's still 4 HTTP connection round trips instead of 1.
The question is where the trade-off point is between one big file, not all of which you need on a given page, or lots of small targeted files, with all the added network traffic that generates. It has nothing to do with how complex the CSS is or if you need to override it; that's totally irrelevant here.
What this patch has done is shift from a "we'd rather waste bytes in one big download" approach to a "we'd rather waste network traffic for lots of request" approach. For the common case, I do not believe that to be a good trade-off. For the uncommon case, there are tools available to switch approaches if needed. But that burden should be on the uncommon case and the "huge sites with a professional staff who knows how to do it", not on the casual site built by a button-pusher.
I am also in favor of a rollback here.
Comment #79
mertskeli CreditAttribution: mertskeli commentedThe question is where the trade-off point is between one big file, not all of which you need on a given page, or lots of small targeted files, with all the added network traffic that generates.
Evidently, small targeted files is a better option. But not when there are 100 of them.
Comment #80
sunIn the end, the question is how this benchmark effectively scales. It might work when re-downloading ~100k on every page from which ~50% are duplicate. But in D7, we have lots of more JS+CSS, especially in the form of generic/re-usable libraries. It's clear that this math has a limit, and heavily depends on one's connection speed and latency.
Also, I didn't hear any concrete problems with the proposed solution. As far as I can see, it would be a smart compromise between one-time/page visitors and authenticated users that normally visit more pages. Thus, nicely resolving the 80/20 default use-case.
Comment #82
Crell CreditAttribution: Crell commentedNo, that's the point. Prior benchmarks said exactly the opposite.
Comment #83
mertskeli CreditAttribution: mertskeli commented@Crell, I mean 100 separate css files, like we have now in D7.
Comment #84
mfer CreditAttribution: mfer commented@mertskeli - Prior to this patch the default was to preprocess all css files turning many files into one per media type. The patch that landed here changed that and defaulted to NOT preprocessing them.
If we were looking for a real performance gain we should use something like http://drupal.org/project/sf_cache. Bundle preprocessed files. At the time all these discussions came up it was too late to make such a feature change.
Comment #85
Owen Barton CreditAttribution: Owen Barton commented@moshe_weitzman, @Crell - I think you have really misunderstood the point and value of this patch.
Drupal's aggregation system is completely, utterly broken without this patch. This is coming from the person who proposed and argued for aggregation in the first place and also did much of the benchmarking that Crell referenced earlier.
I absolutely agree that "one large download is better than many smaller downloads" but we should are in danger of oversimplifying. If users are downloading a duplicate aggregate file on many page loads (as they are on Drupal 6 sites) they are breaking browser caching and essentially replicating the huge cost of a "first site visit" on every single subsequent page view - a 50-200Kb duplicate aggregate can add unnecessary seconds to subsequent page loads. This is certainly slower than with this patch, where you will get an aggregate file on the first page view, but on subsequent page views you either have no request at all, possibly a request or 2 for a small additional files on more obscure pages, or some small 301 checks (if mod_expires is disabled) on already downloaded files.
If you agree that sending duplicate aggregate files is broken, then this patch is a no brainer. Aggregation should be something that is selected for a particular file, knowing that it will be needed by "most visitors on most page loads". Files that are needed on 1 obscure admin page (which actually make up a huge proportion of our CSS/JS) should not automatically be aggregated and served to all users, just to make that visit by one admin user once a month 100ms faster.
This is also not a safe default at all - files that are added conditionally, yet set to aggregate to TRUE (since it is the default) happens in pretty much every contrib module I have looked at. Even in core, where you would hope to know better we are doing this all over the place. An analogy would if we had parameter on the cache_get() function that defaults to TRUE, and will clear the cache unless you remember to set it to false - this would break Drupal's caching in the same way we break browser caching.
This patch is all about optimizing for the common case - regular site visitors. The current situation is optimized for no-one - you have to choose between sending your users lots of tiny files (aggregation off) OR sending them huge files again and again (aggregation on) which is also slow.
The intent with this patch is to optimize for regular visitors on typical Drupal sites - in other words the idea is that for the vast majority of visitors they should need only the aggregate file - no additional requests - over the course of their visit. If we need to tweak what is included/excluded in the aggregate to better achieve this goal then I am totally happy with that. As you can see from sun and my discussion I was originally in favor of aggregating a somewhat larger number of files and he was in favor of a smaller - we settled on somewhere in the middle.
This patch does not get us exactly where we need to be, but it is a huge improvement. It is not currently possible to optimize for every possible case - but let's compare 2 possibilities (these are invented, but I think typical numbers):
A) We optimize for regular site visitors on typical sites. Regular visitors would have to download perhaps 30 files with aggregation disabled. With it enabled they have to download 2 files, each perhaps 50Kb. Editors/admins on this site have to download these aggregates, plus perhaps 50 additional files over the course of a few page loads.
B) We optimize for site editors/admins (which seems to be the proposal). Regular visitors have to download 2 files, but now they are 300 or 400Kb each. Admins get the same.
In the second scenario the initial page load for regular visitors would clearly be several seconds slower than in the first, due to the huge file size. Depending on the connection latency and speed, this could even be slower for regular visitors than if aggregation was disabled in the first place. For admins/editors the first scenario is slower, but the files trickle in slowly (not the huge 30 file pileup you get with aggregation disabled) and of course is mod_expires is enabled then they have no additional requests at all once they have visited a few pages (and of course this is expected behavior for admins/editors).
In Drupal 8, I think we can do better than this - the 2 main changes I would like to see:
1) Add a declarative approach for adding CSS/JS that should be preprocessed - i.e. a new or existing hook that returns an array (similar to, or could reuse our library system). This can safely default preprocess to TRUE and would be much harder to break by accidentally adding files conditionally. We could still keep the regular functions for adding files on really obscure pages if we want, but I think we could probably move the bulk of the code to a declarative model.
2) Through the returned array, files can then be associated with permissions that indicate the pages/sections that need that CSS/JS. We can then create aggregate files per-role that contain exactly what users of that role (or role combo) need.
Comment #86
mfer CreditAttribution: mfer commented@Owen Barton I understand you good intent in making this change and I am all for optimization. But, ahat is a typical use case for a user on a Drupal site? I think this is what is under more debate than anything. If you are talking about a drupal blog the typical use case is someone visiting a full node view. Maybe they visit a few posts. Each of these are blog node types with the same css / js package. Having them all compressed into as few files as possible is best for this use case.
On the flip side you have a site like drupal.org. Where users visit multiple pages with different combinations of css / js. The D6 forces lots of duplicate downloading over the course of browsing a site.
Different use cases have different optimal solutions. In D6 the solution was beneficial to the former case described here. The change on this post is more beneficial to the latter case here. Which is the default case we target out of the box in Drupal?
In D7, if we want to change the preprocessing values using hook_js_alter(). The defaults should be for the typical use case while this hook is used for the less typical installations.
While I believe the default should be to enable preprocessing, Owen has a point in not preprocessing a lot of files. I would keep the default on while going through Drupal and optimizing what is preprocessed and what is not. Even if preprocessing continues to stay off by default someone needs to review all the css and js files in Drupal and clean up the non-optimized mess we have. The fact that jquery.js and jquery.once.js are not preprocessed tells me this patch went it without cleaning up the core files to be more optimized.
As for D8, I think we need to change approaches. But, I'll leave that for D8 and it's issues :)
Comment #87
sunI think that Crell/mfer only suggest to revert this patch, not to do very large aggregate files of all possible files - although that would be the only logically correct intermediate solution when following this direction. Otherwise, everyone, regardless of what page visited and how many pages visiting, and regardless of user permissions, has to (re-)download quite a lot of duplicate code.
Trying to summarize:
- Without this patch, Drupal generates JS aggregate files ranging between 80 KB and 180 KB and CSS aggregate files ranging between 30 KB and 60 KB on a typical Drupal site (D6, that is). Thus, visitors are potentially re-downloading between 120 KB and 240 KB per page. Aforementioend values are based on D6 - D7 has much larger JS/CSS files. These defaults are entirely breaking native browser caching.
- With this patch, we restore compatibility with browser caching as well as relatively common web server extensions like Apache's mod_expires. Only unconditionally added files are aggregated, so site visitors do not re-download any duplicate code. This also means that there will likely be just one reliable aggregate file (overall). Of course, all of this presumes that site visitors are actually visiting more than just a single page.
- To precisely answer what exact numbers are invalidating the math of "one aggregate fits all", we'd need benchmarks.
- Patch in #80 restores one-aggregate-file for anonymous users (only). But, if anonymous users happen to visit more than one page, the situation gets as evil as in D6; they likely have to download a new, large aggregate file, potentially on every single page they are visiting. Overall, however, that would lead to best of both worlds: large aggregates for anonymous users (presuming they'll just visit one page) and a single aggregate optimized for multiple page views for authenticated users.
- Patch in #80 also fixes jquery.js and jquery.once.js not being aggregated due to a wrong array key location.
- If we decide to revert, then we should manually revert the defaults only, as I think that all other changes are good + wanted, regardless of the default.
Comment #88
mfer CreditAttribution: mfer commented@sun - After chatting with Owen in IRC I'll stop calling for the revert of this patch if we do a thorough review of the css/js in core and make sure the preprocessing settings are appropriate for them. I imaging there are more cases like jquery.js and jquery.once.js that slipped through. The 23% increase in time to render the page in my tests is unacceptable but some tweaks to the current situation could help minimize that.
I don't like the idea of anonymous users being handled differently than authenticated users. That leaves more to debug and more painful debugging :(
Comment #89
Crell CreditAttribution: Crell commentedI have argued in the past that conditional CSS is a module bug more often than not for exactly this reason, so auditing our use of it (and reducing it) is a good thing that I fully support. I'm also completely behind moving to a more declarative model in Drupal 8 as Owen suggests. (Module info files need stylesheet and script arrays!) That's for later, though.
Comment #90
sun@Crell:
If it was a bug, then you would be entirely invalidating the idea and conceptual design behind #attached. We're using conditional CSS and JS files throughout all of core. I'd consider any extreme solution attempt, such as declaring all conditionally loaded files as bogus, as dangerous, negatively affecting Drupal's modularity.
Comment #91
Owen Barton CreditAttribution: Owen Barton commentedOK - so I have done some benchmarks, which I think back up most of the assertions I made in my previous post.
These were all done on localhost to avoid the internet, which is far to variable to be useful for testing, using Charles Proxy on it's default setting to introduce artificial throttling and latency (256kpbs, 40ms). Tests were on a fresh D7 install, with dummy content generated and all modules and blocks enabled (to try and simulate a production site the best we can). All tests are averaged over 5 page loads, measured with Firebug in Firefox.
Using these we can evaluate different scenarios - let's say we have an anoymous user that visits the site for the first time, then clicks on 2 pages that wouldn't (in Drupal 6) generate duplicate aggregates, and then 2 pages that would (e.g. going to a listing page and a user profile page). Comparing A and B and also B and C, we can approximate 0.1 seconds (100ms) for every 4 additional http requests, which is pretty typical.
Patch rollback (same as Drupal 6, from an algorithmic point of view): B+D+D+E+E = 9.7 seconds
Drupal 7, as is, assuming no extra http requests: B+D+D+D+D = 8.3 seconds
Drupal 7, as is, assuming <4 extra http requests: B+D+D+D+D + 0.1 = 8.4 seconds
Drupal 7, as is, assuming <8 extra http requests: B+D+D+D+D + 0.2 = 8.5 seconds
Drupal 7, as is, assuming <16 extra http requests: B+D+D+D+D + 0.4 = 8.7 seconds
Drupal 7, tweaked inclusion, assuming no extra http requests: C+D+D+D+D = 8.2 seconds
Drupal 7, aggregating all files (as Moshe suggested): F+D+D+D+D = 28.2 seconds
Let's make it easier on the status que say that this time the visitor visits 4 pages that wouldn't cause an aggregate, and only 1 that does:
Patch rollback (same as Drupal 6, from an algorithmic point of view): B+D+D+D+E = 9 seconds
There are several takeaways here - the most obvious is that while latency is critical, reducing download size is also critical. Hitting even a duplicate aggregate even once adds enough wait time that it is slower compared with properly non-duplicating aggregate even with a fairly high number of additional requests for non-aggregated files. Note that the "duplicate" aggregate file in this case was a measly 22KB - your production sites will likely be an order of magnitude bigger than this, amplifying the effect. Equally, adding lots of rarely used files to the aggregate is also is a very poor strategy.
I have also included a run of "tweaked inclusion" which assumes that we will be able to refine what is in/out of the aggregate a little further, to the extent that my sample site could use only the aggregate. Including jQuery for example is a good idea, for example - openid is another contender (especially since it was coming up in the middle of the CSS aggregate, splitting it in 2). Note that the improvement of this was pretty minor though - 0.1 seconds - much less than the decrease with a duplicate aggregate file.
In short, duplicate aggregate files should be avoided at all costs - even if that means adding a small number of additional http requests to some page loads. We should not be looking at aggregation as a "cure-all" for all our front-end performance problems, but rather as a tool to eliminate the bulk of the latency for the most common situations.
Comment #92
Owen Barton CreditAttribution: Owen Barton commentedIn terms of #attached - I think we essentially need to push this in 2 directions. We want to load all the common stuff into an aggregate (ideally an aggregate per role-combo), but load the less frequently used stuff (which includes most forms) as conditionally as possible, ideally via an lazy- loader.
Comment #93
moshe weitzman CreditAttribution: moshe weitzman commented@sun - thats a bit overstated. #attached was invented for those infrequent cases when conditional css/js is desireable. just because we have #attached does not mean that we need to overuse it.
thanks for the benchmarks, owen.
this brings us to one of the underlying issues here which is that noone trusts contrib developers to set the right value for $preprocess. one option we make drupal_add_css() and druapl_add_css() private functions by prepending an underscore. We then introduce drupal_add_css_always() and drupal_add_css_infrequent(). those two new functions would call into _drupal_add_css/js and set right value for $preprocess. this forces devs to make a decision. further, devs should be revisiting all drupal_add_css|js calls anyway since we have #attached now those pesky conditional cases.
Comment #94
Owen Barton CreditAttribution: Owen Barton commented@moshe - I like the idea of semantically named functions - not quite as elegant as a declarative model, but a big improvement either way. It seems like it could be tough to get into Drupal 7, but it could be worth a try on the basis of it leading to better performance from contrib. I think it should be a separate issue though, and we can use this one to continue to thrash out what should be included/excluded in the aggregate in core itself.
In general, it's not so much a matter of not trusting developers, but more about being nice to them (DX) - having safe defaults that allow people to learn Drupal incrementally. Without this patch if you are learning and screw up you create duplicate downloads on at least one, potentially several pages/sections, each time adding about 0.7 seconds (possibly several times that) to those page loads. With the patch, you only add about ~0.1 seconds to the first page load only. Making the add unconditional or disabling preprocess for it are the only 2 reasonable technical approaches. We can't yet rely on preprocessed files being added unconditionally (other than shouting via the documentation and hoping), so our only recourse for a sane API is to default to FALSE.
Comment #95
Crell CreditAttribution: Crell commentedIf we have #attached for conditionals, why not make those default to non-aggregate but normal-added ones with just drupal_add_css() aggregate?
Comment #96
sunAt least one definitive conclusion can be drawn: At least the preprocess default for #attached should be FALSE.
Speaking of, #attached and drupal_add_js/css() most probably are that "infrequent" and "always" separation already. Though to stay on-topic for this issue: if you set preprocess = TRUE, then you must not use #attached and move your drupal_add_js/css() into hook_init().
Comment #97
cosmicdreams CreditAttribution: cosmicdreams commentedIncredible discussion here guys. One quick question: if this patch doesn't make it to drupal 7's HEAD can we wrap this up in a module? I'd like to see modules for different aggregation schemes.
Comment #98
Crell CreditAttribution: Crell commentedOwen: Something else occurred to me last night. Your benchmarks (which are extensive, thank you) were all run on localhost, to avoid network variance. However, that cuts out a large chunk of the round-trip time for each extra HTTP request. If you were running against a remote server, the CPU overhead on each end of the extra HTTP traffic wouldn't change but the latency for them would change considerably. So I am not sure if that's a valid test case without including a "real network".
Comment #99
sunCurrent conclusions.
Comment #101
Owen Barton CreditAttribution: Owen Barton commented@Crell
Here are a few reasons:
1) No one has shown any benefit to drupal_add_* preprocess defaulting to true. I think my benchmarks showed pretty conclusively that with our current method it is simply not the case that "as much as possible" should be aggregated, either in the form on "one huge aggregate" or "duplicate aggregates". We should aim to aggregate only those files that regular visitors would expect to need on a "typical" site visit, and they should receive the aggregate only once.
2) There are a large number of files that don't make sense to aggregate, and not all of these can or should use #attached, so a TRUE default also doesn't make sense as the "most common" option. If you do "egrep 'drupal_add_[cj]ss?\([^\)]' * -R | grep -v preprocess | grep -v simpletest | grep -v '.api.' | wc -l" on your modules directory you will see that core has around 30 files that are not aggregated, plus around 13 aggregated ones (not including theme files, which are always aggregated). Of course this number may change a little as we tweak what is included, but seems unlikely to radically shift. Contrib modules will likely be in the same situation - most will have 0-2 files that should be aggregated and 0-2 that shouldn't.
3) From a DX point of view, you agree that adding aggregated files intuitively fits a declarative model (i.e. returning an array from some hook). Conditionally adding non-aggregated files more intuitively fits an event-driven model. Since all we have right now is an event-driven model it is much less confusing to developers (and breakages are much less costly - see #94) for the default to use the event-driven paradigm.
4) Experienced developers (and people who carefully read the docs) should be able to choose the correct value of preprocess irrespective of what the default is - so the value shouldn't matter from that point of view. New developers are much more likely to go with the default value, hence given the above we should go with a safer and more intuitive default.
Comment #102
Owen Barton CreditAttribution: Owen Barton commentedIf you re-read my post you will see that I was using Charles Proxy to introduce typical latency artificially (and also throttle the bandwidth to a more typical level). This is pretty much the only way to test this kind of patch - testing over a real network the latency is often too variable to get reproducible results.
Comment #103
mfer CreditAttribution: mfer commented@Owen - Sadly, I think it is a misgiving to think that people will read the docs. In my experience only a select few read the docs to any amount of detail or understand how this stuff works. This means that having a good default is really necessary.
A good default is really going to be based on the use case of your site. If people are only going to visit one or two pages that have the same css and js aggregation then aggregating everything will be the best course of action for them. If someone is going to go to many pages on your site and those pages are going to have a different combination of css / js only aggregating the common ones is better. The first page load will be a little slower but the later ones will be faster for a net gain.
To comes up with a good default you have to assume a particular case which is hard with the wide variety of ways Drupal is used. Really we need to default to one case and then provide some good documentation/posts on how to optimize for other cases which is more easily possible with hook_js_alter() and hook_css_alter().
The question is, what should we assume is the best default case assuming a lot of developers won't think about optimization or read the docs?
If this patch were to be reverted all the js and css in core (since there is so much more now) would need to be optimized. With this patch in we still need to optimize the aggregation of files. For example, the work sun started in #73 would need to be finished.
Comment #104
Owen Barton CreditAttribution: Owen Barton commented@mfer - if you think many people won't read the docs, then this is a strong argument for the default to be FALSE.
In terms of optimizing for different sites, I agree with your general argument that different sites have different usage patterns etc, and that using the alter hooks (hopefully via some clever contrib) will be more optimal for a specific site than whatever we can come up with here. There are certainly reasonable arguments that we should include for somewhat less or somewhat more files than the current patch to be better for a "typical" Drupal site. However, this argument only affects our strategy for what files we include/exclude from the aggregate individually. This argument does not affect what the default should be at all, because we can set the preprocess value correctly per file whatever the default is, and we understand enough to do it correctly.
Hence, the overriding factor in what the default should be is what is the effect of contrib developers who haven't properly read the documentation. My benchmarks show pretty clearly that the effect of them accidentally creating duplicate aggregates is much worse than the effect of an additional http request (or even several) for that contrib modules code. Of course we are all welcome to submit patches to improve contrib and have them aggregate (added unconditionally, of course) where appropriate, which will be nice incremental improvements - but right now the default to TRUE causes duplicate aggregate behaviour that is so prevalent that you would have to patch pretty much every single contrib (and probably core) before you can see an improvement.
Also, you seem to be under the impression that this patch just changed the default. This is not the case - in fact we fixed many, many cases of duplicate aggregate generation in core itself, because evidently even we can't follow our own (previously poor, admittedly) documentation. From here it is quite simple to make a specific files conditional or preprocessed.
Comment #105
DamienMcKennaIf a concern that "contrib has to be fixed" leads D7 to needlessly have a less-than-optimal default then how about:
With the amount of interest in migrating to D7 and the amount of energy that so many people have put into it, holding back a better out-of-the-box experience for the majority of users because of a perceived lack of interest is not the best action to take. Draw us a map and we'll get everyone on the right road.
Comment #106
sunFriends, I've the feeling we're talking past each other. #99 already implements those sane defaults, totally safe for contrib, unless contrib explicitly overrides the defaults ("ahem..." ;). Properly ported modules should be using #attached already. All others are still using drupal_add_*(), so #99 restores 100% backwards compatibility for those modules (if there is something like BC in this case).
I really need to ask: What else is left?
Let's stop overcomplicating this issue. We approximately know what a proper solution for D8 would have to look like. We can't change more for D7. The already committed changes fixed the entirely, utterly broken aggregation of D6. This rather easy follow-up restores full BC.
So, if you agree, let's get this done:
1) Fix those tests. (needs to revert some more .test file lines related to inline CSS processing of the patch that went in)
2) Properly fix usage of #attached and drupal_add_css() throughout core. (Lovely!) For example, both http://api.drupal.org/api/function/block_admin_demo/7 and http://api.drupal.org/api/function/block_admin_display_form/7 should use #attached, not drupal_add_css(). Both totally easy to fix. (Ideally, we'd also rename block.css to block.admin.css, but that's perfectionism already.) The full list of invocations exists already, auto-generated, for http://api.drupal.org/api/function/drupal_add_css/7 and http://api.drupal.org/api/function/drupal_add_js/7 -- all hook_init() implementations can be ignored; leaving us with only a few remaining invocations that most likely need tweaking.
Comment #107
Owen Barton CreditAttribution: Owen Barton commentedThis is my point - defaulting to false is just as optimal from a performance point of view. Files that should be aggregated will be moved to hook_init and preprocess set to TRUE, all other files are just added as they were before - the performance profile is identical, but the DX and the risk of accidental breakage is much, much less. If anyone has any arguments or benchmarks otherwise please share them.
Comment #108
mfer CreditAttribution: mfer commented@Owen - Thoughts on #99?
Comment #109
Owen Barton CreditAttribution: Owen Barton commentedOn #99, the jquery/jquery.once change makes sense. The revert of the default for drupal_add_* does not make any sense to me - not only for the DX and contrib performance/risk reasons I have described in detail above, but also because we would be back to having lots of duplicate aggregates in core (since all the conditionally added files would become aggregated again) - which is itself a huge step backwards performance wise, and defeats the purpose of this issue.
Comment #110
pwolanin CreditAttribution: pwolanin commentedSeems like we need a version of the patch that simply makes the jquery/jquery.once change?
Comment #111
Owen Barton CreditAttribution: Owen Barton commentedHere is a patch that makes just this fix.
Also, I want to reiterate that the selections sun and I came to are not necessarily perfect, and so proposals/patches that refine what files that should or should not be included in the aggregate would be really great. Typically we want to aggregate CSS/JS that "a typical visitor would expect to need on a typical visit to a typical site", as obviously hard as that is to define.
To this end I am including 2 text files that indicate (roughly) what is being aggregated and not aggregated right now. Note that this is not exhaustive - it doesn't include #attached files (i.e. drupal_add_* functions only), since these should clearly almost never be aggregated, nor does it include theme files, since these nearly always should be aggregated - it also doesn't include a few special cases, such as where the file name is a variable. If you want to play with this at home I was doing "egrep 'drupal_add_[cj]ss?\([^\)]' * -R | grep preprocess | grep -v simpletest | grep -v '.api.' | grep -o '[a-z.]*\.[jc]s*' | sort | uniq". I normally find that simply reading the css/js file indicates pretty quickly if it is destined for an admin/editor type page, or for something more general (occasionally both) - in a few cases you may need to dig into the module to figure this out though.
Comment #112
sunMost of the non-aggregated are admin-only files. From those,
form.js (+ jquery.form.js)
help.css
locale.css
openid.css (not prepared for global aggregation)
timezone.js
user.js
Ideally, we'd be renaming all other files from $name.css/.js to $name.admin.css/.js to clarify the situation and prevent common frontend code being put into those conditionally loaded admin-only files. As always, system.module's JS is a total mess.
Comment #113
mertskeli CreditAttribution: mertskeli commented@#112
All of them are just 1Kb. Can be joined into existing admin.css. Benefits: 1. Less server requests, 2. Less functions to override, 3. Easier to edit.
Comment #114
sun@mertskeli: err. Seems like no one told you that Drupal is a modular system? :) Help, Locale, OpenID are modules that can be disabled.
Comment #115
pwolanin CreditAttribution: pwolanin commentedCNR?
Comment #116
mertskeli CreditAttribution: mertskeli commented@sun: I've heard about that.
Comment #117
David_Rothstein CreditAttribution: David_Rothstein commentedRegarding the latest patch in #111, are we sure we actually want to aggregate jquery.js?
This is a ~70KB file, which I think means that the latency due to the extra HTTP request is small (compared to the amount of time it takes to download the file itself). So we don't gain much by aggregating it, but do take on a lot of extra risk; it balloons the size of the overall aggregated JS file, and people will need to redownload the entire jQuery library whenever any of the cached JS changes on the site. Whereas if we leave it out of aggregation, it's added to the page like this:
and therefore essentially "never" needs to be redownloaded by any user.
In general, it really seems like file size is an important consideration here; I'm not sure where the appropriate cutoff is, but e.g. the first graph at http://www.die.net/musings/page_load_time strongly suggests that by the time you hit a ~70KB file size, you are already doing well, and aggregating that file isn't going to help you that much.
***
In terms of the overall issue and earlier committed patch, I'm still trying to get my head around this, but it seems to me like with the way CSS grouping works in Drupal 7 (http://api.drupal.org/api/function/drupal_group_css/7), things will not go well when this approach is combined with contrib modules; even with 'preprocess' set to FALSE, they still might be able to screw the aggregation up pretty badly. Need to think about that a little more.
Comment #118
David_Rothstein CreditAttribution: David_Rothstein commentedSo here are some things to consider (about the overall 'preprocess' => FALSE change introduced in this issue):
Here is a practical example. If I do a standard D7 install and enable aggregation, then visit /admin (the dashboard), the CSS on my page looks like this:
Notice how the
media="all"
aggregated CSS is in two files rather than one, because it's broken up by non-aggregated files added in between. If I then visit admin/config (which does not have jQuery UI or contextual links), those intervening CSS files are not added to the page, leaving this:So a new
media="all"
aggregated CSS file is generated (which contains the duplicated, combined contents of the previous two), and browsers need to redownload that.The above example may be unusual in core, but when combined with contrib modules habits it seems like it could erase many of the presumed gains in this issue. All it takes is some contrib modules which have code like this in hook_init() - which I'm pretty sure I've seen many of them do:
And then even though the default value of 'preprocess' is FALSE, it will still force a new, duplicated set of aggregated CSS files to be generated for those pages, by the same process as above (with modules whose hook_init ran before this module in one aggregations group, and modules whose hook_init ran after it in another).
Now imagine a site with a large number of poorly-coded custom or contrib modules, each doing some version of the above...
The 31 stylesheet limit was fixed in D7 in #228818: IE: Stylesheets ignored after 31 link/style tags but that was based on the idea that not too many CSS files would be added with 'preprocess' set to FALSE. That assumption was broken by the patch committed here.
The above issues (in particular the first two) seem to me like they could be serious.
Leaving JS aside for the moment (it's a bit odd that it currently uses a totally different grouping and aggregation method than CSS?), the fact that we already have CSS grouping in core suggests one way out of the issue, which I will throw out there as a trial balloon: We could go back to having 'preprocess' default to TRUE, but then include the weight of the CSS file as a criteria when putting things in aggregation groups (and add a couple new default weights for that purpose):
Comment #119
Owen Barton CreditAttribution: Owen Barton commented@David_Rothstein: thanks for the great input. Overall I think we are building a good picture of the various weaknesses of our current "event driven" aggregation system - even if we can't fix all the issues here, hopefully we can use these as some design criteria for a fresh approach in Drupal 8.
Regarding jquery.js that is a fair point - having it aggregated will indeed force a redownload of jQuery when JS code changes are pushed, but that shouldn't be that frequent on most sites. At least as long as we are not producing duplicate aggregates users would at least only ever download it once per session with this patch (unlike with Drupal 6, where users can end up redownloading it many, many times even in a single session). To a certain extent it can be faster to have a limited number (e.g. less than 6) of additional http requests of roughly similar size to better utilise browser simultaneous downloads, something that might make sense to use here. I think it could be hard to determine which is better though, without having some user browser collected timing data from a good sample of real sites.
Regarding "1: The effect of CSS grouping" - I could be wrong here, but I am pretty sure that Drupal 6 does exactly the same thing in this case (it was split out into a separate function in Drupal 7 because of the "31 limit" re-factoring). We don't really have any choice but to add CSS and JS in the correct order, because we need to maintain correct CSS inheritance and respect JS dependencies. With our current system I think this implies that some splitting (and hence duplication) of the aggregate is unfortunately inevitable. We can try and change the weights we use (where possible) to avoid splitting the aggregate, but without rewriting the whole thing there may be some limits to what we can do here in Drupal 7. I think it would be best to approach this in a separate issue, since I don't think this really affects the default - aggregating either nothing or absolutely everything is very problematic from both a functionality and performance point of view, and the ideal fix of moving to a declarative system (so we can aggregate by role combination) would be a pretty major API change at this stage of the game.
Regarding "2: The IE 31 stylesheet limit" - my understanding of that issue is that there is a benefit when not too many files added with "preprocess" set to FALSE to the extent that if the total number of stylesheets is kept below 31 it can use link tags, which are preferable for a variety of reasons. It seems very unlikely that the number of non-aggregated stylesheets on a single page would approach anything like that number (certainly we are nowhere close with core), and it doesn't appear that it would break - just that it would use the less-optimal style tags instead. With contrib this can be resolved simply by setting appropriate files to aggregate (and ensuring they are added unconditionally). This doesn't seem like a breakage to me, relative to the performance penalty of duplicate aggregates incurred by setting preprocess to default to TRUE - especially considering that the non-aggregated files are likely to be added mostly on complex form pages (which are mostly admin/editor pages) whereas the duplicate aggregate files could occur for regular visitors.
Regarding "3: CSS compression" - I think this is beside the point. If you look at my benchmarks up in #91 it is pretty clear that even though the aggregate is compressed, on less typical site pages (that need files not included in the aggregate) it is still much slower to redownload a large duplicate aggregate than it is to download a number of small additional http requests.
Regarding your fix for changing weights to avoid splitting aggregates - I think this is a useful approach to investigate, although I think there may still be cases where we cannot avoid splitting the aggregate to maintain correct CSS inheritance and JS dependencies. That said, I don't see how it requires setting the default back to TRUE at all - reworking the weights would not avoid any of the issues discussed previously with duplicate aggregates, and would in fact introduce these issues again. Changing the default is not necessary for the weights to be changed and aggregate splits avoided.
Comment #120
Owen Barton CreditAttribution: Owen Barton commentedThinking about the weight thing some more - one thing we could do that would help would be to do something like this: when we process the files, automatically subtract 0.1 (for example) from the weight for just the preprocessed files whenever one of the "standard constants" is used for the weight. This should be safe to do, because assuming the constant is used it should not be dependent on other files with the same weight, as a group. It would then float the preprocessed files up to the top of that group, meaning that we would (in general) create at most one aggregate per-group, which would be much better than the worst case when aggregates could be split within the same weight for no reason (i.e. not because of the CSS cascade or dependencies).
Comment #121
sunDavid threw up some really good points, indeed awesome.
Yes, something along the lines of #120, i.e., only allowing the default _LIBRARY/_SYSTEM, _DEFAULT, _THEME weights to be aggregated, and only if 'preprocess' is TRUE, sounds like a good idea. However, jquery.js itself has a weight of JS_LIBRARY - 20, jquery.once.js uses JS_LIBRARY - 19.
Comment #122
David_Rothstein CreditAttribution: David_Rothstein commentedThanks! Responding to some of the points above:
My understanding from http://api.drupal.org/api/function/drupal_get_css/6 is that Drupal 6 only groups by media type, and also pulls out all non-aggregated CSS and sticks it at the beginning of the list, so the same situation does not occur - i.e., there is no way for non-aggregated CSS to "interrupt" the aggregation groups.
Either way, I guess it doesn't matter much now :)
We don't have too many real world examples yet, but I do have one: Drupal Gardens. Running the latest D7 code in my local dev environment, I get 16 CSS files on the front page of a Drupal Gardens site logged in as an admin user, with CSS aggregation turned on. When I open the themebuilder tool on that front page, the number jumps to 28 (which is uncomfortably close to 31). Granted, this is a bit of a special case since it's a complex tool; however, not really. The main reason for the increase is that the themebuilder uses a lot of jQuery UI libraries, and from http://api.drupal.org/api/function/system_library/7 you can see that just about every jQuery UI library comes with its own CSS file. Those can add up fast, and will likely do so on any page that is making heavy use of jQuery UI.
Also, I do believe it will break in IE - the way I understand/remember the code in http://api.drupal.org/api/function/drupal_pre_render_styles/7 is that it only groups things in <style> tags when they are part of a group that would have been aggregated together, but CSS aggregation is turned off on the site. So it allows you to have more than 31 stylesheets without breaking IE, but more than 31 stylesheet "groups" still will break it. Each file with preprocess set to FALSE counts as its own group, and gets its own link tag, so there's nothing currently there to stop us from going over the 31 sheet limit in this case. (I double checked this by forcing 'preprocess' to FALSE on a couple files that on my Drupal Gardens setup we normally have set to 'preprocess' to TRUE, and did wind up with more than 31 link tags on the page.)
The reason I wanted to suggest setting the default back to TRUE there was (a) the 31 stylesheet issue, but also (b) the case of lots of small files which would benefit from aggregation even if they introduced some duplication. For example, if you imagine a hypothetical site that has an average of 10 #attached CSS files per page, and then since attached CSS is usually pretty short and focused (assume again a completely hypothetical situation of 20 lines of CSS per file), then I think we want to aggregate them by default, provided we did it in a way where it did not interfere with other, larger aggregated files on the same page. At least, based on the tradeoffs people are discussing here, I think we would want to. Those 10 files would be expensive in terms of latency for a first-time visitor to the site, but combined are only 200 lines of code so even if we have to redownload ~200 lines of CSS for each new page with some duplication, I think that is a tradeoff most people would want to make. Whereas they might not make that tradeoff if the aggregated CSS file was much larger (as per your benchmarks above)
I think we are constrained by not wanting to change anything too drastically in Drupal 7 as well as not wanting to make the defaults too complex, but in theory, I would suspect the "rule of thumb" defaults we would want to strive for to optimize the overall experience are something like this?
It seems to me that would be a good compromise between keeping the number of files low for a first-time visitor, but still having most of those files (especially the largest ones) not need to be redownloaded as any individual user navigates the site.
The question is what we can still do in Drupal 7 to get closer to that. #120 sounds like it's definitely a start, but wonder if we need to go a little further.
Comment #123
JacineSad, sad pandas.
I came looking to post a bug report after looking at the output in
<head>
, in sheer disbelief that the mess in front of me was actually intended.Aggregation off:
Aggregation on:
As webchick would say, "Dear sweet merciful crap."
I am not getting what I expect. It looks completely broken, like there is a pile of crap in
<head>
and I don't like this at all.I'm not trying to discount the efforts being made here. I read this entire issue and while the debate is interesting, and both sides make good points, I don't know that the gain (for some situations) is worth the WTF trade off. We fought really hard for the changes in #228818: IE: Stylesheets ignored after 31 link/style tags. I agree with @David that it will break the work that was done in there. That in itself is so sad. It is an issue that many of us, myself included deal with on a daily basis, and it was a really exciting victory for Drupal 7 themes. To see that all reversed out of nowhere, and replaced with something that is near impossible to explain to newbies, at this stage in the game is disappointing to say the least.
I hope that I'm missing something. If not, I really hope that a better solution can be reached or this can be reverted for now, and properly addressed in D8.
Comment #124
Damien Tournoud CreditAttribution: Damien Tournoud commentedIt seems that this issue only shows how WTF is the IE-fix introduced by #228818: IE: Stylesheets ignored after 31 link/style tags. @import rules *or* link href, pick one. Using both for different types of CSS just feels insane.
Comment #125
JacineI don't understand how the IE patch could be blamed for the current mess. When aggregation was off @import was used, with one style tag per media type. With aggregation on, you would get one link tag per media type. For that same page, in DRUPAL-7-0-ALPHA6, you would see this:
Aggregation off:
Aggregation on:
That makes sense. The current situation is what feels insane to me.
Comment #126
mfer CreditAttribution: mfer commented@Jacine - We should not so concerned with how this looks visually in the html but more with how it performs for end users of a site. If the preprocess defaults to FALSE we need to educate developers and themers how to add css files so they are aggregated and done well. So, we do not run into that 31 stylesheet limit. In any case the assumptions made in the IE issue have changed so we have have to revisit how that is implemented.
Comment #127
Owen Barton CreditAttribution: Owen Barton commented@Jacine - I agree with mfer, the point of this issue is that it fixes a huge front-end performance problem, not the visual look of the tags.
That said, I have done some quick tests and can confirm that adding files with preprocess = FALSE does create link tags and could potentially go above 31 tags in some cases, due to assumptions made in the "31 limit" patch. We need to fix both these issues - there is no point in breaking one thing while fixing another. Attached is a rough work-in-progress type patch that tries to resolve this - essentially we just allow non-preprocess files to be grouped, but change the logic when printing the tags such that that any groups of non-aggregated files (ones where preprocess is false, or where aggregation is disabled) get the import (with 31 limit) treatment. This patch needs a bit more work to confirm that the logic works as expected, and also a fair few of the code comments still needs tweaking.
In terms of the "splitting aggregates" issue we described previously. I think the first step is to see if tweaking the weights could work to limit the number of splits (in the most common situations, anyway). If so, I think we should try and get that in as a contingency fix if nothing else. Once we have that in I think we should work in a new issue to see if there is a way we can get the aggregated files into a small number of fixed locations along the lines described in #122. I think it would be good to move to a dedicated API for adding aggregated files (either a wrapper function as Moshe suggested or a very simplified declarative hook) - I don't think using the weight to do this is ideal, as the API gets really hard to explain and un-intuative ("please only use weights in the range 10-20, 30-40 and 40-50 for non aggregated files, and weights in the range 0-10, 20-30, and 50-60 for aggregated files").
The patch also includes the jQuery and jquery.once fixes. I think we still need to do some benchmarking around the time impact of having jQuery as a separate file on the "initial visit" time and when the aggregate changes (e.g. with a module upgrade, forcing a redownload). If the impact on the "initial visit" time is zero or very minimal then it probably makes sense to split it out.
If anyone wants to work on either of the above 3 fixes please go ahead - I won't have much more time to work on this until my vacation (next week).
Comment #128
JacineFine. I'll just wait to see docs on this.
FTR, I am not ONLY complaining because it looks like a mess. I'm complaining because it is very hard to understand what is going on, and because working with CSS files in the D7 theme layer is a nightmare, and this is the icing on the cake.
Comment #129
moshe weitzman CreditAttribution: moshe weitzman commentedCan anyone review or build upon Owen's last patch? We discussed this at the sprint but noone here had a clear idea of how to proceed.
Comment #130
Dries CreditAttribution: Dries commentedI read up on the issue again and I'm thorn. Not easy.
- I don't feel we have enough data to make a good decision. More testing is in order. Given more data, this might actually be an easy discussion/decision.
- One issue seems that we might have been abusing #attached for "conditional CSS/JS" whereas the intended usage is more subtle. We should avoid conditional CSS/JS unless we really have to. We need sun's patch in #90/#91 so that there is a penalty for using #attached when you don't have to. Maybe we need to revisit the documentation too. I certainly think we should document it. Maybe someone can revive sun's patch?
- I also think we need Owen's patch in #127 but he is still working on it so let's see what he comes back with.
- I checked the Mollom module and while we don't use #attached, we only load mollom.js and mollom.css on pages that have forms with Mollom protected CAPTCHAs. That is, the Mollom module actually causes the one big aggregate CSS file to be invalidated on pages with forms. This module is maintained by sun, Dave Reid and myself. It is probably better to always load mollom.js/mollom.css, including on the main page where there are no forms, so we don't break aggregation. I have no actual performance data to back that up, but I like to believe that the new default behavior in core (i.e. default to FALSE) could be a good thing given that sun, Dave and myself sort of missed this in the Mollom module. It suggests that we should default to what we believe would be the most optimized or fastest behavior.
Comment #131
effulgentsia CreditAttribution: effulgentsia commentedFirst off, I just want to say that I fully support what has already been committed on this issue. Yeah, some follow-up is needed, but the old situation of new aggregate files being made on different pages of the site was really, really inefficient, so I think we've made good progress, even if there's still room for improvement.
With regards to the methodology of aggregating CSS files, I disagree with #127. Here were the initial reasons for using @import for aggregatable-but-not-aggregated files and LINK tags for everything else, and I still think they're relevant:
The problem really boils down to us no longer feeling comfortable about that 1st assumption. Let's fix that. I think we have 4 categories of files to think about:
I also like the idea already floated in earlier comments to implement aggregate groups. If not in core, it will surely be taken up in contrib, perhaps by http://drupal.org/project/sf_cache. But if we can get something into core, great.
Comment #132
Dries CreditAttribution: Dries commentedGreat comments, effulgentsia -- I certainly agree.
Come to think of it; whenever drupal_add_js()/drupal_add_css() is called, it is probably called from the wrong place (i.e. not called from hook_init). Moving most of those to .info would probably be a good best practice. If not, we should update their phpDoc to recommend using them in hook_init().
I think it comes down to:
Comment #133
moshe weitzman CreditAttribution: moshe weitzman commentedAdding stylesheets and scripts to module .info files got a big +1 from the sprinters here in Boston. Even though we are in Alpha, we think benefits outweigh risks.
As for identifying admin pages, we now have http://drupalcontrib.org/api/function/path_is_admin/7. So I guess admin stylesheets would get added with
Comment #134
rfayGetting CSS files into .info files seems like a far more likely route to success than "improving the documentation" or "encouraging developers", both of which sound like the road to hell.
+1
Even though it's way too late.
Comment #135
DamienMcKenna@moshe: looks great, particularly that path_is_admin() runs off some hooks (admin_paths() and admin_paths_alter()) which would give sites lots of flexibility to control this as needed, e.g. for special use cases. +1
Comment #136
Abeaudrian CreditAttribution: Abeaudrian commentedHi Owen
How exactly do I apply this patch? Where does it need to go and do I have to replace an existing file or add the contents of the patch to a file?
Please advise.
Regards
Adrian
Comment #137
effulgentsia CreditAttribution: effulgentsia commented@Abeaudrian: Hopefully, you find this helpful: http://drupal.org/patch/apply. If not, please comment on that page with suggestions for improvement. Thanks!
Comment #138
bleen CreditAttribution: bleen commentedThis patch allows modules to include .css files (and/or .js files) directly in the .info file just like you can with themes.
IF (in big giant capital letters) we decide that this idea is fit for D7 consumption, I'm sure this patch will need some more love as I wasn't 100% sure where the logic for actually including the stylesheets should go (and some tests) but it's definitely a good start.
Comment #139
David_Rothstein CreditAttribution: David_Rothstein commentedLooks like a great start - on a quick review, one thing I noticed is that for performance reasons, we definitely don't want to call system_rebuild_module_data() so often.
A quick improvement would be to use http://api.drupal.org/api/function/system_get_info/7 instead, but maybe even better is to have this data stored in http://api.drupal.org/api/function/system_list/7; notice how that function already stores the full record for themes, so adding an additional element there that just stored the module stylesheets would probably work fine.
And that function is already heavily cached, so calling system_list('module_stylesheets') or what not all the time would not be a big deal.
Comment #141
sunReading through last follow-ups:
One thing that may need further discussion is whether libraries can be loaded that way, too. That is, because we likely get a .info file property clash with "libraries", because I think that's used by contrib already (not Libraries API though; rather thinking of Drush Make/Features/etc).
I also tinkered about whether some very easy "grouping" like the following would be possible:
...whereas we would build separate aggregates on the same page for each specified group then. I.e., one for TRUE (the static default group for all pages), one for 'admin' (if there are any files to be preprocessed), and another one for 'user' (if any). Effectively to prevent re-downloading of the TRUE-aggregate when switching from a frontend to a backend page (where the 'admin'-aggregate would also contain files of the TRUE-aggregate otherwise).
This could even be continued to additionally aggregate all files in the FALSE group on every page. Also, we could put a restriction on the file 'weight' option for those groupings and apply the weight only within the group (so as to make individual file weights not clash with other aggregates).
Comment #142
bleen CreditAttribution: bleen commentedThis patch is a followup to #138...
a) I took David_Rothstein suggestion (#139) and used system_get_info() for performance sake. (using system_list is certainly feasible but that would be a fairly far-reaching change as it would likely change the structure of the array that module_list() returns.)
b) Fixes the errors found by our trusty friend testbot
Comment #143
bleen CreditAttribution: bleen commentedSame patch as #142 without whitespace issue...
Comment #144
David_Rothstein CreditAttribution: David_Rothstein commentedNo reason for it to affect module_list() - this could be stored in a different array key within system_list() than the one module_list() uses.
Comment #145
bleen CreditAttribution: bleen commentedI guess I'm not following ...
This is the code from system_list():
Are you suggestion creating a new value for $type? If not then this code would need to be changed to this:
Happy to talk about this on IRC ... I'd like to understand what you're suggesting.
Comment #146
David_Rothstein CreditAttribution: David_Rothstein commentedI was thinking something like:
Then calling system_list('module_data') retrieves it. Maybe even system_get_info() could make use of it internally, in that case.
Comment #147
David_Rothstein CreditAttribution: David_Rothstein commented[accidental duplicate post]
Comment #148
David_Rothstein CreditAttribution: David_Rothstein commentedAnother option would be more like:
Since given how many modules there are on a typical site, that's a lot of data to be loading (even from a cache) on every single page request when we only need a very small part of it.
Comment #149
sunPreparing/processing stylesheets within system_list() won't fly. The function is complex enough already, and we have to keep it lean and mean.
FYI: The $type is 'module_enabled', because we originally added support for 'module', returning all module records (including disabled), just like 'theme' returns all theme records. However, that's different to what has been suggested in #146 with the _data suffix. It's ugly that 'module_enabled' does not return full data records. Ideally we'd change that, as it's plain silly to query lots of data and throw it away afterwards, but anyway.
http://api.drupal.org/api/function/system_get_info/7 seems like a perfect fit here.
Comment #150
bleen CreditAttribution: bleen commentedSo if system_get_info is the way to go here (or even if there is still some debate about adding a 'module_enabled_data' $type to system_list()) then I guess the real question is still can we make this change to D7 at this stage? Webchick/Dries ?? I'm happy to keep working on this patch as reviews come in, but it would be great to get a green light :)
Comment #151
David_Rothstein CreditAttribution: David_Rothstein commentedsystem_get_info() more or less duplicates the database query that system_list() does and doesn't have any caching, which is fine for its current use case of being called on an occasional admin page or two -- but if we are adding something to the module .info file that we now need on every single non-cached page request, it really seems like we'd need to improve the performance of that function in order to use it?
So instead of building it up separately, it really seems like it would be better to reuse system_list(), which already has all the performance stuff figured out nicely.
Comment #152
sunTake this.
Comment #153
sunmeh :)
Comment #154
sunAnd well, module.inc has to work without common.inc. Sorry.
Comment #155
bleen CreditAttribution: bleen commentedI like that solution ... Best of both worlds!
I'll combine the two patches as soon as I'm off the bus :)
Comment #156
sunNote: It might make sense to tackle this in #887870: Make system_list() return full module records first.
Comment #158
friedjoff CreditAttribution: friedjoff commented#46: drupal.preprocess-false.46.patch queued for re-testing.
Comment #159
effulgentsia CreditAttribution: effulgentsia commentedYay! Glad to see so much progress on #141.1!
Re #141.4: I really like the 'preprocess' => $group idea. I like the idea of adding the two groups you suggest: "user" for files based on some user_access() permission, and "admin" for files only needed on administrative pages. The system would allow contrib to extend with more groups, but this seems like a great and doable-in-D7 core set. However, I think this would also require a "theme" group, assuming it makes sense to sandwich the "admin", "user", and contrib-defined groups between the module .info and theme .info groups.
The permission based one ("user") seems simple enough: the module should drupal_add_(js|css)() in hook_init() but within an if statement that checks user_access(). So, upon first contact with the site, you get 2 aggregate files (one with the module .info specified source files, and one with the theme .info specified source files). Then, after logging in, you get one more aggregate file containing the sum of what's been added in all hook_init() implementations for your level of access containing only what you don't already have browser cached from the first two files.
The path based one ("admin") is a little trickier, I think, since we have hook_admin_paths_alter(). What a module really needs to say is "I need these JS/CSS files for this page", not "I need these JS/CSS files if Drupal thinks this page is administrative". Note, overlay module JS/CSS files are different, and probably fall closer into the user permission based category. I don't know what our compelling examples of page-specific JS/CSS files are in core, but a contrib example might be something like the Views UI module. What if we expanded hook_menu() to have "stylesheets" and "scripts" keys? Something like:
But then what do we do with this info? Are we saying that when a user goes to the 'mymodule-page-with-fancy-ui' page, they should get both 'mymodule.admin.css' and 'myothermodule.admin.css' (along with all other files whose 'preprocess' is 'admin'), since otherwise, as they navigate from page to page, they keep getting a different aggregate file with some redundancy with previous aggregate files they already have? Or, by removing the module.info, user-permission based, theme.info, and large files added without a 'preprocess' flag from this aggregate, is it now ok to re-introduce this much smaller level of aggregate file denormalization? I'm leaning towards saying that we should first try the latter, and let a contrib module experiment with the former. Perhaps it even makes sense to introduce two groups here: "pages" and "admin", though I'm not sure if there's practical benefit to doing so, unless we go with the former approach for at least one of them.
Re #141.3
I disagree and stand by what I said in #131. Unless you have a highly complex site, when you enable aggregation, if you are still left with >31 CSS files on a given page, we should consider that a bug (a buggy module or a poor choice in enabling too many modules that don't aggregate well together). If the way the site builder / owner learns about this bug is by complaints from IE users, great! We should not impose the less efficient @import statements on everyone in order to coddle these buggy sites. The use of less efficient @import statements was deemed acceptable only for situations where aggregation is disabled, because those are situations where you've already decided to throw away front-end performance. For the highly complex sites that have a legitimate occurrence of >31 files with aggregation enabled, they can enable a contrib module that works around the IE limit (e.g., by overriding drupal_pre_render_styles()).
Comment #160
sun1) I'm also quite pleased with the 'preprocess' => $group idea, but I'm also a bit concerned about the complexity + knowledge we're demanding from "regular Drupal developers". Not strictly associated with this very solution attempt though; rather an overall observation.
2) Aren't "admin" and "user" actually the same? user = user + admin
3) Using the menu system would be fun, if it allowed for that. Reading the suggestion, I immediately agreed. But right now, we'd have to introduce 1-2 columns for {menu_router} to store the JS/CSS data. Also likely needs inheritance from parent paths, and needs to be serialized during router rebuilding. Not impossible, but doable.
4) Didn't fully understand the conclusion and/or questions drawn in the para right before "Re #141.3".
5) I won't comment on any further, because I don't know the details of that d.o IE >31 files issue. Hopefully, Owen will. :)
Comment #161
effulgentsia CreditAttribution: effulgentsia commentedGiven #160, perhaps the best path forward is:
if (user_access(...))
block and with 'preprocess' => TRUE. This is relatively minor redundancy: just need a full new aggregate file once after login.And evaluate where we are after that, and whether hook_menu() complexity and/or 'preprocess' => $group complexity has sufficient payoff.
[EDIT: and yeah, if we do try the hook_menu() idea, it should be the 'pages' group, since I think you're right, and 'admin' doesn't mean anything not already covered by 'user', and I think my conclusion of the moment is we should allow the 'pages' aggregate file to fully vary from page to page, just like the complete aggregate file does in D6, because I don't think any 1 'pages' file will be very big.]
Comment #162
Owen Barton CreditAttribution: Owen Barton commentedI have had to catch up with other stuff this week so haven't had much time to weigh in. However, I am off on vacation next week, so hopefully (in typical Drupal-geek style!) should have a bit of time to have a play with some of these ideas.
In general I think all the ideas being discussed sound reasonable, and we just have to balance the performance benefits against the DX and against the scope/complexity of each change (so we can actually get it to D7).
Picking out a few of the points:
- Adding files in module .info files - obviously a great idea, really fixes the DX: let's keep plugging away at this.
- Using roles/paths to have a separate aggregate for admins - I think we need to be very careful here. I agree that a separate aggregate for admins is very likely faster (even if it duplicates the "vanilla" aggregate). That said depending on the implementation I think it has the potential to really reduce the DX and reliability (and hence performance) of the system in practice. Adding files to be aggregated and files individually & conditional really should be 2 quite different things, from a developer mindset point of view. In Drupal 6 we had 1 method to do 2 different things, which lead to major performance issues. In Drupal 7, if we move to .info files for adding aggregated files then we will have 2 methods to do 2 different things, 1 each - simple, intuitive and hard to break. If we then re-add the suggestion that you can also add aggregated files conditionally in specific situations in hook_init (or hook_menu elsewhere) then we now have 3 methods to do 2 different things, and it gets increasingly hard for developers to pick the correct one, as well implement it correctly. Hence, I would really only support this if it could be integrated into the .info file system somehow.
- Ideally I would love to go one step further and actually have an aggregate file for each role-combination. I think the way to do this is actually to use permissions as the key for all files that are set to aggregate. We can then determine the contents of each aggregate file based on the permissions. It might even make sense to move to a situation where the aggregates are generated when specific admin pages are saved, rather than in the "lazy" way we do now. I plan to have a play with this over the next week - even if it turns out to be too complex for D7 it could perhaps be turned into a contrib module (allow contrib to associate files with permissions, perhaps with a UI for non-participating modules).
- In terms of the 31 stylesheets issue I think we need to be more clear about what our objective is. I don't think that it necessarily makes sense to (for example) send smaller duplicate aggregates just to avoid this one problem. Of course if there are performance or DX reasons for the change then that is a different story - but I think that we need demonstrated merit. I totally agree we should avoid @imports for any typical case - I wasn't trying to suggest that we should do that with my patch, but was just experimenting. @effulgentsia - I seem to remember a reason in the original issue, but I can't find it now - is there something stopping us using link tags up to the 31 limit, and then multiplexing the remaining files in with @imports? Also, the approach depends greatly on if we add an "admin pages" aggregate (or similar) - which I am pretty sure would make approaching the 31 limit much much smaller.
- In terms of sending smaller duplicate "per-page" aggregates. I think the jury is out on this one - from my benchmarks there is not enough data (since we would be comparing a smaller duplicate download). We need to determine a typical number of files that would go into this benchmark (a quick click round the core admin suggests perhaps 6 as a number to start with) as well as a typical size of these files. Once we have this it should be pretty easy to benchmark. I think a determining factor will also be the "repeatability" (and hence cache-ability) of these additional files - if many of them tend to occur on multiple pages (although presumably not enough for the regular aggregate) then it seems likely that allowing the browser to cache them is always going to be faster (assuming the user in question visits more than 1 page every 2 weeks).
- We should ensure we don't forget about the "weight splitting" issue. If we move to an .info file approach, this could be pretty much solved in any case, since I don't think that users would have any weight control anyway (other than the module weight and perhaps insertion order - combined with an alter if they need to do something weird). In this scenario we would just need to set the default weights of aggregate files in a sensible way - we might want to use fractional weights to reduce the chance of getting non-aggregated files pop-up if they happen to have the same weight. We could also consider combining the aggregate groups if possible (or make it more likely for them to "glob together" by setting a similar weight) - it would be nice if core, contrib and themes could "normally" form into at most 2 aggregates.
Comment #163
Dries CreditAttribution: Dries commentedThe
'preprocess' => $group
idea, in combination with .info file support would work for me.Comment #164
alanburke CreditAttribution: alanburke commentedWas speaking to somebody [sorry can't remember who] at the sprint today about this issue.
We discussed something similar to #328357: Allow module .info files to add CSS/JS as part of the solution.
Just noting it here.
Comment #165
sunActively worked on follow-up issues:
1) #887870: Make system_list() return full module records
2) #328357: Allow module .info files to add CSS/JS
Comment #166
bleen CreditAttribution: bleen commentedSince I'm a bit confused where I should be posting patches for adding CSS/JS to .info files, I'm noting that there is an updated patch at #328357: Allow module .info files to add CSS/JS
Comment #167
jenlamptonLets keep that on the other issue :)
Comment #168
mfer CreditAttribution: mfer commentedThere is one massive concern with
'preprocess' => $group
. What if one module adds a js file to the admin group and another module adds the same script to the user group? Depending on the pages these are displayed in and the order they are included you could end up with 4 files for groupings instead of 2. And, that's just in the case of one file. For each additional file this can get worse.A simple use case for this is ui.dialog.js. There are times where this will be used for the admin and for front end users. Modules tailoring this for their case can cause problems.
While I love the idea of groupings and will use them but there is potential for the default cause to cause a worse duplication problem than we already have. Each site really needs to setup the groupings for their use cases. Sure, we can configure core to be optimal. But, how do we make it easy for contrib to not screw this all up?
Comment #169
JacineFYI - system.css cleanup should be ready to go here: #885228: CSS Files are in major need of clean up!.
If splitting CSS files to follow a module.admin.css structure will help this issue, just say the word, and I will be happy to start working on it next.
Comment #170
Owen Barton CreditAttribution: Owen Barton commentedI just had a bit of a brainwave, and I think we can get some dramatic grouping improvements really easily and cleanly. Attached (very early stages and untested) patch is on top of #328357: Allow module .info files to add CSS/JS - there is an interdiff also, so you can see the changes clearly. In summary we can associate CSS/JS that is added through the .info files with permissions, and simply check the permissions before adding - a little mental venn diagram shows that this gets us the "aggregate by role combination" that was discussed earlier.
Advantages of this approach:
Comment #172
David_Rothstein CreditAttribution: David_Rothstein commentedHm, is that really a common enough use case to mess with the 'stylesheets' syntax in the info file for, though?
Normally, the bar for adding anything resembling "programming logic" to info files is high - see #522006: Conditional Styles in .info files, since drupal_add_css has it for a lengthy example :) And AFAIK the same effect can be achieved in hook_init() pretty easily:
In core it seems we only have a couple modules that would need that. I wonder what the ratio is in general, of modules that want to add CSS to all pages for all users vs modules that want to add CSS to all pages but only for certain users.
Comment #173
webchickYeah, the proposed syntax is a no-go for me. Permissions are just one way you might want to conditionally add CSS to the page. The .info file is not a substitute for PHP logic.
Comment #174
sunTechnically, this could also be (example modified for clarity):
I.e.: to put a file into a aggregation group (permission), use a string key (permission name) instead of an index key.
Having touched the relevant functions for #575298: Provide non-PHP way to reliably override CSS just recently, I'm pretty confident that we could alter the preparation of styles/scripts, which are collated from .info files, so that the above would result in:
So your proposal *is* actually the 'preprocess' => $group idea, just based on a user permission, which makes sense. It might be limited and not sufficient for 100% of use-cases, but more advanced scenarios can still use drupal_add_css/js() manually, no harm in that.
Furthermore, this solution would be no API change and fully backwards compatible.
Powered by Dreditor.
Comment #175
Owen Barton CreditAttribution: Owen Barton commentedYes - because of the limitations of the current system (discussed in length above) right now only a small portion of our CSS can sensibly be aggregated, and no JavaScript (we could perhaps include a couple of files).
Basically we a forced to optimize for only typical anonymous users - all authenticated users will need additional http requests on various pages. Using the proposed approach a large portion of the rest of the CSS and JavaScript could be aggregated - improving performance for authenticated users. In addition we would no longer be sending anonymous users CSS for stuff that they have no permissions to and will never need.
I think this makes it quite clear why we need this patch. The whole point of adding aggregate files in .info is that we need to tightly constrain the possible logic for a robust, easy to use API. The only conditions that makes sense are things that are constant over the course of a users visit - if developers start using other conditions to add preprocessed files (for example path based conditions) you completely break aggregation and defeat the purpose of this issue. Other than permissions, the only other conditions that are in this category would be checks on the site settings, $user object, $_SESSION or user profile - I think these would be quite rare (and you could still do these in several different ways - e.g. hook_init if you knew what you were doing, or more simply just add them with preprocess = FALSE, use inline CSS if it is small, or have them inherit off a body class).
@sun - thanks for noticing that we can simplify the syntax - I was not entirely happy with this myself, there is clearly no need to add an extra array dimension :)
Comment #176
David_Rothstein CreditAttribution: David_Rothstein commentedI think maybe I'm missing something. So far it seems like we've identified three CSS files in core (aggregator.css, comment.css, and search.css) that meet the criteria of:
(a) They generally already appear on many/most pages of the site,
(b) But only for a certain subset of users.
Which are the others that would benefit from this?
Comment #177
Owen Barton CreditAttribution: Owen Barton commented@David_Rothstein - I think you took my patch a bit too literally, it is intended to just be a sketch :)
The files noted there are only the ones that we are already aggregating (which is not many). The point of this approach would be to work through the many other CSS and JS files that we are not currently aggregating (and can't, because with the current code we would end up sending anonymous users a huge pile of code they never need) and allow them to be added to the aggregate for the specific users that need them. This avoids the extra http requests we have now for these users, and should make page loads faster for them, with no detriment to other users.
In other words, because the aggregates would be targeted at appropriate users we can significantly lower the bar for what gets aggregated. For example with the patch it would be fine to aggregate admin oriented CSS/JS (especially for commonly used admin pages - such as perhaps node forms, menu admin, block admin). Right now we can't aggregate any of this (so these files all create separate http requests for admin users), because it would make the anonymous aggregate really large for no reason (see #91).
Comment #178
David_Rothstein CreditAttribution: David_Rothstein commentedOh, no, I did actually understand that :) I'm just trying to get a sense of what those files would be.
There certainly could be a few more (toolbar.css and shortcut.css come to mind), but I am not sure most files in Drupal really fit that paradigm? Even some of the above ones more "properly" belong in #attached (where they are closely related to the thing they are styling and therefore can easily be swapped out in alter hooks, etc) - it's just that we might be saying we'd rather add them separately on all pages since we expect them to be needed on most pages anyway.
Then there are library files (e.g. jQuery UI - and as discussed above, those can really add up on certain pages), which don't quite seem to fit this permission paradigm...
And then there's JavaScript, which I think we have to be careful to add only on pages where it's needed/expected, or all sorts of funky problems can arise (theoretically that can be a problem with unexpected CSS too, but I think it's less common).
Whether or not it makes sense to add a syntax for this to the .info file probably depends on how common it is compared to other use cases. And I don't think adding it to the .info file enabled anything that wasn't possible before; it just potentially makes it more convenient. You could always wrap drupal_add_css() in an if(user_access(...)) statement if you wanted to. I think this user_access() idea definitely has promise, but it probably can't be the whole solution. In the end, I think maybe we'll still want/need to aggregate (separately) a lot of the files that are "left over" after all the above is done, even knowing that they may need to be redownloaded sometimes.
Comment #179
alanburke CreditAttribution: alanburke commentedHi all
Just a question.
Are the following fair assumptions?
A css rule within a core module css file should never override a rule from another core css module.
A css rule within a contrib css file should never override a rule from another contrib css module.
[Or at least, it wouldn't work reliably unless the set the weight in drupal_add_css]
Given these assumptions, we could do aggregation based not on the order of the files, but based on the weight in drupal_add_css.
Where the weight is equal for the files, they can all the in the same aggregated file.
On a default install, where no funny business has happened with drupal_add-css weights,
we would end up with 3 aggregated files.
Core css aggregated file
Contrib css aggregated file
theme css files aggregated file
These would be intermixed with the non-aggregated files.
Does that sound right - or just clueless utterings?
Comment #180
alanburke CreditAttribution: alanburke commentedSo based on #179, here is a very rough and ready patch.
The idea to to preserve the #weight value, so that can be used to group css files.
A new value called 'insertion_order' is added to keep the css files in order.
A new function is added so ksort can work.
Only for CSS for now anyway.
If this works correctly, we should end up with a single aggregate file per weight.
Need a lot of work, but it might provide a starting point.
Comment #182
alanburke CreditAttribution: alanburke commentedPlease Ignore my ramblings above in 179,180.
That's how it works already...
Comment #183
sunI think we should wait for the module .info file feature to see the actual current status and base further decisions on that. However, one idea crossed my mind just yesterday:
What if we (ab)used libraries? To build aggregation/preprocessing groups/containers?
Comment #184
effulgentsia CreditAttribution: effulgentsia commentedWhile waiting on #328357: Allow module .info files to add CSS/JS, I wanted to share my latest thinking on this: I think most of it is a rehash/remix of ideas already floated in earlier comments by David, Owen, sun, and others.
We currently have constants CSS_SYSTEM (-100), CSS_DEFAULT (0), and CSS_THEME (100) (and the JS counterparts: JS_LIBRARY (-100), JS_DEFAULT (0), JS_THEME (100)). These are constants used for setting 'weight', so we have code that does stuff like
'weight' => JS_LIBRARY - 20
or'weight' => CSS_DEFAULT + 1
.What if we introduce a new property, 'group', and repurpose the constants to be groups, and refactor the above code to
'group' => JS_LIBRARY, 'weight' => -20
and'group' => CSS_DEFAULT, 'weight' => 1
(note, by definition of the word 'default', you can omit 'group' instead of explicitly setting it to CSS_DEFAULT or JS_DEFAULT).Then we can change our sorting logic to instead of being by weight, to being by group first, then within a group by first items with 'preprocess' => TRUE, and then items with 'preprocess' => FALSE, and finally by weight. This ensures we only have one aggregate file per group.
Then, we add group constants CSS_MODULE and JS_MODULE (each can be set to -50 to order it between CSS_SYSTEM/JS_LIBRARY and CSS_DEFAULT/JS_DEFAULT). We put anything listed in the module.info files intro these groups. If we choose not to have module.info files capable of specifying permission requirements, then we also let developers know that they can use hook_init() to conditionally call drupal_add_*() within a condition that stays constant for a given user across a site experience, and when doing so, to specify 'group' => CSS_MODULE (or JS_MODULE).
Then, CSS_DEFAULT and JS_DEFAULT remain for when drupal_add_*() is called within something dynamic, as well as for #attached.
Finally, we implement our aggregating logic to make separate aggregate files per group.
So we potentially have 4 aggregate files: one for each of the groups SYSTEM/LIBRARY, MODULE, DEFAULT, THEME. But, all but the DEFAULT one can be efficiently reused across the site visit. The DEFAULT one would change per page, but would be fairly small on each page.
Contrib modules and themes can add new groups easily, since our core constants have lots of space between them.
I think with this setup, we'd want to go back to making 'preprocess' => TRUE the default.
I think the above DX is pretty straightforward. For each file, you basically have 3 choices:
1) Put it in the .info file (people are already used to this from theme.info experience)
2) Use hook_init() to call drupal_add_*() with 'group' => *_MODULE (this one is new, but grokable)
3) Use drupal_add_*() or #attached from somewhere other than hook_init() (people are already used to this)
And the above 3 choices map to what a front-end developer SHOULD be considering. i.e.,
1) Will most people need this file during a typical site visit?
2) Will a minority of people need this file during a typical site visit?
3) Will this file only be needed on a page that's not part of a typical site visit?
Comment #185
sun#328357: Allow module .info files to add CSS/JS just went in. So let's move forward here. I guess we need new benchmarks resp. actual aggregation file statistics in order to make any decisions.
Comment #186
mfer CreditAttribution: mfer commentedI don't thing groupings is critical. Not even sure it is major. We could update the Support File Cache module in contrib and people who want it there can use it. This piece of functionality can live in contrib.
I, also, do not like the API change to add a feature this late in the game. Isn't it past the freeze point?
What is critical is optimizing the files we have.
At what point does this become a feature for D8?
Comment #187
effulgentsia CreditAttribution: effulgentsia commentedI believe some minimal grouping as per #184 is critical because of the points raised in #117-#123. I think Support File Cache will still have a contrib role to play to handle more advanced use-cases, but core should come with what is needed by core use-cases, and we've identified that:
1) D6's approach of more-or-less a single aggregate file is really inefficient, due to every page requiring a slightly different version of the aggregate file, hence, how this issue started.
2) As per points raised in #117-#123, and elaborated since then, simply defaulting 'preprocess' to FALSE, as is what is now in HEAD, leads to all sorts of problems.
I'll try to work on a patch that implements #184 soon, but if someone else wants to try that, or wants to carry on and refine the idea in #170, that's cool too. A little parallel experimentation here is not a bad thing. I think as long as we end up with something that has good DX and good front-end performance for core modules, then that solves this issue.
Comment #188
sunAttached patch for starters.
Looking at the current results in my browser, the situation doesn't look as bad as I imagined.
- For anonymous users, I see 2 CSS aggregates (all/print) and 1 JS aggregate.
- The only non-aggregated files are for authenticated users only. As expected, they break up the aggregates of anonymous users. Interestingly, only the CSS aggregates are broken up. (1 for all became 2)
- However, looking at the actual files, I fail to see any relationship between the unaggregated ones. Some of them are loaded permanently, conditionally by user permission, but the permissions don't really "map". So unless we're talking about some higher-level meta categories à la "admin" and "form" or similar, I don't see how those files could ever be aggregated into a single aggregate that makes remotely sense with regard to front-end performance.
- It looks like JS/CSS loaded via hook_library() is currently loaded with the special weight CSS_SYSTEM/JS_LIBRARY (DRUPALWTF@THATMISMATCH?!) - which I think was done purposively, but may not be true for all "bundles of JS/CSS"; e.g., contextual links are just normal JS/CSS files. When using that special system/library weight, they happen to break up aggregates.
Comment #189
sunI actually think that this is wrong. It is wrong that hook_library() defines the preprocess option in the first place. drupal_add_library() has to support $options instead. Only the caller knows whether a library is going to be loaded on all pages and should thus be aggregated or whether it is a one-time/one-page/user-specific operation.
drupal_add_library() should support $options, but not all regular $options. Having libraries pre-define their own weight is correct.
That's slightly off-topic for this issue though. Any takers? Please link to the issue here. I'll happily review.
Powered by Dreditor.
Comment #190
mamarley CreditAttribution: mamarley commentedSome of the work done here appears to be completely breaking CSS and JS aggregation for me on Alpha 7. Instead of getting a small number of aggregated files, I get a few files that were created by the aggregator, but only have the contents of one CSS/JS file in them, and several more files that have not been aggregated at all. Additionally, comments and the like have not been removed from the JS files. I filed a bug about that (http://drupal.org/node/913972), but it would appear that this issue also has something to do with it. Is this known to be broken, or is there something wrong with something I am doing? Thanks.
Comment #191
jide CreditAttribution: jide commentedI have to subscribe to this incredible issue.
Comment #192
sebas5384 CreditAttribution: sebas5384 commented#188: drupal.preprocess.188.patch queued for re-testing.
Comment #193
catchI don't understand why filter and openid moved from form_alter() to .info. The login form may not be shown on every page - plenty of sites just have a link to it. Similarly anonymous users might not be able to post nodes or comments, or have access to the filter switcher, or the comment form might be on the 'separate page' option so loading that unconditionally seems odd too. These are basic configuration options in core which plenty of sites make use of, the logic here ought to be able to account for them without penalising one or the other option.
Even taxonomy.css is only loaded on term view (default core term listing pages) - it may well not be loaded on views, panels or node view pages, which will be the bulk on any real site.
Also - what is critical here?
The original patch went in, there seems to be a followup for jquery which I don't think has been committed, then there's new discussions and patches which move the entire issue in a different direction, none of this helps it get resolved.
So - if jquery/jquery.once changes aren't in, can we split that into a different issue and get it committed?
Why is jquery grouping a critical bug report when no-one even mentioned it here until after the original patch was committed and there doesn't appear to be any consensus on how to do it? Can that be split into a new issue too? If not can someone write an issue summary?
Comment #194
sunLike the front-end performance benchmarks in this issue outline, the point is that we need to find a sane standard for aggregating files into the default aggregate. All files that are usually (most commonly) loaded by anonymous users (user, filter, openid, taxonomy, etc.) should be unconditionally loaded via the module's .info file, so they get into the default aggregate. Ultimately, the goal is to have a single aggregate file for anonymous users, since any additional HTTP request is much slower than loading a few JS/CSS bytes more.
I agree that we likely want to commit #188 first, or move those quick fixes into a separate issue to keep this one focused on the macro-level problem. The remaining critical issue is that users can easily get many separate aggregate files, which is caused by files that use 'preprocess' => FALSE and appear in between of all other files (most often using an individual weight). This, in turn, may mean that we save client/server bandwidth, but slow down front-end performance due to many smaller HTTP requests.
Comment #195
catchI disagree that these are all 'commonly' loaded - it depends on site configuration, the existing code accounts for that, the changes here don't.
Comment #196
mfer CreditAttribution: mfer commentedBecause of the preprocess bug for jquery and jquery.once being skipped over for other conversations I moved that to its own issue in #925100: Preprocess is misdefined in system_library().
I agree with @catch on the commonly loaded scripts. OpenID should not be included everywhere. For many cases it is bloat and a more intelligent loading mechanism should be in order. But, that is a small optimization.
This big issue that has come up here is adding a grouping mechanism into preprocessing. We need to move on this for D7 now or move that change to D8. I am happy to help this in D8 but will not have the time to really help if we move on that for D7. Is someone else going to make it happen right away or is it best left for the next release cycle?
Comment #197
sunIt looks like the last comments missed the point of this entire issue. The remaining OpenID and Filter module's assets are currently loaded with preprocess FALSE, so they won't be aggregated. While preprocess could be changed to TRUE in the calling code, that will lead to NEW aggregate files, 80% DUPLICATE aggregate files on certain pages, and ultimately that pattern leads to a giant waste of bandwidth. This has been explained and proved with front-end benchmarks all over again since the beginning of this issue.
So we have to make a case and find some sensible default patterns. This cannot be deferred to D8. Also, there is no way back "to before", because aggregation was utterly broken before (which is also explained various times in the past ~200 follow-ups).
As of now, the most simple pattern most of us can agree with is to unconditionally load a module's assets if it is most commonly used for anonymous users. Basically applying a 80/20 rule here. We therefore allowed module .info files to specify JS/CSS assets, too, which are unconditionally loaded, similar to a theme's assets, which eventually allows us to apply a special weight or other property to those files, so they will never ever be broken up into multiple aggregates. It is the remaining, primary challenge to prevent an unintended aggregate split up from happening, as that means that clients have to re-download stuff they already downloaded.
Given all that, we already changed a couple of module assets to be loaded unconditionally via the module .info file patch. This patch here merely changes the remaining occurrences I found through manual testing + analyzing the contents of those asset files and their usage.
This is the best or ideal solution for >80% of all sites. If a site does not allow anonymous commenting and really thinks it has to cut off the few bytes of JS/CSS that filter.js/filter.css is adding, then - as usual - they can install Support File Cache module or whatever other solution arises for D7 in contrib. Same applies to OpenID - if you install the module, then it's almost insane to expect that the module's assets will not be needed for anonymous users. Of course, this only applies to enabled modules.
Second to Lastly, please bear in mind that all of those files are minimal in size. All of them are no-brainers. The real challenge will probably rather hit contrib, in case you have a not-so-small asset that "should" be unconditionally loaded for the 80% use-case, but could be a problem if the file is larger than the few bytes we're talking about here.
Lastly, we still have to try to come up with ideas to group conditionally loaded assets, and ensure that aggregates of unconditionally loaded assets are not blatantly broken up by conditionally loaded ones, which is still the case currently.
Comment #198
catchSo with OpenID...
If the login block is shown on every page, then every page will include the CSS - no different from the patch.
If the login block is only shown on the user/login page, then only that page will include the CSS (which will result in an aggregate for the user/login page). Does user/login get a high percentage of anonymous user traffic? No it doesn't.
What point is being missed?
Comment #199
sunThe point and difference is that you load a separate, non-aggregated file on every page that contains the login block, effectively leading to a separate HTTP request, which in turn slows down page rendering. Our intention is to get the default behavior of Drupal core back into a similar state of where it was before: a single aggregate file for the most common use-case.
In case you now want to suggest to just set preprocess to TRUE, then we immediately circle back to the problems of conditionally loaded files I stated in #197 already, which is what remains to be resolved here.
Comment #200
catchIf it's conditionally loaded, but also loaded on every request, what's the difference between loading it unconditionally?
Comment #201
geerlingguy CreditAttribution: geerlingguy commentedSubscribe. I've only been able to read through about 2/3 of this issue so far... ugh.
Comment #202
effulgentsia CreditAttribution: effulgentsia commentedFYI: I'm going to spend the next couple hours or so (or with any luck, less) implementing what I outlined in #184, and will post a patch when it's ready. We might not go with it, but I think it will be a useful discussion point that will help bring this issue to closure regardless, as it will give us a concrete basis of comparison with #188 for discussing the points being argued between catch and sun with respect to where http request inefficiencies get introduced.
Comment #203
effulgentsia CreditAttribution: effulgentsia commentedOK. Let's see if we can get this to pass tests, then evaluate what it does for front-end performance.
Comment #205
effulgentsia CreditAttribution: effulgentsia commented.
Comment #206
effulgentsia CreditAttribution: effulgentsia commentedWhile testbot is running, here's a summary of the patch. There are some differences relative to what I suggested in #184:
- As per #184, this patch introduces a 'group' option, and repurposes the (CSS|JS)_(LIBRARY|SYSTEM|DEFAULT|THEME) constants as groups rather than as weight offsets. In other words, instead of code that passes a
'weight' => JS_LIBRARY + 2
option to drupal_add_js(), this patch changes it to'group' => JS_LIBRARY, 'weight' => 2
.- This patch also adds a 'every_page' option that defaults to FALSE, but that code that calls drupal_add_(css|js)() from a hook_init() implementation or other code that puts the file on every page can set to TRUE (done automatically for files added via lines in a .info file). The purpose of this flag is to enable the system to order the files in a way that minimizes aggregate file inefficiency. More details on this are in docblocks in the patch.
- This patch adds a drupal_sort_css_js() function so that the css and js files are sorted by group, then by the every_page flag, then by weight.
- This patch changes drupal_group_css() and drupal_get_js() to separate aggregate files by group and the every_page flag. Explained in patch.
- This patch reverts the 'preprocess' option to default to TRUE, and the only setting of it to FALSE in core is for IE-conditional CSS. As discussed in earlier comments, this issue originally changed the 'preprocess' default because of the inefficiency of a single aggregate file that's always changing. The addition of 'group' and 'every_page' allows us to partition aggregate files in a way that minimizes that inefficiency, and as other comments have mentioned, there's benefits to allowing things to be aggregated. We may, of course, find compelling reasons for particular files to set 'preprocess' to FALSE, but let's do those on a case by case basis, with arguments as to why that's wanted.
I guess the real test of this patch is to evaluate the resulting difference in front-end performance. I'm not really sure how to do that. @Owen, @mfer, @sun, et al: how do we evaluate this?
Comment #208
mfer CreditAttribution: mfer commentedA couple things about the patch in #205.
Comment #209
sunI must say that #205 makes a lot of sense and is an excellent patch. I don't really see the problems that @mfer mentioned. Contrary to what he says, right now, JS_LIBRARY is a weight, not a type. The idea of this patch is to turn it into a group to help us building and ensuring sane aggregates. Groups like "admin" or "all pages" have been discussed earlier in this issue already, and there have been many valid counter-arguments for why such groups do not work. Lastly, files are still ordered by weight within a group, so I don't see a dependency problem, and in general, all Drupal assets should avoid precise (and therefore fragile) dependencies anyway. Basically, this patch re-introduces groups like the simple 'core', 'module', 'theme' groups we've been able to work with in the past, merely using better names, better aggregation, and still supporting individual file weights.
Another master-piece of effulgentsia - you rock! Let's resolve that 1 failure and be done with this.
Comment #210
effulgentsia CreditAttribution: effulgentsia commentedWorking on the test failure now. @mfer: I'm in IRC if you still have concerns after reading #209.
Comment #211
effulgentsia CreditAttribution: effulgentsia commentedComment #212
effulgentsia CreditAttribution: effulgentsia commentedThis is somewhat true, as in the following code that was needed to fix the test failure:
As per #209, I don't think the move from
'weight' => JS_LIBRARY - 21
to'group' => JS_LIBRARY, 'weight' => -21
makes things any harder. But the 'every_page' flag does increase complexity a bit, because if you want a page-specific JS file to come before a 'every_page' JS file, you can't accomplish that with weight alone. For example, if the above wasn't a test case, but was a real need, how would you accomplish it? You could decide to make your "I need to run early" file something that is added to every page, as above. Or, you could do this:'group' => JS_LIBRARY-1
. However, that only works if it's okay for your file to run before all JS_LIBRARY files. What if you want a page-specific JS file to run after jquery.js but before jquery.once.js? In HEAD, you can do this with'weight' => JS_LIBRARY - 19.5
, but with this patch, you could only do it with'group' => JS_LIBRARY, 'every_page' => TRUE, 'weight' => -19.5
. But if the JS file you're putting there shouldn't be on every page, then you're abusing the flag, and causing the aggregate file inefficiency that this issue is trying to solve.I'm tempted to say that this is an edge enough use-case that we shouldn't worry about it. You shouldn't have JS files that need to be sandwiched between two files that would normally be aggregated together without also implementing it in the same aggregate. So if you really need your JS to run between two other JS files that are output on every page, then you should also output yours on every page.
But I did want to point this out, in case anyone has thoughts on whether this is a show-stopper for the patch, and if so, how to improve it.
Also, while I agree that moving from a single weight option to 3 options that affect sort order is inherently some increase in complexity, I think it's worth the trade-off. Front-end developers should be thinking about front-end performance. This is giving them the tools to implement front-end performance. A little bit of grokable complexity is, IMO, better, than telling people, sorry, you're stuck with inefficient aggregation that you can't do anything about. That would be WordPress's answer, but it's not the spirit of Drupal.
Powered by Dreditor.
Comment #213
mfer CreditAttribution: mfer commentedI believe the next step is to do the type of tests Owen previously did. What does this approach get us?
Comment #214
effulgentsia CreditAttribution: effulgentsia commentedDiscussed in IRC with mfer. He remains wary of the complexity the patch introduces to developers that want tight control of order. Note, as per #212, tight control of order isn't lost, just requires dealing with the 'every_page' flag, in a sometimes unintuitive way.
As a reminder, we originally changed from how we did things in D6 because of this comment from #17:
And the numbers in #91, along with this comment:
But #118 and #122 raised really good concerns about the decision to default 'preprocess' to FALSE.
#211 tries to achieve Owen's original intent, but with 'preprocess' set back to TRUE by default.
Do we need to generate more numbers like Owen originally did, in order to properly evaluate #211? I'm not up for that task. Is anyone?
Comment #215
mfer CreditAttribution: mfer commented@effulgentsia we are doing all of this to improve performance. yes, we need to run the numbers.
Comment #216
mfer CreditAttribution: mfer commentedI stopped working on API change like these months ago because of the phase of the D7 cycle we are in. Webchick just reiterated the phases in #927792: Please explain what is still considered for D7. Is this actually critical and if so, why?
Comment #217
effulgentsia CreditAttribution: effulgentsia commented#62, July 29, 2010: Dries commits a change from 'preprocess' defaulting to and almost always being TRUE to the opposite, with the following comment:
#118 - #125, August 11-16, 2010: People start discovering issues with this change. Not that the initial change didn't have merit, just that it has undesired consequences.
#163, August 22, 2010: Dries indicates support for finding a solution to the dilemma, even if it means some minor API change.
In #927792-13: Please explain what is still considered for D7, webchick states
Serious regressions from behaviour that worked in previous releases
as an example of a critical bug, butDX,UX,TX problems that do not prevent Drupal from being used.
as not critical. What category does the current mess of what's aggregated and what isn't aggregated, which is a regression from D6, fall in? Beats me. But given the recent timeline of this, I don't think we have to worry that fixing this sends a message that there's any sort of favoritism happening (one of webchick's valid points about why we should be strict about what's still on the table).Just to be clear, I'm not arguing for #211 as something that is in-scope relative to D6. Just as something that is in-scope relative to what is currently half-broken in HEAD.
Comment #218
mfer CreditAttribution: mfer commented@effulgentsia I wouldn't call the state of affairs a regression. We changed the functionality of the JS/CSS handling.
In Drupal 6:
In Drupal 7:
On top of this site preprocessing as matter of performance improvement. Is this a regression or that preprocessing has to work differently to be a performance improvement on top of a different underlying system for JS management?
Personally, I would like to see preprocessing improvements. But, is this critical? Is it a regression? I'm inclined to say no to both of those.
The current proposal seems to be (please correct me if I'm wrong on this) to change the ordering to better support preprocessing to order files by group based on:
Coming from Drupal 6, where items were already grouped by type, this will not be that big of a deal. This patch adds two categories within types. But, is this actually critical? And, if we are doing this for performance it would be good to show this with numbers to back it up.
Comment #219
mfer CreditAttribution: mfer commentedDo we need to explain the functions that do this for .info files? While the documentation is verbose I think we can just say its set to true for entries in .info files. What if the functions change? Will we remember to update this comment block?
Overall this looks good. I still need to apply and test this out. But, on a quick pass this looks good. More reviewing to come.
Powered by Dreditor.
Comment #220
mfer CreditAttribution: mfer commentedThere were offsets when I applied the patch in #211. The only change in this patch is it takes the offsets into account.
Nice work effulgentsia.
Comment #221
catchI'd like to see the current approach compared with the numbers in #91 and elsewhere. Owen are you able to run benchmarks again?
Comment #223
mfer CreditAttribution: mfer commentedThe failure looks to be because #901062: Regression: Themes can no longer remove stylesheets by overriding them in .info files went in.
Comment #224
mfer CreditAttribution: mfer commentedSome performance numbers. Like Owen, I used Charles Proxy. All the modules were enabled, all blocks placed, and there was dummy content. Each run was done 3 times and the average taken. This is just the time for the JS/CSS.
Under setup B many of the js files are only used on a handful of pages or are UI files used for the admin and are not preprocessed.
In setup C the JS file that's included on all pages will be cached without duplication being downloaded. There are still cases where the other JS content will be download with duplication when modules only add them to select pages.
As I clicked through the site there were some varying results. For example, on the blocks page A and B both downloaded 3 new JS files. C downloaded a new single preprocessed file. The file from C was larger and contained duplicate content. The file in C (7.43s) took longer to download than the 3 files in A and B (5.8s) which were not preprocessed files.
In switching to C we should be smart about page placement and whether files should be preprocessed to really take advantage of the preprocessing.
Comment #225
effulgentsia CreditAttribution: effulgentsia commentedLooking into test failures now.
Comment #226
effulgentsia CreditAttribution: effulgentsia commentedFixes the #220 test failures by:
1. Adding a modules/simpletest/tests/system.base.css file, since the fix to #901062: Regression: Themes can no longer remove stylesheets by overriding them in .info files now requires that for files whose 'preprocess' is TRUE.
2. Adding this hunk to drupal_add_css():
Also incorporates the feedback from #219.
The numbers in #224 look encouraging. As mfer and I discussed in IRC, we may identify specific files that we want to set 'preprocess' FALSE on, but as discussed in this issue, I think it's better for the default to be TRUE, which is what this patch allows, and for any we want to make FALSE to be done in separate follow-ups. Since the #224 numbers don't identify any performance regression of patch compared to HEAD, what else is needed for this patch to be ready?
Comment #227
effulgentsia CreditAttribution: effulgentsia commented#226: drupal.preprocess.226.patch queued for re-testing.
I asked for a re-test, because #561858: [Tests added] Fix drupal_add_js() and drupal_add_css() to work for AJAX requests too by adding lazy-load to AJAX framework just landed, and it's possible there's a conflict now.
Comment #228
effulgentsia CreditAttribution: effulgentsia commentedThis doesn't need to be assigned to me any more.
Comment #230
effulgentsia CreditAttribution: effulgentsia commentedStraight re-roll.
Comment #231
Owen Barton CreditAttribution: Owen Barton commentedAs I understand it (from the plan above - I haven't looked at the patch in detail) there is a single group that varies by page load. If this is the case, we need to do tests across multiple page loads to quantify this. I need to head out, so this is brief...can elaborate if needed.
Why? The Drupal 6 approach to aggregation is "faster" within a single page load than with current HEAD, because it eliminates additional HTTP requests - however it is much slower across multiple ones because it breaks browser caching, causing duplicate code downloads. This patch is an improvement because significant portions of the code are put into invariant groupings and can be cached correctly and the amount of duplicate code should be smaller.
Hence, we still need to validate the assumption that the smaller duplicate aggregate is faster than leaving the individual files in this group unaggregated (and cachable individually). My gut feeling is that the latter case (for which we have no patch yet) may well be faster.
Comment #232
David_Rothstein CreditAttribution: David_Rothstein commentedIt would be very useful to test, but I don't think it's an assumption or a requirement. The requirement should be more like "not too much slower".
That is, I think we are making a conscious decision here to potentially make a small tradeoff - fewer HTTP requests the first time you visit (thereby privileging and speeding up the experience of first time site visitors), at the cost of a slightly slower experience as you continue to browse the site. Probably any tests should specifically separate out the first hit on the site (so the speed of that can be compared in addition to the average overall speed)...
Comment #233
catchI agree with #232. Also the duplicate bundles are only going to be on different page types - if you land on an article page then hit a 'related articles' link - which is a very common pattern for people coming from search engines, you may not have any duplcate requests. Say there are ten duplicate bundles, if you browse a lot of pages you'll end up with them all cached anyway.
Comment #234
mfer CreditAttribution: mfer commentedThe patch in #230 looks RTBC. Its well documented, passes the tests, and everything looks in order.
In reference to what Owen wrote in #231, I'm not sure how we test this in depth. It will depend on usage patterns by developers. There is a potential for developers to add a lot of module code to specific pages where it is preprocessed. So, as someone browses a site they end up downloading many different bundles and a lot of duplicate code.
The solution for the patch in #230 really helps people who visit a site once or primarily visit one type of page. When compared to the preprocessing from D6, this approach tries to minimize the impact of duplicate code to a greater extent. As I noted in #224, when I went to a second page, of a different type, with preprocessing like what this patch provides it was slower than what we have now. This was due to downloading duplicate code.
The question comes down to (at least) one big question. Do we privilege the person who visits multiple pages/page types or the person who visits one or two pages of the same type? Each case would need to be optimized differently. The case we have now is optimized for people who interact with the site visiting multiple pages/types. What we have in D6 was a setup that was optimized for someone visiting one page/type. It privileged people who did not spend a lot of time on the site. This was ok until all the additional JS used in D7 caused the privilege of a first time visitor to be a hefty penalty to the person who visited a lot of pages/types.
What we are looking for in D7 core is something in the middle, I think. Core should not privilege either case and not be a huge penalty to either case. The patch in #230 does that. It is a middle of the road approach.
I think there is great value in use case specific approaches. What about the blog where users only go to 2 or 3 page types? Or sites where users visit lots of page types? Or the cases that we have not imagined? While core should be a good middle of the road we need a way to handle the others well. I started the Preprocessing project. It will be a place to provide other solutions that suit specific use cases.
Comment #235
effulgentsia CreditAttribution: effulgentsia commented@Owen: It sounds from #231 that you question whether #230 is an improvement over HEAD. I'm wondering if you're more concerned about the fact that #230 reverts to defaulting 'preprocess' to TRUE, or whether you agree with that (given the separation of the every_page aggregates from the non-every-page aggregates), but think there are some specific core files that we should explicitly set 'preprocess' to FALSE for. If only the latter, I suggested in #226 that we deal with those in non-critical follow-ups, but if you think there's really obvious ones we should include for you to be happy with #230, then please suggest what those are.
As a reminder, the main motivation for #230 comes from some of the excellent points made in #118 and discussed in later comments, and it was the end of that comment from David that sparked ideas about grouping aggregates intelligently so that most files can be preprocessed, but without reverting to the massive inefficiency of the D6 way.
Leaving as RTBC, but Owen, please add a comment indicating how you feel about the RTBC status, given the last few comments. Thanks!
Comment #236
Dries CreditAttribution: Dries commentedFiew. Just spent quite a bit of time re-emerging myself in this patch. Decided to commit it. Thanks all!
Comment #237
webchickI'm concerned that this patch may have broken HEAD. Testbot's been stalled all day: http://qa.drupal.org/head-status
The failure it shows is "The overriding CSS file is output. Other common.test 748 CascadingStylesheetsTestCase->testRenderOverride() from Cascading stylesheets (CascadingStylesheetsTestCase) [System]"
Does anyone have the ability to check into this with git bisect or whatever?
Comment #238
catchsystem.base.css was not committed.
Comment #239
catchComment #240
webchickWell that'd do it. Thanks!
Let's try that again.
Comment #241
webchickComment #242
Dries CreditAttribution: Dries commentedSorry!
Comment #243
rfaySomebody needs to write a blog post or add an essay to the 6/7 module update page to explain how this actually came out. Any takers? Or post an essay here that can later be transplanted elsewhere.
I'm adding this to the API change notification, but a few minutes reading the bottom 75 or so comments didn't help me understand what really happened or what best practice is now.
Comment #244
torgosPizzaAny chance this is able to be backported to D6? Really need this in Pressflow 6.x.
Pressflow bug is here: https://bugs.launchpad.net/pressflow/+bug/627149
Comment #245
andypostreasonable follow-up #961518: drupal_add_css() preprocesses inline CSS even when the site-wide aggregation setting is off
Comment #246
jason.fisher CreditAttribution: jason.fisher commentedHas anyone considered (or prototyped) optimizing CSS further by reconciling redundant rules?
I typically prefer to keep a theme's default CSS in action and customize (sometimes wildly) with one or more override files. This could cause the original rule to be in the optimized CSS but immediately overridden.
i.e., if you have:
body { margin: 5px; }
and I add in an override file:
body { margin: 10px; }
.. then why not remove the first rule completely from the optimized CSS?
Comment #247
rfay@jason, For D8, please see #494498: Add CSS parsing capability to Drupal
Comment #249
momper CreditAttribution: momper commentedsub
Comment #250
rfayThe outcome of this issue needs to be explained to the world. I can write it up, if somebody can sit with me on skype for a few minutes and make *me* understand it.
Comment #251
rfayComment #252
chx CreditAttribution: chx commentedComment #253
rfayFrom @sun:
From @ksenzee:
Comment #254
jhodgdonSounds like this needs to be added to the 6/7 update page? Updating tag with that assumption. If not, please remove the needs update doc tag.
Comment #255
mattyoung CreditAttribution: mattyoung commented~
Comment #256
rfaymattyoung xpost on this. Putting the tags back
Comment #257
jhodgdonI think the tag change was a cross post. If not, please explain what needs to be documented mattyoung, or else remove all the "needs doc" and "needs update doc" tags.
Comment #258
mattyoung CreditAttribution: mattyoung commentedI didn't touch the tags. How did it get change? I just wanted to subscribe.
Comment #259
jhodgdonWe edited at the same moment. I changed the tags. Your change came in after mine, and had the original tags as part of the edit submission, so your submit changed the tags back.
Comment #260
rfayPlease review this update in the 6/7 module upgrade guide.
I also added about loading in .info files in the D7 Managing JS page.
Comment #262
grendzy CreditAttribution: grendzy commentedfollow-up issue: #1034208: drupal_sort_css_js() ignores media and browser keys