When I use IE v6 to do an upgrade the 'select versions' collapsible list is showing up in the left sidebar instead of the content area. See attached screenshot. This doesn't happen for me on Firefox 2.0.0.x

Note this issue is being spun out of issue http://drupal.org/node/196630 - see #12 & #14 for more context.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

JirkaRybka’s picture

For reference: The above mentioned #14 (additional to the initial post here) came from me. On my test upgrades till now, I often (but not always) saw update.php falling to non-js mode, and task list themed as just black bold text with no done tasks highlighting. I'm unsure about my other problems (not mentioned in that other issue) - sometimes I'm under impression that I was logged in as uid=1 in the same browser session, but still I get access denied (note that I need to check this again, this might be my fault), and at the very end of the update batch (progress bar at 100%), I see a single letter 'a' instead of the message 'updating foo module' (this one seems to be mysteriously solved by my patch at http://drupal.org/node/199336 - but again, it only *seems* to be fixed, no analysis done yet).

I'm going to look into this as time permits, hopefully providing more solid info then (i.e. steps to reproduce and/or a patch), but can't promise given my time situation.

My config: Firefox 2.0.0.6 (JS enabled), php 5.1.6, MySQL 4.1.19, Apache 1.3.37, Mandriva Linux 2007.0 (Drupal upgrade from 5.x)

JirkaRybka’s picture

FileSize
13.55 KB

So, after a bit of testing I think I know what happens in my case (I seem to have remembered some bits inaccurately - my apologies - and the 'uid1' and 'a' problems not reproduced now, however hard I try, so let's say these are no more valid).

Steps to reproduce:
- Do a fresh 5.5 install, going through the installer, and setting up the first user account. Don't close browser in order to keep the session to access update later.
- Replace code to 6.x-dev
- Run update.php

The tasks list is not themed. Collapsible fieldset ok (Firefox), no other visible problems. After starting the updates, the JS version fire, but only just blank page with the title 'Updating' comes. It's stuck there, nothing happens. Dead. (Screenshot attached.)

I hit 'refresh' on my browser, and it helps - update then seems to work.

The reason: Browser cache!

I simply have the 5.x versions of .css and .js files cached in browser, still used because 6.x files have the same filenames/paths and so the browser doesn't refresh them. Obviously, 5.x css doesn't theme the new tasks list, and 5.x JS contains no Batch API things. After hitting 'refresh', new files are loaded, and problem solved. I also tried with flushing the browser cache just before update.php run, and it works fine then. (This might also explain, why vjordan on the other issue reproduced the problem only first time - not on re-tries after flushing all caches/cookies etc. He probably had 5.x things in browser cache too on his first run (?))

So this is in fact trivial (unless there's some other problem involved), but non-obvious to users (to me, at least, it was). The options I see:

- Probably not acceptable: Change 6.x filenames to avoid re-use of 5.x files (just kidding ;)
- Is there a way to force browser-cache flush? I guess not... (Although the problem fixes itself after a few auto-reloads in the non-js update progress - not sure, how these are done)
- Document this (on update.php initial screen "hit 'refresh' on your browser NOW!", plus UPGRADE.txt, and perhaps more places?). Theming issues are not that bad, but update process seemingly dying at the very start is really not good.

yched’s picture

One of those 'aaah !' moments. Of course, stale .js and .css files from 5.x. Doh.
Not sure about the fix either, but loads of thanks for nailing this, I didn't think about that one :-)

What's still not clear to me is why the update batch would start in js mode, then ? It only does so if it finds the 'has_js' cookie, which is only set by D6 drupal.js... Well, at least we have more food for thought now

JirkaRybka’s picture

Thanks for additional info! :-)

Based on #3, I easily figured this out: I simply just browsed some 6.x installs before my test (maybe even on the same testing URL), so the cookie was there. Now I re-tested, this time removing all cached files *and* cookies before setting up 5.x - and indeed update.php started in no-JS mode this time (so not dying, and 'fixing' the tasks theming after some 2-3 reloads). This also explains, why I had the non-JS mode so often previously, without being able to see a pattern in that.

yched’s picture

OK, makes sense :-)

So this leaves us with nothing critical, right ?
- The ill-styled fieldset reported by vjordan in IE6 is pretty ugly, but should we bother with IE6 if it looks OK in firefox ?
- We can probably live with the unstyled task list
- Getting a slower non-JS progress page when you could have the faster JS one is sad (esp. since the D5->D6 update can be quite long), but I guess we can live with that too...
Of course, finding a way to force a browser refresh for those files would be cool... But a message advising the user to do a fresh reload might be enough. Not sure how exactly to word that.

I still have to figure why messages issued during the upgrade are lost in non-JS mode, but that's for the original thread (http://drupal.org/node/196630)

JirkaRybka’s picture

Version: 6.0-beta4 » 6.x-dev

Yes, nothing critical. I'm also not sure about wording, but I support the idea of just adding a message about suggested re-load (most of all to avoid further bug reports). Not beta4-specific anymore BTW.

Nick Lewis’s picture

You can force a browser (most browsers, even ie6 -- unless ie6 is feeling grumpy on that day....) to clear its cache of a page like a so:

header("Cache-Control: no-cache, must-revalidate");

Its worked for me in the past.

JirkaRybka’s picture

Looks nice, but does that apply to embedded js/css files too?

Nick Lewis’s picture

In theory :-)

JirkaRybka’s picture

I'm even unsure whether we have that already? drupal_page_header() seems to do that. My concern was - if we sent that header with the css file (i.e. from the Apache under 5.x - not really possible for us), then it should be ok. But from a new HTML page? Just thoughts, I haven't a chace to test now.

Nick Lewis’s picture

Yeah -- I'm not sure about this case. They only times I've used these header settings was when I was dealing with IE6 caching issues for an ajax comments module. IE6 was being to aggressive about caching the json responses to the post, and no-cache ended up being the solution.

Gábor Hojtsy’s picture

One usual thing to do to force reload of CSS and JS files is to add some random query string to them. Like style.css?core=6.x, so that the browser loads the file, because you requested a different URL. Whether the update code / theme can add such a query string to all CSS and JS files is to be looked into (I am not volunteering:).

JirkaRybka’s picture

Wow! That's great idea, thanks for tip. Going to look into this as time permits.

JirkaRybka’s picture

Status: Active » Needs review

Now that I'm thinking about it, I see that the problem is broader than just update.php: If we managed to add some dummy query-string for all files needed in update.php only, then stale files will bite us later on the upgraded page (i.e. breaking the user's first post-upgrade experience), and also any css/js files not needed for update.php will bite us whatever we do on update.

I also see in common.inc, that Drupal already uses this sort of "cache-control" for JavaScript files in the 'no cache' mode (which probably proves this approach as usable, acceptable, and already-considered).

So I propose to extend the query-string-based cache-control from just *some* JS files to *all* css/js file links generated in core. This ensure reload of *all* files on/after every core upgrade, regardless update.php touching them or not, and so should solve the problem completely.

The attached patch does this. The files are now generated along the lines of <link type="text/css" rel="stylesheet" media="all" href="/xyz/modules/system/system.css?6.0-dev" />. The patch fixed the problem for me completely: No more unthemed task lists, update.php running smoothly in the JS mode, no side effects.

JirkaRybka’s picture

Status: Needs review » Active
FileSize
3.81 KB

File uploads on Drupal.org are currently not working :-( (Filled an issue on that: http://drupal.org/node/201088)

Patch waits on my local machine...

JirkaRybka’s picture

Status: Active » Needs review

OK, now attached successfully, so CNR. See #15/#14 for patch/description.

JirkaRybka’s picture

Title: Theme error on update when javascript not firing » Stale css+js files on upgrade

We'll do with a better title too...

Gábor Hojtsy’s picture

Status: Needs review » Needs work

Doing this on all pages is too much IMHO. It also exposes the exact Drupal version used on the site, which is not exactly ideal.

yched’s picture

'It also exposes the exact Drupal version used on the site, which is not exactly ideal.'

What about a md5 hash of site_name.VERSION ?

JirkaRybka’s picture

That would make the paths quite long (and ugly, which is probably not a problem here). Another way might be a count of update.php runs kept in variable_set('something'), installed schema version, or just some random character(s) generated and variable_set() stored at the start of update.php (possibly just time() ).

But if this query-strings-everywhere approach is not acceptable, then the only other option is just documenting the need of manual page reload, on update.php screens IMO. Adding query strings for update.php only is not nice: Doing yet another update exception in core is not only ugly, it only just postpone the problem to the post-upgrade browsing, and so makes it even worse: update.php is run only once, and with clean results page, minor issues are easily forgotten. But post-upgrade browsing is most probably testing in the style "let's see how good this new version is, and what's broken", so having stale files backfiring there is bad.

catch’s picture

Note this will also affect site visitors post-upgrade when browsing, not just admins.

I reckon a hashed url is probably fine, and better than appending version - given most production sites will use css and js aggregation anyway.

Gábor Hojtsy’s picture

You can do if (defined('MAINTENANCE_MODE') && (MAINTENANCE_MODE == 'update') { add extra query params to JS and CSS files }

JirkaRybka’s picture

Gábor - please read at #14 and #20: That will only postpone the problem to post-upgrade stages, looking even worse there. Or am I missing something?

Gábor Hojtsy’s picture

Oh, yes, understandably. Eeek. It would be so great to come up with an idea to refresh these cached CSS/JS files in the browser, even if the browser is not asking for them. Having a version specific query string on CSS files does not sound right, even if some web firms do exactly this when refreshing designs.

catch’s picture

I reckon yched's idea in #19 is good - if it used hash of 6.0example.com, 6.1example.com etc. could it potentially solve this issue if it arose in minor version upgrades as well?

Looks like the aggregated css file is used on update (obviously not js because that's not in D5, although it'll be an issue in D7) - so that should probably be cleared as well if possible (and has the same issues when browsing after upgrade as well).

JirkaRybka’s picture

Catch is right at #21 - this might be even worse for site *visitors* getting broken js/theming after upgrade. Also the aggregation will make the hashed querystrings go away, indeed, so I think it's acceptable to have these (unless we find a better solution).

But when we are at it, we probably need to check whether update.php pages may get stale *aggregated* files. Brief look into drupal_get_css() suggests, that we should add a MAINTENANCE_MODE check there, to disable aggregation. Of course it also needs a rebuild at the end of update, but that bit is already in there (although might be good to use the new drupal_flush_all_caches() function now, instead of duplicating code in update.php).

And MAINTENANCE_MODE vs. global $update_mode are duplicates BTW, introduced separately by two neighboring commits; we might as well clean this up too.

I'll roll a new patch, but only if time permits - no promises from me.

Edit: Cross-posted, but points are still valid.

catch’s picture

funny cross post.

Everything JirkaRybka said in #26 makes sense to me. Disabling aggregation on update.php fixes stale caches temporarily - then drupal_flush_all_caches() sorts things out before we leave maintenance mode - and covers us in the future as well for things like .js updates - could be much more of these in D6 contrib.

JirkaRybka’s picture

Status: Needs work » Needs review
FileSize
9.75 KB

Attaching a new patch, let's see what the feedback will be...

I'm still adding a query string on all files (not only update.php) - sorry to all who dislike that, but it's necessary to really gain control over browser-caching in a sensible way, which is the basic underlying issue here. The more I think about this, the more I see that site maintainer running update.php is not the biggest problem here: The biggest problem IMO is every single site *visitor* having stale files after every single change done on the site. If the maintainer tweaked a custom theme, for example, he surely knows about the need to hit reload - but site visitors will get new HTML on top of stale CSS and so broken display. If maintainer runs update.php while switching to D6, he is likely to manage (non-js mode even results in a reload already, somehow), but site visitor will then - dunno - get broken js on forms and such......

But however, we need to make the query string small and reveal nothing in it. I decided NOT to use VERSION-based hashes as query-string, for two reasons:
- Hashes are long and ugly (although it's hidden in the HTML source only)
- Such a hash will only change on core version upgrade, leaving contrib updates, or simply just site-maintainer's custom hacks, without a fix still. We need to force reload on EVERY update, even if core version didn't change.

* My patch adds a single-digit counter as query string, which is upped on every update.php run (but kept single-digit always).
<link type="text/css" rel="stylesheet" media="all" href="/x55/themes/garland/style.css?3" />
As a bonus, it's also upped in the central cache-flushing function drupal_flush_all_caches(), making the flushing button on admin/settings/performance page reach there too (so site maintainer may enforce reload, after editing his files, together with page- and other caches flush, in a single blow). The default on variable_get() makes the query-string become zero, if database is not available (the DB-down pages).

* The css/js aggregation is now disabled on update.php run, really no need to risk firing of these unnecessary features. Otherwise, it now uses hashes salted with our new query-string: Since the (aggregated) filename is built as a hash of filenames included in the file, without the query-string presence we'll end with the same filename after rebuild, and so no reload again. Now the filename changes, avoiding explicit query string need for aggregated files. (Note that creation of duplicate files is not a problem, because the counter is upped only on places, where the old files are also flushed.)

* The patch also changes the final flushing of caches at the end of update.php to the new central drupal_flush_all_caches() function, so that it will be easier to keep in sync in future. But still, we need to change the query-string at the *start* of update.php too, to solve the initial problem of this issue (drupal_flush_all_caches() would be dangerous there, as it contains various data rebuilds, and the database is not up-to-date on that point). So we actually change the query-string twice in update.php, but it doesn't hurt at all.

* While playing with the update exceptions here and there in our code, I also cleaned up the duplicity between $update_mode and MAINTENANCE_MODE flags, standardizing on MAINTENANCE_MODE (rather than choosing which one to use here). These two were introduced by http://drupal.org/node/179143 ($update_mode), and http://drupal.org/node/141727 (commit at #68 - MAINTENANCE_MODE - two weeks later), and probably no-one noticed the duplicity. (Disclaimer: My one went in first ;)) So this bit is in fact a follow-up to both of these issues.

JirkaRybka’s picture

FileSize
9.73 KB

Re-rolled - a lot of offsets. Still needs review!

Dries’s picture

I'm not really convinced with this patch -- I find it quite a hack. Why is using HTTP headers not sufficient?

moshe weitzman’s picture

subscribe ... i agree that unique querystrings on css and js includes are needed but i can't articulate a reply right now.

catch’s picture

Dries,

Although it'd be possible to force no-cache in maintenance mode I guess, for site visitors after upgrade, surely you'd have to temporarily force no-cache for every individual visitor then switch it off again for every visitor once they've got the new versions of the files? Even if it's possible, it sounds overly complex and prone to a lot of bugs.

JirkaRybka’s picture

Re: #30 Dries - We're already sending no-cache HTTP headers, but it only applies to the generated HTML (or at the very least, my FF 2.0.0.6 behave this way). For js/css files, the relevant HTTP headers are sent by Apache, while handling these files directly. We can't change these, unless serving the files via php, which is unacceptable for performance (I'm unsure if .htaccess or other Apache settings help here, but that's out of question due to limited permissions available on most shared hosts, I believe).
Correct me if I'm wrong.

Gábor Hojtsy’s picture

So it seems like we would need to add some query string to the runtime and maintenance CSS paths as well. What about trying to make this as small as possible. Like on every update.php run, we iterate through letters and store all letters used before (or the letters still available). After all the letters are used, we can start over. So we have the least overhead on the URLs used, ie. we would have style.css?f or something like that :)

- randomize an alphabet (so that we are not going through a-z on all sites, so how many times was the update run will not be visible from the URL); you get: "fiohzabcxw.."
- keep an index on the alphabet string of what letter was last used
- increment this index with each update.php run
- when the index goes over the end of the string, start over from the first char

This ensures a long enough window of unique strings on the query string (so cache problems will not arise), but a short enough query string to not mess up with the HTML page that much.

Since the analysis above shows that the CSS files are served from Apache and that as long as we use the same file names before and after updates, we need to have them refreshed, so we need to tweak with the HTTP request with query string mangling.

catch’s picture

Status: Needs review » Needs work

Single letter from an alphabetised alphabet sounds sensible to me - presumably that could also be added to the hash for aggregated files. Looks like a little more work needed then.

moshe weitzman’s picture

aggregated css and js files automatically recognize when their sizes change. so they get regenerated instead of stale. i don't think they need to be considered here.

catch’s picture

Moshe, according to JirkaRybka in #28 there's still a risk:

Otherwise, it now uses hashes salted with our new query-string: Since the (aggregated) filename is built as a hash of filenames included in the file, without the query-string presence we'll end with the same filename after rebuild, and so no reload again.

The salted hash is nicer than an appended query string though.

JirkaRybka’s picture

Right now, I can't work on this (due to my personal time limits).

My patch at #29 only just needs to adjust two instances of the query-string-changing code:

  // Change query-strings on css/js files to enforce reload for all users.
  $css_js_counter = variable_get('css_js_flush_counter', 0);
  variable_set('css_js_flush_counter', (++$css_js_counter > 9) ? 0 : $css_js_counter);

The above-quoted code from #29 is cycling through 0-9. If that's not desirable, then I'll suggest just random selection from a-z 0-9 and check to ensure a change from previous value. Risk of repeats is reasonably low for this purpose IMO.

It would be nice if someone go forward and roll a new patch, otherwise it may take quite a while before I come back to this. Sorry about that.

JirkaRybka’s picture

Status: Needs work » Needs review
FileSize
10.09 KB

Today, I have a tiny bit of time...

Attaching a new patch. The only difference is, that random query-strings are used now. Single character is picked from a-z A-Z 0-9 under condition of being different from the previous one. So now you can't tell anything from these URLs, apart from the plain fact that it changed. I think that it's sufficient: random selection from 62 possible characters makes the risk of repeats reasonably low. Storing some history/pattern strings somewhere (in a variable) would be overkill here IMO, no need to add yet another variable just for the very edge case of random numbers repeating and user not hitting reload.

As for the hash-filenames on aggregated files: The hash is built from serialized array of css/js info, i.e. things like filepath, module/theme type, RTL and the like. Not actual contents of the files. Without salting, we'll get the same hash if the same files (filenames/paths) are added.

gpk’s picture

I think somethink like Gábor suggested at #34 still needed, i.e. randomize the $characters and then cycle through them. Otherwise it's possible to get the same $new_string as you had the time before last.

i.e. something like this as the additions to drupal_flush_all_caches() [sorry, my PHP may be a bit 2nd rate]

  // Change query-strings on css/js files to enforce reload for all users.
  $old_string = variable_get('css_js_query_string', ' ');
  $characters = variable_get('css_js_query_characters', 0);
  if (!$characters) {
    $characters = str_split('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789');
    shuffle($characters);
    $characters = implode($characters, '');
    variable_set('css_js_query_characters', $characters);
  }
  $pos = strpos($characters, $old_string);
  $pos = ($pos === FALSE) ? -1 : $pos; // Make sure we start with the first character first time
  $new_string = ($pos = strlen($characters - 1)) ? $characters[0] : $characters[1 + $pos];
  variable_set('css_js_query_string', $new_string);

HTH

catch’s picture

hmm, variable stillseems like overkill. If we want to avoid a > b > a. why not add two/three characters instead? Then you have a very tiny chance of getting the same string twice for a much longer period of time.

JirkaRybka’s picture

FileSize
11 KB

Ok, if you all insist ;)

This version keeps 20 characters history, so the random selection only picks characters not used at least 20 flushes back. The history is stored in the same variable as the actual query-string, so we don't need a new one: The string simply shifts to the right, 21st character disappearing, and only first (newest) one used as the query-string.

I abstracted the string modification into a separate helper function, so that it may be used from drupal_flush_all_caches() and update.php, without code duplication.

I also added it to install.php, so that newly installed sites start with a random character. Otherwise the default zero would be a visible indicator of a new site, or a site running with no upgrades yet, meaning a possibly novice/careless site maintainer and so easy target for an attack.

gpk’s picture

Very neat :-D

catch’s picture

Status: Needs review » Reviewed & tested by the community

OK fresh install. First page load I get:

"/modules/node/node.css?T"
After cache_clear_all with no aggregation:
"/modules/node/node.css?P"
css aggregation on:
3bf3a2ec0d47c921ed4a60a800b30903.css
after cache_clear_all:
88b7141151b02488dea23fb3f2832c0c.css

javascript without aggregation (on the user permissions page where I know there's some js):
/misc/jquery.js?8
then with:
d6131d77335cde8f8406655bd20ad430.js
after cache_clear_all:
07bbf5296c1dd68d5fef642e19fb10b3.js

Going to update.php also changes the string (note I didn't actually test 5.x-6.x upgrade, just went to the url and viewed source, but this is generic so ought to be fine).

everything working as normal obviously.

Abstracting it out sounds sensible, as does not letting people know that someone never upgraded their site. Code all looks sane and well commented, so I'm marking RTBC.

Gábor Hojtsy’s picture

All changes look good, consolidating $update_mode into MAINTENANCE_MODE is all good, and it even removes some code for us, so all looks great.

The only thing which left me wonder is how often full cache flushes happen? Since redoing the aggregated CSS and JS files on each cache flush looks like a possibility to fill up the files folder with CSS and JS files if full cache flushes happen often. If this is the case, we should rather factor in file size and modification date on the source CSS/JS files into the md5(), instead of the query string char.

JirkaRybka’s picture

That's no problem. drupal_flush_all_caches() is only called by clicking the flushing button on administrative interface, plus now introduced at the end of update.php run. grep confirms that, too.

Filling up the files folder is not a problem, too, because drupal_flush_all_caches() also calls drupal_clear_css_cache() and drupal_clear_js_cache(). These explicitly delete all existing aggregated css/js files. I tested that previously to be sure.

So to sum it up, we do this flush only on upgrade, or on explicit manual request from site-maintainer, and we always start over with an empty folder for the (new) aggregated files.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for the explanation. I committed the latest patch.

I need to note here that this is the first release, which relies on the same core theme as the previous release (with the same file names), but greatly enhanced look and functionality in the little details, so this is why we face this problem now, and this is why we needed to fix this problem now. I think we managed to find the least disruptive solution.

catch’s picture

http://drupal.org/node/187023 was duplicate.

In addition to Gabor's comment on theme changes, this is also the first release with an update to jQuery.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

xgretsch’s picture

I have the same problem on an Ajax-heavy module (see www.bachtrack.com) on a newly created site.

The problem with HTTP headers is this: I can't think of a combination of headers which will do what I want, which is to reload the JS and CSS files *when they are changed* but not otherwise. If I use no-cache, the large JS files will be reloaded every time. If I use an expiry date, the users will get a suddenly long page refresh at some arbitrary date unrelated to any changes that may or may not have been made.

Actually, it seems to me that a query string with the CVS revision number of the relevant JS or CSS file would do the job just nicely, albeit at the cost of some manual work every time I do a module release. I'm about to try that with my own module code.

JirkaRybka’s picture

Now that #42 is committed in core, for 6.x you can just call _drupal_flush_css_js() each time your CSS/js changed.

anders.fajerson’s picture

Status: Closed (fixed) » Active

Sorry for coming in so late on this, just did a View source and saw this new feature. While a think this solves a real problem, the query string solution has one big drawback: in some browsers the files will never be cached.

Quoting Cal Henderson (Flickr fame) at http://www.thinkvitamin.com/features/webapps/serving-javascript-fast:

At this point, you might ask why we don’t just add a query string to the end of the resource - /css/main.css?v=4. According the letter of the HTTP caching specification, user agents should never cache URLs with query strings. While Internet Explorer and Firefox ignore this, Opera and Safari don’t - to make sure all user agents can cache your resources, we need to keep query strings out of their URLs.

Personally I will use the aggregate functionality on live sites which don't have this problem (it uses an unique filename), but just wanted to bring this up since I didn't saw it mentioned above.

Update: Reading through the comments it seems like recent versions of Safari and Opera DO cache these files.

anders.fajerson’s picture

Status: Active » Fixed

So I'll mark this as fixed again, given that only older versions of Opera and Safari should have a problem caching files with a query string at the end. Sorry for the fuss.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.