Problem/Motivation

As a general practice, we try to provide "tools, not policy" to the users of the Aegir project. The current nginx configuration goes against this principle by configuring a *lot* of policies in the configuration written by the software.

Because of the way the templates are done right now, the only way to override those policies is to basically *fork* aegir (which is what omega8cc is doing). I believe this is inappropriate and we should not override the policies sysadmins may want to apply themselves. Those policies should instead be provided by a sample configuration file, a contrib module or at the very least be optional.

Proposed resolution

This specific issue is about two of those policies, but other issues have been filed for separate problems.

One of those policies is the fastcgi caching. I have made a patch to remove it in 471212b. The other policies are mostly things like timeouts, buffer sizes, etc. This is removed in 8d14c47.

Other issues have been filed seperately:

It is suspected that this cleanup could also fix other issues, like #1608910: Installing Aegir with nginx leads non-drupal sites on server to have other config ignored. or #1757696: nginx fails to restart with "map_hash_bucket_size" directive is duplicate in /etc/nginx/conf.d/aegir.conf:42.

Cleanup of the nginx code happens in the dev-nginx-cleanup branch of provision.

Remaining tasks

The above issues and patches need to be reviewed. A third-party module must be written to support configurations where the remove settings are expected.

User interface changes

No user interface change.

API changes

FastCGI caching would get turned off by default and certain tuned parameters would be moved to a third party module.

Original report by anarcat

(The original report was integrated with the above summary, and was about the two commits mentionned in the Proposed resolution section above. It has been expanded to cover related features that were difficult to find, but is still mostly about those two commits and discussions should keep to them.)

Comments

anarcat’s picture

Status: Active » Needs review
omega8cc’s picture

Status: Needs review » Needs work

The problem is that those settings are not intended to enforce anything for any special/custom purpose.

They are to provide production-ready Nginx configuration, with settings tuned for setup as generic as possible, because vanilla Nginx provided by Debian or Ubuntu is less than usable.

While I agree that it is generally good idea to avoid "controlling" anything as much as possible, there is a dark side of that "tools, not policy" approach, because many (most) of those settings and associated configuration must be included in config templates, as otherwise to get them to work, *every* user of Nginx based Aegir install would be *forced* to fork Provision, which is probably not what we want.

My opinion is that removing the stuff you *think* don't belong in the templates included in Aegir without really understanding how this stuff works and what can be and what can't be made optional etc, is not the way we should handle it. Hence my initial strong *no* to most of those changes.

You can't just move them to some contrib module/config/example/whatever or to some "docs", because these "docs" would have to instruct the user to fork Provision to have it working without issues.

I can promise only one thing: I can try to move away (to docs or optional includes) as much as possible of currently included settings, but some of them can't be made optional, because they would require not just including some extra (and optional) config, but they would require *replacing* some key parts of the config.

This may easily end up with some not really working "generic" config, breaking the stuff on upgrade for all people using current config, so we must proceed with care, provide working upgrade path etc (which doesn't work currently).

As promised, I will publish my work in a separate dev branch.

anarcat’s picture

My opinion is that removing the stuff you *think* don't belong in the templates included in Aegir without really understanding how this stuff works and what can be and what can't be made optional etc, is not the way we should handle it. Hence my initial strong *no* to most of those changes.

I find this rather rude: while I am not very familiar with nginx specifically, I have been managing apache webservers for a long time and they have similar concepts. The SSL cache, gzip compression, socket timeouts and so also exist in other webservers and are not defined in Aegir's apache configuration.

...because many (most) of those settings and associated configuration must be included in config templates, as otherwise to get them to work, *every* user of Nginx based Aegir install would be *forced* to fork Provision, which is probably not what we want.

You can't just move them to some contrib module/config/example/whatever or to some "docs", because these "docs" would have to instruct the user to fork Provision to have it working without issues.

I don't understand this: as I see things right now, the *only* way to *change* the currently hardcoded settings is exactly to fork provision, while if the settings would be *absent* from the generated files, third party config modules could add those settings as necessary. It seems to me the situation is pretty much the opposite of what you are saying. If we move the policy settings out of the stock configs, they can be modified by third party modules, if they stay in the stock config, they cannot be modified without forking.

At the very least, those settings should be modifiable within aegir, but I do not think that level of customization belongs to the frontend at all.

You have yet to explain what exactly breaks in the dev-nginx-cleanup branch. You quickly mention the upgrade path, I would be curious to hear how it is broken and I would be happy to work on fixing that.

I am also curious to see your alternative proposals.

omega8cc’s picture

It is not my job to prove that you are wrong (and where).

It is your job to prove that you are right, because it is you who started "fixing" this config and introducing changes.

That is where we are and how it should work.

I'm trying to be constructive here, even if you are not making it easy.

omega8cc’s picture

As an example, I would be interested in hearing some reasoning behind removing microcaching introduced here: #1408410: Add microcaching to Nginx config by default

anarcat’s picture

It is not my job to prove that you are wrong (and where).

It is your job to prove that you are right, because it is you who started "fixing" this config and introducing changes.

I believe this is a fallacy. Most specifically, you are shifting the burden of proof on me. Nevertheless, I believe I have made clear points previously and will continue to do so here. In other words: yes, it is your job to prove that I am wrong if I have made valid arguments (and I think I have) that prove that I am correct.

I have explained that hardcoding settings is a problem because they cannot be overriden by the server admin at the global without forking provision.

To take your example, the reason I want to remove microcaching by default is that I believe it is a site-specific policy. Someone may want to disable or change the timeout on the cache and right now there is no clear way to do this. You may want to do this because you have a frontend cache server other than nginx (varnish for example).

I understand that this can be done in global.inc, from the example you have provided on irc, where adding the following to global.inc would disable caching:

header('X-Accel-Expires: 0');

The problem with this is similar than what we had with the default Drupal caching in settings.php: we were imposing a policy (caching) to admins by default, and they had to go through extra steps to disable that (adding stuff to global.inc). I believe this should be disabled by default and easy to enable.

This can be done in the global nginx.conf (which is not under Aegir's control) or through include files, or through custom modules. Either way, it should be optional.

I understand you have your own perspective on this, and will welcome your contributions on that level. Maybe only then will we be able to discuss this, as you said, constructively.

Anonymous’s picture

As per IRC discussion, my thoughts / summing up are:

1. The current nginx defaults provide a very good out-of-the-box experience for probably 99% of end users

2. But they rely on a lot of baked in settings that can't be easily overridden for users who want to customise that or manage settings outside of aegir (eg /etc/nginx/conf.d/, maybe puppet-managed)

3. We can't just rip it all out or we will break the upgrade path for many users

4. We should work on a compromise that retains the current config as the *default*, but allow the user to switch that off if they want more control

I think we could achieve 4 by exposing a checkbox when a user adds a server node of type nginx / nginx_ssl, saying 'Use recommended nginx config?' and it should be checked by default.

The user can then switch it off in the server node. The server node will then be verified, and in the backend we can check for this checked/unchecked option and conditionally use a different template to generate the server config based on that.

If the user opts to turn off the defaults, they get a very bare config, and they are on their own to set that up elsewhere (but they are equally still able to *copy* the default config to /etc/nginx/conf.d/ or similar, and make their modifications)

This is just my thoughts for now - there may be a better way of doing this.

anarcat’s picture

I do not believe that ripping out those settings will break the upgrade path for people. 2.x is exactly when we can have API changes and similar modifications to the software, so as long as we document those changes in the upgrade path, people will be able to follow through.

I believe we have an agreement that we should provide tools, not policy. The configuration deployed by Aegir for nginx hardcodes specific policies, like caching, that cannot be easily overriden by the admin. In similar situations outside of Nginx, we have removed those settings (the canonical example is caching in the settings.php). I think this is a similar situation.

I have proposed two very specific changes:

One of those policies is the fastcgi caching. I have made a patch to remove it in 471212b.
The other policies are mostly things like timeouts, buffer sizes, etc. This is removed in 8d14c47.

Let's talk specifically about those changes.

I believe the latter policies can be re-enabled with a third party module, because there is a $extra_config variable in there that other modules can hook in. Writing such a module is trivial and I would be happy to do it.

The fastcgi caching is harder, because it is all over the place. The server bit can be fixed the same way as the above, but there is no hook to modify the parameters sent to the "index.php" file. It seems to me this is a problem in itself - admins may want to extend that in other ways. I suspect that those fastcgi parameters can be set higher up however, at the server level, and the above overrides could then work.

If there are no objections, I will start working on such an extension.

I strongly believe we should not enable fastcgi caching by default. It will make testing and development of this project harder and while I really appreciate the hard work that was done in attaining this great configuration, I think our default install should be simpler and on par with the Apache configuration, which doesn't have caching enabled.

I look forward to further feedback on this issue.

anarcat’s picture

Issue summary: View changes

rework issue summary

realityloop’s picture

As someone who has come up against nginx config issues with Barracuda (don't get me wrong I love the work thats you have done here omega8cc), I really like the approach that mig5 is suggesting in #7

omega8cc’s picture

I agree with mig5 suggestions. This needs to be done wisely but I didn't have a time to work on this yet. Early December, I think.

ergonlogic’s picture

I had a very informative discussion with omega8 this morning, that I've summarized somewhat here:

11:44 ergonlogic> as I understand the issues under discussion, you've been building a highly tuned default config
11:45 ergonlogic> and anarcat's position is that this conflicts with the 'tools not policy' design principle?
11:45   omega8cc> tuned, highly secure, so restrictive etc 
11:46 ergonlogic> these defaults can't be overridden though
11:46 ergonlogic> so, mig5's suggestion is to make them optional
11:47 ergonlogic> but keep them as defaults
11:47   omega8cc> they could be, easily, but they should stay default
11:48 ergonlogic> but they'd be defaults by implementing the hooks and so forth that anarcat is proposing?
11:48   omega8cc> but you can't move it to the contrib space, however, because you will blow up upgrades
11:48 ergonlogic> along the lines of injecting into apache vhosts?
11:48   omega8cc> anarcat wants the simple as default
11:49   omega8cc> and move the complicated to contrib
11:49 ergonlogic> could a compromise work?
11:49 ergonlogic> keep the complicated ones in core
11:49 ergonlogic> but the defaut is simple?
11:50 ergonlogic> I prefer better (more performant, secure) defaults myself
11:50 ergonlogic> but if they're just a click away...
11:51   omega8cc> you could set the simple as default for *new* installs, especially for apt-get based install
11:51 ergonlogic> and that can be easily maintained as part of the upgrade path, I'd think
11:51   omega8cc> but there is no reason to arbitraly breaks things for manual installs and upgrades
11:52   omega8cc> It should be really simple with just one variable to make the config switch easy, I just didn't have a time to work on this - and not because of vacations, because I had none, but because of my workload which is still very high.
11:52 ergonlogic> upgrades could be handled by a hook_update to enable the new flag by default, no?
11:52   omega8cc> no
11:53   omega8cc> it is on the provision domain
11:53   omega8cc> pretty low level
11:53   omega8cc> it may still fail horribly on upgrade, no matter which config will be set as default, because of this issue I had to develop a workaround for where variables stored in .drush/server_master.alias.drushrc.php are not used on upgrade
11:53 ergonlogic> so that flag would have to be added to the server context?
11:53   omega8cc> yes, but it will not work anyway
11:53   omega8cc> because server_master is not re-verified on upgrade and the config files are left as-is - there is an open issue about this: http://drupal.org/node/1552430
11:54 ergonlogic> hmm, ok
11:54 ergonlogic> but if we could overcome that?
11:54 ergonlogic> is that the only blocker you see?
11:56   omega8cc> yes, as long as you keep the old config default
12:00   omega8cc> just fix that  http://drupal.org/node/1552430
12:00 ergonlogic> yeah, I'm reviewing it now
12:00 ergonlogic> I think good defaults in core is the way to go, personally
12:01   omega8cc> really, it could took 2 days to code and test properly, I just didn't have that time yet
12:01 ergonlogic> if nothing else, they make on-boarding easier
12:01 ergonlogic> and provide good examples of how to implement such things
12:05   omega8cc> and there are built-in options to override anything you want in the Nginx config, but sure, if you are new to this stuff, you may prefer no-control (and no-security and no-performance approach)
12:07 ergonlogic> so, in http://drupal.org/node/1552430, we probably need a provision-save from the front-end, because there's a need to keep a setting there
12:07 ergonlogic> the default/minimal nginx config flag
12:08 ergonlogic> that is the default/minimal nginx config flag is essentially a front-end variable, that'll need to be passed to the backend server context
12:09 ergonlogic> or am I mistaken?
12:11   omega8cc> there are too many weird issues with frontend missing communication with backend to rely on something like that without extensive testing, there are more issues like that, with missing aliases etc, I would prefer to just use a backend variable, as adding interface to this is rather a recipe to really complicate things on the fly.
12:13   omega8cc> this should be an option passed during install and upgrade and stored in backend variable, it shouldn't depend on db and frontend in general
12:13   omega8cc> or you are asking for troubles
12:13 ergonlogic> so, how would we change it, then?
12:14 ergonlogic> without hacking the alias?
12:14 ergonlogic> once an aegir is installed, I mean
12:15 ergonlogic> oh, I see
12:16   omega8cc> this switch shouldn't  be too easy to find and use (like in the frontend) because it may lock the user in the middle of failed task etc. the user should really understand what he is doing, as it will involve changes to the config files, re-verify etc, and there is always a chance that something will fail - like it fails for aliases, and it is too important to allow it to fail
12:16 ergonlogic> but we already re-verify servers after altering settings
12:17   omega8cc> that is why I still see it a wishful thinking when reading about the "simple" approach
12:17 ergonlogic> wouldn't it be equivalent to changing a port, or adding ssl support?
12:17   omega8cc> no we don't
12:17   omega8cc> or rather, it doesn't work really
12:17   omega8cc> no
12:18   omega8cc> it is not like extra feature or one feature less
12:18   omega8cc> it is a config replacement, in fact, plus, it may break (and it will break) some sites which silently depend on that "complicated" config
12:19   omega8cc> so, In fact it would require three and not two options
12:19   omega8cc> like
12:19   omega8cc> bare bone (basically, you are on your own)
12:20   omega8cc> second: things expected to work out of the box, but without enforcing anything which could be skipped
12:20   omega8cc> third: out-of-the-box (as it is now)
12:21 ergonlogic> ok, that seems reasonable
12:23 ergonlogic> and most of that is in here: http://drupalcode.org/project/provision.git/tree/refs/heads/6.x-2.x:/http/Provision/Service/http ?
12:24 ergonlogic> it looks like there's already almost that
12:24 ergonlogic> in nginx.conf, nginx_simple_include.conf, and nginx_advanced_include.conf
12:24 ergonlogic> I'll need to dig into them a bit more
12:25   omega8cc> here is only that out of the box config, with minor difference between advanced and simple, but it is about something else (the upload progress stuff)
12:26   omega8cc> also, you will need more conditional main config and then the includes features depend on it, or it will fail, and it will, at least until the issue mentioned above will be fixed
12:29   omega8cc> and we have already spent too much time on supporting failed upgrades etc, all because the config is not updated on upgrade, so you really must do this with minimum moving parts and even after many tests you will find that it will break things for some older installs upgrades etc, it is not really as simple as it looks while you are testing during development
ergonlogic’s picture

So, we have 3 hooks for configuring Apache:

We don't appear to have nginx equivalents. I think providing these would be a good place to start fixing this issue.

ergonlogic’s picture

Actually, it looks like we do: Provision_Config_Nginx_Server::process. We'll have to update the API docs.

ergonlogic’s picture

I noted the new hooks needing docs in #1111254: Document the Hostmaster API.

In doing so, I also found drush_hook_provision_nginx_vhost_config() defined in Provision_Config_Nginx_Site::process

Both this one and the server one noted above are also invoked in the equivalent SSL classes.

Notably absent is the Nginx equivalent to drush_hook_provision_apache_dir_config().

omega8cc’s picture

There is no purpose to use drush_hook_provision_apache_dir_config() (Append Apache configuration to platform configuration.) equivalent in the Nginx configuration. At least I'm not aware of any, so it was skipped for Nginx.

omega8cc’s picture

Status: Needs work » Fixed

Previous (extended) configuration has been reduced where possible, but is still used by default, to provide backward compatibility without issues, while anyone can easily switch to use new, bare bones configuration, or later back to the extended configuration, using just one empty control file.

To enable basic config (either on initial install or upgrade):

$ touch /etc/nginx/basic_nginx.conf
$ su -s /bin/bash - aegir -c "drush @hostmaster hosting-task @server_master verify --force -d"

To enable extended config:

$ rm -f /etc/nginx/basic_nginx.conf
$ su -s /bin/bash - aegir -c "drush @hostmaster hosting-task @server_master verify --force -d"

Note that only basic->extended switch is safe, while extended->basic switch will break sites config (on upgrade only), until all vhosts will be re-verified to use correct include and match updated main, bare bones config. A simple workaround would be to replace the included config file with:

$ cd /var/aegir/config/includes/
$ cp -af nginx_simple_include.conf nginx_advanced_include.conf
$ service nginx reload

I hope this fixes all previous issues related to design principles and backward compatibility.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

anarcat’s picture

Status: Closed (fixed) » Needs work

First, thanks for taking the time to fix this! :)

I had some issues transitionning to the basic system. This Aegir 2.x head as of now, and nginx 1.2.1-2.2 on Debian Wheezy.

Running the following:

# touch /etc/nginx/basic_nginx.conf
# su -s /bin/bash - aegir -c "drush @hostmaster hosting-task @server_master verify --force -d"

on a clean 2.x install, yields:

Executing: sudo /etc/init.d/nginx reload
  Reloading nginx configuration: nginx: [emerg] "fastcgi_cache" zone "speed" is unknown in /etc/nginx/nginx.conf:73
  nginx: configuration file /etc/nginx/nginx.conf test failed

Line 73 is where the conf.d is sourced...

You also need to run this for every site:

su -s /bin/bash - aegir -c "drush @hostmaster hosting-task @hostmaster verify --force -d"

I am not sure I understand how the basic/advanced switch works, but shouldn't it generate the included files as well?

Also, mig5's original suggestion was a config option to the backend, why do we need to touch a file instead?

omega8cc’s picture

I have tested this on Debian Squeeze only.

Where would you like to store such config option? We would end up in a egg/chicken loop, if we would like to rely on an option stored in the server drush alias, for example.

We simply use the most reliable switch, which uses the same method we already use to determine Nginx binary path and its init script path.

anarcat’s picture

I was expecting this to be stored in the same way that ports or the restart command are configured right now in a server.

How about generating the included file from a template instead of providing both include files and changing which is included in the vhost?

What I mean here is related to #1622846: Nginx config include files should be merged into one. My idea is that we should have a single file included in vhosts, and manage that file using a template. So instead of having nginx_advanced_include.conf and nginx_simple_include.conf, and change the vhost to include one or the other, we could have a nginx_vhost_common.conf file that gets generated differently depending on the configuration. That way, the whole system would remain consistent and configuration could be quickly changed across all vhosts transparently.

omega8cc’s picture

I agree that having just one include file auto-generated from the template already fixes both problems - conflicts between options in the main server config and includes and incorrect includes referenced in all vhosts, so it is a good idea.

The only problem: we still need to generate those two files, even if they will be identical, because old vhosts never heard about nginx_vhost_common.conf include, and if we will decide to use nginx_vhost_common.conf from now on, we will need to generate *three* include files, just to stay backward compatible with vhosts not verified after the upgrade/downgrade. But it doesn't hurt that much, I guess.

omega8cc’s picture

The restart command is stored, but first discovered the same way. But the restart command is not something expected to change after the initial install, so it is enough to simply store it and use as-is. With config mode it is different, because we want to be able to change the mode from the outside, on upgrade, so we deliberately always check for the control file existence (once) and the re-use the result via variable stored in the server drush alias to generate correct config from templates.

anarcat’s picture

One way to deal with the upgrade path is to generate simple stubs for the two current files that include the generated file. That way, yes, we generate three different files, but two of those are mostly empty and only useful for the 1.x to 2.x upgrade path. We can then discard them in 3.x.

omega8cc’s picture

Good idea!

anarcat’s picture

anyone working on this?

omega8cc’s picture

I should be able to complete all discussed changes later this week (Friday?)

anarcat’s picture

awesome, i wish to release alpha2 on friday too! :)

omega8cc’s picture

Ouch! Then I will do my best to complete this on Thursday :)

anarcat’s picture

any news here?

omega8cc’s picture

It is all done a few days ago: http://ci.aegirproject.org/job/D_aegir-debian-build-2x/186/

Sorry for not updating issues yet. One sec.

anarcat’s picture

That's great, thanks!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

nit pick

  • Commit 34a69f4 on dev-drupal-8, 6.x-2.x, dev-ssl-ip-allocation-refactor, 7.x-3.x, dev-subdir-multiserver, 6.x-2.x-backports, dev-helmo-3.x by omega8cc:
    Issue #1635596 by anarcat - nginx: do not decide the policy for users