Yes, SNI doesn't work on Windows XP with old versions of Internet Explorer. But it's still pretty useful and could be supported without heavy modifications.

Activating SNI in the UI would cause Provision to skip the locking mechanism it currently uses to uniquely assign IP addresses to each SSL-enabled site.

Comments

anarcat’s picture

Version: 6.x-1.9 » 6.x-2.x-dev

Sure. Note that the SSL allocation code was significantly revamped in 2.x to cleanup the IP allocation code.

There was code in this issue to implement SNI, no idea in what state it would be now: #1126640: move the SSL IP allocation to the frontend.

Note that this will not be implemented in 1.x core, but I'd be happy to review patches for 2.x.

anarcat’s picture

The more I look at the crap around IP addresses (now redoing the IP forms *again* in #1968226: manage each IP individually on the server level), the more i think back about this issue.

Basically, the question here is do we want to support Windows XP for our SSL certificates. I feel that time has come to say no. XP will not be supported at all by Microsoft itself in august 2014, and has been in "extended support" since 2009 (basically means: only security updates).

SNI is so much simpler - we could simply ditch the whole IP allocation code instead of wasting more time coding for IP support.

Opinions on this matter would be very welcome.

helmo’s picture

Not to spoil your fun, but how do mobile devices handle SNI these days?

It's been a while since I've tried SNI but I remember having trouble with some mobile phone browsers

omega8cc’s picture

This may seem tempting, but let's remember that the real users of the SSL feature are not devs or designers, but online shops owners, and depending on their target audience/market, dropping Windows XP support would be an equivalent of ignoring 64% of 512 million internet users in one country.

My personal opinion is that it is at least one full year too early for such move.

Also, how to support existing installs with SSL feature upgrades/conversion? I guess it is not even possible then, if we just drop dedicated IPs support.

ergonlogic’s picture

#1932616: IPs deleted from hosting_ip_addresses table on server verify has been marked fixed, in favour of #1968226: manage each IP individually on the server level, which appears to indicate that the problem is only with removing IP addresses. Is IP allocation actually broken in 2.x? If not, or if there are viable workarounds, I suggest leaving it as is, rather than expend further effort on something we probably won't need in a year's time. The lifecycle for our 2.x branch can then extend sufficiently to support Windows XP users until its end-of-life. Then we can move to SNI in Aegir 3.x fairly soon.

omega8cc’s picture

@ergonlogic Not exactly. This feature is actually totally broken at the moment, because id's associations are rewritten/randomized and lost on every server re-verify. That is why anarcat suggested replacing current textarea with separate fields per IP to avoid this randomization.

[EDIT]

that the problem is only with removing IP addresses

- no, this is no longer a problem, but the actual problem is not better, as explained in comments linked above.

anarcat’s picture

Alright. So I agree that we can wait until microsoft discards XP support (one year, basically the lifetime we're aiming for 2.x) until we switch to SNI.

So I will finish the IP management code properly in #1968226: manage each IP individually on the server level. It may yield some new bugs, but that's why we have alpha release cycles. I believe we are too far in that refactoring to revert back to what 1.x is doing, and i think that issue is the right approach.

We may even be able to slap SNI support in there as a hack for 2.x - it's necessary for #1784108: pack (and cluster?) modules incompatible with SSL, which is a huge priority for us at Koumbit right now - and one of the reasons why I wanted to change directions here. :) But we can also wait for 3.x for that and maintain a quick hack on the site, I guess.

Thanks for the feedback everyone.

Guillaume Beaulieu’s picture

I think there are a bunch of obscure reasons why we wanna keep different servers with different IPs. Having an access to obscure things in the backend might be useful. We might eventually need to get a "SSL" tunneling only server that connects to different webheads, so having IP allocated on the webheads so we can migrate a site from server to server without having to dick around the backend at all. I think all the interconnects between varnishes/solrs/sql servers/web heads should be IP based, so we can realign a misbehaving sql server/web head/etc to some other machine without having to think too much. In short, the IP mashing code is reusable !

G

anarcat’s picture

Version: 6.x-2.x-dev » 7.x-3.x-dev
anarcat’s picture

just a note to mention that in #2071317: Incorrect SSL IP is deployed to servers in a web pack i am working on re-implementing the IP address passing code between the backend and frontend. and if no IP is provided for a specific server, it will use the wildcard, which in turns could simply enable SNI for that host...

from what I understand from SNI (http://wiki.apache.org/httpd/NameBasedSSLVHostsWithSNI) - this is all that is required anyways!

now what I wonder is - at what level should the checkbox be: do we enable SNI at the server level? platform? site?

cweagans’s picture

If SSL is enabled, we should also add NameVirtualHost *:443 to the server config file.

anarcat’s picture

I have been able to make this work on a production server by removing the check on the IP address field in the frontend:

commit dc43c1047856b23770060d2c00b04f49bea75a4c
Author: Antoine Beaupré <anarcat@koumbit.org>
Date:   Tue Feb 4 16:53:20 2014 -0500

    do not set IP address if none provided, this should permit SNI, in the frontend at least

diff --git a/server/hosting_server.module b/server/hosting_server.module
index 1cef51c..9d75158 100644
--- a/server/hosting_server.module
+++ b/server/hosting_server.module
@@ -230,7 +230,7 @@ function hosting_server_form(&$node) {
   // taken mostly from the hosting_site alias stuff
   $form['ips_wrapper'] = array(
       '#title' => t('IP addresses'),
-      '#description' => t('A list of IP addresses this server is publicly available under, one per line. If none is specified, a DNS lookup will be performed based on the server hostname above. <br /><strong>This should point to the publi
c network, if you have such a separation.</strong>'),
+      '#description' => t('A list of IP addresses this server is publicly available under, one per line. If none is specified, X509 certificates will be offered through the SNI mechanism by TLS-enabled webservers. <br /><strong>This should point to the public network, if you have such a separation.</strong>'),
       '#type' => 'fieldset',
       '#tree' => FALSE,
       '#weight' => -9,
@@ -307,18 +307,15 @@ function hosting_server_form(&$node) {
 /**
  * Implementation of hook_presave()
  *
- * We resolve the server name to IP addresses if none has been given
- * by the operator. we also fire up the regular services hooks.
+ * We notice the operator if no IP was given. We do not set any to allow
+ * operators to use SNI. We also fire up the regular services hooks.
  */
 function hosting_nodeapi_server_presave(&$node) {
   if (empty($node->ip_addresses)) {
     // this returns an array or FALSE
     $ips = gethostbynamel($node->title);
     if ($ips) {
-      drupal_set_message(t('Initialized the IP to %ip based on hostname %name. If an HTTP service is enabled, this will be used to create database grants so make sure it is the right address, as seen from the database server.', array('%ip' => join(',', $ips), '%name' => $node->title)), 'message');
-      $node->new_ip_addresses = $ips;
-    } else {
-      drupal_set_message(t("Could not resolve IP address of server %name, not automatically setting IP address. DNS may fail.", array('%name' => $node->title)));
+      drupal_set_message(t('No IP addresse provided for server. We guess the IP is %ip based on hostname %name. You should set IP addreses in the server node unless you are ready to use SNI (Server Name Indication) which is incompatible with IE and Safari on Windows XP.', array('%ip' => join(',', $ips), '%name' => $node->title)), 'message');
     }
   }
   hosting_server_services_from_post($node);

I pushed this on the dev-sni branch if anyone wants to test.

I'll be testing this in production now.

anarcat’s picture

Issue summary: View changes
Status: Active » Needs review

One bit that is still missing is for the backend to add a NameVirtualHost *:443 to the global configuration file if no IP is found on the server. But I am not sure we have the knowledge to make such a decision right now in the backend so I am configuring this by hand for now.

ergonlogic’s picture

Neat! If this is really all it takes, we could easily add a 'Support SNI (experimental)' option to the settings page, and switch conditionally. Assuming testing proves successful, I'd support backporting this to 6.x-2.x.

anarcat’s picture

Another patch is necessary to remove the check in the frontend when absolutely no IP can be allocated, also on the dev-sni branch.

anarcat’s picture

Status: Needs review » Needs work

Hmmm... I had to manually remove the ip_addresses field from the alias of a site to make this work. It seems this doesn't get cleaned up properly...

xurizaemon’s picture

If you want SNI + Aegir today, there is a workaround. This could be more graceful - it won't handle site migrations so you are going out on a limb. That said ... I just tested this out with Nginx server, and it's working.

  1. cp /var/aegir/config/server_master/nginx/vhost.d/example.org /etc/nginx/sites-available/example.org-ssl
  2. Edit the new -ssl config and add entries for ssl_certificate and ssl_certificate_key
  3. Modify the listen directive to match your SNI IP, terminating in :443 ssl;. (Note that Aegir gets first dibs on IPs, so you must not assign this IP in Aegir at all.)
  4. Enable your new vhost
  5. nginx -t && nginx -s reload

If any details in the original vhost config change, you need to manually update them. A better approach would be to use an approach to hook into your vhost config and append a customised SSL setup.

Back to the issue at hand now :)

SocialNicheGuru’s picture

on 6.2 dev-snl backport seems to work for me.

fittypants’s picture

Version: 7.x-3.x-dev » 6.x-2.0

Installing the dev-sni branch of the hosting module fixed this for me. I used the CentOS 6 manual installation guide and after a successful install,
updated the hosting module manually using the dev-sni branch:

cd /var/aegir/hostmaster-6.x-2.0/profiles/hostmaster/modules/
mv hosting hosting.orig
git clone --branch dev-sni http://git.drupal.org/project/hosting.git

I get a warning about SNI compatibility when creating a new site that shares a common wildcard SSL cert, but things work perfectly.

SocialNicheGuru’s picture

seems like the changes are in hostmaster-6.x-2.1

Edit:
commit: http://drupalcode.org/project/hosting.git/commit/dc43c1047856b23770060d2...
this is applied from dev-snl to 2.1 already
http://drupalcode.org/project/hosting.git/patch/dc43c1047856b23770060d2c...

This commit has not been applied to 2.1: http://drupalcode.org/project/hosting.git/commitdiff/26905e71a2e296b83bb...
Here is the patched code that applies to 2.1
http://drupalcode.org/project/hosting.git/patch/26905e71a2e296b83bbcbdfa...

Looking through the logs I did not see other changes

anarcat’s picture

i don't recall merging that branch in 2.1... i would assume that it's not merged yet.

kristofferwiklund’s picture

I am looking into this.
As some site is okay for non-SSL for anonymous users, but login should be available over SSL. And my solution for 6.x-2.x is to add same IP over and over again for the server-node so I have duplicates on the IP. And then Aegir will find a "unique" ip for next cert when saving the cert.

gboudrias’s picture

I tested the dev-sni branch and encountered the following problems:

  • Every site seems to have a self-signed certificate created for it the first time we activate SSL for it using the wildcard cert. We have to change it and then switch to the wildcard cert again.
  • I once lost the wildcard certificate from the frontend by disabling SSL on one site. I suspect it didn't find any "other" IPs assigned to the cert and that's how it decided to remove it.
  • I also had to add NameVirtualHost *:443 and remove the IPs from the aliases as per #13 and #16.

(Edit: Part of this may be because I hadn't removed the IP address from the frontend at first!)

NWOM’s picture

I had the same issues in my environment as above.

milovan’s picture

I can't find dev-sni branch anywhere. Am I missing something? The only thing I found was dev-ssl-ip-allocation-refactor but it is the same as 2.1 so it is not it. Please let me know as I would like to test it.

milovan’s picture

Okay I tested dev-sni branch with Aegir 2.1 and here is what I found so far:

  • on one existing site (Drupal 6) A I enabled SSL and it created key for it which is fine
  • on another site B, I enabled SSL and chose key from previous site (A) but in the end it created a new key for site B instead using key A
  • I created a new site and chose to use key from B, but instead it created new key for itself notifying me about SNI (message: Unable to allocate IP address for certificate, assuming SNI (Server Name Indication) will work (incompatible with Safari and IE on Windows XP, Android 2.2, etc). ).
  • If I visit site A (which is Drupal 6) on https, instead of a site A I see installation for Drupal 7
  • If I visit site B (which is Drupal 7) on https, it is displayed correctly
  • If I visit site C (which is Drupal 7) on https, it redirects to https version of site B

The same way works branch dev-ssl-ip-allocation-refactor.

So for me SSL fix for regression introduced by Aegir 2, doesn't work correctly. I can now create more SSL sites than amount of IPs I have in my pool but are not created correctly at all as can be seen from bullets above. I would really like to help and test everything needed, as problem with not having common SSL for all websites on test and development servers is really a showstopper to upgrade to Aegir 2 as naturally upgrade will fail since many of those development servers are relied on SSL and adding 30 more IPs in pool on each developemnt server is simply not an option.

milovan’s picture

After Anarcat told me on IRC how to properly configure SNI (signle cert site and SNI sites cannot be mixed) I did the following and here are results too:

  • Disable SSL on al sites in Aegir
  • On server node, remove all IP addresses
  • Enable SSL on all sites where you disabled SSL
  • During enabling, I used both option such as "Use existing key" and "Generate new"

Results:

  1. No matter if you use "Use existing key" it will ALWAYS create NEW key, like if you chose "Generate new key".
  2. HTTPS version of first enabled site with SNI works correctly.
  3. HTTPS versions of other sites DO NOT work. First, they alsways generate new certificate and notify about using SNI, they do not reuse certificate of site where SSL was enabled first. This feature was present in branch 1.x and still works correctly there!
  4. What is important here is that all those HTTPS versions (except of the first enabled ssl site) of sites point to Drupal installation! Which Drupal installation (6 or 7) depends on which FIRST enabled SSL site work on! If first enabled ssl site was D6, than all other HTTPS sites will lead to instalaltion of D6.

So, after completed testing I must say something terribly is wrong here and this feature is not even near to feature that is missing from 1.x branch.
But there is a workaround (not too cool but better than nothing, as this issue is a showstopper for migration of development servers to branch 2, which also means production servers need to wait developemnt servers). Workaround is to add same one IP address in your server node in Aegir multiple times. Like if your IP address is 192.168.1.15 add it as many times as you have sites waiting for SSL. Let Aegir comletes verification of server and platforms, then enable separate SSL certs which will bind to the same IP. That way SSL will finally work correctly on Aegir 2. This is uber dirty workaround which is also quite slow and painful: unlike on 1.x where you had text area and could add many IP addresses at once, on Aegir 2 you can add only 1 IP address which triggers after save server node verification and platforms too. So if you add 4 copies of one site you are developing for multiple teams and those sites need to have SSL, you need to reverify everything FOUR times, which takes some times, especially if you have many platforms and sites (which IS a case with development servers as they have many copies of production platforms on which testing, evaluations, fixes and other developments are done).

This said, I am willing to help as much as possible with this feature by testing or anything else needed because I would really like to migrate from depricated branch 1.x as soon as possible. I am aware that 2.x is short lived, but currently without thise I am stuck between depricated and uncomplete versions until 3 comes out or SNI gets into 2.

realityloop’s picture

Version: 6.x-2.0 » 7.x-3.x-dev

bumping to 3.x branch (with backport maybe?)

helmo’s picture

Issue tags: +Aegir 3.0.0
valkum’s picture

I think this needs to be build properly from ground up. Is there a way to organize a meeting to share ideas, maybe discuss some of the structure of such a big feature?
This isn't a feature which is fixed by some frontend changes.
We should be clear what we have. what we need and how we do it. Maybe we should meet in irc or so. I would like to contribute to this issue as we would like to 'encrypt everything' on our aegir instance.

milovan’s picture

I agree with valkum. Count me in as well!

bgm’s picture

For what it's worth, for the past year I have been running with a patch on Aegir to remove most of the IP management stuff:

- IP management stuff on SSL breaks IPv6 support by declaring a vhost on a specific IP. You may not be using IPv6 today, but you should, and definitely should by 2016 for North America and Europe.
- IP management is not necessary if we're using SNI, which, face it, it's 2015 and no one cares about Windows XP anymore. If they do, they're probably not still using IE 6.

People look to solutions such as Aegir to push their infrastructure forward, not backwards. Let's help make it so :)

ergonlogic’s picture

I have to agree with @bgm on this. Our management of IP addresses leaves a lot to be desired (a usable UI, for example). Moving to SNI would greatly simplify proper SSL management (which itself requires an overhaul).

@bgm: can you share that patch you mention?

bgm’s picture

@ergonlogic: we mostly override the vhost templates for apache/nginx, and either allocate the same IP many times to the server, or comment out the code that checks for available IPs / deletes certificates, but I don't have a formal patch for that, it was mostly to experiment.

Templates are available here:
https://github.com/coopsymbiotic/provision_symbiotic

ergonlogic’s picture

Version: 7.x-3.x-dev » 6.x-2.x-dev
Status: Needs work » Patch (to be ported)

I've merged the SNI branch into 7.x-3.x and it appears to work quite nicely. I suggest that we also merge it into the 6.x-2.x branch before the next release.

Also: note that I've started a broader discussion on how to move forward with SSL in Aegir over in #2466977: [META] SSL refactor

ergonlogic’s picture

Version: 6.x-2.x-dev » 7.x-3.x-dev
Status: Patch (to be ported) » Needs work

I've found some problems with out current implementation of SNI support, so I'm resetting this to 'Needs work'.

First off, in /var/aegir/config/server_master/apache.conf, we need a NameVirtualHost *:443 directive. This should be a simple tweak to the template.

I also had to add Listen 10.0.0.1:443 to my /etc/apache2/ports.conf, so that the proper interface is listening on port 443, but I don't believe this is in-scope for Aegir config. I'll just have to be properly documented, and perhaps added to Ansible roles, Puppet modules, etc.

This allows SSL to work after removing all IP addresses from the http server node. However, it results in an unwanted behaviour. That is, if one attempts to access a non-ssl site via https, we end up re-directed to an SSL site, if we click through the SSL warning. At present, we are re-directed to the alphabetically first SSL-enabled vhost.

The solution here is to provide a default 443 vhost. Of course, this requires a certificate too, though, technically, not a valid one. That is, a self-signed cert will work here (still throwing the usual SSL-doesn't-match warning), but can then re-direct to a 404. For example, the following vhost works for me:

<VirtualHost *:443>
    SSLEngine on
    SSLCertificateFile /var/aegir/config/server_master/ssl.d/default/openssl.crt
    SSLCertificateKeyFile /var/aegir/config/server_master/ssl.d/default/openssl.key
    ServerName default                                                                                                                             
    Redirect 404 /
</VirtualHost>

I generated that cert by hand using default.invalid as the domain name, but we could presumably use the server's FQDN instead. This might cause some issues if we were to enable SSL on the Aegir front-end, which would normally use the FQDN.

Anyway, I figure we can generate such a default cert when we enable the SSL feature, along with adding the missing elements to the server's apache config.

This will presumably require some work on the nginx side as well.

ergonlogic’s picture

Status: Needs work » Needs review

I just pushed a couple commits that resolve the issues that I brought up in #36: 8e90444480 and 644fbe5c6e. The first just adds the NameVirtualHost (which we'll need to remove at some point, since it's deprecated by Apache. The second generates the default SSL vhost, and certificate. I'm not certain that this is the best approach, but it has the virtue of working. We end up with quite a bit of duplication b/w the write() methods for ssl server and ssl site classes, which we might be able to clean up somewhat.

I'll test this further, but it would be good to get more eyes on this.

ergonlogic’s picture

Version: 7.x-3.x-dev » 6.x-2.x-dev
Status: Needs review » Patch (to be ported)

I'm going to consider this fixed for Aegir 3.x, as there are plenty of other SSL issues to tackle.

I think the dev/sni branch should be merged into the 2.x branch, along with cherry-picking 8e90444480 and 644fbe5c6e.

ergonlogic’s picture

I merged these changes (along with those from #1324466: provision-migrate fails because provision-backup creates a useless dump) into the 6.x-2.x-backports branch, and I'm running them in production. This branch should otherwise be identical to 6.x-2.3, and so safe to test for anyone looking to get this fix. Note that you'll need to run the same branch of Hosting, as well.

I had some challenges with existing SSL sites, since the cert->IP mapping was still in place. I eventually uninstalled hosting_ssl, and manually removed the IPs from the sites' contexts, before re-installing hosting_ssl.

gboudrias’s picture

In 6.x-2.x-backports , we still have the problem where it will generate a self-signed certificate when you first set SSL on a site, regardless of which certificate you actually selected. This doesn't occur in vanilla 2.3.

milovan’s picture

I did tests on two separate servers:

  1. Upgrade from Aegir 2.1 to Aegir 2.4
  2. Clean install Aegir 2.4

The results are same:

  • Create new website using existing key doesn't work; instead, it generates new key (if there is free IP in pool)
  • Editing existing site to use some other key works

To remind how it worked on Aegir 1.x, we had there one key which worked with all websites. It worked when you create or edit site. So I believe there is still work to be done to overcome regression introduced with Aegir 2.

crash98’s picture

Is there any useful solution to this problem in the 6.x-2.x branch?
I'm using a wildcard certificate for multiple Sites on one Server with one IP address. Right now I'm living with the workaround mentioned in https://www.drupal.org/node/2023621#comment-8780059 by modifying the aegir database and enabling the certificates for the sites manually.

The solution withstands also a "verify" task by aegir but fails if any of the sites using the key is deleted or cloned. After that, all the sites show "(key deleted)" in site-overview and the certificate entry is removed from the hosting_ssl_certs table.

edit: after testing for quite some time now i also found the only suitable solution is to add more IPs (multiple times the same IP) to the server node. So far, it also works fine to add a huge amount of IPs (80 in my case) directly in the database table hosting_ip_addresses.

meanderix’s picture

Version: 6.x-2.x-dev » 7.x-3.x-dev

This still seems to be a problem in the 7.x-3.x branch. I'm using a similar method as suggested by #34, by manually changing {ip_address} in vhost_ssl.tpl.php to *. There must be a better way of doing this that doesn't require hacking the code.

acrollet’s picture

Just to document what I've seen - I was having trouble getting a new SNI-based virtualhost to work. Turning up apache's LogLevel gave the following error:

"No matching SSL virtual host for servername {sitename} found (using default/first virtual host)"

After much gnashing of teeth, I found that the hostmaster virtual host had the IP set in the vhost file, causing this error. I was able to prevent this by editing hm.alias.drushrc.php and hostmaster.alias.drushrc.php and emptying the ip_addresses array. They have not come back after verifying the hostmaster site and the server, so my fingers are crossed that it won't re-break. (I think this may have been caused by my mistakenly setting an IP address on the server config at one point)

millenniumtree’s picture

Is anyone actively working on this? We are about to begin a server migration from a symlink-based Drupal setup to a new Aegir 3 platform with centralized database / hostmaster server, and 3-4 sub-servers running nginx.

We have a mixed environment using shared IPs as well as many legacy IP/Cert links (No SNI).
As it is getting more and more difficult to get new IPs, we would like to use SNI on the new platform as well.

I've played a little with linking certs and IPs and the current support is not good. If you add a shared SNI cert to two sites, then delete one of the sites, your cert disappears from both. Mixed SSL/Non-SSL sites are a bit janky too.

I'd like to help out with testing and development, but need to bounce some questions off someone with knowledge of the existing code. PM me if you can help answer some questions.

kristofferwiklund’s picture

We are running multiple Aegir 3 servers in master/master configuration. And with Aegir3 the standard setup works great with SNI (just do not enter a IP for the server). We use it for all of ours servers and Letsencrypt certs. But we have not tested a mixed setup on one server with both shared IP and a IP cert.

millenniumtree’s picture

I think with how many SSLs and IPs we currently have, it would not be feasible to put all of our sites on a single IP, or build a new VPS for each client with a unique IP. At least for the forseeable future, we will need a solution that allows multiple IPs, legacy certs, and maybe SNI in the future.

EDIT:
Forget everything I stated above.
We have gotten the servers set up with SNI and LetsEncrypt.

Thanks kristofferwiklund, for the suggestion. I think this will work out very nicely.

ergonlogic’s picture

Status: Patch (to be ported) » Closed (outdated)

The 6.x-2.x branch is no longer supported (since Drupal 6.x isn't either). So back-porting SNI to 2.x doesn't make sense anymore.