I'm not saying we should merge the modules, per se - I'm saying we should permanently consider it / think about it while doing work on the modules.

Quick conclusion: the differing usage of unique user IDs (point 2 below) is the biggest fundamental hurdle for merging. So far, this is just informational: outlining the modules' differences in a more detailed way than the 'contributed modules' page does.

Situation / history

The saml_sp and samlauth modules have the same purpose and even use the same supporting library; there is no real use for them both existing. This is a clear example of what the Drupal community does not want to happen.

These are not even the only modules implementing a SAML SP; there are also

  • OneLogin Integration which also uses the same supporting library.
  • miniorange_saml (SAML SP 2.0 etc...) which does the same thing while not depending on an external library. Its distinguishing features seem to be that the codebase is ported from a non-Drupal source and that it mainly exists to advertise the services of the company that created it.

We briefly mention these in point 2 below.

How that happened?

  • saml_sp existed first - for D7.
  • samlauth was the first D8 module. I don't know how that came about, but I'm guessing that Universal just needed a module quickly to satisfy a business need, and decided to have one built that suited their purpose. While for the outside world that's unfortunate, I can imagine situations where that's more practical than working together to get your requirements integrated into an existing project.
  • onelogin_integration got created, its README.md saying "the modules that are there for Drupal 8 lack documentation, proper coding or don't even work correctly.".
  • salmauth got backported to D7 (which was apparently also a paid assignment). saml_sp got ported to D8. Et voila.

This issue is 4 1/2 years overdue. That's also just what happens. I got some paid time to invest in making this module more widely usable, which unexpectedly got me maintainership of the 2.x branch, but the rest was volunteer time... and it took 4 1/2 years until all existing / steadily incoming issues were dealt with; only after this, I got further plans worked out.

Differences between modules

The main ones I see:

1)
The saml_sp module supports multiple IdPs (enabling a user to log into the same Drupal site from multiple locations / authentication providers that use SAML).

2)
At this moment, the handling of authentication done by the two modules is incompatible - specifically, the way in which we handle a unique, immutable per-user ID value from the SAML assertion to identify a Drupal user who can log in.

The samlauth module uses a SAML attribute with a configurable name for this ID.

The saml_sp uses the SAML NameID for this ID. Its standard mode of operation is to expect this ID to be an email address (it always requests 'email' format from the SP) - but it can also be a different value. See also point 8.

Some additional opinion about this uniqueness/immutability and the reservations I have about using 'email' for this, are in #3211380: NameID support. In #3118296: Add extra fields to config, saml_sp might be exploring how to expand their usage of the NameID but I'm not 100% sure.

About the two other modules:

The onelogin_integration module doesn't implement a concept of a "unique, immutable per-user ID value"; it only configures 'name' and 'email' attributes, and the module will match the user to log in, based on (configurable) either of those two attributes. No words are spent on how the selected attribute must never change values on the IdP side or a different user could be logged in / new user will be created. (This, combined with the fact that it doesn't contain functionality the other modules lack plus the small install base, IMHO make it unnecessary to 'consider merging'.)

The miniorange_saml module can be configured to take this unique ID from either the NameID or an attribute - and always matches it to an existing Drupal user's email address, to decide which user to log in.

3)
The samlauth module has the possibility to take actions based on other attribute values (like adding / synchronizing on every login, extra user properties including roles - through extra modules and an event subscriber). The D8/9 version of the saml_sp module lacks this so far and contains a TODO to implement this.

4)
saml_sp for D8/9 does not implement logout/SLS yet.

5)
samlauth currently has more configurable options to accommodate IdP behavior - owed largely to issues/patches contributed by users. saml_sp has options to advertise contact information in the SP's metadata, which samlauth doesn't.

6)
saml_sp can install multiple X.509 certificates to check the SAML responses against. samlauth only one or two (following specific scenarios implemented in the SAML Toolkit library). Outdated: samlauth can now also support an arbitrary amount of IdP certificates. The implementation differs - I am guessing that is because saml_sp has implemented this before the upstream library supported it. samlauth just populates Settings properties that take care of this.

6.5) samlauth suppors the Key module, i.e. alternative storage options for the private key/certificates.

7)
saml_sp checks whether the certificate used by the SP signing/encrypting messages is expired. samlauth has no such provision yet. This is closely related to point 6.

Some other details that can influence a potential path to merging:

8)
The modules store the mentioned 'Unique ID' in a different manner. The saml_sp module stores it in a user field (which is the user's email by default, but can be configured to be a different field). samlauth v2+ uses the External Authentication module as a dependency which stores it in a separate 'authmap' table (like D7 core did).

9)
saml_sp is split into two modules: a module to handle the SP functionality and another optional module to handle Drupal login. (While this is a neat concept, I personally don't see a practical purpose yet of using a Drupal site as a SAML SP while not logging into Drupal immediately afterwards.)

10)
saml_sp checks Drupal Core settings for user registration, and optionally allows a nonexistent user to request an account. samlauth operates separately from Drupal Core and has an option to automatically create accounts from login data.

Other nitty-gritty details that I've seen so far, which would only come into play if we actually decided to merge:

11)
The saml_sp module treats the Onelogin PHP SAML Toolkit library less as a 'black box' than the samlauth module does; it has more child classes that alter its functionality. I believe that's partly necessitated by having multiple IdPs.

(samlauth will likely also replace the upstream Auth class in its 4.x version. The future of the Settings class isn't clear yet; the upstream module's architecture doesn't really enable extension/subclassing, even though we'll probably have to.)

12)
The samlauth module has (to a certain extent) clunky hardcoded paths for the location of the SP key/cert pair. That's a result of not wanting to reinvent logic contained in the Toolkit library. (I believe the best way to overcome this without unnecessarily duplicating code, is to open a simple github PR that enables Auth::construct() to accept a Settings class as well as the current array.) Clunky hardcoded paths are fixed in v3.3. The code fixing this isn't ideal, and should be rewritten in v4.x, but we don't know exactly how yet.

13)
samlauth is not letting the Toolkit library issue 'Redirect' headers and exit, but makes a point of always returning response objects back to the caller. (Besides being the theoretically right thing to do / possibly having some benefits for a small subset of Drupal installations... this has caused an enormous amount of grief by unwittingly opening the gates of hell#2638686: Correctly handle cache data instead of throwing an Exception in EarlyRenderingControllerWrapperSubscriber().)

...and there are probably other differences I didn't pick up on.

Comments

roderik created an issue. See original summary.

roderik’s picture

I wanted to type the above, partly as a form of documentation / info about the two modules... but re. the practical push toward merging: I am not at this moment going to do it.

I do have plans (#2882568: Plan for SAML Authentication 4.x) but

1) They are just plans. Of course I'd love to tackle the lack of tests and de-spaghettify this module's code... but now that we're finally stable without any outstanding weird bugs or major user requests... barring sponsorship or volunteers to do the work... my attention will very likely shift to other projects.

And I don't expect much eagerness for sponsorship/volunteers, given that the above is 'below the surface' work. The module works well.

2) The plans point into a different direction of cooperation - being: externalauth. In an ideal world, I can see functionality (most of the event subscriber code around e.g. attribute mapping / synchronization, some intricacies around user management) be turned into generalized/optional functionality that other externalauth-using modules could enable.

The work on externalauth's current structure of events is a prerequisite for de-spaghettifying the code in SamlService. If/when that is made possible, the effect will be that this module becomes easier to merge and/or exchange code with.

We could in theory also go back to not using externalauth the ExternalAuth service, instead of trying to rewrite much of its code. But that feels like a step backwards from (instead of towards) greater cooperation / unification of modules.

jproctor’s picture

As promised, my big pile of thoughts non-authoritatively from the SAML SP side.

For my own history, I tried both modules (and SimpleSAMLphp) when I was setting up my D8 site 3 years ago, and SAML SP was the one I got working first. Since then I had to switch to an IdP that uses three certs (front channel signing, back channel signing, encryption), so I’m kinda glad luck dropped the way it did.

  1. I really like having IdPs as config entities. It made testing my IdP replacement (and then executing it) super easy, even though the site only shows one at a time. And perhaps unique to me: since I often end up helping the IdP admin in our organization, including setting up attribute resolution in that new IdP, it’s super convenient to use my dev instance of Drupal as an SP so I can see what the IdP does in various “real-world” situations that aacli.sh can’t get to.
  2. Skipping over this one for now. It’s obviously big enough for its own issue. :)
  3. Yeah, I’m a little jealous of this. I do have a use case, but I only have about 20 users on my site, so we’ve been doing it manually (So-and-so got married; go update their name).
  4. I would like this too, but it’s super low priority because of the nature of the site, so my boss keeps dropping it down the list.
  5. Neither a barrier to merging nor likely a significant distinguishing factor — either module’s functionality could be adapted to the other.
  6. My IdP requires this.
  7. Neither a barrier to merging nor likely a significant distinguishing factor.
  8. I have in my head that one of the principles of SAML SP was not to require additional modules, but I don’t see that written down anywhere, so I might be mis-remembering. Anyway, this is related to the problem of NameID.
  9. My assumption (I never asked) was that it was separate so someone could come along and change everything about how it deals with the response from the IdP to uniquely identify a user and what to do about their attributes/fields. It is the “solution” to #s 2, 4, 8, 10, and maybe 13 by dumping the effort on the developer who needs to deviate from the standard case. AFAIK no one has done this, because it would be a pain.
  10. I think it’s also possible for SAML SP to actually create the user, if Drupal’s Account Settings says that should be allowed. I’m sure it used to be (one of the times I reworked that code) but it might have gotten lost somewhere, or the code unreachable because of some configuration oddity. Regardless, I’d say it’s neither a barrier to merging nor likely a significant distinguishing factor.
  11. I have never looked really closely at these (despite being the person who made them work with OneLogin 3.x). I generally favor not overriding someone else’s library without a good reason. It would take some thought in a merged module to see which bits are still necessary/useful, but that might be good effort to undertake. There may be cruft here that never gets called.
  12. I have settings.dev.php and settings.prod.php files (conditionally included from my main settings.php) to let me point to different cert/key pairs. Why there and not Config Split? I have no idea; I set it up 3 years ago and it still works so I haven’t changed it. Anyway, this functionality is useful, and whether we get OneLogin to improve their API or keep doing it the clunky way, I hope it would survive a merge (if there is one).
  13. “We want to do some processing, stash the result someplace useful, and then send the user to that page with a new state based on our processing” feels like it ought to be easy in a web app, and it never is. I think SAML SP also always returns a RedirectResponse object.

I can’t name any other differences, but I haven’t studied it nearly as much as you.

As for your plans:

  1. At least you have tests. I haven’t even written a task issue for this. And wow do we need them.
  2. I didn’t previously know about ExternalAuth, but I agree it looks like a way to move towards more standardization and cooperation.

Other things I’ve thought of for SAML SP but haven’t really explored:

  1. IdP-initiated login (we had a question about this and I figured out today it isn’t currently possible with SAML SP)
  2. Dynamic IdP metadata (with caching of course), pulled independently from each IdP or through a federation like InCommon
  3. Completely restructuring IdP config, including attribute mapping (see #3126850: IdP-specific login config though it’s out of date with the latest work on #3118296: Add extra fields to config)

Personally (again, I do not speak for SAML SP), I like the idea of merging, for the reasons you mention. One of the things that attracted me to Drupal many years ago was that there is usually one common standard for which contrib module to use for something, then a few specialty solutions if you have an exceptional case. It does not feel like that’s how or why our modules diverged.

I can also imagine letting them grow more distinct (or even encouraging it), but simple vs complex feels like the most logical distinction to allow growth but eliminate the problem that users must to choose which features are most important to them because they can’t get everything in one place.

This leads me down the path of thinking about merging, but with submodules or even plugins: a core with basic functionality and optional things (including an API) if you have special needs. That is a scary, dark path. We’re not Views or Search API. But I keep coming back to my days as a Perl programmer 20 years ago, and the motto “make easy things easy, make hard things possible.”

Regardless of the architecture I think every feature in the list above could be provided.

And I agree the most difficult issue is identifying the authenticated user, and we both already have that challenge, from different sides. #3211380: NameID support is a good place to discuss that, regardless of whether we grow our modules closer together. We should proceed with that.

If SAML SP were the foundation of a merged SAML module, I’d probably suggest a major rewrite and lose some backward compatibility — not because the existing code is bad, but because it grew organically instead of being planned and I think it could be better now that everyone involved knows more about what’s expected of it. That feels like a lot of work that benefits the community but not my site, so my contributions would be mostly volunteer time instead of paid for. I’m happy to help, but I drift in and out of availability.

You know your code far better than me; how would you approach a merge? What features from SAML SP would you want first? Are there any things from SAML SP that you would not want?

roderik’s picture

I'm happy/thankful that you reacted so quickly/positively! Even if neither of us is going to plan doing the practical work at this stage...

I don't currently see a reason for 'specialization' of one of the modules in a specific direction, because (almost?) all the features mentioned are desirable to have - and most of them are 'easy in principle' to merge / wouldn't bloat the resulting module in a way that requires splitting them off.

Maybe I'd want more discussion on the multi-IdP feature - and whether that should live in some kind of 'add-on module'. It's the biggest awesome feature of SAML SP, but... since (also from reading your description) production sites are generally going to have a single IdP configuration... I'd like to hear more thoughts on whether working with configuration values vs. a single config entity, fits people's workflow on e.g. their multi-site platform.

(I have less experience with configuration wrangling techniques than the average large/multi site D8 developer.)

For the rest I don't see an issue in merging functionality. How I'd approach it at this moment? Probably, just try to merge things into the samlauth module. It doesn't need a big rewrite.

--

I'll clarify details around this opinion - this is just an opinion:
The main functionality is in SamlService (which basically contains the same public method names as the controller... but only 'real' functionality so you don't need to see the awful code dealing with redirects/exception handling hidden away in the controller).

The bad:

  • Parts are in need of a rewrite - re. what kind of objects it handles / passes around. #3211529: Refactor SamlService::getAttributes() to a value object. jotted down about 3 (unfortunately interconnected) things to do inside it.
  • acs() is long and windy. The code path may make you go cross eyed. This won't change until we either stop using the ExternalAuth service for registering accounts, or rewrite it.

The good:

After you're over the "going cross eyed" part and decide to live with it... it's actually in pretty decent shape, and can handle additions. The 'NameID stuff', for instance, can be added to the code easily. Sure, some things will need interface breaking changes (per the point above), but I don't think anything will require it to have a big conceptual overhaul.

The multi-IdP stuff is the biggest, sure. But that, and some other work, is separated from the main SamlService code. (In the end the whole big retooled configuration will... result in something like a different Saml2\Settings child class that can be used in the same way.)

roderik’s picture

And about point 13: SamlSPAuth::login() does not return. Unless specifically instructed not to do so, the library internally does

        header('Pragma: no-cache');
        header('Cache-Control: no-cache, must-revalidate');
        header('Location: ' . $url);
        exit();

My message is: do not change this and start returning TrustedRedirectResponses (which you are pretty much required to use for redirecting to an external URL) without executing inside a render context. The results are... bizarre.

(My guess is that the current version of saml_sp_start itself will immediately cause exceptions. Which at least is better than e.g. having the odd user suddenly report exceptions at /saml/ routes at some random point in the future... and in the end discovering it is because they upgraded Rules to the latest alpha version. Which has no connection to SAML.)

roderik’s picture

Issue summary: View changes

Features update:

I didn't expect to do real work on samlauth this quickly, but someone close made a case for Key support (#2915977: Key Management) so I added that.

This required abstracting the way we read keys/certificates, and before I did that... I took a better look at how the upstream library handles them. As a result, the module now handles multiple IdP certificates properly - and if you use files for the SP key/cert, the strange dependency on hardcoded file paths is gone.