Update

The below discussion has helped me sharpen my thoughts.

The module will in itself support any nameID format... we can't really prevent people configuring strange formats, anyway.
But it will have

  • a README section about the importance of immutability. (It already mentions this, but the NameID must be clearly woven into this.)
  • warnings in the UI if a 'mutable' (transient) NameID seems to be used / clear hints not to use one.

----

Original text

There's one glaring thing that hasn't been addressed yet in this module: handling of NameIDs. (We've had a configuration option for 'NameID format' since the first version of this module, but it took me a long time to figure out what it really means and... that we don't actually use it.)

The main purpose of this issue is to invite others to comment if they think we should accommodate the use of NameIDs as 'unique ids'. And to lay out where I should be convinced/informed, in order to do so.

But first, zooming out: what's important here:

A SAML login must be able to unambiguously identify which Drupal account to log in as. So we must have a property which is present in the SAML login response, that is both unique and immutable - and that is stored with the account in Drupal.

If that property is not immutable, that can spell trouble:

  • When it changes on the IdP side, the next SAML login won't log into the same account anymore. Instead, depending on configuration, what can happen is
    • The login gets linked to a different Drupal account.
    • A new Drupal account gets created.
    • Login is denied.
  • When it changes in Drupal: same thing (except for 'getting linked to a different Drupal account', that Drupal account would also need to have its unique property changed accordingly.)

All this implies: if you choose the wrong property, this can lead to potential security issues.

What gets used as the 'unique user id' property:

This module has always required administrators to pass such a property in a SAML attribute in the message - alongside the user name and email (and possibly other attributes).

SAML however has an explicit property to identify the subject of a SAML assertion (e.g. in our case, the user who is logging in). Wouldn't that be intrinsically more suitable to use for this purpose?

...Maybe.

It is a somewhat popular concept, because the saml_sp module does it. Also, we've had a proposed samlauth 2 branch in 2016 that supported it (https://git.drupalcode.org/project/samlauth/-/tree/droath).

But the NameID isn't always intrinsically suitable for this; it is not always immutable.

When/how can we use the NameID:

Besides the NameID, a SAML assertion specifies its format. See e.g. StackOverflow for a bit of explanation.

The format urn:oasis:names:tc:SAML:2.0:nameid-format:persistent would be suitable, and would be easy to implement using the attached patch. I'd implement it if someone gives a good reason for it - the patch IMHO needs to be extended with a re-work of placement/explanation of the NameID Format in the configuration screen.

Some searching around suggests (though I'm not sure) that it isn't used very much.

Note that getting back a persistent NameID is not as easy as just specifying this NameID format in our configuration. That will send the format in the authentication/logout request as <samlp:NameIDPolicy Format=...>, but it does not necessarily result in actually having a NameID format of the same format being returned in the response. (I tried specifying 'permanent' to the SimpleSAMLphp test IdP I use, and it just returns a 'transient' NameID without indicating any error.)

In the case of the 'persistent' format, this figures: the authentication source must be equipped for it. It must have a way to store that persistent identifier somewhere along with the data of the authenticated user.

Then there's the email format: urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress. Searches suggest that this format is used often. Also, the saml_sp module uses it.

And if it's used often, that's a reason to support it. My fundamental question is this, though: What happens if an IdP user's email address changes?

I see two possibilities (and I'm going to assume that there are IdPs that implement either, or maybe even both):

A) The NameID doesn't actually change along with the user's email.

This means it's just a special form of 'persistent NameID' and we can implement it as such. No problem.

I assume this also means the changing email is (also) sent along in a SAML attribute.

B) The NameID changes along with the email address.

This means two things:

1. The NameID can also be used to populate the user's email field. (This means a bigger code change on our end.)

Also, if this is the case, I imagine such an IdP does not send the email as a separate attribute.

2. If we allow this NameID to be used as our unique but not immutable ID for the user, the "potential security issues" mentioned above apply.

This is why I'm hesitant to implement support for the 'email' NameID. I'm not against supporting it if there are IdPs that only work this way, especially because there are many IdPs where non-administrators can never change any email address. However, this support will have to be accompanied by at least a warning and clarifying changes in the README, so site administrators don't inadvertently set up an insecure situation.

This is analogous to our support of linking logins to existing Drupal users by email address, which has the same potential security implications - and has had a warning in the UI added in 8.x-3.1.

Other formats we can support if someone details them. The consideration is the same - summarizing:

  • Are its values unique per user and guaranteed not to change? Can do. See "persistent".
  • Is that not the case? That needs arguments / documentation then.

Conclusion: this module can implement support for NameIds if there is demand for it. But the demand will need to be clarified, and the implementation will need to be accompanied by documentation. Especially the inclusion of support for 'email NameID' will need to clarify which things an administrator must be aware of, if they want to prevent setting up a potentially insecure situation.

Comments

roderik created an issue. See original summary.

roderik’s picture

Issue summary: View changes
jproctor’s picture

I try not to get too hung up on immutability: even a persistent NameID, which is unique to the combination of IdP and SP, could be externally compromised and need to change. That should be extremely rare, even compared to someone changing username or email address, but it can happen. There is no perfect identification scheme that will never require intervention by the admin, and the expectation of a reasonable amount of work is going to vary widely by site.

For example, the site I run has about 20 users, all coworkers, and most of them are in less than once per week. If something major changes, I will likely hear about it before the next time they try to log in. I know there are very different situations on other sites, but identifying by email address is sufficient for me.

In an ideal world, I think we give the admin a choice of how they want to ask the IdP for identifying information: NameID or the presence of some specific attribute.

And as an IdP administrator, I would hope we could make it part of the SP’s metadata (using <md:NameIDFormat> or <md:RequestedAttribute name="..." isRequired="true"/>, but in a multi-IdP environment, it might need to be in the <AuthnRequest>. Some IdP somewhere is going to be an absolute jerk about what they provide, and we will need to adapt. On the other side, I’ve seen so many rigid but nonstandard requirements by SPs I would also hope we can be flexible about interpreting what we get. This is the approach I’ve tried to take in saml_sp — I inherited the requirement that it must be email address, but it can come from NameID or an attribute, and the attribute can be named mail or urn:oid:0.9.2342.19200300.100.1.3 (I probably should have included urn:mace:dir:attribute-def:mail for completeness but I don’t actually care about SAML 1 unless someone asks for it).

When we get a response, we need a translation between what we get from the IdP and information in the Drupal database. I can understand isolating that from automatically updating user account information, but I also see it as the same problem. That’s what I mean by attribute mapping in the saml_sp issue.

For saml_sp, I had been thinking we create hook_get_saml_user() (or maybe a service would be a better option). Needs more thought. If I understand correctly, ExternalAuth does this through a lookup in a separate table instead of looking directly at fields on the users.

Thinking about that, as well as the simple core plus submodules idea, I can imagine something like this (based on saml_sp just because I know it better):

My hypothetical saml_sp 5.x has the default of using email via NameID. It matches directly with the user field, and it probably should have a big warning that users might change email addresses and this could break things. Internally, the mapping works like it currently does: if email value is not actually in NameID, quietly check for likely attributes from the IdP. Or make that an option with its own security warning, anyway. I haven’t decided whether the base module should provide any other options; I’m leaning towards no.

Hypothetical submodule saml_sp_user_api allows the other NameID formats, brings in the ExternalAuth module, and provides both the hook to override those default behaviors (so you can implement your own if you don’t like what we’re doing) and an implementation of it that uses ExternalAuth. Probably it also provides the hook to update user entity fields.

Hypothetical submodule saml_sp_attribute_map provides an interface to map IdP-supplied attributes to fields in the database. It implements the hooks from saml_sp_user_api, which lets us use those connections either by comparison for identifying a user, or for updating them after a login event.

As I read that before submitting, it might not be logically consistent — it seems to be setting up an attribute map or ExternalAuth, and I would want them to work together. I’m going to try to find some time soon to look at samlauth in more detail and see if I can refine this sketch into something that makes more sense.

roderik’s picture

Issue summary: View changes

I understand your position of not wanting to be hung up on immutability.

I also kind-of understand your plans with add-on modules. I don't have much of an opinion about it or understand all details (leaving aside 'arbitrary attribute mapping', which we also have as an add-on module).

samlauth is in a different position. Above two points in reverse order:

2)
I agree that it is best to give the admin a choice between using NameID and using an attribute. samlauth doesn't need add-on modules to make this work. It literally is a patch the size of the one I've uploaded, plus a checkbox added to the configuration screen.

Plus, specifically for the situation where you have email NameID and no email value inside an attibute: some extra lines where the user email field is populated (but not 'synchronized on login') from the NameID. (Plus some corresponding UI changes that make this clear.) Still easy; I don't mind hardcoding this specific behavior.

Summarizing the issue text: the one reason I'm not applying that right now, is I want to spend more time on re-doing wording on the config screen (and the current position of the "NameID format" element which is in a weird place). So I'm waiting for a 'user story' from someone who wants this implemented. Your 'user story' / perspective helps. Still I'll probably also wait for another actual samlauth user who wants this too, who I can nag about some UI or documentation details in return, that I still have to think up.

(Aside: I'm not going to look it up now, but I believe we already advertise the requested NameIDFormat in both the SP metadata and the AuthnRequest. And I've seen that doesn't necessarily prevent some IdPs from completely ignoring that info without returning an error, and just returning a transient NameID in the AuthnResponse.)

1)
And... samlauth is going to keep being hung up about immutability. I don't want to require all users to be, but I want them to have the option. Our user base includes sites with thousands of users, and (at least) one installation with (a subset of admin) users logging into thousands of Drupal sites through an IdP. I'm pretty sure they care about this, to keep their sanity.

This means I want to tell people "please know this before configuring anything. This is why you should care about immutability. [reasons.] If you don't: that's fine (and we know there are many examples of people who don't - e.g. email NameID) but at least now you're aware."

That's where externalauth (a.k.a. an authmap entry a.k.a. "a data row that is not editable by non-admins which contains the DrupalUID - SamlUniqueID mapping, that is saved once on first SAML login and not usually modified afterwards") helps. We have this baseline - and if people don't actually want to think about it: that's fine.

If I understand things correctly, SAML SP arguably now has this 'authmap entry' concept conditionally: if you use an email NameID and you look up the user through a user field... then you have it - it's just stored directly in the user object, but it's in the same way supposed to be saved when creating the user, and not/hardly ever supposed to be changed. If you use an email NameID by which you look up the user directly by email... then you skip the concept of an authmap entry.

samlauth doesn't need to change internally to support 'email NameID + no email in an attribute', besides the "Plus" mentioned above... it just has the side effect that the email value is saved in two places: the authmap entry and the user object. Which is fine; there's a benefit to keeping just one internal mechanism.

roderik’s picture

By the way, do you care about bringing any of this back to D7? Because

provides both the hook to override those default behaviors (so you can implement your own if you don’t like what we’re doing)

samlauth doesn't (need to) provide hooks for this. If people don't like what we're doing, they can override the samlauth.saml service with a class that e.g. rips the usage of authmap out.

(Granted, that's going to take them more effort than is ideal... because acs() consists of a few unfortunately-too-long spaghetti strands :-) But I do keep it in mind, to hopefully next time divide it up into some smaller protected methods that someone could override.)

gpotter’s picture

I've recently come across the need to use NameID as the unique identifier, so I'll throw in my vote to include it as an option in the admin configuration. I had to go back to another client to see if they can include the same identifier in the SAML attributes. The patch roderik created, along with a checkbox to toggle it, and some disclaimer would be nice.

roderik’s picture

@gpotter out of curiosity, what identifier would be in the NameID, in your case?

Is it email / urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress, or some other 'permanent' ID?

And in your case, is it a 'truly permanent / immutable' value (e.g. there exist IdP situations where people can absolutely never change their email) or "it could occasionally change but it's good enough for me"?

--

Disregarding the issue of time investment, there's no real reason to delay this. Things just needed to ripen a bit in my head and the conversation with jproctor gave relevant info and allowed to focus my thoughts (thank you @jproctor).

But any description of a practical situation still allows me to better know how to e.g. phrase documentation.

(Random example: I recently talked to someone who has a SalesForce ID being passed in their attributes, but is still just fine with using email as their "unique ID" because it is sometimes more unique than the unique ID, in practice. Since they sometimes have people leaving and re-joining their subsidiary organizations who get their old e-mail back, even though they get a new employee number... if they use e-mail as unique ID, those users can then log into the Drupal site with their old account without any issues ¯\_(ツ)_/¯ )

gpotter’s picture

@roderik, in our case it was an "employee ID". It was truly immutable from what I know of the client that was passing it. I'm pretty new to SSO some I'm not quite sure how common this situation is or if it was in "good standards" to pass the employee id as the NameID field.

roderik’s picture

Thank you. Conceptually (in my head) it makes complete sense to me to pass something like an Employee ID in the NameID. It feels (to me) like that's exactly what it was made for.

I just have never seen it 'in the wild' and haven't seen an explicit request to include it for this reason, until your request. I can't say whether this is because noone does this, because of my lack of experience with diverse setups, or whatever other reason.

ghost of drupal past’s picture

StatusFileSize
new3.27 KB

In our case the nameid contains the email and there are no attributes whatsoever.

mingsong’s picture

The NameID formats supported by php-saml/SAML2

 // NameID Formats
    const NAMEID_EMAIL_ADDRESS = 'urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress';
    const NAMEID_X509_SUBJECT_NAME = 'urn:oasis:names:tc:SAML:1.1:nameid-format:X509SubjectName';
    const NAMEID_WINDOWS_DOMAIN_QUALIFIED_NAME = 'urn:oasis:names:tc:SAML:1.1:nameid-format:WindowsDomainQualifiedName';
    const NAMEID_UNSPECIFIED = 'urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified';
    const NAMEID_KERBEROS   = 'urn:oasis:names:tc:SAML:2.0:nameid-format:kerberos';
    const NAMEID_ENTITY     = 'urn:oasis:names:tc:SAML:2.0:nameid-format:entity';
    const NAMEID_TRANSIENT  = 'urn:oasis:names:tc:SAML:2.0:nameid-format:transient';
    const NAMEID_PERSISTENT = 'urn:oasis:names:tc:SAML:2.0:nameid-format:persistent';
    const NAMEID_ENCRYPTED = 'urn:oasis:names:tc:SAML:2.0:nameid-format:encrypted';

Reference:
https://github.com/SAML-Toolkits/php-saml/blob/76b784ef51b8dfcf2619b8d4a...

mingsong’s picture

As a contribute module, the more flexibility provided, the more popular this module can be. So an universal solution should can handle all formats, given that I think we should treat the NameID from IdP as an ID of a SSO session rather than an entity ID of a user.
Even If the format is NAMEID_PERSISTENT, in this case, the user entity ID + service ID is used as the session ID of SSO. Regardless what format is used, the NameID can always be treated as the ID of a SSO session.

In the Drupal side, once we have the connection between the NameID from an IdP and the user ID in Drupal, we can figure out which user a NameID represents as.

Therefore, I think we can consider a mapping data table for the NameID and uid in Drupal.

For example

ID NameID UID Formats Timestamp IP
1 _a4a9e072102724495db8c956176f141912afb544aa 10 NAMEID_TRANSIENT 1691973372 127.0.0.1
roderik’s picture

Title: [RFC] NameID support » NameID support
Issue summary: View changes

@chx / #10:

Thank you for the patch. I had some draft work but was overcomplicating things; indeed just adding the NameID into the return value of getAttributes() is the right thing to do, and I'll add this.

Caveats:

  • I'll replace 'NameID' with '!nameid' because I consider it more likely that someone ever creates a regular attribute named NameID than !nameid. So anyone using this patch will need to change their *_attribute config values.
  • The reason I didn't jump on this immediately is that it's adding another footgun to the module and I wanted to take the time to add good language to the README / config screen. I think I have time and ideas to do this now.
  • Something in me finds extending getAttributes() icky, but that doesn't matter because it's already icky anyway. (The same part of me doesn't like to have both regular and 'friendly' attribute names duplicated in the same return value.) This should eventually be fixed by #3211529: Refactor SamlService::getAttributes() to a value object..

@Mingsong:

I don't think I agree with that solution and I think that maybe there's a misunderstanding here... but I'm unable to pinpoint the exact misunderstanding. So I don't know exactly how to approach my reply...

I think we should treat the NameID from IdP as an ID of a SSO session rather than an entity ID of a user.

What is the conceptual difference between treating something as an SSO session vs. a user? if we are not using the SSO session / NameID / ... to look up a Drupal user and log them in, then... what use cases are there for having that session?

In the Drupal side, once we have the connection between the NameID from an IdP and the user ID in Drupal, we can figure out which user a NameID represents as.

I must be missing something here. I read "once we have the user, we can figure out the user". Well uhm...

Therefore, I think we can consider a mapping data table for the NameID and uid in Drupal.

We already have a mapping data table for a 'unique id from the IdP' and the uid in Drupal. (The authmap table.)

What we see as the 'unique id from the IdP', is configurable. At the moment it must always be an attribute. This issue is about extending the possible configurations to also include the NameID.

So conceptually this issue is pretty simple. It is, however, complicated by the fact that we can't use transient NameIDs (AFAICT), just like I was arguing against using 'mutable' attributes. This is a question of proper clear documentation.

You seem to be saying either or both of the following two things:

  • We have a use case for an "SSO session" that is not necessarily tied to a user ID? (See above.) In that case, we need better specifications of that use case - and that likely needs to evolve into a new separate issue.
  • We can use any format of NameID to derive the user. I do not think so. I do not see how we could derive a Drupal user from a transient NameID. (Unless you mean through the feature called 'user linking', but 1) I am leaning toward discouraging that, lately; 2) if we can derive the user already, then I don't really see why/when we need to store the transient NameID. However... I don't think you actually mean this.)
mingsong’s picture

@Roderik, let me explain my thought by examples.

What is the conceptual difference between treating something as an SSO session vs. a user? if we are not using the SSO session / NameID / ... to look up a Drupal user and log them in, then... what use cases are there for having that session?

In a user case, the IdP won't expose any meaningful information of a user identity, such as employee ID. So that the NameID returned from an IdP is a temporary random string to distinguish between SSO sessions. The NameID format is "NAMEID_TRANSIENT". In the SP (Drupal) side, for log/audit purpose, need some information that can tell the connection between NameID (temporary) and Drupal user ID. The currently workaround is that the externalauth module will log the event of a user login via this module. Something like this
Location: http://yourdomain/saml/acs
Message: External login of user user1@example.com
User: user1@example.com
If the 'Log incoming SAML messages' setting is enabled, then there should be a samlauth log that logged the NameID received from the IdP. The way to connect those information is the timestamp of those logs. In a high-traffic, the timestamp could be confusing and not reliable.

We already have a mapping data table for a 'unique id from the IdP' and the uid in Drupal. (The authmap table.)

Again, this data table comes from externalauth module and there is no NameID stored in this table. Why we need NameID? This is back to the original question of this discussion.

roderik’s picture

In the SP (Drupal) side, for log/audit purpose, need some information that can tell the connection between NameID (temporary) and Drupal user ID.

Yes, exactly. (Not for log/audit purpose, but we simply need some information to determine which user to log in.) And in case of a transient NameID... that needed information is not there. So we can't tell the connection.

The currently workaround is that the externalauth module will log the event of a user login via this module.

Uhm, what?

I have to guess at your assumptions... but you seem to be implying that the externalauth module is (magically?) determining which user to log in, based on the information in the SAML response?

That's not true. samlauth determines which user to log in. It then calls externalauth code; externalauth is basically used as a generic utility to store the connection between "unique ID" and user.

Again, this data table comes from externalauth module and there is no NameID stored in this table

The table comes from externalauth module. The data stored in the table is provided by samlauth module.

After this issue is fixed, NameIDs will be stored in this table (if the samlauth configuration specifies this; see e.g. chx' patch).

mingsong’s picture

StatusFileSize
new79.45 KB

I have to guess at your assumptions... but you seem to be implying that the externalauth module is (magically?) determining which user to log in, based on the information in the SAML response?

That is not an assumption. If you have ever checked the Drupal logs, you will see the log I mentioned. I didn't mean which module is in charge of the login process at what stage, I just said there will be Drupal log to record that event.

Log

By the way, here is the line that the External Auth module signs in a user.

https://git.drupalcode.org/project/externalauth/-/blob/2.0.3/src/Externa...

Yes, External Authentication module is an API module, which means another module that calls the API is in charge of the business logic.

In this case, samlauth module calls that API function at following lines
https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/SamlServi...
https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/SamlServi...

As you can see from the codes above, there is no parameter to tell External Authentication module what NameID is that the Drupal user mapping to.

The table comes from externalauth module. The data stored in the table is provided by samlauth module.

So I don't know how can you provide the NameID to externalauth module.

roderik’s picture

This conversation has gotten surreal.

So I don't know how can you provide the NameID to externalauth module.

You provide the NameID in the $unique_id parameter.

mingsong’s picture

You provide the NameID in the $unique_id parameter.

The $unique_id is assigned by following line
https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/SamlServi...

which is specified by the configuration of 'unique_id_attribute'.

More specific, by the configuration form.

https://git.drupalcode.org/project/samlauth/-/blob/8.x-3.9/src/Form/Saml...

A SAML attribute whose value is unique per user and does not change over time. Its value is stored by Drupal and linked to the Drupal user that is logged in.

It could be the NameID, If the it won't be changed ever.

nicklasmf’s picture

Sorry for not having read the thread.

I am in a situation where I need to implement SAML for the Danish Government (NemLog-in).
The user can login with different accounts (private, employee etc.), and they return with different attributes according to their account, thus no attribute is common except the NameID (that I know of).

I used your patch (2022-01-06/samlauth_nameid.patch) but I end up in a situation where this is longer than the allowed length of the name field.
SQLSTATE[22001]: String data, right truncated: 1406 Data too long for column 'name' at row 1: INSERT INTO "users_field_data" with NameID https://data.gov.dk/model/core/eid/professional/uuid/d9a3db11-a100-10d6-9c3e-ac3da2000dba.

I am thinking about expanding the database data type but just want to let you know.
Maybe we could consider add an event that allows to alter the field further before using it? Let's say the field is conditional?

roderik’s picture

1)

The SQL error you are seeing comes from the 'name' field. I see that's 60 characters.

So I am assuming you are setting "NameID" in the "User name attribute" field.

There is an event to shorten the user name if you want - or do anything else, like construct intricate logic to build the user name out of various attributes; the USER_SYNC event. (I'm linking to HEAD where the comments were changed a bit recently - but the event code hasn't changed.)

(You maybe want to only change the user name when the account is new. Note whatever you do here is not restricted by the module's Synchronize user name on every login" option. If you shorten the new user's name and it happens to clash with another already existing user, then you should get a horrible exception without anything being saved yet -- which I guess is as good as anything.)

2)

I am also assuming you are setting "NameID" in the "Unique ID attribute" field. This is stored in the authmap.authname field, which is 128 characters.

That can be altered on the ExternalAuthEvents::AUTHMAP_ALTER event -- only when newly registering a user. Please have a look at the ExternalAuthAuthmapAlterEvent class and/or its caller ExternalAuth::register(). This is outside the samlauth module (though it's called by samlauth).

---

And since you haven't read the thread, I'll repeat something:

I'll replace 'NameID' with '!nameid' because I consider it more likely that someone ever creates a regular attribute named NameID than !nameid. So anyone using this patch will need to change their *_attribute config values.

(Unfortunately several issues came in between me finishing this. It's still on the top of my volunteer-coding list.)

  • roderik committed 23d45bbf on 8.x-3.x
    Issue #3211380 by chx, roderik, gpotter: Support for NameID as Unique ID...
roderik’s picture

Status: Active » Fixed

Well that took longer than I hoped. Now committed on top of a revised/split edit screen.

(A 3.10 release will hopefully follow soonish but there are now some changes since 3.9 that I need to take a good look at before really doing a release.)

Wanted to credit jproctor in the commit message, used wrong name, will try to set credit now, while setting the issue fixed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.