Problem/Motivation
The entity reference form element assumes that either numeric "\d" or string ids "\w" are used.
(See \Drupal\Core\Entity\Element\EntityAutocomplete::extractEntityIdFromAutocompleteInput)
This does not cover the use case where for instance URIs are used as ids (specifically entities in a triple store database), or any other ID scheme where special characters are used.
Proposed resolution
The reference element should not make assumptions on the format of the referenced ID. For the moment we should still disallow the usage of parentheses in the ID otherwise the field widget will not be able to extract the ID from the value. This case is handled in #2520416: Autocomplete node ID numbers are confusing, only display when required.
Remaining tasks
Review + commit.
User interface changes
None.
API changes
None.
Data model changes
None.
Original report by sandervd:
The entity reference form element assumes that either numeric "\d" or string ids "\w" are used.
(See \Drupal\Core\Entity\Element\EntityAutocomplete::extractEntityIdFromAutocompleteInput)
This does not cover the use case where for instance URIs are used as ids (specifically entities in a triple store database).
Attached patch adds support for URIs, but I think the whole way of determining the allowed format should be reconsidered.
The reference element should not make assumptions on the format of the referenced id, this could be delegated to the referenced entity class.
Comment | File | Size | Author |
---|---|---|---|
#21 | interdiff.txt | 750 bytes | claudiu.cristea |
#21 | 2756401-21.patch | 5.04 KB | claudiu.cristea |
#11 | 2756401-11-test-only.patch | 3.97 KB | pfrenssen |
Comments
Comment #2
sandervd CreditAttribution: sandervd commentedComment #3
sandervd CreditAttribution: sandervd commentedComment #4
sandervd CreditAttribution: sandervd commentedComment #6
gielfeldt CreditAttribution: gielfeldt commentedI agree that it might be too strict. Menu names as well are susceptible to this error, since they can use hyphens instead underscores in their machine names. Here's a patch, which makes the extraction a lot less strict.
Comment #7
dawehnerYeah, let's be honest about the tests :)
Comment #8
pfrenssenAs per Allowed changes during the Drupal 8 release cycle, since this is a non-disruptive bug fix this may be filed against 8.1.x.
The patch from #6 relaxes the validation of IDs, but still rejects IDs that contain parentheses. This is intended to make sure the IDs can still be recognized by the current implementation of the entity reference field widget which puts the ID between parenthesis. This is not ideal, since it is legal for an entity ID to contain parentheses. This is out of scope for this issue though, this is being handled in #2520416: Autocomplete node ID numbers are confusing, only display when required. Let's add a line of documentation explaining why we are doing this for now, and a link to the issue.
I fully agree. I would love to have a method like
EntityInterface::validateId()
that would allow the entity itself to validate its ID. But I think it would be better to tackle this in a separate issue. In here we are dealing with a simple bug that prevents any IDs with special characters from being selected. I would suggest to fix the bug here and provide a test that prevents it from regressing in the future. Then in a followup we can do a feature request to have a method on theEntityInterface
to do entity specific ID validation.Comment #9
pfrenssenGoing to work on a test.
Comment #10
pfrenssenIt seems to me that this also matches the first part of the
if () {} elseif () {}
statement which specifically checks for a number. We could get rid of that check.Comment #11
pfrenssenAdded test and addressed my remark from comment #10.
Comment #13
pfrenssenSetting back to NR, the test-only patch failed as expected.
Comment #14
pfrenssenComment #15
dawehnerWhy do we need the change of the regex at the end, aka
\s*
, this wasn't needed before.Nitpick: you could add the "," here, while adding a new line. As its a comment this could be also fixed on the commit :P
Comment #17
claudiu.cristeaYes, trailing spaces should not be present after the closing parenthesis.
Comment #18
dawehnerThank you @claudiu.cristea!
Comment #19
alexpottIt is nice now that we only do a single regular expression to extract this. Looking at the expression can't it be:
/.+\s\(([^\)]+)\)/
? We shouldn't need to check for more(
and the checking for the end of string is new and not done before.Comment #20
claudiu.cristea@alexpott, I tried your pattern with this example and works, it captures the correct parenthesis:
EDIT: Intentionally I used parenthesis in the label.
Comment #21
claudiu.cristeaFixed regexp according to #19, #20.
Comment #22
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAwesome :)
Comment #23
alexpottThe new pattern will have a problem with spaces and brackets in the ID - something like
abc (xyz) (http://ex ( ample.com)
but this is not a new problem. I think that we might need to consider how we handle this in #2520416: Autocomplete node ID numbers are confusing, only display when required. I think the idea of doing something(ID:whatever)
as the identifier is probably a good one and the regex should match the end of the string because we know this should come at the end of the string.What I don't really get is why we're fixing this in a follow up - and not here.
Comment #24
alexpottSomething like: https://3v4l.org/ddpmA
Comment #25
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@alexpott, you mean the pattern suggested by you in #19 or the one from the previous patches?
If the follow-up you mention is #2520416: Autocomplete node ID numbers are confusing, only display when required, that's not really fixable with our current autocomplete JS implementation (i.e. jQuery UI's autocomplete), we would need something like #2346973: Improve usability, accessibility, and scalability of long select lists in order to do that.
Comment #27
claudiu.cristeaComment #30
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedComment #31
alexpott@claudiu.cristea & @amateescu - if we're fixing the regex to cope with non-numeric strings I don't understand why we don't go for something like https://3v4l.org/ddpmA - since that is robust enough to deal with both spaces and brackets in a string ID. I don't understand why we can't add
(ID:
instead of just(
even with our current jQuery UI based solution?Comment #32
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedIsn't that an API change?
Comment #33
claudiu.cristea@alexpott, do we know if there are no custom/contrib modules that extended
\Drupal\Core\Entity\Element\EntityAutocomplete::getEntityLabels()
? Or other JS tweaks that expect only the ID into parenthesis (without 'ID:' prefix)? I'm not sure your proposal assures the BC. I think it's possible to break functionality by changing that. Also, there's a dedicated issue for that #2520416: Autocomplete node ID numbers are confusing, only display when required.I'm moving this back to RTBC to get @alexpott's feedback.
EDIT: Cross-posted with @amateescu :)
Comment #34
alexpottOkay well I guess we're improving things here and fixing a bug in a way that has no BC implications. My initial reaction was the controllers are not API but this is not a controller - this is a form element. I ponder about whether it is API but we can consider that in #2520416: Autocomplete node ID numbers are confusing, only display when required or another followup.
Committed and pushed 729340a to 8.3.x and 5fc86bc to 8.2.x. Thanks!