Problem/Motivation

The entity reference form element assumes that either numeric "\d" or string ids "\w" are used.
(See \Drupal\Core\Entity\Element\EntityAutocomplete::extractEntityIdFromAutocompleteInput)

This does not cover the use case where for instance URIs are used as ids (specifically entities in a triple store database), or any other ID scheme where special characters are used.

Proposed resolution

The reference element should not make assumptions on the format of the referenced ID. For the moment we should still disallow the usage of parentheses in the ID otherwise the field widget will not be able to extract the ID from the value. This case is handled in #2520416: Autocomplete node ID numbers are confusing, only display when required.

Remaining tasks

Review + commit.

User interface changes

None.

API changes

None.

Data model changes

None.

Original report by sandervd:

The entity reference form element assumes that either numeric "\d" or string ids "\w" are used.
(See \Drupal\Core\Entity\Element\EntityAutocomplete::extractEntityIdFromAutocompleteInput)

This does not cover the use case where for instance URIs are used as ids (specifically entities in a triple store database).
Attached patch adds support for URIs, but I think the whole way of determining the allowed format should be reconsidered.

The reference element should not make assumptions on the format of the referenced id, this could be delegated to the referenced entity class.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sandervd created an issue. See original summary.

sandervd’s picture

sandervd’s picture

Status: Active » Needs review
sandervd’s picture

Issue tags: +Needs tests

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

gielfeldt’s picture

I agree that it might be too strict. Menu names as well are susceptible to this error, since they can use hyphens instead underscores in their machine names. Here's a patch, which makes the extraction a lot less strict.

dawehner’s picture

Status: Needs review » Needs work

Yeah, let's be honest about the tests :)

pfrenssen’s picture

Version: 8.3.x-dev » 8.1.x-dev
Issue summary: View changes
Related issues: +#2520416: Autocomplete node ID numbers are confusing, only display when required

As per Allowed changes during the Drupal 8 release cycle, since this is a non-disruptive bug fix this may be filed against 8.1.x.

The patch from #6 relaxes the validation of IDs, but still rejects IDs that contain parentheses. This is intended to make sure the IDs can still be recognized by the current implementation of the entity reference field widget which puts the ID between parenthesis. This is not ideal, since it is legal for an entity ID to contain parentheses. This is out of scope for this issue though, this is being handled in #2520416: Autocomplete node ID numbers are confusing, only display when required. Let's add a line of documentation explaining why we are doing this for now, and a link to the issue.

Attached patch adds support for URIs, but I think the whole way of determining the allowed format should be reconsidered.

The reference element should not make assumptions on the format of the referenced id, this could be delegated to the referenced entity class.

I fully agree. I would love to have a method like EntityInterface::validateId() that would allow the entity itself to validate its ID. But I think it would be better to tackle this in a separate issue. In here we are dealing with a simple bug that prevents any IDs with special characters from being selected. I would suggest to fix the bug here and provide a test that prevents it from regressing in the future. Then in a followup we can do a feature request to have a method on the EntityInterface to do entity specific ID validation.

pfrenssen’s picture

Assigned: Unassigned » pfrenssen

Going to work on a test.

pfrenssen’s picture

+++ b/core/lib/Drupal/Core/Entity/Element/EntityAutocomplete.php
@@ -354,7 +354,7 @@ public static function extractEntityIdFromAutocompleteInput($input) {
-    elseif (preg_match("/.+\s\(([\w.]+)\)/", $input, $matches)) {
+    elseif (preg_match("/.+\s\(([^\(\)]+)\)\s*$/", $input, $matches)) {

It seems to me that this also matches the first part of the if () {} elseif () {} statement which specifically checks for a number. We could get rid of that check.

pfrenssen’s picture

Added test and addressed my remark from comment #10.

Status: Needs review » Needs work

The last submitted patch, 11: 2756401-11-test-only.patch, failed testing.

pfrenssen’s picture

Status: Needs work » Needs review

Setting back to NR, the test-only patch failed as expected.

pfrenssen’s picture

Issue summary: View changes
dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/Element/EntityAutocomplete.php
    @@ -348,13 +348,10 @@ public static function getEntityLabels(array $entities) {
    +    if (preg_match("/.+\s\(([^\(\)]+)\)\s*$/", $input, $matches)) {
    

    Why do we need the change of the regex at the end, aka \s*, this wasn't needed before.

  2. +++ b/core/modules/system/tests/modules/entity_test/src/Entity/EntityTestStringId.php
    @@ -25,7 +25,8 @@
    + *     "label" = "name"
    

    Nitpick: you could add the "," here, while adding a new line. As its a comment this could be also fixed on the commit :P

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

claudiu.cristea’s picture

FileSize
5.04 KB
1.35 KB

Yes, trailing spaces should not be present after the closing parenthesis.

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Thank you @claudiu.cristea!

alexpott’s picture

Status: Reviewed & tested by the community » Needs review
+++ b/core/lib/Drupal/Core/Entity/Element/EntityAutocomplete.php
@@ -348,13 +348,10 @@ public static function getEntityLabels(array $entities) {
-    // Take "label (entity id)', match the ID from parenthesis when it's a
-    // number.
-    if (preg_match("/.+\s\((\d+)\)/", $input, $matches)) {
-      $match = $matches[1];
-    }
-    // Match the ID when it's a string (e.g. for config entity types).
-    elseif (preg_match("/.+\s\(([\w.]+)\)/", $input, $matches)) {
+    // Take "label (entity id)', match the ID from inside the parentheses.
+    // @todo Add support for entities containing parentheses in their ID.
+    // @see https://www.drupal.org/node/2520416
+    if (preg_match("/.+\s\(([^\(\)]+)\)$/", $input, $matches)) {

It is nice now that we only do a single regular expression to extract this. Looking at the expression can't it be:
/.+\s\(([^\)]+)\)/? We shouldn't need to check for more ( and the checking for the end of string is new and not done before.

claudiu.cristea’s picture

@alexpott, I tried your pattern with this example and works, it captures the correct parenthesis:

$ php -r "preg_match('/.+\s\(([^\)]+)\)/', 'abc (xyz) (http://example.com)', \$match); print_r(\$match);"
Array
(
    [0] => abc (xyz) (http://example.com)
    [1] => http://example.com
)

EDIT: Intentionally I used parenthesis in the label.

claudiu.cristea’s picture

FileSize
5.04 KB
750 bytes

Fixed regexp according to #19, #20.

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

Awesome :)

alexpott’s picture

The new pattern will have a problem with spaces and brackets in the ID - something like abc (xyz) (http://ex ( ample.com) but this is not a new problem. I think that we might need to consider how we handle this in #2520416: Autocomplete node ID numbers are confusing, only display when required. I think the idea of doing something (ID:whatever) as the identifier is probably a good one and the regex should match the end of the string because we know this should come at the end of the string.

What I don't really get is why we're fixing this in a follow up - and not here.

alexpott’s picture

Something like: https://3v4l.org/ddpmA

amateescu’s picture

The new pattern will have a problem with spaces and brackets in the ID

@alexpott, you mean the pattern suggested by you in #19 or the one from the previous patches?

What I don't really get is why we're fixing this in a follow up - and not here.

If the follow-up you mention is #2520416: Autocomplete node ID numbers are confusing, only display when required, that's not really fixable with our current autocomplete JS implementation (i.e. jQuery UI's autocomplete), we would need something like #2346973: Improve usability, accessibility, and scalability of long select lists in order to do that.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 21: 2756401-21.patch, failed testing.

claudiu.cristea’s picture

Status: Needs work » Reviewed & tested by the community

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 21: 2756401-21.patch, failed testing.

The last submitted patch, 21: 2756401-21.patch, failed testing.

amateescu’s picture

Status: Needs work » Reviewed & tested by the community
alexpott’s picture

Status: Reviewed & tested by the community » Needs work

@claudiu.cristea & @amateescu - if we're fixing the regex to cope with non-numeric strings I don't understand why we don't go for something like https://3v4l.org/ddpmA - since that is robust enough to deal with both spaces and brackets in a string ID. I don't understand why we can't add (ID: instead of just ( even with our current jQuery UI based solution?

amateescu’s picture

I don't understand why we can't add (ID: instead of just ( even with our current jQuery UI based solution?

Isn't that an API change?

claudiu.cristea’s picture

Status: Needs work » Reviewed & tested by the community

@alexpott, do we know if there are no custom/contrib modules that extended \Drupal\Core\Entity\Element\EntityAutocomplete::getEntityLabels()? Or other JS tweaks that expect only the ID into parenthesis (without 'ID:' prefix)? I'm not sure your proposal assures the BC. I think it's possible to break functionality by changing that. Also, there's a dedicated issue for that #2520416: Autocomplete node ID numbers are confusing, only display when required.

I'm moving this back to RTBC to get @alexpott's feedback.

EDIT: Cross-posted with @amateescu :)

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Okay well I guess we're improving things here and fixing a bug in a way that has no BC implications. My initial reaction was the controllers are not API but this is not a controller - this is a form element. I ponder about whether it is API but we can consider that in #2520416: Autocomplete node ID numbers are confusing, only display when required or another followup.

Committed and pushed 729340a to 8.3.x and 5fc86bc to 8.2.x. Thanks!

  • alexpott committed 729340a on 8.3.x
    Issue #2756401 by claudiu.cristea, pfrenssen, sandervd, gielfeldt,...

  • alexpott committed 5fc86bc on 8.2.x
    Issue #2756401 by claudiu.cristea, pfrenssen, sandervd, gielfeldt,...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.