Git log entries such as Merge branch '2060691-migrate-mentor-field' into 7.x-1.x at https://www.drupal.org/node/651778/commits or any other list of commits are hard to read.
Is is possible either to render them properly or, to reduce confusion, at least disable linking to issues when the hash sign is immediately preceeded by an ampersand? For example, don't ever link &#039 in commit messages (or anywhere).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drumm’s picture

Project: Bluecheese » Version Control API
Component: Code » API module
Issue tags: +drupal.org hitlist

Yep, I've seen this too. These Views are made by the VersionControl module.

cburschka’s picture

Component: API module » Commit Log
Status: Active » Needs review
Related issues: +#2474005: commitlog links "'" entity to node/039.
FileSize
820 bytes

I've created a patch for this at #2474005: commitlog links "'" entity to node/039..

Moving it over here:

Any single quotes that appear in commit messages are escaped to ' in the commitlog - fine so far.

Unfortunately, whatever filter renders issue links then inserts a link to make it &<a href="/node/039">#039</a>;. Oops. This also breaks the apostrophe entity.

I'm not sure what the best approach is, here - clearly the markup does need to be escaped before the links are added, and check_plain() uses ENT_QUOTES.

Adding (?<!&) to the regex pattern would make it not match #\d* preceded by an ampersand.

Alternatively, adding ENT_HTML5 to core's check_plain() (which was added in PHP 5.4, and might make sense) would avoid the numerical entity and make it print &apos;.

I haven't taken the time to set up a test environment for this module, but the regex seems to do what it should.

php > var_dump(preg_replace('/#(\d+)\b/i', '<a href="node/\\1">#\\1</a>', htmlspecialchars("This isn't a commit message for issue #2474005.", ENT_QUOTES)));
string(102) "This isn&<a href="node/039">#039</a>;t a commit message for issue <a href="node/2474005">#2474005</a>."
php > var_dump(preg_replace('/(?<!&)#(\d+)\b/i', '<a href="node/\\1">#\\1</a>', htmlspecialchars("This isn't a commit message for issue #2474005.", ENT_QUOTES)));
string(79) "This isn&#039;t a commit message for issue <a href="node/2474005">#2474005</a>."

marvil07’s picture

@cburschka thanks for the patch and the effort to report it here.

return preg_replace('/(?<!&)#(\d+)\b/i', $replacement, $message);

I have not tried the patch yet, but any reason to not use a boundary meta-character instead?
i.e. '/\b#(\d+)\b/i'

cburschka’s picture

Hmm... that would definitely exclude entities (since & and # are both non-word) but isn't whitespace also a non-word character? If I understand the \b assertion right, this would require the # to be preceded by [a-zA-Z0-9_]...

Eg. "Issue #123" isn't matched by \b#\d+\b; only "Issue#123" is.

jacob.embree’s picture

Status: Needs review » Reviewed & tested by the community

I tested #2. It fixed the apostrophe problem.

drumm’s picture

Confirming this looks good on a Drupal.org dev site and looks safe.

  • marvil07 committed 09d105d on 7.x-1.x authored by cburschka
    Issue #2392847 by cburschka, jacob.embree, drumm: HTML entities...
marvil07’s picture

Status: Reviewed & tested by the community » Fixed

@jacob.embree, @drumm: Thanks for the reviews.

The patch is now added to 7.x-1.x, I guess it is time to plan for a new release.

I added a follow-up to add related tests on #2915026: Add tests for versioncontrol_handler_field_operation_message with a not-yet-working patch, help there is appreciated.

drumm’s picture

Thanks, this has been deployed to Drupal.org.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.