It seems that pathologic encodes HTML entities found in local paths.

http://not-local-site.com/questionmark?apersand&pipe|lbracket]rbracket[colon:

renders correctly as

http://not-local-site.com/questionmark?apersand&pipe|lbracket]rbracket[colon:

However, domains specified in pathologic's filter settings get rewritten (and break)

http://local-site.com/questionmark?apersand&pipe|lbracket]rbracket[colon:

renders as

http://local-site.com/questionmark?apersand&%3Bpipe%7Clbracket%5Drbracket_colon%3A



Glancing at the code, I thought perhaps pathauto settings might have an impact on punctuation. I set ampersands to not be replaced, but the result is the same.

This is the URL I'm hoping to render out on my site: <a href="/civicrm/contribute/transact?reset=1&amp;id=1">CiviCRM DONATE PAGE</a>

CommentFileSizeAuthor
#2 pathologic-wysiwyg-links-1303782-2.patch583 bytesdealancer
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

dealancer’s picture

Title: WYSIWYG modules integration » punctuation rewritten in local URLs
Priority: Critical » Normal
Status: Needs review » Active

Follow!

This happens also for 6 version. The problem occurs when FCK editor module (or any WYSIWYG) is enabled.

So what happens:

a) CKEditor replaces & to &amp;
b) For some reason pathologic replaces &amp to &, hence we get &;
c) Then in CKeditor ; is replaced to %3B, so we get &%3B in links.
dealancer’s picture

Priority: Normal » Major
Status: Active » Needs review
FileSize
583 bytes

Here is a patch for D6 version of the module, but I think it will be very very simmilar for D7 version of Pathologic.

I hope it would work good and I hope there is no links that contains encoded html.

dealancer’s picture

Title: punctuation rewritten in local URLs » WYSIWYG modules integration
Priority: Major » Critical
Garrett Albright’s picture

Title: punctuation rewritten in local URLs » WYSIWYG modules integration
Version: 7.x-1.3 » 7.x-1.4
Status: Active » Needs review

a) CKEditor replaces & to &

So CKEditor is encoding entities in paths as HTML entities, which it shouldn't be doing. Doesn't this completely screw up your links whether you're using Pathologic or not?

Arg. Working with Pathologic would be so much more fun if I weren't constantly being demanded to work around WYSIWYG editors doing stupid crap.

dealancer’s picture

> So CKEditor is encoding entities in paths as HTML entities, which it shouldn't be doing.

I think that not only CKeditor does this.

> Doesn't this completely screw up your links whether you're using Pathologic or not?

No, other links works ok. html entities are decoded by browser. Check your own:

<a href="http://example.com?a=b&amp;c=d">test link</a>

Only links processed by both CKedior and Pathologic are damaged.

> Arg. Working with Pathologic would be so much more fun if I weren't constantly being demanded to work around WYSIWYG editors doing stupid crap.

Please look on the patch, it contains only one line of code:

+  // Process links modified by WYSIWYG modules
+  $matches = array_map('decode_entities', $matches);

Here is a prove that this patch works 100%. There is 0 probability that there can be html entities in links added by user manually, anyway all of them will be decoded by browser. And there is a big possibility that links will be encoded by CKeditor or any other WYSIWYG editor. So we just need to decode it in Pathologic, before browser does this.

What do you think?

dealancer’s picture

This patch works and prevents issue with Pathologic + WYSIWYG. The real problem:

a) WYSIWYG should not encode & in links
b) Pathologic should not replace & with &amp, it should replace & with &amp; or doesn't touch it at all.
fearlsgroove’s picture

a) WYSIWYG should not encode & in links

Not so. Ampersands are supposed to be encoded in URLs, WYSIWYG editors that convert & to & are doing exactly what they're supposed to be doing.

dealancer’s picture

Priority: Normal » Major

Great, so the patch in comment #2 should be reviewed and committed.

fearlsgroove’s picture

Not sure .. why are we decoding then decoding again? What if we just remove the decode from the URL, since URLs are supposed/allowed to be encoded.

dealancer’s picture

@fearlsgroove, the real problem is that pathologic do wrong decoding: it replaces "& a m p ;" to "& ;". It should not do such replacement or make replacement correctly. I could not find bug, cause of difficult regular expressions, so just provided quick fix in a one line. I appreciate if you could help us to find and fix the real issue of wrong decoding.

artis’s picture

Version: 7.x-1.4 » 7.x-2.0-beta2
Status: Needs review » Needs work

this issue persists in 7.x-2.0-beta2

artis’s picture

Version: 7.x-2.0-beta2 » 7.x-1.4
Status: Needs work » Reviewed & tested by the community

Patch at #2 works for 7.x-1.4.
7.x-2.0-beta is too dramatic of a change for the patch to work.

hass’s picture

I have disabled entities in CKEditor module... Does this help? We don't need them as Drupal is UTF8.

hass’s picture

Status: Reviewed & tested by the community » Closed (won't fix)

Disable entities in CKEditor, please.

dealancer’s picture

Priority: Major » Normal
Status: Closed (won't fix) » Needs work

hm....

> Disable entities in CKEditor, please.

How could I technically do it?

Also, if entities are disabled, that means that user could not embed < and >, because they will be processed as tags. That is wrong, I guess, not a good solution for this issue.

So changing status to Needs Work.

hass’s picture

Status: Needs work » Closed (won't fix)

There is a checkbox in ckeditor profile settings. Links can be added with other link or ckeditor_link module. Both works well.

dealancer’s picture

Status: Closed (won't fix) » Active

1) I could not find this checkbox. I am speaking about WYSIWYG module. Please do not close the issue, until it is not solved yet.

2) THERE IS STILL AN ISSUE IN PATHALOGIC MODULE.

For some reason pathologic replaces &_a_m_p_; to &; but it should replace &_a_m_p_; to &. Got it? There is a difference between &; and &, isn't it?

hass’s picture

I used WYSIWYG in past. I'm using CKEditor module as WYSIWYG is not useable if you need to rearange/limit the buttons. Nothing goes forward with this module.

Garrett Albright’s picture

Status: Active » Closed (won't fix)

Closing for now since this doesn't seem to be Pathologic related.

meecect’s picture

Status: Closed (won't fix) » Active

I think dealancer is correct. It seems like pathologic is not properly parsing url arguments that have html encodings and are valid HTML. I wrote a detailed comment to what is basically the same issue in the 6.x branch.

Please see my comment here:

http://drupal.org/node/419826#comment-7140614

Let me know and I can roll a similar patch to my solution in http://drupal.org/node/419826#comment-7140614 (and similar to the patch in #2).

meecect’s picture

Issue summary: View changes

urls weren't quite right

Garrett Albright’s picture

Version: 7.x-1.4 » 6.x-3.x-dev
Issue summary: View changes
Status: Active » Closed (won't fix)

6.x branch bugs will never be fixed. Sorry.