(Any regex experts willing to tackle this?)

When users copy text over from MS Word documents sometimes you end up with HTML markup like this:

<a href="mailto:user@example.com"><span>user@example.com</span></a>

By the time spamspan has chewed this you get a pretty ugly result. It comes out looking like this:

userexample [dot] net

and the mailto: href is also badly mangled.

Help!

CommentFileSizeAuthor
#6 spamspan-embedded_tags-1167084-6.patch2.68 KBvitalie

Comments

gpk’s picture

I should add that I have disabled all filters except spamspan.

peterx’s picture

Issue summary: View changes
Status: Active » Closed (won't fix)

@gpk, the email address is processed by a regex, changing a regex is a pain, and there are dozens of possible combinations to handle.

gpk’s picture

Thanks for working on this module Peter, this is quite a major bug for me and I'm a little suprised if it's not affecting others, given that a lot of content must get pasted into websites from word processing apps. It is a bit of a pain having to trawl through the raw HTML cleaning it up and this is beyond most content creators.

I appreciate a fix might not make it into 6.x but maybe this should be flagged up in 7.x for a regex expert. I have a rough workaround which helps a bit on our site, though it's not really production-ready. Maybe I should post it up here when I get a moment.

Thanks!

peterx’s picture

@gpk Post it.

peterx’s picture

Version: 6.x-1.6 » 7.x-1.1-beta4
Issue summary: View changes
Status: Closed (won't fix) » Postponed (maintainer needs more info)
vitalie’s picture

StatusFileSize
new2.68 KB

Patch below should partially fix this - it will just strip the tags. It includes the patch for the issue #2386967: Link text replaced with email address, since without it testing the this very issue becomes problematic.

Keeping the tags needs more work which I postpone until it will be actually requested by community.

vitalie’s picture

Version: 7.x-1.1-beta4 » 7.x-1.2
Assigned: Unassigned » vitalie
Status: Postponed (maintainer needs more info) » Needs review

  • vitalie committed 23f74a1 on 8.x-1.x
    ported patches for issues #1422462, #1167084, and #1012088
    

  • vitalie committed 65a0a19 on 7.x-1.x
    Issue #1167084 by vitalie: Tags embedded inside the <a href="..."> tag...
vitalie’s picture

Version: 7.x-1.2 » 7.x-1.x-dev
Status: Needs review » Closed (fixed)