The change in #3402582: Does not work with ckeditor 5 breaks the filter function when you are not using an editor like CKEditor in your text format.

I have the following text format that no longer works after updating to beta5. Downgrading to beta4 solves the issue.

uuid: 2223de04-7710-4620-8144-4a1908ea1962
langcode: en
status: true
dependencies:
  module:
    - soembed
name: oEmbed
format: oembed
weight: 0
filters:
  filter_soembed:
    id: filter_soembed
    provider: soembed
    status: true
    weight: -49
    settings:
      soembed_maxwidth: '780'
      soembed_replace_inline: '0'
      soembed_allowed_buckets: {  }

Issue fork soembed-3417152

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

Peter Törnstrand created an issue. See original summary.

peter törnstrand’s picture

Issue summary: View changes

  • Ki committed 47084194 on 8.x-2.x
    Issue #3417152: beta5 breaks filter
    
ki’s picture

Status: Active » Needs review

It was not easy to find regex to handle different scenarios. Pushed the fix in the release 8.x-2.0-beta7.

peter törnstrand’s picture

Still having issues with this using beta7, if I change the regexp to #([^"])?(https?://[^\s<]+)([^"])# it works. Made the first match capturing group optional.

To make my case clear, the only content in the textarea is the URL, so no other content, no P tags or anything.

http://host.com/embed/eyJxdWVyeSI6IkRpYWJldGVzIiwibGFuZyI6InN2IiwiZmlsdGVycyI6eyJwcm9maWxlIjoicHJvZmlsZSIsImFsbCI6MCwicHVibGljYXRpb24iOjAsImdyb3VwIjowLCJncmFudCI6MH0sImhpdHMiOiIyMCJ9 
ki’s picture

We'd need to come up with a regex that works with ckeditor 4, ckeditor 5, and no editor. Will the suggested fix satisfy it?

peter törnstrand’s picture

Sorry for late reply. I have not tried it beyond my own use case. I will test and report my findings.

peter törnstrand’s picture

Status: Needs review » Reviewed & tested by the community
StatusFileSize
new116.36 KB

Yes, the pattern #([^"])?(https?://[^\s<]+)([^"])# works for all three cases.

  • Ki committed d4a05f35 on 8.x-2.x
    Issue #3417152 by Peter Törnstrand: beta5 breaks filter
    
ki’s picture

Status: Reviewed & tested by the community » Fixed

Applied the fix to the release 8.x-2.0-beta8.

This is almost identical to the pattern for `Replace in-line URLs` option, though. Now I wonder if I should get rid of the option, which is getting irrelevant.

loze’s picture

This now fails for me when using no text editor and there are no characters following the url, see screenshot

This patch fixes it for me.

loze’s picture

Status: Fixed » Needs review

  • Ki committed 1d1437e0 on 8.x-2.x authored by loze
    Issue #3417152 by loze, Peter Törnstrand, Ki: beta5 breaks filter
    
ki’s picture

Peter Törnstrand

I've applied loze's patch and pushed to the latest 8.x-2.x-dev version.

Could you check it works for you?

Thanks.
Ki

loze’s picture

Status: Needs review » Needs work
StatusFileSize
new30.33 KB

After further testing the latests commit, the regex still fails in a few places

in particular the url inside the <a> tag gets converted to a video in the following example.
<a href="https://www.youtube.com/watch?v=CsGQOPLGImA">https://www.youtube.com/watch?v=CsGQOPLGImA</a>

urls that are linked should not embed. At least that's how this used to work.

Also some urls that do not have an enabled provider are mangled. for example:
the website <a href="https://drupal.org"><em>Drupal</em></a>
displays broken html, see screenshot

loze’s picture

Status: Needs work » Needs review

I think is regex should work
'#(<p>|\n|^)(https?://\S+)(</p>|\n|$)#'

This works with ckeditor and without in my testing when urls are on their own line.
It also works for Peter Törnstrand's use case when it is the only text in the textarea.

loze’s picture

the regex in #16 still failed in some cases. Try this one, It seems to work.
'#(<p>|\n|^)(https?://[^\s<]+)(</p>|\n|$)#'

ki’s picture

It seems one regex that works for one easily breaks another. I'll try to find time to do tests, but I'd appreciate those in the thread test their case and work together to come up with a regex that works for all cases. Or is that actually possible? Should we introduce a checkbox setting to handle ckeditor5 as an exceptional case? Since ckeditor5 merges all lines into one in the backend, it is essentially no different from the option setting "Replace in-line URLs"

What if we change the label of the checkbox from "Replace in-line URLs" to "For in-line URLs or ckeditor5" and revert the regex to as it was?

mortona2k’s picture

Priority: Normal » Major

The links on my site broke after enabling this module.

I'm using the "Convert urls into links" filter, and this is causing them to render with broken html.

I think this meets the criteria for a major bug. https://www.drupal.org/docs/develop/issues/fields-and-other-parts-of-an-...

ki’s picture

Attempt to support ckeditor 5's behavior of saving content source in one line caused too much trouble. The behavior existed for a while and seems to have gone now.

I have reverted the regex back to old simple one. Instead, I've revised the regex for in-line URL option and updated the description for the "Replace in-line URLs" checkbox to "Recognize in-line URLs. Enable too if the wysiwyg editor saves the content source in one line."

The update has been released in beta-9 version for review and testing.

@mortona2k
If you are a new user of the module, please make sure that the filter for "Convert URLs into links" comes after. As mentioned in README, the order of filters is important for the soembed filter to work.

Thanks.

loze’s picture

I've created MR10 as an attempt to improve the regex for common cases with text editors.

Moderators may inadvertently add &nbsp; or regular spaces, or the URL may be wrapped inside a <div>.

These are situations I frequently encounter with moderators who are not familiar with HTML and often complain, "My links don't embed." In 99% of cases, the issue is that the HTML is slightly malformed, and they don’t know what to check or how to fix it.

My code attempts to make the url embeddable if it is "visually" on a new line even if it isn't in the code. And is working well, so far, for us.

loze’s picture

StatusFileSize
new2.08 KB

And here is the patch to help test it with composer

  • ki committed 1261d9d4 on 8.x-2.x
    Issue #3417152 by ki, loze, peter törnstrand: beta5 breaks filter