This is a follow-up to my comment, and @Wim Leer's answer, in #2630076: Inline images: how to calculate the absolute file URL.
My comment should have been its own separate issue but I didn't think that far when first finding the discussion there....
Thanks to Wim, for the quick and thorough answer despite me hijacking that issue.

Problem/Motivation

I can configure the upload dialog in CKEditor to upload files into public://whatever, making me assume paths will be relative to my public files folder. If so, everything would be OK. But, uploaded image paths are instead hardcoded as relative to the Drupal root at the time of upload. This is not as bad as it was before (see referenced issue), but it still causes headaches.

It's quite common for us to have multiple dev instances of the site example.com on dev.example.com, dev2.example.com, or even one at foo.com, but they are all really hosted as a single multi-site installation. Usually, these are managed by Aegir since it makes it easy to clone/move sites at various stages in our workflow.
If I upload an image on either site and then try to move that content to the live site, it will end up using file paths from the dev environment so all images will return 404, unless we do as Wim suggested (and we currently do just that), add a site alias to sites.php to force Drupal to save files using the live version of the path instead.

Wim outlined some of the reasoning behind the current behavior (in quotes), and here are my reactions to them:

Most sites have different domains (hosts) when they change from dev -> staging -> prod. Only some of them being in a subdir is rare, and not wise, because it means you're probably missing things in QA.

The issue is the same regardless of if you change the domain and/or the subdomain. Both cause the path to the public files folder to be different. Everything in Drupal 8 appears to deal with this change perfectly fine, except when uploading images.

I've used multiple workflows with Drupal 7 and the Media module where URLs (domain and/or path) have changed wildly between dev/staging/testing/live without any issues at all, thanks to there being no hardcoded paths and everything relies on the file entity id.

Moving from dev.example.com to example.com is of course totally valid, but then it's up to you to set up your sites.php file correctly: dev.example.com and staging.example.com should point to example.com. That means files will be uploaded to /sites/example.com/files always, regardless of dev/staging/prod. That's even exactly what /sites/example.sites.php documents!

This breaks down completely if multi-sites (Aegir) to easily manage multiple parallell development variants of the same site on the same platform. There's no way multiple instances of the same site can co-exist anymore if everything has to point to the same [settings] folder just to preserve the paths in the editor.

We went with a practical compromise in core: use relative file URLs, because that works fine 99% of the time (see above explanation).

I can see why and I'm glad it works for the almost all use cases. I'm also a bit surprised that choice was made given all work that has already gone into making sure as much as possible uses the public:// scheme. Core including a filter and editor plugin to perform the transformation from [public|private]:// to example.com/sites/[whatever]/files/ just seemed like the logical way to accomplish this, IMHO.

But, we also thought about the other 1%. Whenever you insert an image, there are data-entity-type and data-entity-uuid as well. So you can do reliable transformations. You can create a filter that looks for those attributes, loads the file entity, generates a URL for the current site. It's trivial to do so.

I did notice that, and I'm very glad that information is there!
However, doing only that may turn the embedded image's href="/sites/[something]/files/...." attribute into a red herring where you can change it to anything you'd like and it'd simply be ignored. The in-editor preview would of course break if you make it an invalid path, but an editor plugin could do the same transformations to forcibly set the real path every time it needs to update the placeholder. Then there's the case if you disable the editor, so why not have it as public://whatever to begin with?.

In fact, that's where Media module got it from!

Wow, thanks to this line I dug deeper into why it appears that Media module was not available at all for Drupal 8*, and found a whole bunch of new modules integrating in various (so far quite confusing) ways. This will take some time to figure out, but I just found some documentation about how the Entity Embed module does this. Much of the docs are just TODOs still though...
Anyway, what I've found so far does not do away with this problem in Core....

Proposed resolution

  1. Store embedded image paths as public://whatever instead of /sites/[sitename]. Or even better, don't set a path, since it is completely irrelevant when we have the entity id in there already.

    If <img data-entity-type="file" data-entity-uuid="foo" src="#" alt="..." title="..." /> was all I saw in the database it would make me very happy and confident I could do pretty much anything to the site as long as the managed_files table is intact.

    See #15 + #16: moved to #2755223: Consider letting EditorImageDialog return canonical public:// file URLs and let the drupalimage CKEditor plugin transform those to the correct URLs.
  2. Have the editor image plugin replace the private|public path with the actual path just before switching to WYSIWYG mode (and do the reverse when going to source mode).
    See #15 + #16: moved to #2755223: Consider letting EditorImageDialog return canonical public:// file URLs and let the drupalimage CKEditor plugin transform those to the correct URLs.
  3. Have a filter generate the file entity's URL when content is rendered.

Wim again:

I think it would be okay to update \Drupal\editor\Plugin\Filter\EditorFileReference() to automatically update those src attributes on images. Then even your 1% use case is taken care of.

Sounds like a good place to start! I can't promise I'll have time to dig into it myself though.

If you want to do that, please create a new issue referencing this issue, please don't repurpose [that] one.

Done, and again; apologies for my bad manners.

* The Media module page links to an issue, which in turn has a broken link to GitHub. There's nothing on that repo page, but that account does have lots of other Media and File Entity related repositories.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

TwoD created an issue. See original summary.

Wim Leers’s picture

Title: How to deal with hardcoded /sites/[sitename]/ paths in embedded images » Inline images: consider storing either 'public://image.jpg' or '' (nothing at all) on an <img>' src attribute, instead generate URL in filter
Version: 8.0.3 » 8.0.x-dev
Component: ckeditor.module » editor.module
Category: Support request » Task

Thanks, great issue :) Retitling for clarity and marking as task, not support request, since this is something you'd like to see happen, but it's also not really broken (hence not bug report).

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Wim Leers’s picture

Title: Inline images: consider storing either 'public://image.jpg' or '' (nothing at all) on an <img>' src attribute, instead generate URL in filter » EditorFileReference filter: generate URLs for inline images
Priority: Normal » Major
Status: Active » Needs review
Issue tags: +Contributed project blocker
FileSize
5.12 KB

Generating file URLs for files referenced in filtered texted not only solves this problem, it also ensures that if a module implements hook_file_url_alter(), that it actually works for those files also.

Clearest example: the CDN module (which I maintain).

If Drupal core doesn't do this, then the CDN module would have to do it.


Attached patch implements this and updates the test coverage. Note that the expected file URLs look a bit strange… this is because the way PHPUnit-based kernel tests uses VFS (virtual file system) results in very hacky/clunky URLs, but that's a pre-existing problem that we shouldn't address here.

Wim Leers’s picture

Version: 8.1.x-dev » 8.2.x-dev

@timmillwood pointed out in IRC that this can't go in 8.1.x, and he's of course right.

timmillwood’s picture

Status: Needs review » Reviewed & tested by the community

This will solve so many headaches.

For Deploy in Drupal 8 (when doing single site content staging or with cross site content staging) we have a lot of normalizers to make sure entity references and other things go back the way they are supposed to, currently we don't do anything much with data in fields. This patch will save us some of this effort by making files referenced in ckeditor work after they're replicated to a different place (on the same or different site).

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 4: editor_filter_create_url_for_referenced_files-2666382-4.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Reviewed & tested by the community

Random migrate fail:

fail: [Other] Line 139 of core/modules/migrate_drupal_ui/src/Tests/MigrateUpgradeTestBase.php:

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 4: editor_filter_create_url_for_referenced_files-2666382-4.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Reviewed & tested by the community

MADDENING this is.

#2724871: Random failure in \Drupal\migrate_drupal_ui\Tests\d7\MigrateUpgrade7Test is where this random fail is being fixed.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/modules/editor/src/Plugin/Filter/EditorFileReference.php
@@ -71,6 +71,15 @@ public function process($text, $langcode) {
+            $result->addCacheableDependency($file);

Doesn't this need testing?

Wim Leers’s picture

#11: actually… it's not even necessary, the existing code already does that. The tests already test this. That's why it's green. It's just a force of habit by now :P

Removed that line, and added a code comment to clarify what that filter does:

  1. stage 1: updating the 'src' attribute (note that this attribute may not be set for every occurrence of @data-entity-type="file" and @data-entity-uuid — for example entity_embed does not have a src attribute, yet relies on this filter!)
  2. stage 2: add cache tag of the current file only once
timmillwood’s picture

Status: Needs review » Reviewed & tested by the community
catch’s picture

Status: Reviewed & tested by the community » Needs review
Issue tags: +Needs issue summary update

This needs an issue summary update. I'm not sure why we can't store public:: when saving the field, then generate the URL from that in filter processing. Storing whatever URL we get, then fixing it by loading the file entity and pulling it from there seems a bit heavy to fix a 1% case.

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs issue summary update +Workflow Initiative

Storing whatever URL we get, then fixing it by loading the file entity and pulling it from there seems a bit heavy to fix a 1% case.

It's to allow rich text editors like CKEditor to actually show the image.

If we don't do this, then we'd need to make CKEditor do a lot of additional processing. We could do that, but:

  1. then we have automatic-processing functionality in CKEditor that every other text editor will also need to implement
  2. then we need to make the drupalimage CKEditor plugin aware of which specific text format filters are used, and act only when appropriate
  3. then we need to write JS test coverage — I try to change Drupal core's custom CKEditor plugins as little as possible
  4. I do see that that is semantically more accurate, but I don't see it gaining us much in practice

I originally said it's a 1% case, but that's not true anymore, we have at least three cases:

  1. multiple development instances (as @TwoD described in the issue summary)
  2. CDN (see #4)
  3. Deploy/Workflow Initiative (see #6)

Finally, storing src="public://…" and then letting CKEditor do the conversion is something we could still do later. What this patch does is essential. That other part is non-essential.


(Not updating the IS because I don't see how to update it without making it more confusing because of the way it explains the problem. If this issue grows much bigger, I'll take that on though. But for now, at 14 comments, having the answer to #14 in this comment seems clearer.)

Wim Leers’s picture

Alex Pott asked me in IRC to update the issue summary because it's too confusing at the moment. So, did that.

This is still right:

Finally, storing src="public://…" and then letting CKEditor do the conversion is something we could still do later. What this patch does is essential. That other part is non-essential.

The changes I made in the IS here mean that this issue only does point 3 of the proposed resolution in the IS. It's the only essential part. It's what the issue title says. And now there's #2755223: Consider letting EditorImageDialog return canonical public:// file URLs and let the drupalimage CKEditor plugin transform those to the correct URLs for points 1 and 2.

alexpott’s picture

Status: Reviewed & tested by the community » Needs review

I think we might need an upgrade path - doesn't a cache need clearing?

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

There's no need or an upgrade path: the stored data is the same. The filter is already enabled, the filter just does a bit more now.

And the filter's output was already using cache tags. The test coverage is already ensuring that those cache tags are present. When either of those files is renamed, moved to a different directory or even just updated with different content, the filter will therefore run again.

Before vs after this patch, nothing actually changes in the output. The change here is that changes to files will actually cause the image file URLs to be updated, which they do not in HEAD.

Or am I missing something?

alexpott’s picture

But if someone has moved a file location and this has broken - then if we don't clear the caches then it is still broken until the next filter cache clear at which point it magically fixes itself - no?

Wim Leers’s picture

Correct. But updating to Drupal 8.2 will already clear all caches anyway.

Happy to add an explicit Cache::invalidateTags(['rendered']) in an hook_update_N() if you want that though. If so, I think you'll also want an explicit upgrade path test?

Please confirm, I'll make it happen. I'm personally not convinced this is worth our time, precisely because updating === clearing all caches.

Status: Reviewed & tested by the community » Needs work
Wim Leers’s picture

Status: Needs work » Reviewed & tested by the community
alexpott’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
548 bytes
5.56 KB

@Wim Leers if a module does something that requires a cache clear then the module should have an empty update. We've decided these should be post updates - because it is easy cause no numbers involved. If there were no updates for 8.2.0 then update.php would not clear the cache.

Status: Needs review » Needs work

The last submitted patch, 23: 2666382-23.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

I didn't know update.php didn't always clear caches!

#23's interdiff looks great to me. (I'd personally add a @see https://www.drupal.org/node/2666382 to this post-update function, but that's a nitpick.)

RTBC'ing #23's interdiff, hence re-RTBC'ing the entire issue.

effulgentsia’s picture

Ticking credit boxes.

  • effulgentsia committed 9268dcc on 8.2.x
    Issue #2666382 by Wim Leers, alexpott, TwoD, timmillwood, catch:...
effulgentsia’s picture

Status: Reviewed & tested by the community » Fixed

Pushed to 8.2.x. Thanks!

effulgentsia’s picture

Even though I committed this, I now have a few reservations about it:

  1. Minor: EditorFileReference has a class doc that says "Passes the text unchanged", which is no longer true. Could be addressed in a follow-up.
  2. Minor: EditorFileReference::process() looks to me like it now does more loadEntityByUuid() calls than needed. The method already does that call for cache tag management, so can we reuse it? And can we also optimize it to not do it multiple times for the same file, as we already do for the cache tag processing? That may require changing $processed_uuids to track the files rather than just a TRUE/FALSE. This is also fine for a follow-up.
  3. The above follow-ups might not be needed depending on how #2755223: Consider letting EditorImageDialog return canonical public:// file URLs and let the drupalimage CKEditor plugin transform those to the correct URLs gets implemented.
  4. My biggest concern though is what if there's a module that either during upload time, or during filter processing time, sets the src attribute to an image style derivative of the file? With #28, EditorFileReference now overrides that back to the original file. I don't know if there's such modules out there. https://www.drupal.org/project/entity_embed doesn't appear to run into this problem, but potentially only because entity-embed-container.html.twig wraps any resulting <img> output in an <article> wrapper, which separates which element ends up containing the src attribute from the one containing the data-entity-uuid attribute. But in theory, some other filter might support image styles without creating such a wrapper. This concern almost makes me want to revert this patch, but I'll hold off on that for feedback from others, including what direction #2755223: Consider letting EditorImageDialog return canonical public:// file URLs and let the drupalimage CKEditor plugin transform those to the correct URLs takes.
Wim Leers’s picture

Great concerns!

Let me first address the big one.

  1. It'd be wrong to modify the src attribute value during upload time. This is filtered text. Any overrides you want to make must respect the original data; you must not ever modify it. Therefore you must use a filter.

    Then there are two possible choices: does this hypothetical filter that wants to make some alteration to the <img src> allow the content creator to make a choice (e.g. pick a separate image style for every inserted image), or is that choice reserved to the site builder (e.g. pick an image style for all images)? In the latter case, it'd be a filter setting. In the former case, it'd have to be a data- attribute on the <img> tag, so that the choice is stored in the original data, and a filter does the necessary transformations on output.

    This is exactly what #2061377: [drupalImage] Optionally apply image style to images uploaded in CKEditor 5 does!

    In other words: you're not the first to think about this concern. We've chosen to use data- attributes to represent these content author choices, which allow us to store these choices in a structured form, and to then let filters take action on them. It also means that you can easily uninstall contrib modules that provide such functionality, and your images will still work. It also means migrations away from Drupal 8 (to Drupal 9 or some other CMS, or just when processing the data stored in Drupal for another channel of output has that same metadata to make smart decisions during filtering).

    See http://wimleers.com/article/drupal-8-structured-content-authoring-experi... — this is exactly why we also have data-caption, data-align, data-entity-type and data-entity-uuid.

    Therefore But in theory, some other filter might support image styles without creating such a wrapper is wrong: the solution is to add data- attributes to the element you care about, and then have a filter that runs late enough to ensure your changes apply — no wrappers necessary.

  1. Nice nit, we'll fix that in a follow-up: #2769529: EditorFileReference class doc needs a small update.
  2. I had the same exact concern. But we can't reuse it. Because we are loading the file entity for very different purposes:
    1. It's possible that we have data-entity-uuid, but we don't have src. In that case, the code this patch added won't execute its loadEntityByUuid() call (this would be the case for entity_embed, for example).
    2. On the other hand, we load the file entity once, just to add its cache tags to the processed result. But we only need to do this once, even if the same file is referenced multiple times.

    These are completely orthogonal things. I tried to bring it down to a single loadEntityByUuid() call, trust me. But there's no way. This code is simple, legible, maintainable. Why make it needlessly complex for what is by definition a premature optimization? Let's leave the optimizations for later, if they turn out to be necessary.

  3. I don't see how points 1 and 2 are related to #2755223: Consider letting EditorImageDialog return canonical public:// file URLs and let the drupalimage CKEditor plugin transform those to the correct URLs.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.