Problem/Motivation
In addition to listing files, the core file view also lists and links usages. Another difference between the two views is that media may be deleted from the media view, whereas the files may be marked for deletion based on automatic (but broken) usage tracking.
User stories:
- As a media manager, I want to see where my media is used across the site (in entity relationship fields or in WYSIWYGs) so that I can effectively manage my media library and related content.
- As a media manager, I want to be prompted when I try to delete a piece of media that's in use (in entity relationship fields or in WYSIWYGs) so that I don't inadvertently break content or designs.
Proposed resolution
Add (non-broken) information on media usage and present it to the site builder where appropriate (on the media view / media library, possibly on the media deletion confirmation form, etc.).
Remaining tasks
Postponed on #2821423: Dealing with unexpected file deletion due to incorrect file usage.
User interface changes
API changes
Data model changes
Comment | File | Size | Author |
---|---|---|---|
#25 | delete_find_usage.png | 20.38 KB | anavarre |
#25 | find_media_usage.png | 36.32 KB | anavarre |
Comments
Comment #2
dawehnerPersonally I have seen a lot of usecases for a view of places where some particular files are used. This will be probably harder when you don't track those files,
but its not a super common feature.
Comment #3
Gábor HojtsyNeed to get feedback from subsystem maintainer. Also probably product manager, given this will affect how files and file usages are trackable for users. I keep this with the file module because that is the affected one, media system would make this change but file.module would get it.
Comment #4
BerdirThis the related media_entity issue in contrib: #2756747: Track media usage, with a link to a sandbox module for generic entity usage tracking. Which we'd basically need for this.
Comment #5
tkoleary CreditAttribution: tkoleary at Acquia commentedI don't think we need to (or should) get rid of the file view. There are certainly cases where you might simply want to know about what files are stored without reference to how or if they are used in media entities.
For now we could just hook menu alter the tabs at /content when media library is installed so that:
Something like this:
Then when media library is no longer experimental we can make that the default configuration.
Comment #6
Gábor HojtsyWell, currently the secondary tabs on /admin/content/media are for the different media types such as document, image, remote video and facilitate filtering the media list. If we sprinkle in the files tab, that would be confusing IMHO, since that would not behave like the rest of the media library does.
Comment #7
tkoleary CreditAttribution: tkoleary at Acquia commented@Gabor Hojtsy
Hmm. Ok. That's actually a different problem. It didn't occur to me before but that would be an anti-pattern to the way core handles sorting and filtering. If we follow the pattern of content (as we probably should) all media will be shown and the user can filter it. If we introduce this pattern of displaying preset filters as tabs then any module that adds a tab creates the same problem.
So I suggest we revisit the filters and go back to the standard pattern, in which case my suggestion above is the same.
To the question of "should we have preset filters for the default media handlers" I still say yes, but we need to render them as something other than tabs. At the risk of sounding like a broken record, Select2... Select2... Select2...
Comment #8
Gábor HojtsyHm, in #2828538: Produce high fidelity screens based on Media prototype we produced the media library designs in Seven's style and the tabs evidently mapped to Seven's secondary tabs (just inlining the images from there):
Also similarly in the modal:
It is entirely true that modules cannot add tabs to this without it being fully confusing. But I think it was/is a defining part of the design wasn't it?
Comment #9
tkoleary CreditAttribution: tkoleary at Acquia commented@Gabor Hojtsy
Indeed it was. And it was a big oversight on my part that I did not identify the problem earlier. We definitely need to revisit it though.
Comment #10
Gábor HojtsyI think we can/should bring this to the UX meeting later today in 3 hours for review and discuss that question.
Comment #11
tkoleary CreditAttribution: tkoleary at Acquia commentedYes, I agree.
Comment #12
Gábor HojtsyYoutube recording from UX meeting where this was discussed: http://youtu.be/hf8AovBZflo it came up that Files may need to be a tab under Media which would invalidate the use of tabs as filters and significantly alter the design of the library. On the other hand otherwise there is no extensibility under this tab for other modules at least not by adding more tabs.
Comment #14
xjmFor #2821423: Dealing with unexpected file deletion due to incorrect file usage we are going in the direction of deprecating file usage tracking, so at most, this will need to follow the solution we come up with there. There will need to be a way through the UI to manually delete files. Postponing.
Comment #15
xjmComment #19
xjmIncorporating some bits from #2938473: Users can delete media from admin overview page without knowing where the media is in use, updating based on the current state of HEAD, and adding credit for @anavarre who reported the duplicate.
Thanks!
Comment #20
xjmComment #21
BerdirI don't think this needs to be postponed on that.
I know @catch wants to remove the file usage system completely but I don't see that as realistic.
The reality is that we need it, for example to handle private file access and for users to understand how media is used and if it can be deletd as the issue referenced above shows.
The only thing we can remove is that we delete things automatically. At least it should be configurable as it is now (without a UI, though and without any way to delete unused files in core).
Comment #22
BerdirPostponed #2904842: Make private file access handling respect the full entity reference chain on this.
Comment #23
BerdirDiscussed a bit with @marcoscano and he said that generically tracking all entities, all the time, is a huge overhead, almost impossible and not really worth it.
I agree with that after some thinking. But I still think that we need to consider tracking all media entities at least. And if we need that, then we probably want at least the API and storage to be able to track a generic entity -> entity usage. It's another question for what usages we'll then actually use it. I think that makes more sense than introducing a media_usage API that is a copy of file_usage.
But even with tracking, having magically working media entities with private files that automatically become visible when the content that uses that media is visible is.. extremely hard. And we need to think about what exactly we can and want to support. I think I already commented about this a few times, for example in the issue about hiding/adjusting the field type UI. If you want private files on content (e.g. the pdf e-paper version of articles on a newspaper site and only grant access if the user is allowed to see that article), then maybe you should not use media at all but stick to plain, non-reusable file/image fields. Which means that the answer is not that media fields replace image/file fields all the time, just most of the time, which affects how we show them on the field add UI among other things.
Comment #24
marcoscano+1 to #23 :)
IMHO the hard part of tracking is to do it generically enough to be useful for all use-cases. For example, I'm not sure the solution for "tracking for access" needs to be the same as the solution for "tracking to ensure content integrity". Even the solution to ensure content integrity might be different depending if you need revisioning / workflow / multi-lingual or not, for example.
Comment #25
anavarreAnd
Got me thinking. Could we track all / media entities only when it's needed? Examples below for media entities:
We could imagine a 'Find usage' action button like below:
And, upon trying to delete a media entity, a warning and a button to also find where the media entity is being used.
Comment #26
marcoscano#25 brings an interesting approach for the "content integrity" use-case of tracking (it wouldn't address the "access inheritance" use case).
I believe that for content related in
entity_reference
fields it could work. However, doing the same with entities embedded in WYSIWYG would be much harder. It basically would mean we would need to load all text fields from all content entities, and parse their content, in order to detect existing usages. Even if we use a batch process, that could be a huge overhead in sites with a lot of content.Comment #27
mallezieI think we have 3 different ways where we need to count items. Mostly this could be up to the entity itself to track it's own usage. So we don't need to do this on a parse everything afterwards approach (which would indeed be not easy / performant see #26).
An entity reference field is probably easiest to track, we might do that on a field_save moment, which would be performance wise okay probably.
An entity embed (contrib now) might also be feasible on the same way.
Think the most difficult one is direct linking in text-field. (download this file / link to other page). This would indeed involve parsing the text-field which might be the trickiest part.
Not sure if deploy does something similar to find the dependencies. IIRC (D7-wise) only the first situation is handled there. The others needed custom functionality. But could be we need a similar solution for both problems (deploy dependencies and usage tracking).
Comment #28
catchThis isn't exactly my position.
I think that using the file_usage counter is unmaintainable as it is - each module is responsible for handling it correctly, any module getting it wrong can result in data-loss affecting all the other references, the only way to correct things is to both correct the logic and create an upgrade path to repair data.
So I think we need to add either an alternative or additional API, which allows modules to define a particular type of file usage, and way to collect it for a particular file (or to rebuild it like the node grants system). For entity references this is simple enough, it's not for things like wysiwyg embeds.
Comment #29
dsnopekAPI-wise, this actually sounds a lot like a search index, ex. search_api. An imaginary implementation of this API could define a 'type' of thing it gathers usage for and provide a method for finding all those things (ie. a list of what needs to be indexed) and another method for processing them (aka, generating the index). Then Drupal can provide a way to completely re-index the usage data for that method, doing batches during cron or via the Batch API or whatever. Like search_api, it'd probably need a table to track the list of things to index and if they need to be reindexed.
Comment #33
Chris Matthews CreditAttribution: Chris Matthews as a volunteer and at City of Oaks Design commentedComment #36
Anybody100% agree with @dsnopek and a bit like with cache, as modules can invalidate the index, but don't maintain it themselves, that should be done by code. And I guess entity referencing modules will have to care for this index as they're the ones that know about the entity usage.
Currently I think in most cases it's:
Most other modules like paragraphs, etc. depend on such basic modules.
Perhaps we could introduce / ("force") a Trait or Interface from core for modules that basically implement entity references to provide the required methods to update the index?
Comment #39
mlncn CreditAttribution: mlncn as a volunteer and at Agaric for Drutopia, Portside, Sahara Reporters commentedTo help people looking for solutions to this problem now, and perhaps to better illustrate the problem space and possible ways core could handle this (or better support solutions):
Comment #40
joseph.olstadThere's a module called safedelete that I developed in conjunction with another developer. If this module safedelete module is installed and content is referenced within body fields using the linkit module the safedelete module will provide a report of to-be-orphaned references on the delete entity form. The idea is to prevent breaking references.
This approach has been in place for two instances I know of with about 8000->10000 pages of content on each of them.
Comment #43
larowlanI opened an ideas queue issue for this too, as it has come up a few times in product/security/bug chats