Problem/Motivation
Drupal will upload files into one directory by default. Users can affect this by changing configuration of file/image fields, but real-world experience shows that they don't. Uploading thousands or millions of files into one directory is a very bad idea, which can have performance implications too. There are reports of sites going down just as a result of this issue.
Proposed resolution
Let's change default location configuration variable in file/image fields. This variable is currently empty, which results in all files being uploaded in a root folder (usually public://
). Since use of tokens is supported we can provide configuration that will dynamically adapt for different fields.
[field-storage:name]/[date:custom:Y]-[date:custom:m]
was preferred but couldn't make it in time for RC1 so we've decided to simplify the patch to minimize the disruption and file #2592251: Uploading files should have more robust default tokens instead.
We're now going with [date:custom:Y]-[date:custom:m]
only for all fields. On top of that we've decided to upload user avatars under pictures/[date:custom:Y]-[date:custom:m]
and to remove file_directory: field/image
completely from field.field.user.user.user_picture.yml
since it would be duplicating what the default is anyway.
Remaining tasks
Evaluate (and document in the issue summary) impact of showing internals like field machine names to end users whose browsers load an asset and expose the filename (and thus the path in the url)Attempt at answering this in #141- Validate the approach is confirm the rc target requirements
- Review change record
User interface changes
Site builders now see the default tokens in the File directory
field for files/images. This can still be completely customized to suite their needs.
API changes
None.
Original issue summary by @meba
It is a well known fact that if you upload thousands of files into a single directory you will have performance issues on certain systems such as Cloud systems using GlusterFS, NFS or other shared filesystems or simply slow hard drives. The reason being that list operations on that single directory take too much overhead over network.
The solution typically is to ensure there is no more than a couple of thousands of files in a single directory in your files/. This can be easily achieved through dividing directories by date:
30/12/2013/file.jpg
29/12/2013/file2.jpg
https://drupal.org/project/filefield_paths provides this functionality but many people do not install it when they create their website and migration is complicated.
I propose that we automatically add date based tokens to all file system paths in Drupal core and allow modules such as FileField Paths to override these tokens to provide advanced functionality.
Beta phase evaluation
Issue category | Task because it's not really new functionality, but it's not really a bug as nothing is broken. |
---|---|
Issue priority | Normal because it's not a blocker for anything. |
Prioritized changes | The main goal of this issue is usability. |
Disruption | This issue should have no disruption for any existing APIs, modules, or sites. |
Comment | File | Size | Author |
---|---|---|---|
#231 | drupal-n2128055-231.patch | 2.65 KB | DamienMcKenna |
#25 | Added_data_pattern_in_destination_folder-2128055-25.patch | 1021 bytes | estoyausente |
#28 | Added_data_pattern_in_destination_folder-2128055_2.patch | 1005 bytes | estoyausente |
#29 | added_data_pattern_in_destination_folder-2128055_28.patch | 1.02 KB | stijntilleman |
#42 | 2128055_42.patch | 1.09 KB | slashrsm |
Comments
Comment #1
attiks CreditAttribution: attiks commentedI don't think it has to be that many levels, we could use 20131230 so we only create one extra sub directory.
In the past we used something like
substr(md5($file_name), 0, 4);
, but maybe a date is easier for all use cases.Comment #2
Dries CreditAttribution: Dries commentedYesterday, a big Drupal site was down for several hours because Drupal uploaded hundreds of thousands of files into a single directory. The amount of files caused severe problems for the underlying filesystem, leading to data corruption. We should have safe defaults, so I propose providing a mechanism in core to prevent this kind of data corruption.
Comment #3
Dave ReidI think just
[current-date:custom:Y]/[current-date:custom:m]
(e.g. 2013/11) would be good.Comment #4
attiks CreditAttribution: attiks commented#3 We did a project last month where we imported 30.000 files, so Y/m isn't enough, can we settle for Ymd (without /)?
Comment #5
pdrake CreditAttribution: pdrake commentedOne problem with using dates in the URL for files is that users may assume that it is a timestamp of sorts for the file, which may lead to confusion about the freshness of content. The date in the URL has no relation to the creation time of the content itself, just the date on which the file was uploaded (or perhaps the date on which it was migrated). Additionally, as attiks said, certain situations (such as content migration) may result in the creation of too many files on a given day.
Comment #6
Crell CreditAttribution: Crell commentedI agree that we should have a dynamic default, but it should not be date-based. I would recommend something like the node type or field type (or both). That makes it very easy and predictable, but still helps to reduce the "crapload of files" problem. It doesn't help if you're going to have 300,000 nodes of the same type, all with files attached, but if you're building that scale site you should set up custom rules anyway. (Perhaps better help text in the latter case could be helpful?)
Comment #7
meba CreditAttribution: meba commentedYou would be surprised about the amount of sites that are not "enterprise" but have this many files.
Let's discuss *how* to make this happen and choose a pattern later. Right now we agree on the need for the pattern, we have the ability to choose it later.
Comment #8
saltednutI agree with #6 that sorting by date makes things less searchable. I am less concerned about whats suggested in #5 since we are going to have a confusing URL in making things granular in most cases. I would suggest a pattern of [entity]/[bundle]/[field]/[id]/my_snarky_meme.gif for a file. This allows developers to easily find files and should prevent too many files from being stored in a single directory. The edge case would be a node with a multifield with millions of attachments to that field. It doesn't seem a likely scenario but its possible someone could do it.
Comment #9
Dave ReidWe already have a way to make this happen - there is a setting in the file field configuration, but it's default is nothing, which means all files get dumped into the public directory. Discussing how to make this happen means choosing a pattern now.
Comment #10
Dave Reid@brantwynn: Always consider that adding files to new content means we don't have context such as ID. Also, we'd need to fix that the token replacement when the file uploaded has no context related to it's parent entity at all (see file_field_widget_uri() and nothing that calls this function uses the $data parameter.
Comment #11
saltednut@Dave Reid Could we get around this by storing newly saved files in a tmp directory and then moving them after the entity is saved?
Comment #12
Dave Reid@brantwynn: Sure, it would require some major changes to file fields, but it might be worth a try in order to get that context. I'm sure the maintainers of filefield paths would appricate that work going into core.
Comment #13
slashrsm CreditAttribution: slashrsm commentedI think we should think about this out of field context. We can create default pattern for fields, but we should not forget that files can come from other sources (there is support for WYSIWYG in D8 now, modules like Media, etc.).
Comment #14
Dave Reid@slashrsm Indeed. Also images could via contrib be attached to multiple entities, so I think the same cons for a date approach also apply to an entity/ID approach.
Comment #15
meba CreditAttribution: meba commentedAny pattern that is not going to be based on either semi-random / date information will fail the same way as no pattern. Remember - most sites have one major content type and one major image field.
Let's summarize:
Comment #16
attiks CreditAttribution: attiks commentedThere;s another use case, people using IMCE plugin for wysiwyg, they probably will have a hard time finding their images. No idea if these type of module still exists in Drupal 8, but in general if we start moving images, we will make it hard for all modules trying to access the file system directly.
Comment #17
Crell CreditAttribution: Crell commentedA value unique to the entity isn't going to help much. Instead of 500,000 image files in one directory, you get a directory with 500,000 directories in it, each of which has one file. This is not an improvement.
I think we need to be careful of our goal here; it shouldn't be to "solve" the performance problem. It should be to mitigate it. Whatever we do, we're going to have patterns that result in epic fail in some use cases. We should try to have a default pattern that isn't going to epic fail in the most common use cases, and *suggests to users that they can setup their own rules* based on their use case. ("Teachable moment", as they say in the Ed biz.)
Comment #18
slashrsm CreditAttribution: slashrsm commentedIt looks that we have two options here:
- We use something file-related for sub-folder token. Upload date seems the most appropriate in this group, but it has some obvious problems. Pros: semantically nice urls, easy to understand, seen elsewhere (Wordpress?); Cons: possibility of migration problems (not if timestamp is also migrated over), may cause confusion among users and less experienced developers.
- Pseudo-random sub-folder token. I personally like the approach that hash_wrapper takes, since it ensures good dispersion by design (seen it in action on a site with tens millions of files and it works great). We could also use parts of UUID or filesize's modulo. Those two methods also ensure quite good dispersion of files and seem easier to calculate (compared to hashing) and probably a bit easier to understand (compared to hash_wrapper). Pros: good dispersion in most use-cases, no problems with migrations; Cons: harder to understand, does not provide nice/semantic URLs, harder to directly browse file system.
I completely agree with @Crell about the "teachable moment". People should be aware about implications of huge file libraries and not simply follow defaults we provide. While this might work for smaller sites it is definitely not acceptable for high-end projects.
Comment #19
Fidelix CreditAttribution: Fidelix commentedHow about these:
{field_machinename}/{year}-{weekoftheyear}
{field_machinename}/{year}-{dayoftheyear}
{field_machinename}/{year}-{month}-{day}/
Comment #20
Crell CreditAttribution: Crell commentedfield name/year-month seems like a reasonable "common default case" to me. (Most sites won't need per-day, I suspect.)
Comment #21
Fidelix CreditAttribution: Fidelix commentedI proposed "week of the year" because it won't be recognizable by the average user looking at the folders thinking that's the image timestamp, neither it's too random/dirty like a MD5 or another type of hash.
But yeah, it we take that aside then field_name/year-month is good enough, I think.
Comment #22
askibinski CreditAttribution: askibinski commentedI would prefer not using field/entity machine names in the paths because files can be uploaded from outside a field (wysiwyg editor) or media module.
Year + Month + Day (eg. /2014/01/21/) should probably be enough for most cases.
Comment #23
slashrsm CreditAttribution: slashrsm commentedHow do we want to implement that? As as default configuration on image/file fields or on a lower level (more general approach)?
Comment #24
Crell CreditAttribution: Crell commentedJust a default on newly-created file fields, IMO. KISS. This is not quite a Novice issue, but not far from it.
Comment #25
estoyausenteSomething like this?
I put the next pattern: {field_machinename}/{year}/{month}/
If somewhere needs a better implementation for this topic because he has thousand of files in a month... If you want I can try other implementation. I accepts suggestion.
Comment #26
stijntilleman CreditAttribution: stijntilleman commentedShouldn't it be the other way around?
If the user set an upload location you add /{year}/{month} to that location. The way I understand this issue is that if nothing was entered by the user the token will automatically filled with {field_machinename}/{year}/{month}. That way the files root will not be cluttered with files.
Comment #27
estoyausente@stijntilleman mmm... I'm not sure. It's posible, yes. I'm going to change and test again.
Do you think that is a correct pattern?
Comment #28
estoyausenteI change the patch like #26 told me.
Comment #29
stijntilleman CreditAttribution: stijntilleman commented@estoyausente I tryed applying your patch but it didn't work. I manually created the patch and this one should apply.
Comment #30
Crell CreditAttribution: Crell commentedHm, this is doing the work at runtime. I was thinking we just set the pattern by default in filefield, so it shows in the UI. That helps suggest to people that it's changeable.
Also, remember to set the issue to "needs review" when you post a patch so that the testbot can find it.
Comment #31
estoyausenteThanks @stijntilleman (I'm novice and sometimes have problem with the patchs ^^).
@Cell done. Now it have need review status.
And... I'm not sure about your suggest. In the filetype fileld form, show an input with a pattern, for configure the fieldpattern? It sound great. But I'm not sure if I can do it. I Never work with form in D8. I will try it.
Comment #34
Anonymous (not verified) CreditAttribution: Anonymous commentedHas any decision been reached about how things will be done in D8?
I love this one! I think the cons are quite minor, as you should be handling files through Drupal and not directly through the filesystem. This way of doing it would also help to identify and not duplicate identical files, a nice benefit.
Comment #35
erwangel CreditAttribution: erwangel commentedFrom user's side experience, I can say that an alphabetical/lexicographical order on directory tree is more meaningful to a user than a day or any other number split, although this second option has some sens to news sites. I am operating both news and e-commerce sites.
- For news sites based on drupal I have this system :
$files/file-type/optional-content-type/year/month/day/filename.ext
- For user generated content sections like picture galleries I have this system :
$files/file-type/user/year/month/day/filename.ext
- For e-commerce sites based on Magento I use this architecture :
$files/optional-category-or-brand/firstletter/secondletter/filename.ext
where firstletter and secondletter derive from filename, i.e. "f" and "i" => f/i/filename.ext
So architecture make sens depending to the use case. We sometimes have to locate a file in a filesystem architecture and then it should be intuitive. There is no ideal situation, but we can give the choice in initial settings of uploading file path to use a numerical or an alphabetical subdivision system along with some tokens about file-type, user-name or content-type. I think this can be easy to configure and understand even to starters and it covers a large number of usage cases.
Comment #36
pbattino CreditAttribution: pbattino commentedPardon the question, but why this is not considered a "D7 also" issue? I know nothing about the main structural differences between D7 and D8 as far as files are concerned, but having a look at this patch seems no D8 specific.
Is is possible to have this issues backported?
Comment #37
batigol CreditAttribution: batigol commentedHeh, yea it was one of the biggest problem for me - multidomains and multilanguage markets site - real pain in the ass.
Media was totally out, I did use:
https://www.drupal.org/project/insert
https://www.drupal.org/project/imce_filefield
https://www.drupal.org/project/filefield_paths
https://www.drupal.org/project/filefield_sources
https://www.drupal.org/project/colorbox
and File (Field) Path settings was something... like this - [current-date:custom:Y-m]/[node:nid]-[node:language]-[node:domain:id]
Comment #38
kerios83 CreditAttribution: kerios83 commentedPlease remember that some images need to used twice or more times on the same page.
Comment #39
alimac CreditAttribution: alimac commentedMinor tag cleanup - please ignore.
Comment #42
slashrsm CreditAttribution: slashrsm commentedAttached patch follows the suggestion that @Crell gave many times in this thread. It simply adds a default file directory pattern on a file/image field. It uses a workaround for field name replacement as we don't have a token for that in core. Implementing that in this patch would be a) completely out of scope and b) would make the patch much, much bigger. If the workaround is not acceptable I'd rather simplify the default pattern instead.
Comment #44
slashrsm CreditAttribution: slashrsm commentedTest fixes.
Comment #45
Crell CreditAttribution: Crell commentedLet's go ahead and add the token if we can. This sort of code debt is a time bomb, and the sort of "good enough for now" that has gotten us in trouble before. :-) If we're going to do it, let's do it right.
I also don't really care for the year/month being in the default path. That feels unnecessary to me.
Comment #46
slashrsm CreditAttribution: slashrsm commentedI agree. That is a nasty hack :).
How do we want to approach this? Add proper token support for fields in a separate issue and postpone this until that gets in? Which field-related tokens we want?
Comment #47
Crell CreditAttribution: Crell commentedI'd have to see what field related tokens exist now; My knee-jerk thought is to add just what we need for this issue, in this issue. We'd also need a beta eval added...
Comment #48
slashrsm CreditAttribution: slashrsm commentedI believe there are none at the moment. That was the main reason for the nasty hack in #44. We only need field name I believe.
Comment #49
Crell CreditAttribution: Crell commentedLet's add it then. Hacks--
Comment #50
slashrsm CreditAttribution: slashrsm commentedDone.
Comment #51
Crell CreditAttribution: Crell at Palantir.net commentedDo we still need this part now?
This may be out of scope as well, but... \Drupal:: inside a class is a code smell. Always, without exception. What's this method doing that it needs to run a token replacement? It says "generate sample value", which... shouldn't need a service for that, should it? Is that a pre-existing design flaw in the field system?
Grab this service once and use it twice. Doesn't make a big difference in a test but it's a good habit to get into. :-)
Comment #52
slashrsm CreditAttribution: slashrsm commentedThank you. Glad to see we're moving this into the right direction. :)
This was added with field tokens. Token service needs field storage definition in order to process them and here we're making sure the service is always there.
Since this is static context we can't inject. We could just skip token replacement (as we do now), but that assumes we'll never use field definition with non-empy "file_directoy" configuration variable in here. Sounds wrong to me.
Done. Also noticed unnecessary file_load() few lines above. Fixed that too.
Comment #53
Crell CreditAttribution: Crell at Palantir.net commentedI'm not going to try and critique the design flaws of the token API at this time, so I guess let's move forward. :-) Thanks!
Adding beta evaluation, too. I don't think we need a change notice for this.
Comment #54
willzyx CreditAttribution: willzyx commentedString::checkPlain was deprecated (will be removed before Drupal 8.0.). Can we start to use
\Drupal\Component\Utility\SafeMarkup
's methods?Comment #55
alexpottRe #54 yep we shouldn't be adding new usages.
Also the issue summary could be updated to say what the patch is actually doing. Plus it might be good to have a site builder CR to tell people that this default is changing.
Comment #56
slashrsm CreditAttribution: slashrsm commentedFixed #54.
Comment #57
slashrsm CreditAttribution: slashrsm commentedComment #58
slashrsm CreditAttribution: slashrsm commentedIssue summary updated and change record draft created. This was technically already reviewed and was RTBC in the past.
Comment #59
Crell CreditAttribution: Crell at Palantir.net commentedI must have missed SafeMarkup becoming public in favor of String... Latest patch still RTBC IMO.
Comment #60
Berdir@Crell: #2454447: Split Utility\String class to support PHP 7 (String is a reserved word)
Patch looks fine to me, would be useful to have a "by default" or so in the issue title/commit message I think.
Comment #61
slashrsm CreditAttribution: slashrsm commented@Berdir: like this?
Comment #62
alexpott@Berdir: like this?
Comment #63
alexpottThis is an odd inline comment.
Comment #64
rteijeiro CreditAttribution: rteijeiro commentedFixed. Is this way correct? Could not find anything in [#1354]
Comment #65
toamit CreditAttribution: toamit commentedI did not see this issue until now, so I'd request you folks to take a look at this proposal as well for file handling.
https://www.drupal.org/node/2485807
Comment #66
toamit CreditAttribution: toamit commentedIn current patch, which tokens will be available to user besides year/month. Will they include system wide tokens like [nid], [uid] etc?
Year token is mostly useless, as we will get handful of folders. Month gives us 12 folders, a bit of improvement, but does this really solve the original problem?
field-storage/[uid]/[nid] approach will be more scalable for authenticated uploads (if anonymous uploads are permitted that is a different can of worm)
Comment #67
Crell CreditAttribution: Crell at Palantir.net commented#64 looks fine to me; not RTBCing myself because I'd prefer to have someone from token and/or entity API sign off on it. Who should we get for that?
Comment #68
woprrr CreditAttribution: woprrr as a volunteer commentedA small error i think ;)
Comment #69
jonhattanPatch doesn't apply any more due to core changes. Rerolled and fixed #68
Comment #70
Crell CreditAttribution: Crell at Palantir.net commentedGoing out on a limb...
Comment #71
webchickHm. A bit torn here. :\
I've definitely hit this problem myself on sites with tons of file uploads, and it's extremely painful to deal with if it happens to you.
However, I'm concerned that the beta evaluation may be underselling the disruption here, especially to end users using modules such as Media Browser or IMCE or what have you that provide a browser to add existing uploaded files to your content. These content authors (aka, the least technically savvy users of a Drupal site) are now going to have to click 3+ times into deeply nested folders, named firstly after system internal details (particular filefield/imagefield machine name, which is not exposed to them anywhere), and secondly after "what
daymonth did you upload it?" (as if anyone can remember that :P) in order to find files/images that they uploaded in the past.So I wonder if it'd be better to kick this to 8.1.x instead, and postpone it in on having a file browser built into core with a search box by file name?
Not doing that postponing yet so we can discuss it a bit more, but for now tagging for usability review, since the beta eval says this is ostensibly for usability and there's no one from the UX team who's chimed in here yet.
Comment #72
Crell CreditAttribution: Crell as a volunteer commentedwebchick: Honestly, I don't recall the last time I used the file system directly via any module, and I've recommended against doing so for the past 8 years or so. The internal organization of the files directory is a Drupal implementation detail, and anyone relying on that is still thinking in Drupal 5 terms at best. Drupal manages files for you, and provides categorization, tagging, and whatever other organization. That modules still exist that encourage you to break through that abstraction and dig into the file system directly is both surprising and depressing.
IMO, we should view the organization within the files directory the same way we do the DB schema now in D8: An implementation detail that you should not rely on, and if you do then you're on your own.
Comment #73
slashrsm CreditAttribution: slashrsm as a volunteer and at Examiner.com commentedI agree with @Crell. There are two additional arguments that are in favour of this patch IMO:
1. It is still possible to change file_directory pattern, which allows people that might be using IMCE or anything similar to simplify their folder structure.
2. If I have to choose between default configuration that works nicely with solutions that we're trying to avoid in the future anyway and avoiding potential serious performance problems I'd definitely pick the latter.
Comment #74
toamit CreditAttribution: toamit commentedThis patch is quite useful, not only it provides customization as earlier, but provides scalability those who need it. The usability issue can simply be handled with a short description in UI.
Its also high time that IMCE and its kin add a feature which uses restrictions imposed by content type rather than provide global browsable/upload solutions. Ofcourse those who need or like global files as spaghetti could change the path to /files, etc
Comment #75
webchickA second question would be if there are existing patterns we can pull from other CMSes for this. I have a feeling none of them would store files in top-level folders called field_txn_img (for example), but rather by date or whatever instead. That would make this a lot less controversial if it became a "bring Drupal inline with existing CMS standards" than "Let's make up a file directory structure that will result in fewer large file repos, but pushes system internals to content authors."
Comment #76
Crell CreditAttribution: Crell as a volunteer commentedSee #3-#17 for a discussion of what the default path should be. We already went over this 2 years ago.
The current path *is* storing by date, but also by field. Field is a decent substitute for "type" or "use case" in a typical setup. (All, no, but all setups was never the goal of this issue.)
Also, this default does not "pushes system internals to content authors." As noted in #72-#74, content authors should be blissfully unaware of this entire detail. If they somehow need to click through these directories to get to a file, it's because they're using a contrib module that is, quite frankly, doing it wrong and it's that contrib module's problem. Using just core, a content editor will never see this path, period, ever. So "what other CMSes do" is completely irrelevant here.
All this patch does, in the end, is this:
Everything else is just support code for changing the default value in the UI from an empty string to default that is less likely to result in giant directories (less likely, not guaranteed), and by extension shows site builders (not content editors, they're entirely irrelevant here) that they can (and should) set up a path structure for their files (which can be whatever the heck they want). That's it.
It only took us 2 years to add a default string because tokens. :-)
Comment #77
webchickIf it's the default, that means we are making a statement that we think it'll work for 80% of sites, and so I don't think asking for 10 minutes of research into what a couple of other CMSes do so we can (possibly) match them is out of line.
Comment #78
slashrsm CreditAttribution: slashrsm as a volunteer and at Examiner.com commentedWordpress uses [Y]/[m]/filename.ext as far as I know.
Comment #79
rootworkHere's what I've found:
uploads/Y/m
. sourceimages
. source Using the popular Joomla Content Editor it separates out all article-related uploads but puts them all inimages/stories
. sourceDigital Assets/[sitename]/[content_type]
but within the content types everything is in the same folder by default. sourceMedia Library
. sourcestorage/original/[mime_type]
and then stores files with a hash. Most users never refer directly to the filename. sourceuploads
, though I'm not completely sure if it creates subfolders. sourceHope that helps!
Comment #80
Crell CreditAttribution: Crell as a volunteer commentedThanks, rootwork! My read of #79:
* The hosted services use a CDN or similar with file name hashes.
* Several use one-big-dumping-folder, which we're trying to avoid here.
* Among those that use some sort of segmenting, dates, content type, and file mime type(!) are all used by somebody but there's no consistent convention.
* "Most users never refer directly to the filename" seems to be popular, and is what Drupal has moved to as well. If anything, that's the only attribute that even approaches consistency across them.
Based on that, I think the current proposal of field_type/date is reasonable and achieves the goal of "a typical user in the default case won't have a million files in one directory and make his OS cry".
Based on that, setting back to RTBC for webchick.
Comment #83
slashrsm CreditAttribution: slashrsm as a volunteer commentedLooks green.
Comment #84
Bojhan CreditAttribution: Bojhan commentedI assume in the ideal WYSIWYG file browsing integration the map structure is not directly tied to what you show in your browser. If we expect no issues on this front, I don't think we should keep this patch from going in. I would like to have some assurances on this front though, as we should indeed not introduce a very big UX problem this late in the cycle.
Comment #88
slashrsm CreditAttribution: slashrsm as a volunteer commentedReroll.
Comment #91
slashrsm CreditAttribution: slashrsm as a volunteer commentedComment #92
attiks CreditAttribution: attiks commentedSmall nit pick
Use single quotes
Comment #93
rootworkSmall nit pick addressed :)
Comment #94
DrupalRanger CreditAttribution: DrupalRanger commentedWill there be a back port for this addition for D7?
Comment #95
Pls CreditAttribution: Pls as a volunteer commentedYep, would be great to have it as default setting for new D7 installs. Is this possible?
Comment #96
anavarreDidn't we just deprecate SafeMarkup::checkPlain()?
Applied #93 and there's a problem with the [field-storage:name] token that is not being computed:
/var/www/html/head/sites/default/files/field/[field-storage:name]/2015/09/myfile.txt
Comment #97
anavarreComment #98
anavarreAttempting a straight reroll to chase HEAD.
Comment #99
anavarreComment #100
Pls CreditAttribution: Pls as a volunteer commentedWhy we are using only Year and Month as date tokens? Year/month/day format looks reasonable, especially when site is big and files are uploaded frequently (month directory can get big fast). Actually, I would be totally cool with Y/m/d (for ex. 2015/10/05) directory format, field name prefix just seems unnecessary and drupal'y :)
Why do we need field prefix? Does it make better directory separation?
Comment #101
anavarrere:#100 - #98 is a straight reroll only and there's more work to do as #93 pointed out.
To your comments, I agree with Y/m/d or even Ymd (if we're concerned with the number of folders Y/m/d would create in the long run), that's what I always do, too (and it's also what @attiks suggested in #4). I don't do field/field-name at all, mostly because it's ugly in the URL + reasons listed in #22
Large media sites will potentially upload hundreds of assets every day and from experience I've never seen issues with as many as ~2000 files per dir, which gives quite some room for a day-based organization. Let's make a guess: 2000 assets x 31 days = 62000 assets - If those were to sit under a Y/m organization, then this would not be optimal.
Migrations can definitely be causing issues too but I'd consider this to be an edge case that most likely should be planned ahead anyway. Also, we shouldn't wait on committing this based on edge cases IMHO. This is just too important for too many sites to have sane defaults for assets.
Comment #102
Pls CreditAttribution: Pls as a volunteer commentedRight, Y/m/d or Ymd looks reasonable choices. Can we vote on directory format selection? This should be settled before making a patch.
I think that if majority of interested drupalers agree on one format, then there is no need to make ugly default with field prefixes, just because we can.
Comment #103
anavarreHere's an opinionated approach that organizes files under /assets/YYYYMMDDEDIT: nope, that doesn't fly as well as assets/YYYY/MM/DD - Thanks @stefan.r for talking this through.
Comment #104
anavarre#102 - Quickly created https://docs.google.com/spreadsheets/d/1trOcEvW4Yl7iH8VJ1MTlrhlD0cYhQrDa... for people to vote.EDIT: issue is closed, Google Doc has been deleted because it doesn't serve any purpose any longer.
Comment #105
anavarreChanged priority because this *has* to land before D8 is out to the masses.
Comment #106
tstoecklerCan we move those to a new field.tokens.inc please? That would be great.
Comment #107
anavarreLet's try with #106. Also reverted to the below organization as it seems to be preferred for now:
'file_directory' => 'assets/[date:custom:Y]/[date:custom:m]/[date:custom:d]',
Comment #108
anavarreMessed up with the previous patch. Let's break this down in 2 steps. First, the new files reorg.
Comment #109
anavarreAnd now reverting changes to file.module to address #106. Hopefully the patch makes sense now?
Comment #111
webchickFWIW if this changes to just date parts vs. exposing internals (field machine names) to site visitors (in URLs) / content authors (in browse windows), many of my concerns go down a significant amount. I assume that option was thrown out for some kind of valid reason, but the issue summary doesn't really explain how the proposed resolution was arrived at.
Comment #112
anavarre#111 - As I understand things, #72 (using the file browser) and #79 (research on how other CMS'es do it) have tried to answer your concerns. Is there anything else that is still causing you concerns that we should address?
Also, stepping back a bit we have to take into account what is the current state of things and where we're headed:
We *might* be losing a bit of UX via the WYSIWYG's file browser (click click click) but this UI would anyway become moot at some point because simply listing files will no longer be possible when the file system chokes on too many assets to load. Also, we're just suggesting sane defaults that would satisfy the vast majority of sites (aka most common denominator) and also would help people who don't know what they're doing from having FS issues. It might not be in scope for this issue but ideally I would also like to enforce sane defaults for the maximum size of uploads (2M? 5M?).
If people are not happy with the default files structure, they can totally go and change this to something that suits them better. I think that's pretty reasonable considering the current state of files in D6/D7.
Comment #114
pwiniacki CreditAttribution: pwiniacki commentedHmm I remember at 2010 when I did create multilingual domain access page, that I used file_field_path module to exclude this drupal behavior - fixing it by YYYY/MM would be nice.
I did use [current-date:custom:Y-m]/[node:nid]-[node:language]-[node:domain:id] token. I also changed image folder to custom MEDIA inside drupal folder (shorter path ->yousite/media/token/picture).
Comment #116
anavarreComment #117
Crell CreditAttribution: Crell as a volunteer commentedI have no idea why we're adding day. For most sites that will spread files out so far you'll never be able to find them. The typical site is NOT uploading thousands of files a day. Also, not grouping by field name means that I can't easily tell when auditing the file system what node or field the file came from, and thus how to know if it's left over flotsam.
I'm very tempted to unsubscribe from this issue at this point, as a simple minor UX and performance improvement is now at 117 messages and 2 years, and now we're talking about polls and votes and other pointless crap. It's like a microcosm of why I'm less involved in the core queues these days. The ROI of even trying to help on issues like this one is frankly negative more often than not.
Comment #118
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commented+1 for FIELDNAME/ prefix, this is absolutely required as it doesn't only help auditing, it also clearly groups files by functional role and for instance allows manual removal during development and allows users to do "du -hs FIELD/" to assess the upload size by field.
Hearing you crell, let's instead just cut a decision and get this in and dealt with.
FIELDNAME/YYYY/MM/
Let's not forget that the lack of this sane default, caused important sites to go down years after the fact badly reflecting on Drupal. Such a simple change, such a great improvement.
Comment #119
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commentedAgain folks, we're talking defaults here.
High-profile news sites simply requiring their site builders to append DD/ to my proposed default...
everyone else should be A LOT safer!
Comment #120
anavarreBased on experience, many, many sites have been bitten by too many files under one directory (in regular day-to-day operations or with migrations). I'm not saying it's DD or nothing but YYYY/MM could not be enough for large Drupal sites. If people really think YYYY/MM is enough, well let's go for it for the sake of committing this sooner than later.
That is a very good point.
That's called democracy. Sorry you don't like it :-/
This issue is no small change, while it looks like it. Whether or not we do it right will impact so many sites for the foreseeable future and I'd rather have 50 more comments on this issue and still get it committed before the GA as long as we get this right once and for all. Honestly, we should all be rejoicing because we're so close to fixing one of the main pain points with Drupal sites.
Comment #121
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commentedLet's get this accelerated:
https://www.drupal.org/files/issues/2128055-121.patch
https://www.drupal.org/files/issues/interdiff-109-121.txt
This patch:
[field-storage:name]/[date:custom:Y]/[date:custom:m]/[date:custom:d]
core/p...install/field.field.user.user.user_picture.yml
in line.core/p...install/field.field.node.article.field_image.yml
in line.Opinions on format still differ, but please let's focus on the patch code and go for consensus so that we can get this in before 8.0 gets out. Thing is, when 8.0 is out people already building production sites, will start off with messy files directory already risking the first harm..
So opinions aside, this fixes the problem.
Comment #122
anavarreApplied #121 successfully. Confirmed that the behavior works as advertised, not only for new files but for the default image field, too (which originally defaulted to field/image). Really like the approach. +1
Comment #123
YesCT CreditAttribution: YesCT commentedday seems too fine grained to me.
Comment #124
slashrsm CreditAttribution: slashrsm at Examiner.com commentedI agree with @YesCT.
Comment #125
YesCT CreditAttribution: YesCT commentedadded a remaining task to the summary to address the concern about exposure of machine names
Comment #126
anavarreAddresses @slashrsm, @YesCT and @Crell's concern (and probably others in this issue I've missed to mention). Back to [field-storage:name]/[date:custom:Y]/[date:custom:m] - Hopefully we have a consensus.
Comment #127
anavarreTested with Article's default image field + a new file field added to the Article content type. Here's what I see in the URL bar:
And from the FS:
In more details:
Comment #128
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commentedI fail to see the 'field name exposure' concern, we're exposing machine names at many other places (e.g. in views markup) too after all. Even besides that, who would design a site with a field name containing a secret too big to expose?
Comment #129
anavarreBesides the concerns with exposing machine names (that I don't find to be problematic), I wonder if it's likely this will make it to core for 8.0.0 (with the rc eligible tag?) or if this should be tagged minor version target instead?
I honestly think we'd be missing a big opportunity to help users and companies help themselves by having this in ASAP and promote best practices. Else we're in for years of troubles to fix messed up file systems.
Comment #130
anavarreAlso tagging scalability as this does impact distributed file storage systems.
Comment #131
Fidelix CreditAttribution: Fidelix as a volunteer commentedI still think this is too deep: [field-storage:name]/[date:custom:Y]/[date:custom:m]/[date:custom:d]
Instead we should use [field-storage:name]/[date:custom:Y]-[date:custom:m] as a sane default.
That would work for the great majority of installations, and it's simpler.
Comment #132
anavarre#131 - Doing this would mean 12 folders per year x N years and nothing to be concerned with indeed. Sorry I had overlooked your previous attempt at suggesting this in #19.
Comment #134
anavarreComment #138
anavarreComment #139
anavarreComment #140
anavarreComment #141
anavarreTo answer @YesCT's concern in #125 about fields/machine names being exposed in the URL, here's the HTML source for the default image field:
<div data-quickedit-field-id="node/2/field_image/en/full" class="field field--name-field-image field--type-image field--label-hidden field__item quickedit-field"> <img property="schema:image" src="http://head.local/sites/default/files/styles/large/public/field_image/2015-10/photo.JPG?itok=YMZpC79M" alt="My image" typeof="foaf:Image" class="image-style-large" height="360" width="480">
field_image is already exposed in data-quickedit-field-id and didn't seem to be a concern before? IMHO this would be security through obscurity but there are always ways to know what you want to know (e.g. Views is exposing quite a lot)
Comment #142
anavarreComment #143
pwolanin CreditAttribution: pwolanin as a volunteer commentedFor this use case you would probably want to have a machine name token that gives you _ replaced with - ?
It's too bad FieldItemBase seems to be using \Drupal and not better dependency injection, so there maybe it not another option for this patch to get the token service.
It would be good to have some more test coverage here and to think about how to handle tokens with odd characters.
Comment #144
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commentedIt's one simple
str_replace()
away, but I wonder if we should at all. All it does is adding slight runtime overhead and the machine names aren't exactly user input either... the UI-level input-validation is very agressive for machine names, only machine names through imported yml remains a vector: and that attack route allows a lot, lot, lot more danger.I gave it a stab to see if we can get rid of
\Drupal::token()->replace
, no way, unless we start doing setter injection deeper down, but that's way more disruptive than this in my opinion. Having that said, all higher level classes seem to use\Drupal
quite a lot.I'm trying to think of what more we need, happy to get the patch improved further.
Comment #145
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commented@pwolanin just tested through the UI again and made a screenshot of it, there is no way that it lets through anything besides lowercase letters, numbers, and underscores. Even when clicking edit after the autogenerated machine name, the edit field gets validated.
This is good, as it means all ui-generated machine names are safe to use on the filesystem. The only open vector remains imported YML config, but as said before, that's not up to this patch.
Comment #146
Crell CreditAttribution: Crell as a volunteer commentedYet another reminder, since people still seem to be missing this part: All we're adding here is a default. We're adding a default value for a text field. That is all. A site still has complete freedom to organize its files however it wants using whatever black magic it can make tokens support. 70% is the target here, NOT "all sites big and small".
anavarre: No, Drupal is not a democracy, nor has it ever been, not even a little. We have a governance structure defined, and it's not democratic. In case of a bikeshed we should be deferring to the subsystem maintainer.
Comment #147
webchickAnd because it's a default, if we make the default just Y/M date parts, we are both more inline with what other CMSes do, and don't have to deal with any of the stuff in #143, etc. People like Crell can always add machine name into the directory pattern if they want. (FTR, I'm not concerned from a security aspects of field name, but the UX aspect. URLs are part of the UX of files, and legibility of URLs to non-technical people matters.)
I do think it's too late for this in 8.0.0 now and would recommend postponing to 8.1.x, but if you want to tag it "rc target triage" given the benefits to sites with massive amounts of files, feel free. Core commiters can then discuss committing it during RC.
Comment #148
Crell CreditAttribution: Crell as a volunteer commentedTagging as suggested. Note that as a side effect we're adding token definitions here. (Whether that makes it more or less likely to be accepted in RC I don't know.)
webchick: I really don't see any UX issue with the URL of the file containing the machine name. Image styles do more to make the URLs long and hard to copy already than anything we're doing here. :-)
Comment #149
anavarreOkay, taking it from here, I'm tentatively adding the rc target triage tag to hopefully make this issue eligible to RC's guideline item #5 "certain other issues with high impact and low disruption at committer discretion only" - Thanks for offering that path forward!EDIT: cross-posted with #148I've updated the issue summary yesterday to list the 3 current approaches. Here they are again for clarity:
I think/hope we've addressed everybody's concerns in terms of:
@webchick - You've expressed concerns about the UX but IDK if anybody here is really qualified to answer. Should we tag the issue accordingly? IMHO http://site.tld/sites/default/files/styles/large/public/field_image/2015... is not much worse than http://site.tld/sites/default/files/styles/large/public/field/image/phot...
So maybe too much is too much and it's "one more thing" that degrades the UX? I'm not competent to tell but just wanted to bring up that we might already have things to be concerned about from a UX standpoint. If we step back and think about what we have today, the file paths only make sense for experienced site builders anyway.
My next question is: what can we do that is still missing from the current patch to ensure we have something as solid as we can?
Comment #150
rootworkI'm supportive of any of the approaches, because they're all way better than the current situation. I would lean toward #126 but since these are just defaults and people can change them, I don't feel strongly about one versus another as long as something gets in. And I hope it does happen during 8.0.0-rc.
@webchick I do think having the field name in the pattern is helpful for (again, default) organization, and like anavarre I don't completely see the UX problem with those file names, though I'd love to see someone review that specifically. But where you say having just the Y/m date parts would be more in line with other CMSes -- I was surprised that in the research I did in #79 that only WordPress appears to use that pattern. The other CMSes either throw everything into one folder and put a hash on it, or group the files into much larger categories like mime type or (in Drupal terms) entity type of where it was uploaded.
I also agree with anavarre that it'd be great to strategize about what else could make the patch even stronger so when folks come back to review it it's solid. I suspect "pick an option" would make this a stronger case at least to new-to-the-issue reviewer. Given what crell said in #146 are there relevant maintainers that might be able to take a look and recommend one option over the others -- just for the purposes of putting forward a single patch?
Comment #151
Crell CreditAttribution: Crell as a volunteer commentedMaintainers.txt's listed file system maintainers are... sadly out of date. :-(
Comment #152
anavarre:-(
Filed #2586419: Aaron Winborn is still listed as File module and File system maintainer in MAINTAINERS.txt
Comment #153
webchickI do understand the desire to subdivide files beyond just date to something more semantically meaningful, so what about a compromise of
files/[entity:machine-name]/[date:custom:Y]-[date:custom:m]
. (Note that I don't know if [entity:machine-name] exists yet, but then again neither does [field-storage:name] so ;))So files uploaded to issues on D.o would get the default path:
files/issue/2015/10/patchy.patch
User pictures would get the path:
files/user/2015/10/stinkyfoot.png
Commerce products would get the path:
files/product/2015/10/buymenow.png
...and so on.
This is much better for end users because they would've seen those words in the UI already, when they created the content from node/add or whatever in the first place, and generally-speaking content types are named something in plain english vs. field machine names which might be anything (and generally start with an annoying "field_" prefix which is a waste of 6 bytes on every file path).
Comment #154
Crell CreditAttribution: Crell as a volunteer commentedThe only downside of #153 is that it's a larger change to the patch, I suspect, to expose a new token. :-) I could live with it, though.
Also, your suggestion uses a dash, but the examples use a /. Which did you intend? (I don't have a strong enough preference between 2015-10 and 2015/10 to fight for either one, I think.)
Comment #155
anavarreNote that [date:custom:Y]-[date:custom:m] was suggested in #19 and again in #132 + is listed as an option in the issue summary. The main advantage being to limit the number of directories that will have to be created. While I've never seen any issue with [date:custom:Y]/[date:custom:m] or [date:custom:Y]/[date:custom:m]/[date:custom:d] in the past, I also understand the need for simplicity and the low risk it represents makes [date:custom:Y]-[date:custom:m] a pretty good candidate.
Not opinionated on [entity:machine-name] either. Maybe it makes individual fields auditability a bit more challenging though?
Comment #156
Crell CreditAttribution: Crell as a volunteer commentedanavarre: I suppose it's a question of whether we want to default to field-based auditing or entity-based auditing. Both have valid cases, I suppose. I don't feel strongly enough either way there, as both are logical and both are more logical than pure date.
Comment #157
webchickSorry, I missed that detail; no strong feelings one way or another about "-" so that's fine. :)
Comment #158
timhilliard CreditAttribution: timhilliard at Acquia commentedI prefer #121 over #126 over #132.
I think using hyphen is unnecessary and most people are used to the extra directory for month. I also think adding the day as #121 does is worth it as I see the problem where a feed importer or something to the like might ingest thousands of files every day. This could easily add up to a very large number of files for a whole month but would be manageable if separated by day. I realize that if the client does end up with a feed imported into their site they should be able to adjust the file path but it is often over looked and only dealt with once it becomes a problem, why not just handle this with the default since one extra folder layer will not be much overhead.
[entity:machine-name] sounds great but no strong opinion of using that over [field-storage:name], especially given it doesn't yet exist. Maybe we go with [field-storage:name] since we already have that and don't have to wait for / build a new token. So #121 as is with #126 as is as second preference.
Comment #159
slashrsm CreditAttribution: slashrsm at Examiner.com commentedI'd just like to remind everyone once again: we are talking defaults here. Every site can choose to take it's own approach, every field can use something different, distributions can ship with their own defaults.
Let's try not to spend few more months discussing trivial questions. Please, please :)
Comment #160
webchickJust to be clear, unless we go with raw date parts (e.g.
YYYY-MM
, which is the simplest thing that could possibly work, but there was pushback on that), we're inventing a new token either way. This is why I was pushing for raw date parts, because that would cut this patch down to like one line (ok, maybe 10 lines ;)) and make it much more palatable as an RC target (because, as slashrsm says, these are just defaults).Comment #161
anavarre@webchick: are you saying that we should go with
YYYY-MM
defaults only if we want to see this land before the GA and we should pursue more solid defaults to target 8.1 instead? Or is there any chance it could make it 'as is' for the GA, still?Comment #162
slashrsm CreditAttribution: slashrsm at Examiner.com commentedI think that @webchick tried to say exactly that. It is much more likely this will go in if simplified and something is better than nothing IMO.
It is not optimal, but it is a start. We can still add token support for fields and improve default setting in follow-ups.
Comment #164
slashrsm CreditAttribution: slashrsm at Examiner.com commentedMeh... :)
Comment #166
timhilliard CreditAttribution: timhilliard at Acquia commentedLooks good. I still prefer YYYY/MM over hyphen but I'm not going to hold this up over that so RTBC.
Comment #168
slashrsm CreditAttribution: slashrsm at Examiner.com commentedThis patch is identical to the one on #162. Hopefully this fixes the mess that I created.
Comment #169
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commentedAlthough I'm a strong advocate of the field name token, I'm backing the idea of 'getting it in' over 'getting it perfect'. For any high-traffic site, the #168 patch will do it, period.
As long as we're not shipping 8.0 with a ever-exploding
sites/default/files
by default, I'm more than happy to wait for even better defaults post 8.0.+1
Comment #170
nielsvm CreditAttribution: nielsvm as a volunteer and at Acquia commentedReviewed #168 once more and still looking good and sane, let's hope testbot doesn't take long either.
Comment #172
catchI just scanned the past 50 or so comments.
Just a general point that it's only core where a file is tied to a specific field name. It's quite possibly to reference the same fid from multiple fields, and when that happens the field name will not make any sense - the file may even switch from one field name to another if the original entity it was uploaded to was deleted.
For auditing files, you can do that with code by comparing the contents of the files directory to the database, I'm not sure how naming helps there at all.
Year and month seems great for the default.
Comment #173
anavarreFiled #2592251: Uploading files should have more robust default tokens if this issue gets committed.
Comment #174
alexpottDiscussed with @effulgentsia - we've had a problem with default file organisation since forever. It makes sense to provide some defaults that make it easier to manage sites especially for people new to Drupal who don't know about such settings. Considering we're just changing the defaults here it is not a risky change especially now we're doing the minimum possible change so let's do this in RC - it makes the out-of-box experience of Drupal better.
I was going to say I'm not sure about this change because user pictures are different - but pictures is hardly that helpful a name either.
Comment #175
webchickWhile I've seen the lack of separating folders within the files directory bring down sites, I've never encountered any real-world problems with having user pictures segmented off in their own folder before. So this seems like an unnecessary change, that might introduce more problems than it solves. Should probably be updated to pictures/YYYY-MM instead.
Comment #176
timhilliard CreditAttribution: timhilliard at Acquia commentedI rerolled with @webchick's suggestion as I think it makes sense. I also updated node article field_image to prefix with field/image as this was the directory previously and I think it makes sense to make it consistent with user pictures.
Comment #177
timhilliard CreditAttribution: timhilliard at Acquia commentedComment #180
timhilliard CreditAttribution: timhilliard at Acquia commentedComment #182
timhilliard CreditAttribution: timhilliard at Acquia commentedRe-uploading patches so interdiff doesn't trigger testing. Sorry for confusion.
Comment #184
anavarreI'm really confused now.
field/image/[date:custom:Y]-[date:custom:m]
makesfield_image
a special snowflake. Do we really, truly want that?Comment #185
timhilliard CreditAttribution: timhilliard at Acquia commentedok, re-rolled with the field_image snowflake removed.
Comment #186
timhilliard CreditAttribution: timhilliard at Acquia commentedShould we actually be omitting this since it is the same as the default?
Comment #187
anavarre#186: seems fair to do it. Tested removing
file_directory: '[date:custom:Y]-[date:custom:m]'
fromcore/profiles/standard/config/install/field.field.node.article.field_image.yml
and the default tokens are picked up fine in the UI and files are successfully uploaded in the expected location.Comment #188
timhilliard CreditAttribution: timhilliard at Acquia commentedRe-rolled with field_image override removed.
Comment #189
anavarreComment #191
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI don't like the duplication of this line with a similar line in getUploadLocation(), especially since this doesn't fully duplicate it (it doesn't convert to plain-text).
Even though this works per #187, it's good practice to include the field defaults in the config/install .yml anyway, as it results in exported config not having extraneous differences from the initially imported config. And StandardInstallerTest tests for that via assertInstalledConfig(), which is the test failure in #188.
This patch addresses both of the above.
Comment #192
catch#191 looks great to me now, I think this is a good balance for both the default, the pictures and the article image field.
Comment #195
effulgentsia CreditAttribution: effulgentsia at Acquia commentedTicking credit boxes for people who helped to substantially move this issue forward. Apologies if I missed anyone.
Comment #197
effulgentsia CreditAttribution: effulgentsia at Acquia commentedTo give more background on this, when I first discussed this with @alexpott, I was initially against doing this during RC, because I don't think RC is the right phase in the development cycle to be changing default configuration away from what's been in use throughout all of Drupal 7's lifetime and Drupal 8's development phase. However, in this case, we have plenty of real world experience to know that the existing defaults cause problems for many site owners who don't know that that's what's causing the problem and how to improve their configuration to fix it. So with that and with our other efforts in "Fast By Default" issues, I think it does make sense to still allow this in. Therefore, pushed to 8.0.x and updated the change record to reflect the committed patch and published it. Thank you, everyone, who helped on this.
There were some comments on this issue that asked for a backport to 7.x, so moving to that queue for that to be discussed.
Comment #198
Crell CreditAttribution: Crell as a volunteer commentedShould we add a new issue about adding the tokens to core that this issue was doing originally? (Probably postponed for 8.1)
Comment #199
rootwork@crell I mentioned that in #2592251-4: Uploading files should have more robust default tokens -- so that could be the followup issue unless there needs to be a separate meta-issue for deciding on the tokens prior to an issue to actually create them.
Comment #200
mikeker CreditAttribution: mikeker as a volunteer commentedAdded followup: #2604602: ImageItem::generateSampleValue() throws "Call to a member function id() on a non-object" exception.
Comment #201
Anonymous (not verified) CreditAttribution: Anonymous commentedSo, I am running D8RC2 site and this change is fine with me and for the future it is really good choice. But one issue I have with this is that the default path field for File-based Field API fields has no token tree nor any mention of available tokens in its description. Is there an issue for this or we just assume the user is all knowing? Because to me this looks like a huge UX/UI fail.
Comment #207
vijaycs85Direct port of D8. Probably we might need an upgrade path? wondering if we need to move existing files. Surely the change would work for new fields.
Comment #208
rahul.nahar001 CreditAttribution: rahul.nahar001 at TATA Consultancy Services commentedHi @vijaycs85,
I have applied patch#207. it fixed the issue.
For new image and file fields, the image and file are uploaded in the proper location(files\year-month).
Thanks
Rahul Nahar
Comment #209
BerdirDefinitely not moving existing files or even changing existing settings. However, some tests would probably make sense to verify this works?
Comment #210
zopa CreditAttribution: zopa at Chromatic commentedY-m folder convention still causes too many files - we need files saved to Y/m/d folders.
Refactored patch #207 for Y/m/d file structure use case.
Comment #211
BerdirIt is a default configuration, big sites with many files can always change it. It's meant to be a sane default for most sites, which do not create dozens or hundreds of files per day. This is what has been discussed and agreed upon and how it has been implemented in Drupal 8. So lets keep that.
Still needs tests, so needs work for that.
Comment #212
garth.powell CreditAttribution: garth.powell commentedI have a major issue with the file being placed withing automatically generated folders by date
The problem we have with this is that when we update a file it remains it the folder it was created in, for example the may folder, and then places the new file into the July folder.
The issue we have with this is that all the links on the site to that file still points to the old location.
This means that when we updated files the links on pages will just point to the old file which is a major issue
We are mitigating this by using field redirect and linking to the parent node, not the file but users will always open the file and copy the URL from the file, not the node.
Has anyone else encountered this? If so is there a better solution?
Comment #213
ksenzeeThe patch at #207 does modify the existing tests to make sure that the upload location is correct. I'm not sure what else makes sense to test.
Comment #216
BerdirI am blind :)
Yes, #207 seems fine.
Comment #217
ajayg CreditAttribution: ajayg as a volunteer commentedThis has been RTBC for a while. What is preventing commit for this?
Comment #218
cilefen CreditAttribution: cilefen at Institute for Advanced Study commentedComment #219
marloAZ CreditAttribution: marloAZ commentedWhat about using YYYYW (year + week number, no separator) ; e.g. "file/201830/file01.txt"
In a 5 day week, this allows 500 uploads/day = 2500/folder.
This would suit *most* sites, with the benefit of giving an idea of the file (upload) date without suggesting to users that it's the file version. Whereas if you add hyphen or slash separators, it appears to users that it's the file date.
Comment #220
rootwork@marloAZ I'm going to quote @Berdir in #211:
This issue has gone a couple of hundred comments, and we're working on implementation. If you want to propose a new plan, you can create a follow-up issue.
Comment #221
rootworkComment #222
leventdal CreditAttribution: leventdal commentedI think this not the best arbitrary solution that is neither universal nor robust. Also it is misleading in many aspects.
It is implemented as if it is a photo album or a screenshot software. Using dates is a bit like an excuse for not using totally random hashes.
I also couldn't find any documentation to apply a custom folder other than using private folders.
Comment #223
MustangGB CreditAttribution: MustangGB commentedComment #224
joseph.olstadBumping to 7.70, this didn't make it into 7.60
** Edit ** I have no objections to putting this as 7.61,
for the file_entity module what we did is provide a helper message in the default folder setting but we did not enforce it, either way I think it's good.
see related file_entity commit:
https://cgit.drupalcode.org/file_entity/commit/?id=073d1c06a6dc83f48e6de...
this is already taken care of by the file_entity module version 7.x-2.x / 7.x-3.x in a soft way without actually putting in a default value
but I agree with this change to core, core should provide a good default.
** End Edit **
Comment #225
ajayg CreditAttribution: ajayg as a volunteer commentedWhy bump to 7.70? Why not 7.61?
Comment #226
joseph.olstadI'd love to see this in 7.69, however 7.70 might be just around the corner.
With that said, file_entity provides a helper message suggesting to do default foldername by date and provides the token to do it.
If you need this right away, just upgrade to the latest releases of file_entity.
Comment #227
joseph.olstadThose using file_entity get this already in recent releases but not all installations use file_entity. It is a good idea for core to do this also by default.
Comment #228
izmeez CreditAttribution: izmeez commentedPatch in #207 is the direct port from D8 using folder names Y-m for files.
Patch in #210 uses folder names Y/m/d for sites with large volume of files.
Meanwhile, comment in #212 identifies a problem,
Although it will not address the problem in comment #212, I was wondering if having a conf for settings.php might be useful to allow sites to decide whether they want to use 'Y-m' or 'Y/m/d' or leave empty '' for sites with fewer files that don't need this.
Comment #229
mcdruid@Fabianx and I had a look at this for D7; generally backporting the D8/9 approach seems good although it's not perfect there either.
This change would only affect new fields, and most D7 sites where the "everything in one directory" setting caused problems will most likely have addressed this some other way.
Comment #230
mcdruidOne thing to remove the patch - we shouldn't change a
hook_update_N
implementation.That could be fixed on commit, but ideally we'd test a patch without it.
#207 is preferred as it's a straight backport. Any sites that need slightly different token patterns can set those up for themselves.
Comment #231
DamienMcKennaA version of #207 without the system_update_7060() change, as requested.
Comment #233
mcdruidThank you!
Comment #234
izmeez CreditAttribution: izmeez commentedDoes this require or have a change record?
Comment #236
jigariusI wish the files were uploaded by default to a directory like "YYYY/MM" instead of "YYYY-MM". I'll see if I can find an open issue for that.