Objective
Status: Incomplete — This is just a first proposal.
-
Since PHP 5.4, stream wrappers have matured significantly; our entire Drupal-specific StreamWrapperInterface is most likely obsolete.
(cf. #2107287: PHP 5.4 calls a new stream_metadata() method on stream wrappers not implemented by Drupal) -
Most of the functions used for file operations are split between
file.incandfile.module. They can be grouped by purpose and consolidated as services inside ofDrupal\Core\Filenamespace:-
File API functions consuming
Settingsto ensure that created files are readable/writable:drupal_chmod() drupal_unlink() drupal_mkdir() drupal_rmdir()Drupal\Core\File\FileHandler
— or similar; the Drupal-specific File API helper functions from #2050759: Move drupal_chmod and other code in file.inc to a Drupal\Core\File\FileSystem class - OS/environment helpers:
drupal_dirname() drupal_basename() drupal_tempnam() drupal_move_uploaded_file() file_upload_max_size() file_directory_os_temp() file_create_filename() - Unmanaged file API:
file_unmanaged_copy() file_unmanaged_prepare() file_unmanaged_move() file_unmanaged_delete() file_unmanaged_delete_recursive() file_unmanaged_save_data()They are transformed by #2244513: Move the unmanaged file APIs to the file_system service (file.inc) into this interface:
namespace Drupal\Core\File\Handler; interface UnmanagedFileHandlerInterface extends FileHandlerInterface { public function copy($source, $destination = NULL, $replace = self::FILE_EXISTS_RENAME); public function delete($path); public function deleteRecursive($path, callable $callback = NULL); public function move($source, $destination = NULL, $replace = self::FILE_EXISTS_RENAME); public function prepare($source, &$destination = NULL, $replace = self::FILE_EXISTS_RENAME) { public function saveData($data, $destination = NULL, $replace = self::FILE_EXISTS_RENAME); } - htaccess helpers
file_ensure_htaccess() file_htaccess_lines() file_save_htaccess()Transformed by #2620304: htaccess functions should be a service into
namespace Drupal\Core\File; class Htaccess { public function ensure() { } public function save($directory, $private = TRUE, $force_overwrite = FALSE) { } } - URL generation
file_create_url() file_url_transform_relative()Transformed by #2669074: Convert file_create_url() & file_url_transform_relative() to service, deprecate it into
namespace Drupal\Core\File; class FileUrlGenerator { public function generate($uri, $relative = FALSE) { } public function transformRelative($file_url) { } } - Generic helpers:
Handled by #3035312: Move file_scan_directory() to file_system servicefile_scan_directory() file_get_mimetype() - Other…
-
-
For stream wrappers, we should arrive at this for D8:
Drupal\Component\StreamWrapper\StreamWrapperInterface [ex. PhpStreamWrapperInterface] Drupal\Component\StreamWrapper\StreamWrapperManager Drupal\Component\StreamWrapper\LocalStreamBase [sans getLabel()/getDescription()] ... Drupal\Core\StreamWrapper\StreamWrapperInterface [extends Component\StreamWrapperInterface, adding getLabel()/getDescription()] Drupal\Core\StreamWrapper\PublicStream Drupal\Core\StreamWrapper\TemporaryStream Drupal\Core\StreamWrapper\PrivateStream→ Our
StreamWrappermechanism turns into a re-usable component. -
settings.php file and (local) stream wrapper settings should turn into this:
$settings['file_chmod_directory'] = ...; $settings['file_chmod_file'] = ...; $settings['stream']['public'] = ...; $settings['stream']['private'] = ...; $settings['stream']['temporary'] = ...;
Comments
Comment #1
sunComment #2
sunComment #3
sunComment #4
larowlanWorth considering https://github.com/thephpleague/flysystem for the low-level filesystem interactions?
It has a Null implementation so we can mock for unit testing.
Plus pluggable backends would be useful for things like Media module.
Same kind of notion as storage api module, having seen that in work with a Riak backend - its pretty damn cool.
Comment #5
slashrsm commentedComment #6
slashrsm commentedWhile I not automatically oppose 3rd party libs I really think we should do it in a thoughtful way. How mature is that library? What happens when there is a security update? Would they be able so sync with our releases in such a case? Flysystem looks like a nice tool, but it shouldn't be just about that. I am not familiar with it so I can't really judge. I just wanted to point out this important questions.
We should really modernize file API (with or without introduction of flysystem). We probably need to do this before beta?
Comment #7
almaudoh commentedComment #8
thijsvdanker commentedSince #6 flysystem has taken of seriously; it has a 1.0 release, is used as the filesystem in Laravel 5 and is used/downloaded a lot (77k downloads in the last month).
Comment #10
jibranComment #12
20th commentedI have been messing with File API for a few weeks now, tested a lot of patches found in different issues. This API has a long and complex history and many design problems, some of which, I believe, I was able to identify:
(1)
Unmanaged and managed file APIs in Drupal are traditionally seen as complementary but they have long diverged:
Unmanaged file API consists of these functions:
While managed file API has these:
Delete is not even part of managed API, see below.
(2)
*_copy()and*_move()operations in these two APIs, for the last 6 years, have been maintained to have very similar function signatures. However, there is no real reason for that because they are completely incompatible and operate on different data types.For example, the
$sourceargument offile_unamanged_copy()function is a string and the function returns a string as well.file_copy(), on the other hand, received and returns a wholeFileentity. If these two functions are not compatible and are not interchangeable, there is absolutely no reason to keep this tradition of visual similarity between them.(3)
Delete operations are not even part of managed file API, but part of Entity API.
file_copy()andfile_move()are defined infile.module; they receive an instance ofFileInterfaceas their first argument.file_delete()is defined incore/includes/file.incand receives an integer$fid, and is a simple wrapper aroundentity_delete_multiple('file', $fids);file_delete()is not even used in core; and should not be used. There is nonode_delete(), norfield_delete(), nortaxonomy_term_delete(); there should be nofile_delete(). Managed file is an entity and the correct way to delete entity is with$file->delete();(4)
file_move()is cryptic, it tries to act like a wrapper aroundfile_unmanaged_move()and combine it with entity API; and looks completely broken. What it does:- Moves file on disk.
- For some reason, clones file entity object (not using the correct
createDuplicate()method).- If destination already exists, does some voodoo magic by manually setting entity
idanduuid. I guess, this hack does overwrite existing file entity but still leaves the old data in cache.- Does not clear usages of moved file.
- Does not delete source file entity, keeps it in DB with a broken URI that points to a file on disks that does not exist.
How move should be implemented, IMHO:
Overwriting another managed file is an invalid operation and should not be permitted at all. (Here is even a bolder statement, moving a managed file should not be allowed at all, the same way it is not allowed to move a node from one
nidto another.)(5)
The actual deletion of file on disk (as in
unlink()) is handled inside ofDrupal\file\Entity\Fileclass. File creation, however, is not. It is spread across all 4 functions:file_copy(), file_move(), file_save_data(), file_save_upload()with significant duplication of code.There is no single authority that would manage both files on disk and records in the DB. This is a design flaw and this spawns bugs like this one #1556396: Refactor file.inc to make it simpler and prevent PDO exceptions.
It has become so unclear how to actually manage the "managed file API" that
file.modulehas this code:This TODO is implemented in
Drupal\file\Entity\File::preDelete()method but even thefilemodule itself does not know about that.Futhermore, (6)
core/includes/file.inccontains a lot of other "helpers" that are used here and there, but are not part of any "official" API in Drupal. Some of them found their way intofile_systemservice but the contents of this service seems to be completely arbitrary.For example,
file_stream_wrapper_valid_scheme()was moved intoFileSystem::validScheme()method, butfile_valid_uri()was not. Their role seems quite similar to me, they should be treated similarly.On the other hand,
drupal_move_uploaded_file()was also moved intoFileSystemclass, despite obviously being part of Managed File API (because it is used only there). On a side note, this function does not even have a reason to exist anymore as it was added as a fix for some issue with 'safe_mode' which has been gone since 5.4.Comment #13
20th commentedThe only purpose of 'Managed file API' is to track what files Drupal has written to disk. Data in the database should directly correspond to the stats of those tracked files. To be able to do that, any modification of a File entity must be instantly reflected in filesystem and vice versa:
$file->delete()must directly correspond to deletion of file on disk,$file->rename()must change location, URI and name of file on disk and in DB, etc.Given all of the above and considering the number of abandoned and semi-abandoned patches in the issue tracker that have tried to unbreak file API (take a look at the number of child issues in the sidebar), I propose the following relatively large API changes:
file_save_data()as a factory constructor on File entity class:File::createData($data, []);file_copy()as a factory constructor on File entity class:File::createCopy($file, []);file_move()as an instance method on File class. Perhaps, give it a better name, likerename().file_copy()andfile_move()withFILE_EXISTS_REPLACEbecause this operation is invalid for managed files, leads to data losses and inconsistencies.file_delete()andfile_delete_multiple()for eventual removal in D9.file_save_upload()into afile_uploadservice.FileSystem::moveUploadedFile()methodfile.incMaking all these changes essentially allows to move from procedural code like this
to more entity-oriented code
Comment #14
jonathanshawWhat do you think of starting to use Flysystem (see #4) if the file code is being rearchitected? It's now used by Laravel, and has a thorough Drupal contrib implementation.
Comment #15
phenaproximaI'm generally in favor of ditching Drupal's unwieldy filesystem-related code in favor of a library. Working with files is often very painful in Drupal, whereas external libraries (in my experience) make it easy and pleasant. So from a DX perspective, +1. But from a practical, "how do we get this into core" perspective, it's a colossal pain in the butt. Just getting a new library into core is a long and contentious process by itself, and even then we'd have to rewrite all filesystem code on top of it to ensure perfect backwards compatibility, which might not always be possible. In other words, switching to Flysystem (or similar) looks more like Drupal 9 stuff to me. I think what is in reach is to refactor/reorganize the D8 file APIs first, which would potentially open the door to rebuilding it all on top of a library in D8.
Comment #16
20th commentedFlysystem is a good library but I agree that its inclusion in a minor 8.x release might introduce a lot of compatibility issues.
Right now Drupal filesystem API is not much more than thin wrapper around PHP functions that mostly normalizes paths and permissions, and calls methods on stream wrappers where necessary. And it is far from being complete. For example, there is no Drupal implementation of
file_exists()function, the size of a file is checked withfilesize()in several places in core and in many contrib modules, file modification time withfilemtime(), etc.Flysystem is a bit more complex. Out of its three core concepts from Flysystem documentation page, Drupal explicitly implements only one – Adapters – which corresponds to stream wrappers. Not sure how much Drupal depends on absolute paths, but it manipulates directories directly a lot. Flysystem also abstracts file permissions, directory traversal and handling of uploaded files by introducing streams API.
Drupal core can be updated to use all these new abstraction. Of course, Flysystem API must be used in its entirety, so all
filesize()calls must be replaced with API calls toFilesystem::getSize()which can be done in core fairly easily. But what about contrib modules? I have surveyed half a thousand most popular modules with D8 versions and found at least 30 that use PHP filesystem functions directly.On the other hand, Flysystem is already available to Drupal with a contrib module! Right now it only provides a bridge for stream wrappers but it can used to test the whole library. If all goes well, it can be included in D9.
Once we implement proper filesystem services in core,
flysystemmodule can simply override all services configuration using a standard technique. To be able to do that, Drupal and Flysystem must be somewhat compatible on API level. In other words, Drupal filesystem need to be modelled on Flysystem which I don't think is a bad idea.Aside from stream methods and addPlugin() method, Flysystem FilesystemInterface uses very generic method names that Drupal might as well use too.
Comment #17
jonathanshawI'm not sure this is true, though I'm out of my depth here.
What I do know is that I can use the fieldfield_paths and filefield_sources modules unpatched (which contain PHP direct filesystem calls like file_exists) with remote storage setup using the Flysystem module. So maybe it's not true that "all filesize() calls must be replaced with API calls to Filesystem::getSize() ".
A bigger question:
is tinkering with the existing unmanaged file API worth it? Or would it be simpler to create a new OO API (possibly with Flysystem) and then deprecate the existing one, removing it at D9? I'm not 100% sure which you're proposing.
Either way, I'm excited to see your thinking.
Comment #18
20th commented@jonathanshaw
My point is, if
Filesystemclass has agetSize()method, it is probably better to use it than not. If we look at the source code, we'll see that it calls$path = Util::normalizePath($path);. I have no idea what path normalization really does but it is probably important.But if you say that PHP
filesize()function still works correctly, it will make library testing even easier which is good!Comment #19
20th commentedYeah, my comments are quite long and not easy to follow, I guess.
I believe that it is exactly what #2244513: Move the unmanaged file APIs to the file_system service (file.inc) tries to do – create object-oriented interface for filesystem operations but keep it backwards compatible by reusing existing code.
Flysystem has been suggested several times as a possible replacement for filesystem API. But the general consensus is that it will be difficult to make it a core library and the benefits are not obvious.
That is why it will be better to test it completely in contrib, in order to discover potential problems and its advantages. If it matches all Drupal needs, the flysystem module can be merged into D9 and old unmanaged filesystem API completely removed.
Comment #21
20th commentedComment #25
volegerWe have a plan to release Drupal 9 in 2020, so I guess this requires IS update for this issue.
Comment #27
kim.pepper#2620304: htaccess functions should be a service just landed, so only #2669074: Convert file_create_url() & file_url_transform_relative() to service, deprecate it remains from the original list.
Comment #35
solideogloria commentedFrom SteamWrapperInterface.php:
This is something that can be updated, as there's no need to support PHP 5.2 anymore.