Objective

Status: Incomplete — This is just a first proposal.

  1. Since PHP 5.4, stream wrappers have matured significantly; our entire Drupal-specific StreamWrapperInterface is most likely obsolete.
    (cf. #2107287: PHP 5.4 calls a new stream_metadata() method on stream wrappers not implemented by Drupal)

  2. Most of the functions used for file operations are split between file.inc and file.module. They can be grouped by purpose and consolidated as services inside of Drupal\Core\File namespace:

    1. File API functions consuming Settings to ensure that created files are readable/writable:

      drupal_chmod()
      drupal_unlink()
      drupal_mkdir()
      drupal_rmdir()
      

      Drupal\Core\File\FileHandler
      — or similar; the Drupal-specific File API helper functions from #2050759: Move drupal_chmod and other code in file.inc to a Drupal\Core\File\FileSystem class

    2. OS/environment helpers:
      drupal_dirname()
      drupal_basename()
      drupal_tempnam()
      drupal_move_uploaded_file()
      file_upload_max_size()
      file_directory_os_temp()
      file_create_filename()
      
    3. Unmanaged file API:
      file_unmanaged_copy()
      file_unmanaged_prepare()
      file_unmanaged_move()
      file_unmanaged_delete()
      file_unmanaged_delete_recursive()
      file_unmanaged_save_data()
      

      They are transformed by #2244513: Move the unmanaged file APIs to the file_system service (file.inc) into this interface:

      namespace Drupal\Core\File\Handler;
      
      interface UnmanagedFileHandlerInterface extends FileHandlerInterface {
      
        public function copy($source, $destination = NULL, $replace = self::FILE_EXISTS_RENAME);
      
        public function delete($path);
      
        public function deleteRecursive($path, callable $callback = NULL);
      
        public function move($source, $destination = NULL, $replace = self::FILE_EXISTS_RENAME);
      
        public function prepare($source, &$destination = NULL, $replace = self::FILE_EXISTS_RENAME) {
      
        public function saveData($data, $destination = NULL, $replace = self::FILE_EXISTS_RENAME);
      
      }
      
    4. htaccess helpers
      file_ensure_htaccess()
      file_htaccess_lines()
      file_save_htaccess()
      

      Transformed by #2620304: htaccess functions should be a service into

      namespace Drupal\Core\File;
      
      class Htaccess {
      
        public function ensure() {
        }
      
        public function save($directory, $private = TRUE, $force_overwrite = FALSE) {
        }
      
      }
      
    5. URL generation
      file_create_url()
      file_url_transform_relative()
      

      Transformed by #2669074: Convert file_create_url() & file_url_transform_relative() to service, deprecate it into

      namespace Drupal\Core\File;
      
      class FileUrlGenerator {
      
        public function generate($uri, $relative = FALSE) {
        }
      
        public function transformRelative($file_url) {
        }
      
      }
      
    6. Generic helpers:
      Handled by #3035312: Move file_scan_directory() to file_system service
      file_scan_directory()
      file_get_mimetype()
      
    7. Other…
  3. For stream wrappers, we should arrive at this for D8:

    Drupal\Component\StreamWrapper\StreamWrapperInterface  [ex. PhpStreamWrapperInterface]
    Drupal\Component\StreamWrapper\StreamWrapperManager
    Drupal\Component\StreamWrapper\LocalStreamBase         [sans getLabel()/getDescription()]
    ...
    Drupal\Core\StreamWrapper\StreamWrapperInterface       [extends Component\StreamWrapperInterface, adding getLabel()/getDescription()]
    Drupal\Core\StreamWrapper\PublicStream
    Drupal\Core\StreamWrapper\TemporaryStream
    Drupal\Core\StreamWrapper\PrivateStream
    

    → Our StreamWrapper mechanism turns into a re-usable component.

  4. settings.php file and (local) stream wrapper settings should turn into this:

    $settings['file_chmod_directory'] = ...;
    $settings['file_chmod_file'] = ...;
    $settings['stream']['public'] = ...;
    $settings['stream']['private'] = ...;
    $settings['stream']['temporary'] = ...;
    

Comments

sun’s picture

Issue summary: View changes
sun’s picture

Issue summary: View changes
sun’s picture

Issue summary: View changes
larowlan’s picture

Worth considering https://github.com/thephpleague/flysystem for the low-level filesystem interactions?
It has a Null implementation so we can mock for unit testing.
Plus pluggable backends would be useful for things like Media module.
Same kind of notion as storage api module, having seen that in work with a Riak backend - its pretty damn cool.

slashrsm’s picture

Issue tags: +Media Initiative
slashrsm’s picture

While I not automatically oppose 3rd party libs I really think we should do it in a thoughtful way. How mature is that library? What happens when there is a security update? Would they be able so sync with our releases in such a case? Flysystem looks like a nice tool, but it shouldn't be just about that. I am not familiar with it so I can't really judge. I just wanted to point out this important questions.

We should really modernize file API (with or without introduction of flysystem). We probably need to do this before beta?

almaudoh’s picture

thijsvdanker’s picture

Since #6 flysystem has taken of seriously; it has a 1.0 release, is used as the filesystem in Laravel 5 and is used/downloaded a lot (77k downloads in the last month).

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

jibran’s picture

Version: 8.1.x-dev » 8.2.x-dev

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

20th’s picture

I have been messing with File API for a few weeks now, tested a lot of patches found in different issues. This API has a long and complex history and many design problems, some of which, I believe, I was able to identify:

(1)

Unmanaged and managed file APIs in Drupal are traditionally seen as complementary but they have long diverged:

Unmanaged file API consists of these functions:

file_unmanaged_copy()
file_unmanaged_move()
file_unmanaged_delete()
file_unmanaged_delete_recursive()
file_unmanaged_save_data()

While managed file API has these:

file_copy()
file_move()
file_save_data()
file_save_upload()

Delete is not even part of managed API, see below.

(2)

*_copy() and *_move() operations in these two APIs, for the last 6 years, have been maintained to have very similar function signatures. However, there is no real reason for that because they are completely incompatible and operate on different data types.

For example, the $source argument of file_unamanged_copy() function is a string and the function returns a string as well. file_copy(), on the other hand, received and returns a whole File entity. If these two functions are not compatible and are not interchangeable, there is absolutely no reason to keep this tradition of visual similarity between them.

(3)

Delete operations are not even part of managed file API, but part of Entity API.

file_copy() and file_move() are defined in file.module; they receive an instance of FileInterface as their first argument.

file_delete() is defined in core/includes/file.inc and receives an integer $fid, and is a simple wrapper around entity_delete_multiple('file', $fids);

file_delete() is not even used in core; and should not be used. There is no node_delete(), nor field_delete(), nor taxonomy_term_delete(); there should be no file_delete(). Managed file is an entity and the correct way to delete entity is with $file->delete();

(4)

file_move() is cryptic, it tries to act like a wrapper around file_unmanaged_move() and combine it with entity API; and looks completely broken. What it does:

- Moves file on disk.
- For some reason, clones file entity object (not using the correct createDuplicate() method).
- If destination already exists, does some voodoo magic by manually setting entity id and uuid. I guess, this hack does overwrite existing file entity but still leaves the old data in cache.
- Does not clear usages of moved file.
- Does not delete source file entity, keeps it in DB with a broken URI that points to a file on disks that does not exist.

How move should be implemented, IMHO:

if ($uri = file_unmanaged_move($source->getFileUri(), $destination)) {
  $source->setFileUri($uri);
  $source->save();
}

Overwriting another managed file is an invalid operation and should not be permitted at all. (Here is even a bolder statement, moving a managed file should not be allowed at all, the same way it is not allowed to move a node from one nid to another.)

(5)

The actual deletion of file on disk (as in unlink()) is handled inside of Drupal\file\Entity\File class. File creation, however, is not. It is spread across all 4 functions: file_copy(), file_move(), file_save_data(), file_save_upload() with significant duplication of code.

There is no single authority that would manage both files on disk and records in the DB. This is a design flaw and this spawns bugs like this one #1556396: Refactor file.inc to make it simpler and prevent PDO exceptions.

It has become so unclear how to actually manage the "managed file API" that file.module has this code:

/**
 * Implements hook_ENTITY_TYPE_predelete() for file entities.
 */
function file_file_predelete(File $file) {
  // @todo Remove references to a file that is in-use.
}

This TODO is implemented in Drupal\file\Entity\File::preDelete() method but even the file module itself does not know about that.

Futhermore, (6)

core/includes/file.inc contains a lot of other "helpers" that are used here and there, but are not part of any "official" API in Drupal. Some of them found their way into file_system service but the contents of this service seems to be completely arbitrary.

For example, file_stream_wrapper_valid_scheme() was moved into FileSystem::validScheme() method, but file_valid_uri() was not. Their role seems quite similar to me, they should be treated similarly.

On the other hand, drupal_move_uploaded_file() was also moved into FileSystem class, despite obviously being part of Managed File API (because it is used only there). On a side note, this function does not even have a reason to exist anymore as it was added as a fix for some issue with 'safe_mode' which has been gone since 5.4.

20th’s picture

The only purpose of 'Managed file API' is to track what files Drupal has written to disk. Data in the database should directly correspond to the stats of those tracked files. To be able to do that, any modification of a File entity must be instantly reflected in filesystem and vice versa:

$file->delete() must directly correspond to deletion of file on disk, $file->rename() must change location, URI and name of file on disk and in DB, etc.

Given all of the above and considering the number of abandoned and semi-abandoned patches in the issue tracker that have tried to unbreak file API (take a look at the number of child issues in the sidebar), I propose the following relatively large API changes:

  • Completely get rid of managed file API in its current form.
  • Reimplement file_save_data() as a factory constructor on File entity class: File::createData($data, []);
  • Reimplement file_copy() as a factory constructor on File entity class: File::createCopy($file, []);
  • Reimplement file_move() as an instance method on File class. Perhaps, give it a better name, like rename().
  • Do not reimplement behavior of file_copy() and file_move() with FILE_EXISTS_REPLACE because this operation is invalid for managed files, leads to data losses and inconsistencies.
  • Deprecate file_delete() and file_delete_multiple() for eventual removal in D9.
  • Refactor unmanaged file API into a service - #2244513: Move the unmanaged file APIs to the file_system service (file.inc).
  • Refactor file_save_upload() into a file_upload service.
  • Depricate FileSystem::moveUploadedFile() method
  • Create new URI and scheme parsing/formatting service for many function in file.inc

Making all these changes essentially allows to move from procedural code like this

$file = file_save_data($data, 'public://test.txt');
file_move($file, 'public://rename-test.txt');

$copy = file_copy($file, 'public://copy-test');

file_delete($file->id());

to more entity-oriented code

$file = File::createData($data, [
  'uri' => 'public://test.txt',
]);
$file->move('public://rename-test.txt');

$copy = File::createCopy($file, [
  'uri' => 'public://copy-test.txt',
]);

$file->delete();
jonathanshaw’s picture

What do you think of starting to use Flysystem (see #4) if the file code is being rearchitected? It's now used by Laravel, and has a thorough Drupal contrib implementation.

phenaproxima’s picture

What do you think of starting to use Flysystem (see #4) if the file code is being rearchitected?

I'm generally in favor of ditching Drupal's unwieldy filesystem-related code in favor of a library. Working with files is often very painful in Drupal, whereas external libraries (in my experience) make it easy and pleasant. So from a DX perspective, +1. But from a practical, "how do we get this into core" perspective, it's a colossal pain in the butt. Just getting a new library into core is a long and contentious process by itself, and even then we'd have to rewrite all filesystem code on top of it to ensure perfect backwards compatibility, which might not always be possible. In other words, switching to Flysystem (or similar) looks more like Drupal 9 stuff to me. I think what is in reach is to refactor/reorganize the D8 file APIs first, which would potentially open the door to rebuilding it all on top of a library in D8.

20th’s picture

Flysystem is a good library but I agree that its inclusion in a minor 8.x release might introduce a lot of compatibility issues.

Right now Drupal filesystem API is not much more than thin wrapper around PHP functions that mostly normalizes paths and permissions, and calls methods on stream wrappers where necessary. And it is far from being complete. For example, there is no Drupal implementation of file_exists() function, the size of a file is checked with filesize() in several places in core and in many contrib modules, file modification time with filemtime(), etc.

Flysystem is a bit more complex. Out of its three core concepts from Flysystem documentation page, Drupal explicitly implements only one – Adapters – which corresponds to stream wrappers. Not sure how much Drupal depends on absolute paths, but it manipulates directories directly a lot. Flysystem also abstracts file permissions, directory traversal and handling of uploaded files by introducing streams API.

Drupal core can be updated to use all these new abstraction. Of course, Flysystem API must be used in its entirety, so all filesize() calls must be replaced with API calls to Filesystem::getSize() which can be done in core fairly easily. But what about contrib modules? I have surveyed half a thousand most popular modules with D8 versions and found at least 30 that use PHP filesystem functions directly.

On the other hand, Flysystem is already available to Drupal with a contrib module! Right now it only provides a bridge for stream wrappers but it can used to test the whole library. If all goes well, it can be included in D9.

Once we implement proper filesystem services in core, flysystem module can simply override all services configuration using a standard technique. To be able to do that, Drupal and Flysystem must be somewhat compatible on API level. In other words, Drupal filesystem need to be modelled on Flysystem which I don't think is a bad idea.

Aside from stream methods and addPlugin() method, Flysystem FilesystemInterface uses very generic method names that Drupal might as well use too.

jonathanshaw’s picture

Of course, Flysystem API must be used in its entirety, so all filesize() calls must be replaced with API calls to Filesystem::getSize() which can be done in core fairly easily. But what about contrib modules? I have surveyed half a thousand most popular modules with D8 versions and found at least 30 that use PHP filesystem functions directly.

I'm not sure this is true, though I'm out of my depth here.

What I do know is that I can use the fieldfield_paths and filefield_sources modules unpatched (which contain PHP direct filesystem calls like file_exists) with remote storage setup using the Flysystem module. So maybe it's not true that "all filesize() calls must be replaced with API calls to Filesystem::getSize() ".

A bigger question:
is tinkering with the existing unmanaged file API worth it? Or would it be simpler to create a new OO API (possibly with Flysystem) and then deprecate the existing one, removing it at D9? I'm not 100% sure which you're proposing.

Either way, I'm excited to see your thinking.

20th’s picture

@jonathanshaw

My point is, if Filesystem class has a getSize() method, it is probably better to use it than not. If we look at the source code, we'll see that it calls $path = Util::normalizePath($path);. I have no idea what path normalization really does but it is probably important.

But if you say that PHP filesize() function still works correctly, it will make library testing even easier which is good!

20th’s picture

Or would it be simpler to create a new OO API (possibly with Flysystem) and then deprecate the existing one, removing it at D9? I'm not 100% sure which you're proposing.

Yeah, my comments are quite long and not easy to follow, I guess.

I believe that it is exactly what #2244513: Move the unmanaged file APIs to the file_system service (file.inc) tries to do – create object-oriented interface for filesystem operations but keep it backwards compatible by reusing existing code.

Flysystem has been suggested several times as a possible replacement for filesystem API. But the general consensus is that it will be difficult to make it a core library and the benefits are not obvious.

That is why it will be better to test it completely in contrib, in order to discover potential problems and its advantages. If it matches all Drupal needs, the flysystem module can be merged into D9 and old unmanaged filesystem API completely removed.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

20th’s picture

Issue summary: View changes

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

voleger’s picture

We have a plan to release Drupal 9 in 2020, so I guess this requires IS update for this issue.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

kim.pepper’s picture

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

solideogloria’s picture

From SteamWrapperInterface.php:

 * Note that PHP 5.2 fopen() only supports URIs of the form "scheme://target"
 * despite the fact that according to RFC 3986 a URI's scheme component
 * delimiter is in general just ":", not "://".  Because of this PHP limitation
 * and for consistency Drupal will only accept URIs of form "scheme://target".

This is something that can be updated, as there's no need to support PHP 5.2 anymore.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.