Problem/Motivation

Each image that loads on our site comes from the private:// scheme, stored on AWS S3 (we use s3fs).

Although all metadata come from the local database and/or cache (only the actual image content is downloaded from the remote S3 service), all images take several seconds to load (5s to 15s, sometimes more) and cause the DB and PHP processes to consume a lot of CPU.

Loading images should be quicker.

Steps to reproduce

Have a Drupal 10 site loading all its image assets from a non-standard filesystem, like S3.
Also, and this is mainly what makes the issue worse in our case: have all images in a single directory containing ~20000 pictures.

Proposed resolution

It turns out this is caused by the "hook_file_download" function implemented by IMCE, because every-time a file gets downloaded, it checks if the file is accessible, but the whole directory containing the file gets scanned in the process too.

In our particular case, that's causing roughly 20K queries to the database, each time a picture gets downloaded.

Here is the bit of code that causes that (\Drupal\imce\ImceFolder::checkItem):

  public function checkItem($name) {
    if (!$item = $this->getItem($name)) {
      if (!$this->scanned) {
        $this->scan();
        $item = $this->getItem($name);
      }
    }
    return $item;
  }

I'm not sure I understand why it's needed to scan the whole directory for checking a single item?

Could we not have something closer to what follows:

  public function checkItem($name) {
    if (!$item = $this->getItem($name)) {
      $this->loadItem($name);
      $item = $this->getItem($name);
    }
    return $item;
  }

But maybe I'm missing something. Besides, applying such a change is not as trivial as one may think because currently "$this->scan()" relies on "ImceFM::scanDir()" which itself relies on "$this->getConf('scanner', 'Drupal\imce\Imce::scanDir');" for populating entries.

So function "$this->loadItem()" would ideally come from a new method "ImceFM::loadItem()" I guess and a similar conf key like above, or something along those lines...

What are your thoughts?

User interface changes

None.

API changes

None.

Data model changes

None I think, but I'm not sure :/

Comments

pacproduct created an issue. See original summary.

pacproduct’s picture

Issue summary: View changes

  • ufku committed b9d7200f on 3.x
    Issue #3443768: Check file file path access without folder scan
    
ufku’s picture

Version: 3.0.9 » 3.x-dev
Status: Active » Fixed

Committed a patch for checking file access without a folder scan.

Please let us know if it helps with the performance.

pacproduct’s picture

Thank you @ufku for your fast reply. Your patch does 100% solve the issue we were having.

We're still facing performance issues with the browser itself, but it's way less critical and it makes sense for it to scan the entire directory in that case, so I'll investigate and if I need more help (or have a patch to suggest) I'll open a separate issue.

Thanks again! :)
Cheers.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.