Problem/Motivation
Each image that loads on our site comes from the private:// scheme, stored on AWS S3 (we use s3fs).
Although all metadata come from the local database and/or cache (only the actual image content is downloaded from the remote S3 service), all images take several seconds to load (5s to 15s, sometimes more) and cause the DB and PHP processes to consume a lot of CPU.
Loading images should be quicker.
Steps to reproduce
Have a Drupal 10 site loading all its image assets from a non-standard filesystem, like S3.
Also, and this is mainly what makes the issue worse in our case: have all images in a single directory containing ~20000 pictures.
Proposed resolution
It turns out this is caused by the "hook_file_download" function implemented by IMCE, because every-time a file gets downloaded, it checks if the file is accessible, but the whole directory containing the file gets scanned in the process too.
In our particular case, that's causing roughly 20K queries to the database, each time a picture gets downloaded.
Here is the bit of code that causes that (\Drupal\imce\ImceFolder::checkItem):
public function checkItem($name) {
if (!$item = $this->getItem($name)) {
if (!$this->scanned) {
$this->scan();
$item = $this->getItem($name);
}
}
return $item;
}
I'm not sure I understand why it's needed to scan the whole directory for checking a single item?
Could we not have something closer to what follows:
public function checkItem($name) {
if (!$item = $this->getItem($name)) {
$this->loadItem($name);
$item = $this->getItem($name);
}
return $item;
}
But maybe I'm missing something. Besides, applying such a change is not as trivial as one may think because currently "$this->scan()" relies on "ImceFM::scanDir()" which itself relies on "$this->getConf('scanner', 'Drupal\imce\Imce::scanDir');" for populating entries.
So function "$this->loadItem()" would ideally come from a new method "ImceFM::loadItem()" I guess and a similar conf key like above, or something along those lines...
What are your thoughts?
User interface changes
None.
API changes
None.
Data model changes
None I think, but I'm not sure :/
Comments
Comment #2
pacproduct commentedComment #4
ufku commentedCommitted a patch for checking file access without a folder scan.
Please let us know if it helps with the performance.
Comment #5
pacproduct commentedThank you @ufku for your fast reply. Your patch does 100% solve the issue we were having.
We're still facing performance issues with the browser itself, but it's way less critical and it makes sense for it to scan the entire directory in that case, so I'll investigate and if I need more help (or have a patch to suggest) I'll open a separate issue.
Thanks again! :)
Cheers.