Hey, great module! Just wondering if it would be better to scan all files instead of depending on the node form. For example, user profile pics or Media 1.x fields.

This patch uses hook_file_validate instead of the delegated validation handlers.

Comments

rooby’s picture

I agree with using hook_file_validate().

Is there a specific reason why it was not done that way initially?

rooby’s picture

Status: Active » Needs review

I will review this later.

sambonner’s picture

Status: Needs review » Reviewed & tested by the community

Have reviewed and tested this, works very nicely on standard image and file fields, as well as file browsers like IMCE. Suspect it was not implemented this way earlier as hook_file_validate was introduced in drupal 7 and the drupal 7 version of clamav appears to largely be a like-for-like conversion.

Recommend this patch be committed as it covers more use cases (see https://drupal.org/node/1814812) than the existing functionality and does so in a more logical way making use of new hooks available in drupal 7.

rooby’s picture

Yeah, it definitely cleans things up and should close a few other related issues too.

rooby’s picture

I have tested this too and it works great.

I do have a slightly unusual request though.
I think it needs a hook in there so that other modules can tell clamav to skip the scanning of this file as it is currently all or nothing.

It is a pretty big edge case but I have a site where one area is specifically to upload malware files.

There is already additional handling of these files prior to validation that means I can tell these files by their file name, which means the data available in hook_file_validate() is sufficient.

I don't think it hurts to have the hook there to avoid having to hack the module.

rooby’s picture

Issue summary: View changes
Status: Reviewed & tested by the community » Needs review
StatusFileSize
new3.13 KB

Here is a patch like the previous one that adds the new hook_clamav_file_is_scannable().

If any module returns FALSE from hook_clamav_file_is_scannable() then scanning of that file will be skipped and it will pass clamav validation.

Does anyone oppose this?

Note: feel free to suggest alternative hook names.

rooby’s picture

StatusFileSize
new3.2 KB

Here is a new version of the patch that just makes the message that appears when clamav is not available a bit more useful.

Also, I no longer require that hook that I added in the previous patch, so if anyone really opposes that hook I'm definitely not going to argue the point.

jastraat’s picture

Status: Needs review » Reviewed & tested by the community

The patch in #7 worked great for scanning files in IMCE (as well as regular file fields). I tested with a file following these directions: http://www.eicar.org/86-0-Intended-use.html

pkil’s picture

As with #8 tested with IMCE and working well

adammalone’s picture

StatusFileSize
new1.45 KB

And here's an interdiff between patch 0 and patch 7. Is there a reason not to use module_invoke_all for clamav_file_is_scannable? If at all possible, an addition to the README or a clamav.api.php would be nice for the newly created hook too.

rooby’s picture

I can't say I remember all the details of the creation of that patch in #7 except that I no longer have a need for that hook.

The only difference I see with this vs module_invoke_all() is that this will return as soon as any module returns false, since there is no need to continue, whereas module_invoke_all() would call all hooks even if a module already returned false.

Also, module_invoke_all() returns an array of results so the check for false values is made slightly more complex.

A clamav.api.php entry is definitely a good idea.

adammalone’s picture

Assigned: Unassigned » adammalone

Ok, good idea to keep the module_implements/module_invoke pairing then. I'll create a clamav.api.php and add to your patch.
Assigning to me to repatch.

adammalone’s picture

Assigned: adammalone » Unassigned
Status: Reviewed & tested by the community » Needs review
StatusFileSize
new5 KB
new1.8 KB

Ok here we go. Made a change to admin.inc to remove unused configuration and added in clamav.api.php.

pkil’s picture

#13 works for me.

pkil’s picture

Status: Needs review » Reviewed & tested by the community
rooby’s picture

Status: Reviewed & tested by the community » Needs work

With the removal of the configuration settings we should also do an update hook to delete the related variables.

adammalone’s picture

Status: Needs work » Needs review
StatusFileSize
new398 bytes
new5.39 KB

Oh good point. Repatched.

I'm going to create a follow-up issue to remove the other variables as part of a hook_uninstall()

rooby’s picture

Status: Needs review » Reviewed & tested by the community

RTBC based on previous other user's RTBC + interdiff.

mcrittenden’s picture

+1, #17 works great for me.

jordanmagnuson’s picture

Seeing as the last commit for this module was over two years ago, it looks like the project's badly in need of a new co-maintainer if we want to see any of these recent patches committed.

adammalone’s picture

Priority: Normal » Major

I've reached out to a couple of the maintainers but haven't really heard anything back. I would be happy to maintain this in a minimal manner in order to get old patches reviewed and in. If we don't hear anything in a couple of weeks, let's take it to the drupal.org webmaster queue.

adammalone’s picture

Assigned: Unassigned » mcdruid

Assigning to mcdruid for a review and merge.

mcdruid’s picture

Status: Reviewed & tested by the community » Needs work

I think this looks great, and I very nearly committed it.

However, one problem we might want to think about a little more is how people can exclude certain files / form elements from being scanned.

This patch adds a hook:

/**
 * Act on an uploaded file being scanned by ClamAV.
 *
 * Modules implementing this hook can prevent a file being scanned by ClamAV
 * during the file upload process.
 *
 * @param $file
 *   A file entity.
 *
 * @see hook_file_validate()
 */
function hook_clamav_file_is_scannable($file) {
  // Exempt files with the text/plain mimetype from being virus checked.
  if ($file->filemime == 'text/plain') {
    return FALSE;
  }
}

...which I think will satisfy some use cases, but I think perhaps there are a few which it won't AFAICS. For example:

  • exclude files attached to nodes of type foo
  • exclude all files attached to field bar

Unless I'm missing something, because hook_file_validate just passes the $file object around, there's not much context available about where this file came from and what's happening to it.

To illustrate this, I just did some testing with webform, and the file object passed to the hook looks like this:

stdClass Object
(
    [uid] => 0
    [status] => 0
    [filename] => eicar_test.jpg
    [uri] => /tmp/phpFzJHFW
    [filemime] => image/jpeg
    [filesize] => 69
    [source] => submitted_test_upload
    [destination] => public://webform/eicar_test.jpg
)

So there's the source property, which I believe is the $form_field_name as passed to file_save_upload, but I'm wondering if there's anything else we can do to make this better.

I had a look at whether #upload_validators would be any use... but I'm not sure that's going anywhere; file_validate calls callbacks from #upload_validators before invoking hook_file_validate, but as $file is not passed around as reference, I don't think it's possible to add additional metadata there.

Can anyone think of a decent way of providing more info to the hook which decides whether to exclude a file from scanning?

Other than that, I think this is ready to commit.

rooby’s picture

At the time of upload I don't think it's really possible to get much contextual information.

file_save_upload() happens before the file hits the database and it doesn't pass any contextual information to validators.

The only thing I can think of would be to go looking through post values and there might be something useful but it may not be reliable.

mcdruid’s picture

Status: Needs work » Fixed

At the time of upload I don't think it's really possible to get much contextual information.

I think that seems about right; if there are use cases for excluding files from scanning where the metadata available on the newly uploaded file object is insufficient (and the source property should usually be somewhat useful), developers will have to come up with some creative things to do in the hook_clamav_file_is_scannable in order to decide what to do.

Committed to the 7.x-1.x branch - thanks to all contributors.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

jamesrward’s picture

Not re-opening but this seems to have broken oembed support. I started an issue with oembed at https://www.drupal.org/node/2690329 but thought folks here should be aware.

rooby’s picture

Cross-posting here because it is important information:

When I originally wrote the hook hook_clamav_file_is_scannable() into this patch it was for a very specific purpose, which was a file field that was specifically for uploading viruses (into a very controlled environment).

Other than purposely wanting viruses uploaded into your site I cannot right now think of a good reason for using that hook.

Modules should not be flippantly implementing that hook or they could be (more likely than not) implementing a security hole.

Please think very carefully about your purpose if you are thinking of implementing that hook.

I have created a follow up issue relating to either documenting or removing this hook at #2690365: Add warning to the api docs for hook_clamav_file_is_scannable().