Problem/Motivation

Currently the only supported entry to the processing mechanism is by piping email to a Drush command (or calling the inmail.processor service from within Drupal). In many cases this is not possible or preferrable.

Proposed resolution

Implement a Fetcher concept. Maybe actually as another module?

Remaining tasks

Discuss where the fetcher concept belongs: in a separate module or within Inmail.
Implement.

User interface changes

None, really.

API changes

We'll see.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

miro_dietiker’s picture

The "original" idea was to put the fetcher concept into inmail and basically declare that the drush command / mail server pipe passing identifies similarly as a named fetcher.
Minimalistically, it's possibly just a string property in the incoming message that can be accessed.

For configurability of fetchers (imap / pop3 credentials), one might think about plugins with a Manager and UI... however this is possibly heavy over engineering.

EDIT: see also related issue
#2379909: Make the drush command a Deliverer

miro_dietiker’s picture

Priority: Normal » Major

Setting to major as most shared hostings will not be able to use this module as long as no (cron imap / pop3) fetcher is in place.

This is not critical to us (the initiators), but we are highly open to push this forward to make it broadly accessibly.

Arla’s picture

Version: » 8.x-1.x-dev
Status: Active » Needs review
FileSize
4.54 KB

Proof of concept. Not sure how/if this can be tested. I tested manually with a Gmail account. (For this, I had to enable IMAP for the account, and enable "Access for less secure apps" on myaccount.google.com.)

miro_dietiker’s picture

Yay! Looks awesome.

+++ b/inmail.module
@@ -255,3 +255,12 @@ function inmail_mail($key, &$message, $params) {
+function inmail_cron() {
+  // @todo Use queue to fetch and process.
+  $raws = \Drupal::service('inmail.fetcher')->fetch();
+  \Drupal::service('inmail.processor')->processMultiple($raws);

OK... with drupal core, every single cron run will trigger this fetcher.

For real sites with a high cron interval, we are now dependent to tools like ultimate cron. If we want to be "nice", we would need to offer an interval setting to avoid spamming the imap service too often.

Also, we should only call the service if $raws are not empty.

Arla’s picture

Now fetchers are plugins.

Trying a simpler approach than with handlers/analyzers, by putting config directly in "inmail.fetcher.$id" instead of introducing yet another config entity type. The drawback is that it does not support multiple configurations per existing plugin.

Maybe adapt to Ultimate Cron as a followup?

Status: Needs review » Needs work

The last submitted patch, 5: imap-2379889-5.patch, failed testing.

Arla’s picture

Status: Needs work » Needs review
FileSize
18.52 KB
6.63 KB

Introduced QueueWorkers for the cron job. 120 more code lines... but slightly nicer ;) I think it makes a difference if you have many fetchers with slow connections. Then the load will be balanced over multiple cron runs.

Also fixed test fails, left-over settings and moar @todo's.

miro_dietiker’s picture

Status: Needs review » Needs work
  1. +++ b/inmail.module
    @@ -270,15 +270,12 @@ function inmail_mail($key, &$message, $params) {
    +  foreach ($fetcher_manager->getInstances() as $fetcher) {
    ...
    +    \Drupal::queue('inmail_fetch')->createItem($fetcher);
    

    So with every single cron run inmail creates now an item per fetcher in the inmail_fetch queue. In case the workers are then slow, we still stack up new fetcher items here.
    Also for sake of priority, like everything, fetchers should also be ordered.
    I think the intended scaling does not work as expected here. This needs more black magic.
    I didn't look into cron invocation of the queues by core... Is there some black magic?
    Search API for instance creates a queue per index. Do we need a queue per fetcher? With such a queue per fetcher, we should only add a fetch request if the queue is empty. A fetcher then adds items per item fetched. And the processor then processes them until empty. I think this would scale much better.

  2. +++ b/inmail.module
    @@ -270,15 +270,12 @@ function inmail_mail($key, &$message, $params) {
    +    // @todo Is it iportant to avoid duplicates?
    

    iportant => important.

  3. +++ b/src/Plugin/QueueWorker/Fetch.php
    @@ -0,0 +1,61 @@
    + *   title = @Translation("Fetch email for processing"),
    

    email => emails

  4. +++ b/src/Plugin/QueueWorker/Fetch.php
    @@ -0,0 +1,61 @@
    +    return new static($configuration, $plugin_id, $plugin_definition, $container->get('queue')->get('inmail_process', TRUE));
    

    I would prefer to have the creation of the queue as a variable assignment on the line before.

  5. +++ b/src/Plugin/QueueWorker/Process.php
    @@ -0,0 +1,54 @@
    + *   title = @Translation("Process incoming messages"),
    

    missing "email".

  6. +++ b/src/Plugin/inmail/Fetcher/ImapFetcher.php
    @@ -25,6 +25,8 @@ class ImapFetcher extends FetcherBase {
    +    // @todo Return noisily if misconfigured...
    

    Yeah, in case it fails (such as password error), it should possibly stop retrying. Otherwise our implementation retries forever with a high rate... Unsure how to do this (detect such cases and disable spamming) best.

  7. +++ b/src/Plugin/inmail/Fetcher/ImapFetcher.php
    @@ -32,11 +34,13 @@ class ImapFetcher extends FetcherBase {
    +    // @todo Allow choosing A) only read UNSEEN and mark unread or B) read all and delete.
    

    and with delete you need an expunge option (by default on). Otherwise the messages are only marked for deletion, but deletion is never triggered.

  8. +++ b/src/Plugin/inmail/Fetcher/ImapFetcher.php
    @@ -32,11 +34,13 @@ class ImapFetcher extends FetcherBase {
         $unread_ids = imap_search($imap_res, 'UNSEEN') ?: array();
    

    Document our strategy in the inmail settings. So there is no other party allowed to connect to the imap account and mark stuff as seen, otherwise we don't process it anymore.

  9. +++ b/inmail.module
    @@ -255,3 +265,17 @@ function inmail_mail($key, &$message, $params) {
    +    \Drupal::queue('inmail_fetch')->createItem($fetcher);
    
    +++ b/src/Plugin/QueueWorker/Fetch.php
    @@ -0,0 +1,61 @@
    +        $this->processQueue->createItem($raw);
    

    Is this ordered? So guaranteed we first fetch and then process? Otherwise an item will need always 2 cron runs if accidentally first process and then fetch is called.

  10. +++ b/src/Form/InmailSettingsForm.php
    @@ -40,6 +68,22 @@ class InmailSettingsForm extends ConfigFormBase {
    +    if ($fetchers = $this->fetcherManager->getInstances()) {
    +      $form['fetching'] = array(
    

    So the whole fieldset doesn't even show up if none configured? How to add new one then?

  11. +++ b/src/Plugin/inmail/Fetcher/FetcherInterface.php
    @@ -0,0 +1,36 @@
    +   * @return string[]
    +   *   The fetched messages, in complete raw form.
    

    The messages have a key. So you're not exposing this IMAP message key at all? It is common to add an element of identification as key.

  12. +++ b/src/Plugin/inmail/Fetcher/ImapFetcher.php
    @@ -0,0 +1,117 @@
    +    $unread_ids = imap_search($imap_res, 'UNSEEN') ?: array();
    

    To suggest a C) we could also persist the last checked IMAP message number and only ask for newer (bigger) ones.

  13. +++ b/src/Plugin/inmail/Fetcher/ImapFetcher.php
    @@ -0,0 +1,117 @@
    +  public function buildConfigurationForm(array $form, FormStateInterface $form_state) {
    +    $form['host'] = array(
    

    This clearly needs more explanation.

Finally, i was scrolling 3 times through the whole code to find tests... without success...
oOo... ;-)

Berdir’s picture

+++ b/inmail.module
@@ -255,3 +265,17 @@ function inmail_mail($key, &$message, $params) {
+function inmail_cron() {
+  // Fetch new mail by adding each fetcher to the queue. The fetching queue
+  // worker in turn adds new messages to another queue for processing.
+  /** @var \Drupal\inmail\FetcherManager $fetcher_manager */
+  $fetcher_manager = \Drupal::service('plugin.manager.inmail.fetcher');
+  foreach ($fetcher_manager->getInstances() as $fetcher) {
+    // @todo Is it iportant to avoid duplicates?
+    \Drupal::queue('inmail_fetch')->createItem($fetcher);
+  }
+}

I don't think frequency is a big problem, if you just have core, then you're not going to run cron more than every few hours or so, because system_cron() eats your caches and so on.

Intervals and so on are only necessary if it is something that should only happen once a day or so (e.g. dropping daily statistics).

Core queue API has no way to have unique entries, it simply calls cron hooks and then processes queues as long as as it was specified.

This is not really the model that it was designed for, the idea is that cron creates N actual tasks (as in, a specific feed to process, a remote item to fetch and import, ..) that are then processed over time.

Setting up recurring tasks every time unconditionally on every cron run doesn't really make sense, as it just delays the processing by a few seconds.

I am also not sure that this makes sense like this in general. The issue summary says that the drush command should also be specified as a "passive" fetcher, so we know where things are coming from. But you wouldn't want to call that. So maybe have a passive/active flag, or just let specific plugins implement this for themself?

Berdir’s picture

Some feedback based on discussion with Miro:

* Maybe what I said in our discussion was misleading, but the current config pattern seems strange.. you're treating it like a config entity (multiple config files) but then not. It also limited to a single configuration per plugin, so impossible to fetch from two different accounts. (and the imap fetcher seems to be using a different, standalone configuration in the implementation?). I would suggest to use the same config entity pattern as we do elsewhere for now (The only viable alternative from my point would be a central config entity where you set up multiple instances of each plugin type as a single "thing" (process/queue/flow/... ?), but that's something for another discussion.

* We discussed that fetchers should have an active => true/false flag in the annotation, then we'd query for config entities with a plugin that is any of the active ones, and only execute those.

* The drush fetcher would then be active = false, we discussed that it would still be useful to possibly have multiple of them (you could provide one by default), and then you have to specify the ID (optionally?) in the drush command

* Remove the queue handling, just call them directly (they can still set up their own queue handling, if they want to).

* For now, in imap, just fetch the count of unread messages, process up to N (default=100) mails, then stop. then store the remaining amount (also if 0) and how many you processed in state, e.g. inmail.fetcher.$id.remaining/processed. That is simpler and at least as flexible for now (A queue would only really help if there would be a chance to process things in parallel). The state information can be displayed somewhere or we could also have monitoring integration for it.

* Naming: A passive fetcher is a bit a weird thing if you think about it, it is a fetcher that doesn't fetch but e.g. receives (see drush) We discussed various ideas, tending towards viewing from the perspective of inmal, then what these things do is deliver us mails:

- Deliverer (IMHO a good match as far as the meaning goes, but a weird word)
- MailProvider (provider is very generic... )
- Postman ("Delivers mails", kinda cool, but british english ;))
- Mailman (american postman, not so cool tho ;))

Hope that's all the topics that we discussed, not sure how much of this should be done here and what later, I guess the conversion of drush to a plugin/config so that we can identify it can happen later, but then we can not rely on it yet.

miro_dietiker’s picture

Additionally discussed this today with Arla.

Well, splitting it up is fine if it leads to quicker progress and fast partial review. We need all anyway ASAP. ;-)

Arla’s picture

Status: Needs work » Needs review
FileSize
40.28 KB
43.9 KB

Yes, would be nice to commit and create followups.

As per #10, I removed queue handling and change config structure to the same pattern as Analyzers and Handlers. This lead me to create a few generalized superclasses (PluginConfigEntity and PluginConfigurationForm). I also implemented the batch size (N messages at a time) and tested it manually.

Points in #8:
1-5, 9 invalidated by removing queue handling.
10 invalidated by config restructuring.
6-8, 12, 13 transformed to @todo comments for followups.
11. Ignored for now. With the IMAP fetcher, the "message number" could be used as a key. But other fetchers might have no similar concept. I'm not sure how keys could be useful.

Still not possible to create new fetchers, it's a @todo. An unconfigured IMAP fetcher is provided until adding fetchers is implemented.

The active annotation flag should be added in #2379909: Make the drush command a Deliverer ASAP I think.

I like the word Deliverer but I don't want to commit to it yet... But it would of course be cleaner to perform that name change in this issue, than later. Postman and Mailman IMO have the slight issue that they are more or less gendered.

New @todo comments now are:

+ * @todo Add "Configure new fetcher" action button.
+    // @todo Return silently if not configured.
+    // @todo Return noisily if misconfigured. Possibly stop retrying.
+      // @todo Inject logger.
+      // @todo Consider throwing an exception.
+    // @todo Introduce options for message selection:
+    // @todo Create a Monitoring sensor for this state key.
+    // @todo Add descriptions here and/or in FetcherBase.
+    // @todo Hide password field unless user checks to change password.

Do we do all this in followups, or get some of it done right now?

miro_dietiker’s picture

Status: Needs review » Needs work

Most followups are fine. I like to do as part of this issue:

  • Final name decision
  • Ability to create new instances. (No unneeded or "wrong" default instances anymore.)
  • Basic test coverage
Arla’s picture

Status: Needs work » Needs review
FileSize
46.3 KB
34.53 KB

Name change (to "Deliverer"), ability to create new instances, and basic test coverage done. Interdiff large because of the rename, but unlike last patch, at least it is not larger than the patch itself ;) Also fixed a few minor @todos: injection, form descriptions and password field.

miro_dietiker’s picture

Status: Needs review » Needs work

Looks great. I only checked the tests in detail though.

  1. +++ b/src/Tests/InmailWebTest.php
    @@ -53,6 +53,29 @@ class InmailWebTest extends WebTestBase {
    +    // Check Deliverer list.
    +    $this->clickLink('Mail deliverers');
    

    I'm a bit confused here that you don't use an explicit URL ever. It makes the test sequence a bit inaccessible for developers.

  2. +++ b/src/Tests/InmailWebTest.php
    @@ -53,6 +53,29 @@ class InmailWebTest extends WebTestBase {
    +    $this->assertText('There is no Mail deliverer yet.');
    ...
    +    $this->clickLink('Configure new deliverer');
    

    The pattern is clearly "Add xyz" instead of "Configure new...". I can understand your concerns a bit, but we follow these patterns to avoid further confusion. You can be more verbose if the empty message if you like.

  3. +++ b/src/Tests/InmailWebTest.php
    @@ -53,6 +53,29 @@ class InmailWebTest extends WebTestBase {
    +    $this->assertLink('Configure');
    +    $this->clickLink('Disable');
    +    $this->clickLink('Enable');
    

    Some more verbose comments in the test would be a great thing.

I'm missing triggering the cron. Let's create a followup to implement a test deliverer that verifies cron calls.

Go ahead and commit once the small fixes pass.

Arla’s picture

Status: Needs work » Needs review
FileSize
50.97 KB
7.49 KB

Things fixed. I took the opportunity to split up the tests and rename methods to camelCase.

miro_dietiker’s picture

Status: Needs review » Fixed

Yippie, looking good. Committed.

  • miro_dietiker committed c4f5b85 on 8.x-1.x authored by Arla
    Issue #2379889 by Arla, miro_dietiker, Berdir: Fetch email by IMAP/POP3
    
miro_dietiker’s picture

Assigned: Unassigned » Arla
Status: Fixed » Needs work

Arla, plz create followups and close again. :-)

miro_dietiker’s picture

Component: Code » Deliverer
miro_dietiker’s picture

Title: Fetch email by IMAP/POP3 » Fetch email by IMAP

Reducing title to IMAP, POP3 would be a (low prio) followup.

  • Arla committed a3d022c on 8.x-1.x
    Issue #2379889: Fetch email by IMAP; Added URLs of followup issues
    
Arla’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.