I've coded a custom condition for a project and was wondering if it could be useful for everybody. Basically, it adds a new condition in the "Data" group that allows to compare a string and a list of string, aka a list<text> which is an array of strings. If there's a match between them it returns true. Here's the code I wrote. It could be perfected, the comparison loop could be a do-while for better performance, there's no need to declare a $pattern variable for a single use, and the final test could leave the == 0 since in PHP 0 == FALSE. But it works very well :)

For history, I used it to automatically flag comments that contain some offensive words on comment submission/edition. That's why I used a regexp with \b to be sure to match whole words only.

/**
 * Implements hook_rules_condition_info().
 *
 * Declare a new Rules condition : "Text contains any word in a list"
 * This condition is gonna be used as a way to match comment content and the list of suspect words
 *
 * @return array the condition description
 */
function mymodule_rules_condition_info() {
  return array(
    'mymodule_text_contains' => array(
      'label' => t('Text contains any word in a list'),
      'group' => t('Data'),
      'parameter' => array(
        'text' => array(
          'label' => t('Text to search into'),
          'type' => 'text'
        ),
        'list' => array(
          'label' => t('List of words to match'),
          'type' => 'list<text>',
          'description' => t('The comparison will be case insensitive'),
        ),
      ),
    ),
  );
}

/**
 * Rules condition callback. Matches a text against a list of words.
 *
 * @param $text string Text to search into
 * @param $list array List of words to match
 * @return boolean TRUE if there's a match, FALSE otherwise
 */
function mymodule_text_contains($text, $list) {
  $match = 0;
  foreach ($list as $word) {
    // This pattern takes care of word boundaries, and is case insensitive
    $pattern = "/\b$word\b/i";
    $match += preg_match($pattern, $text);
  }
  return $match == 0 ? FALSE : TRUE;
}

Comments

xandeadx’s picture

Component: Rules Core » Rules Engine

+1

DjebbZ’s picture

I hope it can be useful to you !

webchick’s picture

This looks pretty handy. I am struggling hard with trying to envision a simple use case for Rules out of the box to explain it to people, and if this condition were there, a "spam filter" would be an obvious one.

fago’s picture

Status: Active » Needs work

Yep, sounds handy. Let's polish and include it.

$pattern = "/\b$word\b/i";

This would treat $word as regex. Maybe we should just use stripos() instead?

DjebbZ’s picture

The problem with stripos is that it detects words inside longer words. It specifically wasn't my use case at all. The wording of this condition is "text contains any word in a list". stripos() would make it more more like "text contains this sequence of characters from a list". I think both are completely ok, and even if close in meaning and code, can have their own existence.

For someone who's searching for whole words, the stripos() would bring false positives. For someone searching for a pattern, the stripos() is better. We may even make the case sensitivity an option for both cases.

fago’s picture

I see. I'd agree that it should check for words only.

mitchell’s picture

Title: New condition : "Text contains any word in a list" » Condition : "Text contains any word in a list"
Issue tags: +data transforms
ressa’s picture

Component: Rules Engine » Rules Core
Status: Needs work » Active

Thanks for sharing @DjebbZ, it was just what I was looking for. Any chance this might make it into the official version? I have tested it, and it seems to work just fine.

Jarviss’s picture

Wolfgang can it be added to Rules release? As addition we can change code to use Vocabulary for spam list filter!

Jarviss’s picture

Wolfgang can you help to change this code so Rules Condition could compare Spam Vocabulary terms based on Vocabulary id provided to to Rules condition?

Jarviss’s picture

Issue summary: View changes

Here is code for Rules to use Vocabulary ID - which is Spam filter Vocabulary

<?php
/**
 * Implements hook_rules_condition_info().
 *
 * Declare a new Rules condition : "Text contains any word in a list"
 * This condition is gonna be used as a way to match comment content and the list of suspect words
 *
 * @return array the condition description
 */
function rules_spam_rules_condition_info() {
  return array(
    'rules_spam_text_contains' => array(
      'label' => t('Text contains words from Vocabulary'),
      'group' => t('Data'),
      'parameter' => array(
        'text' => array(
          'label' => t('Text to search into'),
          'type' => 'text'
        ),
        'list' => array(
          'label' => t('Vocabulary ID'),
          'type' => 'text',
          'description' => t('The comparison will be case insensitive'),
        ),
      ),
    ),
  );
}

/**
 * Rules condition callback. Matches a text against a list of words.
 *
 * @param $text string Text to search into
 * @param $list array List of words to match
 * @return boolean TRUE if there's a match, FALSE otherwise
 */
function rules_spam_text_contains($text,$list) {
  $match = 0;
  $list = (int) $list;
  $terms = taxonomy_get_tree($list);
  $items = array();
  foreach ($terms as $term) {
          $items[] = $term->name;
          }
  foreach ($items as $word) {
    // This pattern takes care of word boundaries, and is case insensitive
    $pattern = "/\b$word\b/ui";
    $match += preg_match($pattern, $text);
  }
  return $match == 0 ? FALSE : TRUE;
}
arruk’s picture

This looks great! Has anyone used it with Reservation Conflict?

mpotter’s picture

Would really like to see a generic "contains" string data comparison. Tried something like the OP but I don't see anything in the "Comparison Operator" field for data.

The use case: I want to compare the email address of a new user to see if it contains certain text in order to automatically assign it to an OG Group.

Even better would be a regex comparison operator. Does this already exist somewhere in contrib?

mpotter’s picture

Nevermind, I'm an idiot. I was looking at operators within Data comparison and missed the entire Text comparison conditions.

Québec’s picture

Hi.

I'm trying to make a simple antispam to prevent humans from manually creating unwanted content. But I just cannot find the way to make Rules check into a list. It seems to work if I make one condition per word. But if I make a list — one word per line or comma separated — it does not work.

So I saw all that PHP code (#0 and #11). Does this mean that it is not possible to make Rules match a content to a list of words? Do I need this code? Where to put it? How to make the list; comma seperated?

Thanks.

TR’s picture

Still in need of an actual patch here.
Also needs a test case.

TR’s picture

Version: 7.x-2.x-dev » 8.x-3.x-dev

Moving to 8.x-3.x. There doesn't seem to be much interest in adding this, but if it is added it should go into the current version of Rules first. It can then be backported to 7.x-2.x if there is community interest.

TR’s picture

Issue tags: +D8RE

Tagging issues that will be fixed by code moved from D8RE.

TR’s picture

Component: Rules Core » Conditions