I wrote a small custom module to delete uncategorized nodes of a defined type that have no taxonomy terms set. I called it "Flush Feeds" since the node type is a feed item. Generally, this is the code:


function flush_feeds_cron() {

  flush_feeds('feed_item');

}


function flush_feeds($contenttype) {

  $expired_time = strtotime('-30 days');

  $query = db_query("SELECT nid, title FROM {node} WHERE type = '%s' AND created <= '%s' ORDER BY nid ASC", $contenttype, $expired_time);

  while ($del = db_fetch_object($query)) {

    // Check if node has a taxonomy term assigned
    $query_term = db_query("SELECT DISTINCT nid FROM {term_node} WHERE nid = '%s'", $del->nid);
    $check = mysql_num_rows($query_term);

    if ($check == 0){

      // Solve permission problem since cron runs as anonymous user
      global $user;
      $temp_user = $user;
      $user = user_load(array('uid' => 1));

      // Delete node
      node_delete($del->nid);

      // Set watchdog message
      watchdog('Flush Feeds', 'Feed item deleted: <em>' . check_plain($del->title) . '</em> (' . $del->nid . ')');

      // Set back $user to what it was before
      $user = $temp_user;

    }

  }

}

Since node_delete() checks for access permissions and cron is working as anonymous user, I temporarily change the user to get delete permissions. Now I'm wondering if that is safe. If some people with more knowledge told me if this is safe, I'd appreciate it a lot.

Comments

yuriy.babenko’s picture

I've seen this user_load() trick done before, and have even used it, but I don't like it. Too much opportunity for something to go wrong and end up assigning UID1 permissions to heck knows what/whom.

Safe or not, the code can be improved :).

1. You can just run one JOIN'ed query instead of two separate ones.
2. The user_load() and temporary user switch should happen outside the while() loop (no need to do this for every node you get in the query).
3. As an alternative to the user_load() trick, I would just copy the code from node_delete() and remove the access check:

$query = 	"	SELECT 		n.nid,
					 		    COUNT(tn.nid) as count
				  FROM 		  {node} n
				  LEFT JOIN	  {term_node} tn
					  ON		 n.nid = tn.nid
				  WHERE		  n.type = '%s' 
				  AND 		   n.created <= '%s'
				  GROUP BY	   n.nid
			";

$result = db_query($query, 'article', strtotime('-30 days'));

while($row = db_fetch_object($result)) {
	if($row->count == 0) {
		// Clear the cache before the load, so if multiple nodes are deleted, the
		// memory will not fill up with nodes (possibly) already removed.
		$node = node_load($row->nid, NULL, TRUE);
		
		db_query('DELETE FROM {node} WHERE nid = %d', $node->nid);
		db_query('DELETE FROM {node_revisions} WHERE nid = %d', $node->nid);
		
		// Call the node-specific callback (if any):
		node_invoke($node, 'delete');
		node_invoke_nodeapi($node, 'delete');
		
		// Clear the page and block caches.
		cache_clear_all();
		
		// Remove this node from the search index if needed.
		if (function_exists('search_wipe')) {
			search_wipe($node->nid, 'node');
		}
		watchdog('content', '@type: deleted %title.', array('@type' => $node->type, '%title' => $node->title));
		drupal_set_message(t('@type %title has been deleted.', array('@type' => node_get_types('name', $node), '%title' => $node->title)));
	}
}

---
Yuriy Babenko | Technical Consultant & Senior Developer
http://yuriybabenko.com

Jaypan’s picture

You could set up a grants system using hook_node_access_records() and hook_node_grants(), whereby the anonymous user is given node deletion grants under specific conditions. You could then set up that condition to exist somehow during the cron run. How this may be good is that without node edit grants, a regular anonymous user could never get to the screen with the delete button, so they wouldn't be able to accidentally delete something, but cron on the other hand could delete without needing to get to the node edit screen, since it is just calling the function node_delete().

All theory though - I've not done this.

yan’s picture

Great, thanks a lot!

Heine’s picture

See Safely Impersonating Another User for general info on user switching (not saying it is the best in this case).

Ceterum censeo, API functions shouldn't check user perms.

yan’s picture

This is the D7 code I use:

/**
 * Implements hook_cron().
 */
function flush_feeds_cron() {
  flush_feeds('feed_item', 'field_data_taxonomy_vocabulary_1');
}


/**
 * Function to delete uncategorized items that are older than $expired_time days
 */
function flush_feeds($content_type, $field_name) {

  $expired_time = strtotime('-' . variable_get('flush_feeds_time_ago', 30) . ' days');
  $threshold = variable_get('flush_feeds_threshold', 10);

  $query = db_select('node', 'n');
  $query->leftJoin($field_name, 't', 'n.nid = t.entity_id');
  $result = $query
    ->fields('n', array('nid', 'title'))
    ->condition('n.type', $content_type)
    ->condition('n.created', $expired_time, '<=')
    ->isNull('t.entity_id')
    ->range(0, $threshold)
    ->orderBy('n.created', 'ASC')
    ->execute();

  foreach ($result as $record) {
    node_delete($record->nid);
    watchdog('flush_feeds', '%type deleted: %title (%nid).', array('%type' => node_type_get_name($content_type), '%title' => $record->title, '%nid' => $record->nid));
    drupal_set_message(t('%type %title (%nid) has been deleted.', array('%type' => node_type_get_name($content_type), '%title' => $record->title, '%nid' => $record->nid)));
  }

}

Note: In my case, the content type to delete is called feed_item and the taxonomy field is called field_data_taxonomy_vocabulary_1.