Hi,

I'm looking for the best method to exclude nodes based on field settings.

A simple example is if the page is unpublished the node does not display in the site map.

What I need is something similar but instead of the page needing to be unpublished it need to use a field value.

For example an even. If the event end date hasn't passed then the node needs to be in the sitemap but, if the event date has passed then the node needs to be excluded from the sitemap.

The date is field in the custom content type for the node.

What would be the best method to achieve this?

Can it be configured with rules of some other extension module?

Or should I build a module to achieve this?

If a module, where should I hook in? with hook_query_xmlsitemap_generate_alter() ?

thanks

Comments

haggins’s picture

Today I implementend something similar: include only commerce_product displays of products, whose date fields meet a special condition.

I ended up in creating a custom module and excluding the appropriate node type from the sitemap. So I was able to query nodes with conditions and adding them manually to {xmlsitemap} with:

function yourmodule_xmlsitemap_index_links($limit);
// Replace with your own query.
// @see xmlsitemap_node_xmlsitemap_index_links().
$nids = db_query_range("SELECT n.nids FROM {node} n ....", 0, $limit)->fetchCol();

$nodes = node_load_multiple($nids);
foreach ($nodes as $node) {
  $link = xmlsitemap_node_create_link($node);
  $link['status'] = 1;
  $link['status_override'] = 1;
  xmlsitemap_link_save($link);
}

Implement hook_cron() with 2 function callbacks:
First for finding new nodes that need to be indexed (see snippet above).
Second for removing nodes from index that don't meet the conditions anymore. Here's my example code for that:

/**
 * Remove links from {xmlsitemap} that should no longer be included.
 */
function yourmodule_purge_links() {
  if (!db_table_exists('xmlsitemap')) {
    return FALSE;
  }

  // Get user ignore list.
  $exceptions = variable_get('yourmodule_exceptions', '');
  $ignore = explode("\r\n", $exceptions);

  // Get events, which had their latest appointment more than a week in the past
  // or are unpublished or canceled.
  $date = new DateObject();
  $date->modify("-1 week");
  $max_date = date_format_date($date, 'custom', DATE_FORMAT_DATETIME);
  $nids = db_query("SELECT DISTINCT n.nid "
                  ."FROM {node} n "
                  ."LEFT JOIN {xmlsitemap} x ON x.type = 'node' AND n.nid = x.id "
                  ."INNER JOIN {field_data_event_ref} fder ON n.nid = fder.entity_id "
                  ."INNER JOIN {commerce_product} cp ON cp.product_id = fder.event_ref_product_id "
                  ."INNER JOIN {appointments} a ON a.product_id = cp.product_id "
                  ."INNER JOIN {users} u ON u.uid = cp.uid "
                  ."WHERE (x.id IS NOT NULL AND n.type = 'event_display') AND (cp.status = 0 OR a.date < :max_date OR a.canceled = 1 OR u.name IN (:ignore)) "
                  ."ORDER BY n.nid DESC", array(':max_date' => $max_date, ':ignore' => $ignore))->fetchCol();

  if (!empty($nids)) {
    // Don't use xmlsitemap_link_delete_multiple() because it doesn't allow to set
    // the conditional operator to our needs.
    variable_set('xmlsitemap_regenerate_needed', TRUE);

    $query = db_delete('xmlsitemap')
      ->condition('type', 'node')
      ->condition('id', $nids, 'IN')
      ->execute();
  }
}

Your hook_cron() will then look like this:

function yourmodule_cron() {
  yourmodule_xmlsitemap_index_links(xmlsitemap_var('batch_limit'));
  yourmodule_purge_links();
}

Be aware of the $limit as with this solution your module doesn't share it with xmlsitemap_node.module.

You also may react on node CRUD operations. Look at the xmlsitemap_node.module how this can be achieved (i.e. hook_node_insert() etc.).

I don't know if there's also another, cleaner method to achieve this. But it's working for my case and I hope it gives you an idea.