I am currently (successfully) using apachesolr for project with custom facet on cck date field.
As the code that I've made is done inside apachesolr module I want to make it separate module (so I can update apachesolr without patchig the code again)

I am using hook for (standard) cck fields hook_apachesolr_cck_fields_alter(&$mappings) and this is enough to get facet.
But to make it work like date facets (for node created and node changed fields) I had to change
function apachesolr_search_add_facet_params(&$params, $query)
and
function apachesolr_search_block($op = 'list', $delta = 0, $edit = array())

There is one more change that I've done. It is in function apachesolr_search_date_range($query, $facet_field) and it is handling dates that are in ISO format (string) in database.

I am attaching diff.

Is there support for additional cck fields that will act like date facets planned (and when)? Is there some plan that I should consider when trying to make support for date facets from cck fields?

I was thinking maybe to automaticly consider using funcionst for date facets if cck field is date or datetime or timestamp type...

Mihajlo
Kontrola

Comments

moshe weitzman’s picture

Any chance we can get basic cck date field support in core apachesolr? if not, lets try to clear these blockers. I have not tried the code, but will probably need this soon.

ximo’s picture

Version: 6.x-1.0-rc2 » 6.x-1.x-dev
Category: support » feature

I'm very interested in this. Would this be possible for 1.x-rc4, or will this have to go into 2.x? The patch seems simple enough.

jpmckinney’s picture

Status: Active » Needs work

The patch hardcodes the name of the CCK field. There needs to be more abstraction.

robertdouglass’s picture

I'm working on this. It's a big patch and a custom module. Will post soon.

obrigado’s picture

Subscribing.

robertdouglass’s picture

Just want to confirm ongoing work and the imminent propinquity of the first patch + new module to try.

Maikel’s picture

Subscribing

robertdouglass’s picture

Version: 6.x-1.x-dev » 6.x-2.x-dev
StatusFileSize
new41.84 KB

Here's a patch against 6.2 that adds a new module, apachesolr_date.

It also refactors much of the CCK handling code altogether. Notably, CCK facet definitions must always define an indexing callback.

I'm eager to have people try it but the code needs more cleaning up.

BenK’s picture

Subscribing...

DenRaf’s picture

I have tested the patch on my system.

When rebuilding the index only the 'tdate' types are being indexed.

When rebuilding the index using a patched version of nd_search, everything works just fine. The only difference is that those 'tdate' types are now 'date' types.

Will have a deeper look into it, to get things working without nd_search.

robertdouglass’s picture

What is nd_search?

Thanks for testing. I have an updated patch for later today.

DenRaf’s picture

nd_search is a contrib module for display suite.

Good, cause I'm unable to get those facets as you showed them on Fosdem.

robertdouglass’s picture

Status: Needs work » Needs review
StatusFileSize
new25.73 KB

Here's a new version. I want to commit this soon to get wider testing.

DenRaf’s picture

Status: Needs review » Needs work

This new patch doesn't contain anything of the apachesolr_date module. Still have it from the older patch.

Without nd_search, only those tdates fields are being indexed.

The problem in building the facets is that $response->facet_counts->facet_dates are empty.

haxney’s picture

This issue is related to #664896: Automatic CCK introspection, especially the changes that the patch in comment 8 makes to indexing.

I do think that indexing_callback shouldn't be required for fields that can be indexed with a default indexer, such as text and number fields. A lot of CCK fields have a single value attribute which stores the value of the field.

I'm working on merging these two patch sets (the ones from #664896: Automatic CCK introspection and the one here) together in a Git repository at http://github.com/haxney/apachesolr . I'll be pushing all of my updates there, and when I have something presentable, I'll submit a patch here.

robertdouglass’s picture

StatusFileSize
new43.31 KB

@haxney - that sounds great. @DenRaf - here's the patch I tried to submit before.

DenRaf’s picture

Still the same issue: $response->facet_counts->facet_dates is empty. (apachesolr_date.module:347)

Are there changes required to the solrconfig or the schema maybe ?

robertdouglass’s picture

Well, I'm using the new tdates:
shema.xml:

    <!-- A Trie based date field for faster date range queries and date faceting. -->
    <fieldType name="tdate" class="solr.TrieDateField" omitNorms="true" precisionStep="6" positionIncrementGap="0"/>

Did you launch Solr with the schema.xml from the latest 2.x-dev version? If so it should be fine.

DenRaf’s picture

Yes, and I checked that immediately when I saw you were using the tdate type.

Any idea why that facet_dates are empty ?

haxney’s picture

@robertDouglass I've updated #664896: Automatic CCK introspection, and have new code at my GitHub project. I've changed some of how hook_apachesolr_cck_fields_alter() works (not dramatically; is pretty much exclusively redundancy elimination), so you could probably save yourself some time by avoiding having to write a bunch of identical callbacks (if it can handle them automatically).

I'm also definitely planning on using your system of having parallel Solr fields for complex CCK fields (like 'date' and 'date_end'). Hopefully, I'll be able to make your life a bit easier, too. :)

robertdouglass’s picture

@haxney - excellent. Keep up the work. I'll be able to review this coming weekend.

Macronomicus’s picture

Thanks haxney .. I will test this weekend too!

mcarbone’s picture

Looks like you didn't add date fields to the introspection code? I tried adding it explicitly myself but they don't seem to be working. Can you give me an example hook_apachesolr_cck_fields_alter for a date field to test this?

robertdouglass’s picture

StatusFileSize
new45.2 KB

Progress. The facet blocks are *generally* not working correctly, but the date stuff seems to be in place. Any help debugging greatly appreciated.

mcarbone’s picture

I'd like to help debugging but I want to make sure I'm configuring things correctly before I do.

I applied your patch and uninstalled and then re-enabled the search and date solr modules on a sandbox with a nodereference to page, a select text field, and a date field on the story type. The noderef and text field were autodiscovered, but only the noderef filter is appearing. As for the date field, I added this hook to a custom module:

function customsolr_apachesolr_cck_fields_alter(&$mappings) {
  $mappings['per-field']['field_date'] = array('callback' => '', 'index_type' => 'string');
}

It then appeared as a filter, but didn't appear as a facet block when I enabled it. (The blocks are enabled, and, as I said, the noderef one is appearing correctly.) Is that not the right way to map the date field?

robertdouglass’s picture

mcarbone - It was my intention that you wouldn't need to map the date field at all - that the module would make facets for starting and ending dates without you doing anything special.

It's been quite frustrating though, because the behavior that other people see when they test it often diverges from what I was seeing during development.

For the patch I attached, I was seeing these things working:
1. Date fields of all types get recognized automatically and make separate facet blocks for starting and ending dates.
2. You can drill down into any date facet down to the hour.

What wasn't working was the interplay between other facet blocks. Clicking a facet link was making all other facet blocks disappear, despite the fact that the right search results were being fetched.

mcarbone’s picture

Aha. It appeared when I switched the date widget type from popup to select or text. Seems like you need to add 'date_popup' in addition to 'date_select' and 'date_text.'

The date facet block is now appearing (the text select one still isn't), although it's timing out when I click one of its options. The noderef facet block works fine.

mcarbone’s picture

Huh, the date facet drilldown worked for me briefly. And it worked in conjunction with the noderef facet. Now it's timing out again -- very inconsistent. (The noderef facet works consistently.)

robertdouglass’s picture

mcarbone: it's exactly these reports of inconsistent behavior that have been haunting me on this patch. Thanks for your testing. Let me know if you find anything. Did you look at the Solr logs for possible errors?

mcarbone’s picture

After doing some digging:

First, it seems like the reason my field_options (text with select widget) facet isn't working is because it's not being indexed, and it's not being indexed, I think, because it doesn't have an index_callback. E.g.,

Array ( [field_name] => field_page [indexing_callback] => apachesolr_cck_nodereference_indexing_callback )
Array ( [field_name] => field_date [indexing_callback] => apachesolr_date_date_field_indexing_callback )
Array ( [field_name] => field_options [indexing_callback] => ) 

As such, the following conditional on line 119 of apachesolr.index.inc is false for field_options:

        if ($cck_info['indexing_callback'] && function_exists($function)) {

Without the patch, it gets indexed fine and the facet works.

Now, as to why the date facets are timing out, I'm still not entirely sure. I'm seeing this in watchdog: "0" Status: Request failed I didn't set up the Solr instance, so I'm not sure if I'm looking at the right logs, but my sysadmins told me to look in /var/log/daemon.log -- is that the right log to look in? If so, I saw some solr-related lines in there, but nothing that indicated an error.

The freezing seems to be happening on line 377 of apachesolr_search.module:

$response = $solr->search(htmlspecialchars($query->get_query_basic(), ENT_NOQUOTES, 'UTF-8'), $params['start'], $params['rows'], $params);

It's an empty query, and here is what's in the $params variable:

Array
(
    [fl] => id,nid,title,comment_count,type,created,changed,score,path,url,uid,name,teaser
    [rows] => 10
    [facet] => true
    [facet.mincount] => 1
    [facet.sort] => true
    [facet.date] => Array
        (
            [0] => tds_cck_field_date
        )

    [f.tds_cck_field_date.facet.date.start] => 2010-02-08T04:00:00Z/HOUR
    [f.tds_cck_field_date.facet.date.end] => 2011-02-14T04:00:00Z+1HOUR/HOUR
    [f.tds_cck_field_date.facet.date.gap] => +1HOUR
    [facet.field] => Array
        (
            [0] => is_cck_field_page
            [1] => ss_cck_field_options
        )

    [facet.limit] => 20
    [bf] => Array
        (
            [0] => recip(rord(created),4,30,30)^200.0
        )

    [start] => 0
    [q.alt] => tds_cck_field_date:[2010-02-08T04:00:00Z TO 2011-02-14T04:00:00Z]
)

Not sure if that's helpful, but I'm fairly new to this and am not sure how to debug the actual search call.

robertdouglass’s picture

Very helpful. Tell your sysadmin that the log of interest is catalina.out in Tomcat (if that's what you're using).

drewish’s picture

robertdouglass’s picture

@mcarbone can you look and see if you are seeing queries like these take exceptionally long to execute?
SELECT MIN(cck.field_mt_dates_facet2_value2) FROM content_field_mt_dates_facet2 cck INNER JOIN node n WHERE n.status = 1;

robertdouglass’s picture

So on the problem server which is hanging, the SELECT MIN() query with the INNER JOIN takes minutes to return, but this returns instantly:
SELECT MIN(cck.field_mt_dates_facet2_value) FROM content_field_mt_dates_facet2 cck;

Anyone know how to optimize the original query to get better performance?

robertdouglass’s picture

AHH! I'm missing the ON part of the join and it is therefore doing massively stupid things at the database level.

robertdouglass’s picture

Status: Needs work » Needs review
StatusFileSize
new47.81 KB
robertdouglass’s picture

StatusFileSize
new47.51 KB
drewish’s picture

There's packaging script crud in all the .info files.

pwolanin’s picture

Status: Needs review » Needs work

a bunch of cruft on your patch like:

Index: apachesolr.info
===================================================================
RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/apachesolr.info,v
retrieving revision 1.1.2.1.2.8
diff -u -p -r1.1.2.1.2.8 apachesolr.info
--- apachesolr.info	4 Jun 2009 13:33:36 -0000	1.1.2.1.2.8
+++ apachesolr.info	21 Mar 2010 19:07:23 -0000
@@ -5,3 +5,10 @@ dependencies[] = search
 package = Apache Solr
 core = "6.x"
 php = 5.1.4
+
+; Information added by drupal.org packaging script on 2010-03-10
+version = "6.x-2.x-dev"
+core = "6.x"
+project = "apachesolr"
+datestamp = "1268179277"
+
robertdouglass’s picture

Status: Needs work » Fixed

I cleaned the info files and committed. Further testing and review welcome - but there's great value in getting the code out there.

#558160 by robertDouglass, mihha | DenRaf, mcarbone, haxney: Added date facet for cck fields.

robertdouglass’s picture

StatusFileSize
new2.43 KB

Followup to fix broken text indexing.

mcarbone’s picture

That patch works for me and the text facet now appears. However, and this was happening before but I was waiting to report it until the other missing facet issue was resolved, the date facet block disappears once I filter using either one of my other non-date facets.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

MickC’s picture

Hi,

Is this patch now committed to the current dev or do we still have to apply it?

Also, there is another patch in this thread http://drupal.org/node/920880

Which is the right one to use to get CCK date facets working correctly?

Thanks

jpmckinney’s picture

#41 has been committed, yes.

MickC’s picture

Hi - I'm still getting problems after using the latest dev as per below. Do I need to reindex before it's fixed? MC

Filter by Date

2009 (15)
2009 (15)
2009 (15)
2009 (11)
2009 (11)
2009 (11)
2009 (11)
2009 (10)
2009 (10)
2009 (9)

alex72rm’s picture

The same issue here: I've an output for cck date (latest -dev) that isn't usable (sorting is so curious!)

2009 (2)
2002 (1)
2005 (1)
2009 (1)
2009 (1)

Afterwards, when I click on a link, I obtain directly some hour instances:

(-) 8:52 AM
8:52 AM (1)
2:33 PM (1)

I've re-indexed without solving the issue.

mihha’s picture

See the thread: #920880: facet_block_callback not propagated
There is patch in http://drupal.org/node/920880#comment-3543678

One part is in DEV the other part is not. It is behaving the same on my sites without second part of the patch... Give me few hours to make a patch again and I'll post it.

mihha’s picture

StatusFileSize
new962 bytes

it's been more then few hours...

and maybe we should continue in the other issue...

alex72rm’s picture

Status: Closed (fixed) » Patch (to be ported)

@mihha: thank you very much! Your patch is the only one needed to be applied to the latest 2.x-dev version to solve this issue.

MickC’s picture

@mihha: brilliant! Well done - the last patch seems to have fixed the CCK date formatting, now showing years at the top level, drilling down into months and days.

I now need to figure out how to control the index so it only offers upcoming dates, and optionally past dates. any clues how to do that? One thought was to have a separate, maybe calculated field with values such as 'Today', 'Tomorrow', 'This Week','Next Week', etc

Thanks, MC

nick_vh’s picture

Version: 6.x-2.x-dev » 6.x-3.x-dev

Moving to 6.x-3.x. Will close once facetapi has this working for Drupal 6

nick_vh’s picture

Status: Patch (to be ported) » Closed (won't fix)

Closing, 6.x-3.x is working and facetapi is also semi working.