You need to be able to search which entities have a given file id in a filefield. Similarly, you need to be able to look up which entities refer to another. Accidentally, these are the very fields that the SQL storage engine needs an index on. Some other engines might need to provide a by-attribute lookup etc.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drewish’s picture

subscribing since #391330: File Field for Core is blocked on this.

Damien Tournoud’s picture

Because we have pluggable field storage, all the querying needs to be specific to the field storage. At this time, this is done in view by the view_query class, and we need to reimplement that in the field storage engine itself.

Basically: we need to add the logic that, from set of filters (a concept that we don't have at this time yet), returns a set of entities ids.

drewish’s picture

#293841: Allow different back-ends for Views seems a bit relevant to this.

bjaspan’s picture

Here are two function specifications:

/**
 * Return all object types and ids containing field data items matching a query.
 *
 * @param $field_name
 *   The name of the field to query.
 * @param $conditions
 *   An array of query conditions on the field columns.  Keys can include any of the columns for $field_name's field type.  
 *   Values must be of the same data type as the column.  Not all storage engines are required to support queries on all column types.
 *   Also, if the 'revision id' key is specified, only that revision is searched; otherwise, only the current revision is searched.  
 * @return
 *   An array mapping object types (e.g. 'node' or 'user') to an array whose keys are the ids of objects with a field data item matching 
 *   the conditions and whose values are TBD.
 */
function field_attach_query($field_name, $conditions);

/**
 * Exactly like the previous function, but all revisions are searched.
 */
function field_attach_query_revisions($field_name, $conditions);

Notes/questions:

1. Does these functions provide the necessary capability?
2. The functions in principle might query multiple data sources, e.g. SQL, a web service, and a cloud database.
3. The functions can return an arbitrarily large array; an empty $conditions array() will return all object ids in the system!
4. The functions can return just object ids (as currently specified) or return all of the field data columns. However, if we return all the field data columns, we have to do so for all deltas. This makes the returned arrays even larger.

bjaspan’s picture

If $conditions can also include 'id', and if the returned arrays contains all field data items in the standard format, then this function can also be used to load individual fields for an object (whereas field_attach_load() always loads all fields for an object), which is a capability I think we've previously determined we need for a use case I cannot recall. However, if 'id' can only be a scalar, we can only support single-entity load. If 'id' can be an array we can support multiple-load, but now we're getting into the kind of polymorphic API functions that we really want to avoid.

yched’s picture

This should probably be able to handle conditions on entity type, bundle name, in addition to actual field data columns, at minimal cost.
After that, what worries me is the open door to feature creep and "let's duplicate dbtng api / views query builder' requests : array conditions instead of scalars (value IN (x, y) instead of value = z), condition must be true for all value items or just one, cross field queries...
So, yes, the proposed specs should provide the necessary capability for the example use cases in the initial post above

More generally, abstracting out field storage to storage engines has a nasty consequence that we've black-boxed the data :-(.
- Right now field_attach_load() is the only way to access field data. It has a delimited set of use cases and features : in SQL lingo, SELECT all field data WHERE object_id IN ().
- The proposed field_attach_query() would be another way, with its own delimited set of use cases and feature scope : SELECT (TBD) WHERE col1 = value1 AND col2 = value2 AND ....
- it could be used to do SELECT (single field value) WHERE object_id = x (comment #5)
- For more advanced stuff, we are in Views land, not clearly shaped for now.

I think the last entries from David Strauss in the FiC sprint spreadsheet (http://spreadsheets.google.com/ccc?key=pzBnXLotGBqu-mcAObjUaSw, 'Stories and tasks' sheet, lines 129-130, 136-140) are just about this : full-fledged query interface. Scary :-(

bjaspan’s picture

re #6: Yes, this will add limited query capability to fields, and yes, feature creep is totally possible. I thought it was a horrible idea until I was reminded that node_load() and user_load() both support exactly this kind of limited query capability, and the world has not ended, nor have those functions' query capability feature-creeped.

yched’s picture

re #7 : true

re #4 point 4 (return just ids of the objects that match the condition, or return the field data as well) : returning actual values raises the issue discussed and unresolved in http://groups.drupal.org/node/17897. In a regular field_attach_load(), hook_field_load() (for the field-type module) and hook_field_attach_load() (free for all) can alter the field values after they get loaded by the storage engine.
I think we can state that the $conditions in field_attach_query() only apply to the 'raw' values that are actually stored. But if the function returns the field values, then for consistency it also needs to call a single-field version of hook_field_attach_load(), so that the results match what you'd get from a regular field_attach_load(). Right now we have no API designed for that.

bjaspan’s picture

Very good point.

I actually do think we need single-value versions of all Field Attach functions because I want some code to be able to say "please load just field foo for this object right now please" and the same for other attach ops. For example, this is necessary for a module like Taxonomy Fields which I would like to remain possible. I don't especially want a completely duplicate field of field_attach_*_single() functions or something like that, but I haven't thought yet about another way to do it.

chx: You created this issue. What do you need, and why?

sun’s picture

catch’s picture

subscribe. #412518: Convert taxonomy_node_* related code to use field API + upgrade path will either need to keep the term_node table or have something like this to rely on.

yched’s picture

On a related note, I started #415044: Indexes for field storage..

yched’s picture

Status: Active » Needs work
FileSize
9.12 KB

Initial stab at this. Still has many TODOs and open questions; notably, I didn't even start to address the hook_field_load() and hook_field_attach_load() related issues mentioned in #8 / #9 above. The query part is working, though.

Will expand a little later, posting what I have before heading to bed :-).

chx’s picture

Major issue: we need to support ranges, too, not just equals. Think events. Can be followup but it needs to be done. Minor gripe, $row->{_field_sql_storage_columnname($field_name, $column)}; the {} needs to be gone, move the contents to some $property , thanks.

chx’s picture

Oh and big big thanks for working this.

yched’s picture

FileSize
9.88 KB

Rerolled for the $row->{_field_sql_storage_columnname($field_name, $column)}; thing mentioned by chx.
Also documents the fact that array of values are supported ('column IN (values)').

About range conditions : I can see scope creep coming... The risk is to have to reinvent a whole new query syntax, different from dbtng, and different from Views...
Well, maybe we could inspire from db API's $query->condition() method, and accept a 'condition' as an array of (column, value, operator):

$conditions = array(
  array('bundle', array('article', 'page'), 'IN'),
  array('value', 10, '>'),
  array('value', 15, '<'),
);

The db API can rely on fairly standard SQL operators, though. We cannot assume SQL, but we can probably take the operators as tokens. Other storage backends will need to translate accordingly, just like they will have to translate schema-like 'columns' definitions.
Thoughts ?

chx’s picture

You can easily drop $item[$column] = $row->{$column_name}; the {}. Let's move range discussion to next patch.

yched’s picture

Status: Needs work » Needs review
FileSize
25.64 KB

Updated patch :
- Adds hook_field_attach_pre_query() for modules that bypass the storage engine (PBS...)
- Adds two result modes : return ids only or return field values.
- Adds hook_field_load() and hook_field_attach_load() calls for the 'field values' result mode, as discussed in #8 / #9..
- Adds a first set of tests
- Improves PHPdocs

A few words on _load hooks :
- Single-field hook invocation :
_field_invoke() remains the single entry point for hook_field_[op]() invocations. It now accepts an $options array parameter :
$options['field_name'] = 'field_foo' will call the hook only for field_foo field.
$options['default'] = TRUE replaces the previous $default parameter used by _field_invoke_default().
This $options array might also come handy for other stuff later (translatable fields, if/when it lands)

- Both hook_field_load() and hook_field_attach_load() need an $object parameter. Unlike other field_attach_*() functions, field_attach_query() does not receive an $object from the outside. We thus need to build one from the raw storage values retrieved by the query.
The patch introduces a new function field_attach_create_stub_object() (suggestions welcome for a better name), that creates an $object of a given type from given bundle, id and revision keys. Of course, the function can only create pseudo-(nodes|users|...): we can only populate properties known by the Field API. Here again, we lack a Data API, and Field API is the closest we have.
The patch adds a note in PHPdocs for hook_field_load() and hook_field_attach_load(), stating that they should not assume a full-fledged $object. I don't think it should be a problem. hook_field_load() works on its $items param rather than on $object. hook_field_attach_load() works on $object, but should be entity-type agnostic - if you want to do node-specific stuff, use hook_node_load().

Still a few TODOS to address :
- From the code snippet in Barry's comment #4 : 'Not all storage engines are required to support queries on all column types'. Seems like a serious restriction. How can a module providing 'X as field' safely build on this, then ?
This also applies to the 'condition operator' feature discussed in #14 / #16: '<' and '>' can probably be assumed to exist on non SQL storage backends, but what about 'LIKE' ?
How usable or reliable is field_attach_query() if you don't know for sure what will be supported by the storage engine ?
- Not sure how to handle the case of field types that don't define field 'columns' and use hook_field_load() to handle their own storage. In D6, this mainly involves Eaton's Amazon field, AFAIK. Right now, those fields are not queriable.
- tests missing : field_attach_query_revisions(), 'field values' result mode, check hook_field_load(), hook_field_attach_load(), hook_field_attach_pre_query() invocations...

Setting to needs review, because reviews is what this needs now.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
21.7 KB

Updated patch , fixes exceptions caused by an incomplete variable rename.

yched’s picture

FileSize
73.7 KB

Patch adds 'operators', as discussed in #14 / #16. Mirroring the 'autocomplete text field' feature in profile.module will require operators...
Sample code :

$conditions = array(
  array('bundle', array('article', 'page')),
  array('value', 'foobar, 'starts_with'),
);
$results = field_attach_query('field_foo', $conditions);

Allowed operators : =, !=, IN, <, <=, >, >=, starts_with, contains
'=' or 'IN' will be assumed if operator is omitted.

bjaspan’s picture

Status: Needs review » Needs work

This is a nice piece of work, good job. Your patch seems to remove/rewrite a substantial portion of field.test and I can't think of why it would need to, so I did not review the tests at all.

* Nit: "The field_attach_query_revisions() function additionally supports:" should be out-dented.

* Re: "how can a module providing 'X as field' safely build on this"? If there is a field type whose data schema includes a blob column (perhaps precomputed_random_bytes field type), and a site is using the WackyCoolDB Field Storage Engine, and it turns out that WackyCoolDB does not support queries on binary data, and if you then use a module/build a view/whatever that requires queries on the blob column of a field, that module/view/whatever just won't work. Presumably the WackyCool Field Storge Engine module would document this limitation, maybe throw an exception when it receives such a query, or whatever.

* Re: "what about field types that handle their own storage?" Why *wouldn't* we provide a hook_field_query()? We provide hook_field_load(), and hook_field_query() is really just a more general case of load.

* If we provide a hook_field_attach_pre_query(), why wouldn't we provide a hook_field_attach_query()?

* Documentation for field_attach_query() should say that if you query on revision id, the return array is keyed by revision id, not object id (assuming I read the code correctly).

I should implement pbs_field_attach_pre_query() to make sure it is possible. :-)

yched’s picture

FileSize
23.52 KB

Yes, I usually comment out unrelated tests so that I can run the test being written faster, but forgot to uncomment them before rolling the previous patch. The patch doesn't affect existing tests.

- Fixed the comment indentation you mentioned
- Added a FieldQueryException class, and a note about it in the yet-to-finalize PHPdoc for hook_field_storage_query()
Not sure yet of the format to use for FieldQueryException::$errors.
- Documented the id / vid subtlety for field_attach_query_revision(). I also slightly changed the FIELD_QUERY_RETURN_IDS format so that the key is id or vid depending on the $age param, but the value is always id. When querying on revisions, you'll probably want to know both.

yched’s picture

Thinking a little more about hook_field_query():

In my mind, use cases where two-fold:
1) field types that handle their own storage (AFAIK in D6 that's only Amazon module)
2) reference field types, that add to the field values fetching from another table (filefield)

In the case of 2), the role of hook_field_query() would be to allow filefield to support queries like 'give me all the objects where file size > X', 'where file mime is Y'.
Problem : a query like the above would first have the storage engine return *all* values (no conditions on base columns), in a potentially huge PHP array, which hook_field_query() would then filter out. Performance + memory issues.
On the opposite, if we make the storage engines return an empty array when there are conditions on non base columns, then
a) hook_field_query() would need to extract the values itself for records that match the 'external' condition. How ? re-query storage modules :-) ?
b) this doesn't allow mixing conditions on base and non-base columns ('where bundle is X and file mime is Y')
There has to be a PHP bridge between 'abstract storage engine' (base columns) and SQL (columns of referenced tables). It just occurs to me that this is exactly what Crell says in http://www.palantir.net/blog/remote-data-drupal-museums-and-web-2009#com... "Searching across separate datasets has no solution that anyone has found that does not involve first caching everything in an intermediary server and then searching that".
So 2) looks like it's out of reach.

For the same reasons (no querying on separate datasets), supporting 1) means that we have to defer the whole query to hook_field_query() when the field has no 'columns', and hook_field_query() has to support conditions on type, bundle, entity id, revision id:

if ($field['columns']) {
  // Ask '_pre storage' modules (hook_field_attach_pre_query) or fallback to storage engine (hook_field_storage_query)
  // Let the field type module complete/alter the returned values (hook_field_load)
}
else {
  // Ask field type module (hook_field_query)
  // hook_field_load() is probably not needed in this case, hook_field_query() should be able to populate all the values
}
//  let all modules have their say on the returned values (hook_field_attach_load)

About hook_field_attach_query(): I don't see that hook_field_attach_pre_query() implies a hook_field_attach_query(). In my mind, _pre hooks are only related to the existence of a storage hook. hook_field_attach_X() is more the 'open for all' equivalent of the field-type hook_field_X(). This probably means that the hook_field_attach_pre_X name pattern is not right, BTW. Maybe we do need a hook_field_attach_query() all the same, but I'm not clearly seeing the requirements for now.

yched’s picture

FileSize
28.19 KB

Updated:
- reroll
- stop calling hook_field_attach_pre_query() hooks as soon as a module claims he handled the query.
- moved forward on PHPdocs
- settled on a basic format for FieldQueryException::$errors : array of error messages. Not sure what the best practices are in this regard, suggestions welcome.

Remarks:
- Field API support for field types that handle their own storage is not clear at the moment.
I suggest we defer this topic to a separate task, and not burden this patch with the hook_field_query() discussed in #24 for now.

- Similarly, remaining tests (result formats, field_attach_query_revisions) could be added in a separate patch ? Would be a cool task for a 'FiC initiative' contributor :-)

- We'll be able to optimize the hook_field_load/hook_field_attach_load calls when #362024: Field Attach: Make hook_field_load() be 'multiple' like field_attach_load() lands. We'll have an issue with querying revisions, though:
field_attach_query_revisions() generates arrays keyed by revision id. Currently, the 'load multiple' features in core assume arrays keyed by object ids, meaning no multiple load on several revisions of the same object. The patch in #362024 stays inline with this. Changing it to support arrays keyed sometimes by id, sometimes by revision id sounds messy.
Maybe when querying all revisions, we'll need to turn multiple-object invokes into a series of single-object invokes. field_attach_query_revisions() + FIELD_QUERY_RETURN_VALUES would be less performant, but I think that's an edge case.

So, back to 'needs review'.

yched’s picture

Status: Needs work » Needs review
Berdir’s picture

Status: Needs review » Needs work

Just some minor stuff..

+  $query->fields('e', array('type'))
+    ->condition('deleted', 0)
+    ->orderBy('delta');

The first fields call should be on the next line

+    // Prevent warning if $operator is implicit.
+    @list($column, $value, $operator) = $condition;

I don't like "@"'s :) Also, I don't understand why it is needed (I'm just looking at the patch and have 0 knowledge about fields internals ;))

yched’s picture

Status: Needs work » Needs review
FileSize
29.63 KB

- Added tests for result formats
- Fixed Berdir's 1st point in #27
- About the second point: Added a clearer comment above the @list call. No biggie, but I still advocate for this code to be a legitimate use case for @, allowing terser code than a tedious

$column = $condition[0];
$value = $condition[1];
$operator = isset($condition[2]) ? $condition[2] : NULL;

If this bugs Dries or webchick, we can switch to that.

catch’s picture

Maybe when querying all revisions, we'll need to turn multiple-object invokes into a series of single-object invokes. field_attach_query_revisions() + FIELD_QUERY_RETURN_VALUES would be less performant, but I think that's an edge case.

I don't think we need to worry about making querying revisions fast - especially not if that means having _load() stuff worrying about multiple revisions of the same object, so seems better to make those singular.

chx’s picture

The reason I used mainly isset/empty while making D6 notice free is that @ is relatively slow however in this case it might be microoptimization to fix it.

catch’s picture

This patch includes the field_attach_load() multiple changes, is that intentional?

yched’s picture

@catch: no it doesn't :-)
What it does is add a 'single field' mode to _field_invoke()

catch’s picture

Whoops, sorry yched.

yched’s picture

Which outlines that I failed to reflect those changes in the PHPdoc for _field_invoke().

yched’s picture

FileSize
30.47 KB

Er, and here's the patch.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
40.27 KB

Completed tests and PHPdocs. This should be ready for final reviews.

Summary of what the patch does - see #4, #18 and (fairly technical) #24 for more insight on design decisions.
- Adds field_attach_query() and field_attach_query_revisions() functions, letting modules do stuff like 'give me the objects whose values for field_foo match these conditions'. Amongst other things, this will be notably needed by 'reference' field types (noderef, filefield, even text field as 'reference to a text format') to clear stale cache data.
The functions can return results in two formats: ids only, or full field values. See the PHPdocs for the syntax to specify conditions.
Conditions across different fields are not supported, we're not rebuilding Views ;-)
- Adds hook_field_storage_query() and hook_field_pre_query() hooks for storage modules to handle the queries.
- Adds a FieldQueryException exception class, to be used by storage modules that might not support the full range of the query syntax.
- Adds tests for field_attach_query() and field_attach_query_revisions()

The 'full field values' result mode required a few additions:
- A field_attach_create_stub_object() function to create an $object from object properties known by the Field API. Can be seen as a reciprocal to field_attach_extract_ids().
- To get accurate results, the raw data handed by storage modules need to go through the same 'load' hooks than in field_attach_load().
This is done through the regular field_invoke() mechanism. field_invoke() and field_invoke_multiple() have been modified to accept a 'run on a single field' mode, for which other use cases have been identified in the past (display form for one field only, display a single isolated field...). The $default parameter for field_invoke() has been turned into an $options array: $options['default'], $options['field_name']. #367595: Translatable fields might use an $options['language'] later on.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Priority: Normal » Critical
Status: Needs work » Needs review
FileSize
40.04 KB

Rerolled after _field_sql_storage_tablename() signature change in #364620: Allow creating a field with a deleted field's name
+ bump, and raising to critical. Followup work on reference fields (including text fields) now depend on this.

yched’s picture

FileSize
40.04 KB

chx noticed a mixup in the PHPdoc for _field_invoke $option['default']. Fixed now.

yched’s picture

FileSize
40.04 KB

And that PHPdoc was present in two places :-p

chx’s picture

Status: Needs review » Reviewed & tested by the community

I spent hours reading this patch back and forth. I think it's OK.

webchick’s picture

First a couple very simple things from a "Coder module" pass:

+/**
+   * Test field_attach_query_revisions().
+   */

Need to fix indentation.

 /**
+ * Exception class thrown by field_attach_query() when the query is not
+ * supported by the storage engine.
+ */

The first line of PHPDoc should always fit into 80 chars, with more details put on lines below it.

+ * API knows about : bundle, id, revision id, and field values (no node title,

about:, not about :

+define('FIELD_QUERY_RETURN_VALUES', 'FIELD_QUERY_RETURN_VALUES');
+/**

Add a newline between these lines, please.

+    // A condition is either a (column, value, operator) triple, or a
+    // (column, value) pair with implied operator.
+    @list($column, $value, $operator) = $condition;

This does bug me, but I read in #28 the reasoning so that's fine.

+    $id = ($load_current || empty($entity_type['revision key']))  ? $row->entity_id : $row->revision_id;

One too many spaces before the ?

I want to spend a bit of time reviewing http://views2.logrus.com/doc/html/query_8inc-source.html to see how this is similar or different to Views's query object. If we can, it would be wonderful to have Earl review this patch himself, since he's obviously someone with a lot of experience writing query abstraction.

bjaspan’s picture

Status: Reviewed & tested by the community » Needs work

I moved discussion about hook_field_query() to #477124: Supporting querying field data outside field storage.

In _field_invoke():

There is an incorrect elseif clause:

      if (drupal_function_exists($function) {
        // do stuff
      }
      elseif (isset($result)) {
        $return[] = $result;
      }

Shouldn't the code following the "populate field values back" comment only occur inside the if (drupal_function_exists($function)) block?

In field_attach_query():

The paragraph in the PHPdoc that begins "Note that the field values..." will make more sense if you reverse the order of the sentences, I think.

Docs for $conditions: Field Attach functions have an $obj_type argument. This function accepts queries on 'type' instead of obj_type or entity_type, and then also accepts an 'entity_id'. Should we unify this now, or wait for #470274: Use the term 'entity' instead of 'object'?

Perhaps "not all storage engines" comment should be on the bullet item listing columns for $field_name's field type, since those are the columns the storage engine might not support.

"module claims he handled" -> "module claims it handled."

if ($age = FIELD_LOAD_CURRENT)

needs ==, and as webchick points out, we need more test coverage.

When querying revisions, why do we create a single-object array and call _field_invoke_multiple('load') instead of calling _field_invoke('load') on the single object?

In field_sql_storage_field_storage_query(), why orderBy('delta')? The docs specifically say items may not be in correct delta order.

I need to implement hook_field_attach_pre_query() in PBS.

webchick’s picture

Ok, now the more thorough review with lots of questions:

+class FieldQueryException extends FieldException {

I grepped core for Exception and noticed that in most places (DBTNG, mostly) we just do:

class BlahDeeBlahException extends Exception { }

Why does this one need a constructor? I asked this in IRC, and Barry pointed out that field.attach.inc defines a FieldValidationException which also defines a constructor, but that one structures error messages in a specific way with error code strings attached, where this just appears to be a bunch of strings.

+function _field_invoke($op, $obj_type, &$object, &$a = NULL, &$b = NULL, $options = array()) {

That $op and $a and $b crap throws me for a loop every time I review a patch that touches that function! I created #477132: Refactor _field_invoke() to not use $op, $a, and $b to discuss this.

But within the function itself, there's an awful lot going on and not a lot of comments to help someone unfamiliar with the Field API follow along. Could you do the "~one 1-line comment per every 4-5 lines of code" treatment to this function?

+function _field_invoke_multiple($op, $obj_type, &$objects, &$a = NULL, &$b = NULL, $options = array()) {

That one too. :)

  * Invoke field.module's version of a field hook.

As opposed to what? (this isn't something that needs to be changed in the patch, I'm just curious)

+ *   Example values for $conditions:
+ *   @code
+ *   array(
+ *     array('type', 'node'),
+ *   );
+ *   array(
+ *     array('bundle', array('article', 'page')),
+ *     array('value', 12, '>'),
+ *   );
+ *   @endcode

Nice. I appreciate that this mirrors fairly well conditional clauses in DBTNG.

I notice that in DBTNG you have to specify "IN" in an array clause because you can also pass in "BETWEEN" if there are two of them. I know in #16 you said this was scope creep, but I wasn't sure if knowing that DBTNG supported this would change your mind.

Also if "startswith" is a valid operator, does it make sense to make "endswith" one too? Or too scope creepish?

That's about all that sticks out to me right now. I'll want to make another pass at this before committing. (though am perfectly fine if Dries beats me to it)

And ideally, I would love to get some feedback first from people like Earl, eaton, drewish, KarenS, etc. Basically, people with experience building "reference" field types and Views. This isn't a pre-requisite necessarily because I know this patch is holding up a lot, but it would make me more at ease pressing the commit button if it had a few more positive architectural reviews.

yched’s picture

FileSize
42.03 KB

Updated patch for comments in #43 / #44 for now, keeping #45 for later :-)

re: webchick #43:
- Add a newline between these lines, please.

+define('FIELD_QUERY_RETURN_VALUES', 'FIELD_QUERY_RETURN_VALUES');
+/**

This 'missing newline' was used to 'groups' constants that flock together. menu.inc uses @name docblocks, so I went for this.
Shall I fix field.module's other existing constants in this patch ?

re: bjaspan #44:
- $obj_type argument / 'type', 'entity_id' conditions.
Hm. The names for the conditions are directly taken from the actual column names used by field_sql_storage.module's data tables (and {field_config_entity_type} for 'type').
Those names are purely internal to that storage engine, so we could use different ones and have field_sql_storage_field_storage_query() translate. For instance: 'obj_type' and 'id' for the 'conditions' names ? This would make us at least consistent with the current 'obj / entity' inconsistency :-). Then, yes, I think the overall inconsistency would be best handled in #470274: Use the term 'entity' instead of 'object'
I'll roll the changes if we agree on this (or on something else).

- "Perhaps "not all storage engines" comment should be on the bullet item listing columns for $field_name's field type, since those are the columns the storage engine might not support."
Not sure what you mean. If there's one thing we can be sure, it's that storage engines store all the field columns, so I'd rather expect the uncertainty to be about meta 'columns' ('bundle', 'type'...), which are more dependent on how the module decides to structure its storage ?

- "When querying revisions, why do we create a single-object array and call _field_invoke_multiple('load') instead of calling _field_invoke('load') on the single object?"
Because hook_field_load() is a 'multiple' hook, that's hardcoded in its signature. Only _field_invoke_multiple() knows how to build the matching function call, and _field_invoke_multiple() expects an array of objects.
We could have _field_invoke_multiple() accept a single object and build the array itself, as syntactic sugar, but it wouldn't help here: we also call hook_field_attach_load() which also expects an array of objects. The same will be true for any other 'multiple' op we might have in the future : the "field invoke" will most likely be followed by a hook_field_attach_[op] invoke...

- "if ($age = FIELD_LOAD_CURRENT) / as webchick points out, we need more test coverage."
The typo wasn't caught because testFieldAttachQueryRevisions() happened not to test FIELD_QUERY_RETURN_VALUES with two matching revisions. It does now.
Aside from that, the tests are already fairly extensive IMO, but suggestions and help welcome ;-)

- "In field_sql_storage_field_storage_query(), why orderBy('delta')? The docs specifically say items may not be in correct delta order."
What the docs mean is that you'll only get a subset of the actual values (those matching the conditions), thus not with their actual delta: if value0 doesn't match the query and value1 does, value1 will appear keyed with delta 0. But order is preserved. I tried to make the doc a bit clearer.

yched’s picture

Status: Needs work » Needs review

Updated for webchick's comments in #45:

- removed the FieldQueryException::$error property, and the constructor.

- added code comments for _field_invoke() and _field_invoke_multiple()

- "field_invoke_default() : Invoke field.module's version of a field hook - as opposed to what ?"
This means field_default_[op] as opposed to hook_field_[op]. I tried to make the PHPdocs a little clearer.
If our API was OO, this would be good old inheritance / override : field_default_[op]() would be Field::[op]() and hook_field_[op] would be TextField::[op](), calling parent::[op]() if needed, and we wouldn't need two 'invoke' and 'invoke default' concepts. But our API is not OO :-/.

- Added 'ends_with', no big deal. Added 'BETWEEN', I wasn't aware that was a standard SQL, DBTNG-supported construct, so sure, why not. It's up to other, non-SQL storage engines to chose which set of operators they can / want to support, I guess.

- To keep consistency, moved all operators to caps : STARTS_WITH, ENDS_WITH, IN, BETWEEN...

yched’s picture

FileSize
44.32 KB

Forgot the patch.

bjaspan’s picture

FileSize
45.59 KB

hook_field_attach_pre_query() needs a $age argument. New patch attached.

I mostly implemented hook_field_attach_pre_query() in pbs.module. Except for the missing $age argument, I see no problems with the API. However, I'm about to create a separate issue to explain why I think pbs.module should not implement this hook. :-)

bjaspan’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
45.57 KB
yched’s picture

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

webchick’s picture

Status: Needs work » Reviewed & tested by the community

HEAD broke.

webchick’s picture

Status: Reviewed & tested by the community » Needs work

OK, looked at this one more time and this looks good, based on my understanding. Committed to HEAD! Now we should be able to get rolling on some other patches that depend on this.

However, could someone please file a quick follow-up patch for:

+    $this->assertEqual($result, $expected, t('FIELD_QUERY_RETURN_VALUES result format returns the expect result'));
+
+    // TODO : test

Test what? :)

Also:

+    switch ($operator) {
+      case NULL:
+        $operator = is_array($value) ? 'IN' : '=';
+        break;

The only problem with this nicety is that DBTNG does not do this, and people are going to expect both APIs to behave consistently. Discussed with Crell in IRC and he has no problem with adding this there. Let's make a follow-up issue for that as well.

webchick’s picture

yched’s picture

Status: Needs work » Needs review
Issue tags: +Quick fix
FileSize
2.2 KB

- hotfix for a remaining _field_attach_extract_ids(), while the function doesn't exist anymore after
I guess this broke tests :-(. My bad. Not sure why this wasn't detected in #51.

- removed the stray TODO in testFieldAttachQueryRevisions(). I know it's an easy excuse ;-), but simply forgot to remove it after actually doing the work (test field_attach_query_revisions() with a result set containing several revisions of the same object)

- Since DBTNG now handles the default $operator (= or IN), field_sql_storage_field_storage_query() doesn't need to do it anymore.

webchick’s picture

Status: Needs review » Fixed

K, committed this. Hopefully it should be safe to turn testing bot back on again. Very strange that this wasn't caught earlier. Hrm.

Status: Fixed » Closed (fixed)
Issue tags: -Quick fix

Automatically closed -- issue fixed for 2 weeks with no activity.