At the suggestion of Catch, I've broken up the task #412518: Convert taxonomy_node_* related code to use field API + upgrade path into pieces that may be easier to evaluate, test and commit.
Term.module is basically a copy of list.module that uses taxonomy functions to gather allowed values. No other magic happens.
I could have used list module with its allowed values function, but I couldn't see a way to extend list.
Options.module provides the widgets. It also requires a patch.
I tried writing tests, but it was a disaster. I'll post them separately tomorrow so the patch isn't overlooked on the basis that bad tests were included. I do want to get it right, though! If only list.module had some tests I could rip off...
Comment | File | Size | Author |
---|---|---|---|
#90 | term_field_changelog-491190-90.patch | 728 bytes | Anonymous (not verified) |
#87 | d7-term-field_491190_86.patch | 12.84 KB | Anonymous (not verified) |
#80 | d7-term-field.patch | 12.41 KB | Anonymous (not verified) |
#67 | d7-term-field.patch | 12.19 KB | Anonymous (not verified) |
#66 | d7-term-field.patch | 12.34 KB | Anonymous (not verified) |
Comments
Comment #1
Anonymous (not verified) CreditAttribution: Anonymous commentedThere's a patch in the CCK issue queue to allow term.module fields to be configured in CCK. #491196: CCK Field UI support for term fields. This also requires #491130: cck_field_update_field does not generate the 'data' field before calling drupal_write_record to be fixed, and there's a patch for that too.
Comment #2
Dries CreditAttribution: Dries commentedDuplicate of #412518: Convert taxonomy_node_* related code to use field API + upgrade path?
Comment #3
Anonymous (not verified) CreditAttribution: Anonymous commentedIt is not an intentional duplicate. That issue describes the scope of the problem and it has a few patches that are still in heavy development.
THIS issue is a patch that needs review.
And here's a new one: previously I was creating a module for term fields. This creates NO NEW MODULES. It only patches taxonomy.module and list.module.
To use it, you must create a field along this pattern:
You must also use these formatters for your instances: term_link OR term_plain.
The aforementioned CCK patches will let you easily create a term field for nodes.
This version of the patch is much much better.
Comment #4
yched CreditAttribution: yched commentedGreat to tee you embrace this, bangpound !
Sticking this on the generic 'integer list' field type seems odd, though. Just like we wouldn't cram 'node reference' field type on the 'list' field type, I don't think we can spare a dedicated 'term reference' field type.
Comment #5
Anonymous (not verified) CreditAttribution: Anonymous commentedyched: You don't think we can spare a dedicated term reference field type?
I also created a term reference field the first time around, but my term reference field was 100% identical to the list module except for the allowed values function, which is pluggable in list module. As far as I can tell, if we introduce a special term field type, we are essentially forking list module and we'll lose the ability to re-use list module widgets with this field.
A taxonomy vocabulary is just a list of names, no?
Comment #6
webchickThe problem I have with this hunk:
Is that if Organic Groups module wants to expose "Group selector as a field" (which is entirely possible), they have to provide a patch to list.module in core. Big ol' -1 to that.
I understand the concerns about code duplication, but we ultimately need figure out a way to make whatever works here work for both core and contrib.
Comment #7
Anonymous (not verified) CreditAttribution: Anonymous commentedWebchick:
The patch to list.module actually isn't necessary. I thought it was, but... nope!
This can be accomplished by only patching taxonomy.module.
And I've now implemented taxonomy_field_attach_load so that the whole set of terms can be loaded at the same time.
I am not going to work on this patch more today. I would appreciate some help from others about the question of reviving term.module or using an unmodified list.module.
Comment #8
Anonymous (not verified) CreditAttribution: Anonymous commentedSome thoughts about using list.module vs. a term-specific module.
What is not possible with list.module is the creation of new values. If we want all of our term fields to support widgets who let users optionally add new terms, we'll need a new field module devoted to terms. I could add support for adding new allowed values to list.module, too.
We still need an autocomplete widget. I see no reason why an autocomplete widget wouldn't also be useful for list fields.
I briefly discussed an idea of separating tags into a separate module. It would still rely on taxonomy. It would still be a field. However, tags would be a one-off field with an autocomplete widget that also supports adding new tags.
I can see creating a term field module before D7 is frozen, but I know for a few more weeks, that module would be identical to list and that list may continue to develop and change.
All of that said, I'm eager for feedback and advice. I'm happy to be working on this task!
Comment #9
yched CreditAttribution: yched commentedI see the arguments about 'term_ref field almost equals list field'. Right now in D6, node_ref and user_ref are maybe like 70% code duplicates, but the differences are not easily mergeable within the current field concepts. An OO model where a field type can subclass another would possibly alleviate some of this, but that's not likely to happen in D7.
More specifically on 'taxo as field':
- The problematic hunk
+ 'settings' => array('allowed_values_function' => '', 'vid' => ''),
is gone, but taxonomy_field_allowed_values() still refers to$field['settings']['vid']
.'vid' becomes an 'unofficial setting'? Works, I guess, but feels odd, and could easily break in the future.
Could possibly be worked around with a generic 'allowed_values_args' setting.
- the point raised in #8 about specific widget needs also applies to formatters : the current approach makes the 'Term: link', 'Term: plain text' formatters available for all 'list' fields, while they don't make sense for 'regular' list fields. This approach thus cannot be extended to other 'almost list field' field types without clobbering the list of display options with irrelevant choices. If the approach doesn't scale to similar use cases, I'm not sure it's a good approach.
- In #111127-207: Cache node_load we defined a generic pattern for 'reference' field types. For 'taxo as field', it translates to:
For performance, actual term data needs to be multiple-loaded at field_attach_load() time and get saved in the field cache, instead of having separate taxonomy_term_load() calls in the formatters (note : it seems to be the role of taxonomy_field_attach_load(), but I'm not sure what the function actually does).
Then when a term definition is changed or a term is deleted (hook_taxonomy_term_[update|delete]), the field cache needs to be cleared, using field_attach_query() : "find all objects that have value $tid on a 'term_ref' field, and clear their field cache"
With the current approach the latter needs to become "find all objects that have value $tid on a 'list' field with 'allowed_values_function' == 'taxonomy_field_allowed_values' ". Ugh.
- Usability: how do you explain to end-users "you want to add this vocab to this node type ? sure, add a 'list' field, and make sure to provide the right allowed_values_function" ?
I'm pretty much convinced this deserves a dedicated field type.
Comment #10
catchSo taxonomy_field_attach_load() currently just pre-loads the terms into memory - then taxonomy_term_load() is pulling them from the static instead of the database individually. But agreed it'd be better to have these in the actual field cache instead. I think yched's making good arguments in favour of a term reference field, there's not a lot of code duplication really between the two patches, and I can see this extending a lot.
Just a side note, I'd like to be able to use autocomplete for non-tagging vocabularies - for both interface and performance reasons with huge vocabularies - so it'd be good to have the autocomplete widget and actual term creation separated (similarly if hierarchical select becomes a field formatter and lets you create terms from select lists).
This is looking really nice though :)
Comment #11
Anonymous (not verified) CreditAttribution: Anonymous commentedthanks for the feedback yched. i'll post a new patch with term.module separated tonight.
i was also trying to demonstrate and experiment with list.module... and i was trying to avoid writing tests. ;-) does anything in core use list.module yet?
given the lack of a UI for adding fields to bundles in Drupal 7 right now, i am not sure how i can tell you what end users should do.
Comment #12
yched CreditAttribution: yched commented@bangpound:
- No, nothing in core uses list fields yet. 'Node body' is the only core field, right now...
- CCK HEAD is updated to work with D7 Field API (but I guess you know that, you posted a patch over there)
Comment #13
chx CreditAttribution: chx commentedNo! Dont let them! Benchmarks! What is this craziness of putting vocabularies in titles when the issue is about terms and not just vocabularies?? That's like trying to make aggregator items nodes when the issue title says make aggregator feeds nodes.
Comment #14
Anonymous (not verified) CreditAttribution: Anonymous commentedHere's a new patch.
Term.module is now a separate thing. I'm no longer piggy-backing on list.module but it is more or less a straight copy right now. Options module is patched so we have some widgets for term fields.
I did my best at implementing the cache invalidation and field value integrity functions as described by yched in #9. It's still a little bit mysterious, but I think I manage to do it without breaking the multiple loading feature on fields.
I chatted with chx in IRC briefly about his concern about the titling of this issue, and I'm going to be maintaining a daily presence in IRC while I work on this and related patches. You can also tweet @bangpound to get my attention if I've got my head down.
Comment #15
catchRather than just unsetting the ID in term_field_load() we should also stuff the term object in there - hook_field_load() is cached - and won't be run each request, so we actually have to do this if we want to pre-load terms at all, I forgot this when talking about it in irc :(. text_field_load() does similar. This probably means iterating over each delta rather than the array_search, but at least all the work will be cached. Not sure where to put the term object though - can this be in 'value'? Or we need a separate array key for the full term object?
I'm a bit concerned about the formatters:
That means running term_allowed_values() for every term shown on every page - since we're removing invalid tids in hook_field_load() that should be enough no? Even if it's not I think the formatters should be dumb and do the minimum amount of work possible - allowed values should validate input, can't be responsible for already saved data if admins change it later.
The cache clearing looks good. If/when #333171: Optimize the field cache goes in we should be able to make that an array of cache IDs to clear instead of one by one deletes.
Comment #16
chx CreditAttribution: chx commentedfunction term_field_schema is weird with a default only switch. I think we want tests too.
Comment #17
yched CreditAttribution: yched commentedRe catch #15: The additional data loaded in hook_field_load() should be put in additional keys. D6 filefield merges data from the {file} table into a flat array with keys fid (the original, raw field value), filename, filepath, etc...
Didn't think about this earlier: with terms now being fieldable, calling taxonomy_term_load() in term_field_load() triggers field_attach_load() on terms and introduces potential infinite loops. More general issue with any 'foo reference' field if 'foo' entity is fieldable. Think "term with a term field" (odd but not forbidden) or "node references term which references node".
I'm still convinced that loading referenced data in hook_field_load() is the right approach, but I'm not sure of the best way to avoid infinite loops.
Maybe hook_field_load() should only load base properties: hardcoded, predictable set of properties, which are also the only ones needed by most common formatters, so this is good enough for us.
But:
- partial loads should be multi-loadable too
- they can not pollute the static cache for 'regular loads'
- yet having a static cache for partial loads would be handy too :-(
Agreed with catch's remark about invalid values and formatters - this should be changed also in list.module (not in this patch, but would be cool if someone created a separate task for that)
Comment #18
Anonymous (not verified) CreditAttribution: Anonymous commentedI asked bjaspan today in IRC about adding more stuff to the field item's array. He said that hook_field_load gets to set the agenda, and all the hook_field_attach_load implementations have to respect it. As such, we can set our index in the array to be anything we want. I chose term.
Terms are now loaded with the field and cached.
I can see the elegance of adding the whole term object to the value. What do others think?
Do I need to be unsetting the value in the item now? Shouldn't I just ignore it now?
I'm still playing with the settings array on the field. I've made the 'vid' value into an array. This opens the possibility of using multiple vocabularies in one field, which is a desirable feature that's easy to support. However it does raise some UI questions, which I'm posting on the original issue for this task.
The structure of hook_field_schema can be changed. I was seeing warnings about missing 'indexes' index for in field sql handler module, so I cut and paste the schema from list just to prove that my eyes weren't deceiving me.
Tests will come soon. There aren't many examples of field modules with tests that I can find, and my first attempt wasn't so good.
Comment #19
Frando CreditAttribution: Frando commentedRegarding the infinite loop problem. One way to somehow solve this is build modes IMO. Say you do a field_attach_load on a node which loads the terms with a build mode of "minimal" or "as field" or "no attach" or something, which doesn't even trigger field_attach_load on the terms, while on a term view page, a term is loaded with a build mode of "full", which then triggers field_attach_load. This of course implies that a) the build mode overhaul patch gets in, b) build modes are properly supported not only by nodes but also by other entities and c) build modes are already passed to the _load_multiple functions (which currently call field_attach_load). Especially the latter could be a great improvement by its own (making loading and _load hooks build mode aware).
Comment #20
Anonymous (not verified) CreditAttribution: Anonymous commentedyched: i think it's a safe assumption that the initial offering of reference fields will be fields that reference bundles. this is a hard one, for sure. i'm thinking about it...
Comment #21
yched CreditAttribution: yched commentedre Frando #19: build modes are currently of no help here because they're a *display time* property ('build') - and I think it should stay that way: node_load() loads full raw data, there's no variants. Once loaded, you might want to do different things with the object, including displaying it in several variants, but having several levels or flavors of 'load' would add much confusion IMO.
#409750: Overhaul and extend node build modes takes this a step further by removing the $node->build_mode property and extending/renaming the current (bool) $teaser param of node_view() and friends into (string) $build_mode param.
Comment #22
catchIs there somewhere in field api, after field_attach_load() where we can still have access to $objects - then we could have hook_field_attach_pre_render() or hook_field_attach_after_load() or something - do taxonomy_term_load_multiple() in there (not cached) - and this wouldn't call field_attach_load() on the referenced terms.
Comment #23
yched CreditAttribution: yched commented@catch: right now we don't.
Comment #24
Anonymous (not verified) CreditAttribution: Anonymous commentedthis might be a naive approach to the problem, but it does stop circular references from crashing the request. The function keeps a static tally of which object types and IDs it is currently loading fields for. If the function is called again to load fields for an object which is already loading, it simply doesn't try to load more terms.
i was able to deliberately create a circular reference like this: node X -> term Y -> term Z -> term Y. I cleared the cache on node X, and node X turned out fine. I could link to each of the terms including the circular reference. Then I cleared the cache while viewing term Y and then term Z.
I wish I could write tests...
Comment #25
Anonymous (not verified) CreditAttribution: Anonymous commentedthe last patch did it wrong for sure. it was unsetting the flag in the static loading array if the bundle's fields were loaded a second time, which meant the third circular reference would cause a problem.
Comment #26
yched CreditAttribution: yched commentedWell, if I'm not mistaken, this makes it impossible to predict what can be found in $obj->taxo_field[0] : 'tid' ? 'tid' + 'term' ? And thus sprinkles uncertainty on the rest of the field lifetime.
+ I don't think that approach works if you try to separately load objects referencing the same terms in the same request (think : in a test). Static as hysteresis factor.
My proposal to 'Load only base properies' in #17 still stands, IMO.
Comment #27
catchIf we only load base properties, while that's better than doing it in formatters, I can see a lot of formatters still having to do node_load() individually to get missing pieces. Opened #493314: Add multiple hook for formatters to discuss whether it's possible to have our cake and eat it too. Until that's resolved, I agree with yched that just a direct query to get $term->name and similar is the safest bet.
Comment #28
Anonymous (not verified) CreditAttribution: Anonymous commentedCatch is right to point out that this is going to be more severe for nodes than it is for terms in terms of the awkward constraints on caching and the need to load the referents if you want to use non-base-table data in your formatter.
Here's a new patch. The "term load multiple" cut-n-paste job can probably be made even smaller, but here's a start.
Comment #29
Anonymous (not verified) CreditAttribution: Anonymous commentedNow with 25 tests that check the validator, widget and formatter.
I reused text.test. My main dilemma was to understand what I needed to test. Field API tests its own code with made-up bundles and made-up fields that don't exist outside the test framework. By now I've figured out that I need to test the code in the module, and I can see how that's done with text.test.
Someone should check how I'm testing the select widget. Something is making me doubt the absence of [0] in the element name.
I'm testing the validator on the most basic settings. A term from the field's vocabulary passes validation. A term from another vocabulary triggers an error.
The options select widget appears.
The output of the formatter matches the term's name.
I can see where more tests could be written: link vs plain text formatter, multiple values, fields with multiple vocabularies, different widgets, etc.
Where should I be looking for code examples for Field API tests? field.test?
Comment #30
catchBecause term_field_load() is going to be cached in the field cache (and in any fieldable entity which maintains a static cache during the request) I think we can drop the static and query building and just do a straight SELECT * FROM taxonomy_term_data WHERE tids IN (:tids); The 'multiple formatter hook' discussion suggests we'll need to keep that hunk of code even if it's implemented in a different hook eventually, which ought to work out pretty well. Will try to have a look at the tests later - IMO as long as we have roughly equivalent test coverage to the other field modules we're alright - a lot of taxonomy tests will need to be rewritten when converting the nodeapi stuff to fields in that issue - so no point doing the same work twice.
Comment #31
Anonymous (not verified) CreditAttribution: Anonymous commentedThis patch removes the cut-n-paste of taxonom_term_load_multiple into term_field_load. Now I'm just doing what Catch suggests in #30.
Comment #32
moshe weitzman CreditAttribution: moshe weitzman commentedsubscribe
+ * TODO access control?: I think this should be removed. access control for terms is not implemented in core. let contrib modules deal with it using form_alter and node_build_alter.
+ * Theme function for 'default' term field formatter. : this text is repeated even for the plain formatter.
Comment #33
Anonymous (not verified) CreditAttribution: Anonymous commentedMoshe:
You mean the comment text! Aha! Yes I will fix that.
TODO Access control: It's not just about viewing and changing term field values, actually. Tagging widgets will allow users to create new terms. I remember a slide at Drupalcon DC where Catch showed us that vocabularies will have more granular permissions, but now I don't see that in core. Either way, creating a term is one permission I wanted to consider, but I agree that the full range of possibilities just need to be opened up for contrib and not actually implemented. Would we be doing that with node_build_alter and form_alter even with field api?
Comment #34
Anonymous (not verified) CreditAttribution: Anonymous commentedNew patch incorporates Moshe's change to the "plain" formatter's comment.
The TODO in "term_allowed_values" function has been changed.
For folks who are not very familiar with the various field and widget modules in D7 core, FIELDMODULE_allowed_values is called by options.module widgets. It's implemented only by list.module so far. In D6, new terms can only be created using the "tagging" autocomplete widget. A straight port would mean that I (or someone else) will write an autocomplete/add widget for term.module only.
If people want to use different widgets and still be able to add a new term, it means that I'm going to shake up options.module a little bit or I'm going to basically copy it to be a term-specific widget module that supports one new feature. So this is one of the next questions I need to deal with when I start dealing with tagging. #495774: Modularize tagging is where you can get your pies lifted off the ground around this problem.
Comment #36
Wim LeersSubscribing.
Comment #37
catchComment #38
catchI'll do an in-depth/nit-pick review later, couple of quick (but big) things.
1. Why do we have to change options module? Is there a way we could add an alter hook there or something if it's going to be a general problem?
2. I still think this should be taxonomy.module providing the fields instead of a new module - once field conversion is done on taxonomy_nodeapi, taxonomy.module would be pretty much useless without this for 99% of Drupal sites - so it's just extra stuff to add to the modules page.
Comment #39
Anonymous (not verified) CreditAttribution: Anonymous commentedCatch:
We have to change options.module (as far as I know) because the widgets need to define which fields they can work for. Is there another way?
My reasoning for keeping term.module outside taxonomy.module is this:
With all that said, I'm not intransigent on this issue. It's been easier to work with a separate term.module for this patch because my fork of taxonomy.module has some of this nodeapi to field work started but not ready to submit. I'm only trying to follow patterns I read in Drupal 7's code.
Comment #40
catchOK fields provided by individual modules - I guess the question is - would we add nodereference to node.module, probably we wouldn't. So I can live with a separate module.
I think we probably need a hook_field_widget_info_alter() - but that needs barry or yched to chime in - let's leave it as modifying options module for now.
Comment #41
yched CreditAttribution: yched commentedWe indeed need a hook_field_widget_info_alter() - more generally, we need a way to alter field_info, widget_info, formatter_info, and to alter corresponding settings form. 3 use cases in the latest 24 hours. I'll create an issue shortly.
Until then, we're left with changing options.module, but that's clearly not future proof. Contrib field types won't have that option.
I'd like to take a step back on the current approach:
a taxo field picks terms in one (or more ?) vocabularies, and several fields can possibly use the same vocabulary.
This means that the information previously stored in {term_node} is now scattered in several tables - possibly even several tables for a given vocabulary.
Similarly, semantics change from "node A has terms a, b, c" to "node A has term a in taxo field foo and terms a, b in taxo field bar"
Dumb example: vocab 'Rate', with terms 'good', 'average', 'poor'. Node type 'restaurant', with three taxo fields 'Food', 'Service', 'Price' using the same vocab.
The 'rating' flavor this example is intended: it's a semantic shift quite like the one from 'fivestar as a nodeapi module' to 'fivestar as CCK field', with terms instead of notation.
I have no real objection myself, just making sure everyone goes there willingly :-)
This raises a couple questions on taxonomy listing pages though:
- do we list all nodes that are tagged with the term in *any* field ? That's one query per field using the vocab - problem: with the field storage abstraction, sort and merge results have to be done in PHP and I'm not even sure pagination is solvable.
do we list all nodes that are tagged with the term in a given field, specified in the URL ?
- not just nodes can have taxo fields; but then how do we display taxo field values on a user object ? as a link to what ?
Looks like taxonomy/term/[tid] needs to become taxonomy/term/[field_name]/[obj_type]/[tid]
More generally: IMO it would be worth re-reading the discussion in http://groups.drupal.org/node/8793 - speculative 'taxo as field' discussion after the data API design sprint in winter 2008. I don't remember what the conclusion was back then (if any), and large parts of it are possibly stale now, but there might be good ideas or remarks.
Comment #42
catch@yched I agree with all these questions (not sure on the answers yet) - but I think they're more for #412518: Convert taxonomy_node_* related code to use field API + upgrade path. IMO we should focus on this issue as offering a 'term reference' field type which can be used orthogonally to the current taxonomy_node_*() integration. That lets you use terms with other entity types, and do stuff with views with the field, link the formatters to taxonomy/term/type/tid and stuff like that. Then we need to look at replacing taxonomy_node_*() integration in core to use that reference field, which is where all the really nasty, sticky issues are.
Comment #44
Anonymous (not verified) CreditAttribution: Anonymous commentedThis new patch avoids changing options.module and now relies on the hook_field_widget_alter provieded in #502522: Allow drupal_alter() on the various Field API declarative hooks.
Comment #45
RobLoachI really like where this patch is going. It demonstrates how powerful the Fields API is and is the direction we should go with how the Taxonomy is used on the node edit form. One concern I have, however, is that its an entirely separate module. I understand the argument that this is a CCK/FieldsAPI addition, but wouldn't it make sense to add it to the Taxonomy module itself since the Fields API is part of core now anyway? Maybe in a taxonomy.field.inc file?
Comment #46
joshmillerThis is a review for the patch at #44. Tests look nice. Just found some doxygen formatting issues. No spelling errors!
Doxygen standards say it should be "Implements" -- found throughout patch.
Missing Doxygen @file declaration
Readability: perhaps a whitespace line between comment and code?
Like I said, just a quick check of the spelling and code syntax.
Josh
Comment #47
Dave Reid@joshmiller: The current Drupal 7 coding standard for hook docblocks is "Implement hookname()." until we can decide on the best format (another patch out there).
See example http://api.drupal.org/api/search/7/_block
Comment #48
Anonymous (not verified) CreditAttribution: Anonymous commentedI've abolished term.module. Catch has advised to keep all the field api hook implementations in taxonomy.module, which hopefully is going on a diet very soon when we trim away nodeapi.
Please check how I've added _taxonomy_clean_field_cache() to the save and delete operations for terms.
Your reviews are always helpful. Thank you!
Comment #49
catchHere's a nitpick pass:
if (count($allowed_values) && !array_key_exists($item['value'], $allowed_values))
Seems like this could just be !isset($allowed_values[$item['value']]); ?
(string)$item['value']
I think the casting should be string $item - with the space, but have a feeling this is inconsistent in core as it currently is anyway.
While these are all potentially useful - they're not @TODO - this is broken, they're @TODO this might be nice, so I think we can remove them (but good to document in followup issues maybe).
No longer users taxonomy_term_load_multiple() so that just needs removing from the docs.
I think the query should have
$query->addTag('term_access');
added to it - then it'll respect modules like tac lite.
Needs some phpdoc - I defer to yched on field_attach_query() implementation but that looks good.
However
We can now pass an array of cache ids to cache_clear_all, so this could be
then cache.inc will do DELETE FROM 'cache_field' WHERE cids IN ($cids); internally rather than a one-by-one delete.
Not needed.
Term $term->tid doesn't
You can use t() for the assertions and therefore put $term->tid into a placeholder. Or you probably don't need to put the actual $term->tid into the assertion text - just "Term does not cause validation error" is usually enough.
Everything else looks great to me.
Comment #50
yched CreditAttribution: yched commentedThen we probably don't need a switch ;-)
minor : it's related to the 'taxo field' part of the module, so maybe it should be named taxonomy_field_allowed_values() ?
Capitalization, trailing dot + the function doesn't return anything, actually ;-)
_taxonomy_clean_field_cache() is fine (once it uses the new 'multiple clear' mentioned by catch), but any other 'reference' field type will need similar code. Would be cool to abstract this into a generic function in field.module, as outlined in #111127-207: Cache node_load. At worst, could go in a followup patch, but we'll need it anyway at some point.
Also, note that there's a discussion pending in #367753: Bulk deletion to refactor field_attach_query() a bit, because currently it has the potential to completely blow PHP memory on large sites. Not something this specific patch should worry about for now, though.
The current code just lacks a check for 'is object type cacheable' before trying to clean the cache. Something like:
Actually, it might even be better to parse all object types for non-cacheable ones (in most cases there won't be any), then run field_attach_query() with a 'object_type NOT IN (non cacheable types)' condition.
The NOT IN operator in f_a_query() is currently not documented, but actually supported (I just created #511486: field_attach_query also supports 'NOT IN' to document it).
Other than that, as I wrote in #491196: CCK Field UI support for term fields: is it really a good idea to support 'taxo fields' spanning over several vocabs ? On 'object edit form', they will show up as one unique select widget merging terms from different vocabs, I'm not sure about this.
Comment #51
catchI'd like to be able to have an autocomplete on the node form, which allows assigning terms from combinations of different vocabularies. That way users think they're just tagging, but you get the convenience of multiple vocabularies for everything else (management, views, theming, etc.). However that doesn't need to be dealt with by this patch, at all.
Comment #52
Anonymous (not verified) CreditAttribution: Anonymous commentedyched:
Are you concerned about the UI and sensible defaults for this feature? Do you think it's fundamentaly a bad idea? I would be happy to keep this feature out of the headlines and "marginalize" it in the UI (as an advanced setting? custom module field creation only?). For most users, 1 field per vocabulary, 1 vocabulary per term field is the design we'll want to use in examples and defaults. In fact, good speedy taxonomy/term/[tid] pages might depend on this kind of design.
However, there are a few use cases for allowing term fields to rely on multiple vocabularies for options. Catch has already mentioned one related to combining controlled and tagging vocabularies in one field. In addition to that one, I think the proliferation of auto-tagging services and modules could this case even more compelling to consider. For me, I need to be able to combine topic, toponym, and chronology vocabularies into a single field called "coverage" (the way Dublin Core uses that word).
The alternative is to force $settings['vid'] to be a single value and see field modules appear in contrib that introduce this one difference. Overloading field settings with undocumented or unoriginal values has already been mentioned as risky and bad, so the only sanctioned path is either contrib or let it be an array that usually has a single value. I see this as low-hanging fruit that feeds use cases already identified by contrib (see Content Taxonomy module for example).
Some problems are bubbling up though. For tags where the user could add new terms, which vocabulary should get the new term if 4 different ones are giving options to the field? If there are use cases for multiple vocabularies, are there also use cases for subtrees? Multiple subtrees? Multiple subtrees from different vocabularies?
Here's an alternative in this new patch: each field has required 'vid' and 'parent' settings. If the term field's widget is configured for tagging (not real yet), the new terms will be added in that vocabulary, under that parent term.
Additional allowed values are defined as repeated arrays with 'vid' and 'parent' keys. These are simple unions. There's no complex set mathematics going on here.
taxonomy_allowed_values would combine the top 'vid' with the ones under 'allowed_values,' but new terms would be added only following the primary settings.
It's also important to state that allowing fields to pull options from multiple vocabularies does not "break" the ... theoretical? informatic? underpinnings of taxonomy.module. The vocabularies themselves are allowed to remain ontologically "pure" and term fields provide the flexibility of expanding the field's domain across mutiple vocabs. Nobody has to bend their controlled vocabularies to fit their site's content model and UI goals.
yched suggested renaming 'taxonomy_allowed_values' to 'taxonomy_term_allowed_values', but this would require that the field's module be named 'taxonomy_term.' Options.module calls the field's MODULE_allowed_values. Can I set a different module in the field info?
_taxonomy_clean_field_cache.. umm.. yeah. I really can't talk about it now. I'm dizzy from all the cache clearing. I think it's extra loopy because of the way the available options subtrees are defined. However, I documented it quite a bit with comments.
Comment #53
Anonymous (not verified) CreditAttribution: Anonymous commentedComment #54
catchThe allowed values stuff seems to be commented out? If bangpound's approach with the extra array for allowed values works that seems like a good compromise to me - default to a field per vocabulary but leave things open for other uses.
I keep getting told not to use abbreviations in code comments - also extra space here which isn't needed.
!empty
is enough there I think.Comment #55
Anonymous (not verified) CreditAttribution: Anonymous commentedThe code is commented because the 'allowed_values' should by default be an empty array. Normal term fields will only define settings[vid] and settings[parent] will default to 0. Additional options for the field are additional arrays of 'vid' and 'parent' in that 'allowed_values' array. I've removed the comment. Where/how do I document the field's settings?
Comment #57
Anonymous (not verified) CreditAttribution: Anonymous commentedTest bot, I'm on to you...
Comment #58
yched CreditAttribution: yched commentedMultiple vids / parent terms in a field: the consequences are a bit over my head, but the approach looks sound.
_taxonomy_clean_field_cache(): OK, hard to abstract to a generic function because you're using extra logic specific to "taxonomy field allowed values" to avoid querying on fields that won't contain the tid anyway. Which is cool, I guess. We'll see about a generic function in a separate issue (when adressing 'text input format changed, clear text fields cache')
#512236: Design flaws in field_attach_query() got (partially) committed, and you updated the code to use FIELD_QUERY_NO_LIMIT. This won't fly in the long run (PHP memory explosion), but we can keep it until that issue settles.
if (count($cids))
could be justif ($cids)
, since $cids is initialized as an empty array anyway.Other case for cache clear: when a vocab gets deleted :-(. We'll want to clear field cache for any object that uses a field referencing that vocab. No need to check for individual tids, it's ok if we delete 'too many' cache entries. But what we want to avoid is 'run _taxonomy_clean_field_cache($tid) on each $tid in the vocabulary' (PHP timeout pretty fast). We'd need to ensure that taxonomy's workflow on vocab deletion doesn't lead to this case.
Comment #59
yched CreditAttribution: yched commentedAlso re #55 "where do we document the field settings ?".
This is actually #464548: Document field types specifics (settings, columns...) - in short: there should be a PHPdoc block at the beginning of each field-type module that documents field, widget and formatters settings. I'm not familiar enough with PHPdoc conventions and how they translate to api.drupal.org to know precisely how it should be structured...
Comment #60
Anonymous (not verified) CreditAttribution: Anonymous commentedyched:
All of the sudden I've become dissatisfied with the method. I think it's a design flaw to impose this constraint (options come from any vocab and parent but only one vocab and parent can receive a new term) on widgets. There's no reason a widget couldn't support more versatile "add a term" features. I need a few hours to think about this, because I don't want to break the reliability of the options.module widgets.
When a vocabulary is deleted, fields and field instances also persist. This problem is a little bit deeper than you've realized. ;-)
I understand your points about FIELD_QUERY_NO_LIMIT.
I'm not sure what to do about documentation. Of course I want to document, but I also don't know if I'd be setting the standard or if I'd be evaluated against one that I haven't found... does anyone have some specific formatting hints or a template so I can write it well the first time?
Comment #61
Anonymous (not verified) CreditAttribution: Anonymous commentedOk. I'm happy with this now. The field settings include an 'allowed_values' array which is an array of taxonomy trees (vid and parent term). For most of us, a single tree scope definition will be fine, and usually that tree scope will be just the vocabulary ID and parent term 0.
I will be posting the documentation for this new field at #464548: Document field types specifics (settings, columns...) so that this patch doesn't get held up because we're discussing the way fields should be documented.
Comment #62
Anonymous (not verified) CreditAttribution: Anonymous commentedI'm glad I looked at that again. That would have been messy ... to keep adding conditions to the field_attach_query.
Comment #63
yched CreditAttribution: yched commented@bangpound: I think you can post the doc about the field settings in this patch's PHPdoc for taxonomy_field_info().
At least it will be somewhere where it can be picked up later :-)
Comment #64
Anonymous (not verified) CreditAttribution: Anonymous commentedI hope you're right yched. Here's a new patch with settings documentation on the implementation of hook_field_info().
Comment #65
catchThis is looking great, minor stuff:
The new docs in taxonomy_field_info() look double indented?
#50 where?
There's phpdoc for formatter tests, but no tests underneath, best to hide that away.
Nice work!
Comment #66
Anonymous (not verified) CreditAttribution: Anonymous commentedNo wonder my to do list only ever grows. #50 was referring to yched's comment #50.
Comment about formatter tests removed. We're actually testing a formatter already. After code freeze, I'd like to make some boilerplate field tests that do a more thorough job of testing all facets of fields and widgets. We don't really have great examples in core yet.
Indentation could be still bad. I'm trying to indicate that the value of allowed_values is an array of arrays.
Comment #67
Anonymous (not verified) CreditAttribution: Anonymous commentedThis new patch fixes some code style and phpdoc format issues. It also merges the _testTermfieldWidgets() function into testTermfieldWidgets().
Comment #68
webchickI looked at this and only found pretty minor stuff in my first pass. A few comments not conforming to standards, an extra /** and a missing /** here and there, and had a couple field API-related questions for bangpound and yched. Overall this looks really good. I'd like to play with it a bit to see if there are more substantiative comments, but overall I found the code pretty easy to follow.
Comment #69
Anonymous (not verified) CreditAttribution: Anonymous commentedI rolled patch #67 after chatting with webchick in IRC, and it no longer has the issues she describes. Let me know if there are new questions after you've had a chance to actually use it.
Comment #70
Dries CreditAttribution: Dries commentedI spent 10 minutes skimming this patch while at the airport. In short: I'm cool with it. It could use a more in-depth review, but it sounds like people have scrubbed it quite a bit already. It would be _really_ helpful if we could focus on the CCK UI -- that would make it easier to play around with patche like this. I think we should give that some priority (not in this patch).
Comment #71
Anonymous (not verified) CreditAttribution: Anonymous commentedThe current HEAD of CCK Field UI contains code to create fields and instances of this type.
Comment #72
yched CreditAttribution: yched commentedDries: see #516138: Move CCK HEAD in core - I'll be rolling an initial patch asap.
Comment #73
catchOK I read through this and couldn't find anything else to complain about. Additionally I applied this patch, #516138: Move CCK HEAD in core and c&ped http://github.com/benjamin-agaric/drupal-taxonomy-as-fields/raw/11ebd8b2... into taxonomy.module and was able to attach a term field to a user, which is wonderful.
There's going to be followup work required - this only gives us the 'term reference' field, it doesn't replace taxonomy_node_*() in any way (that's #412518: Convert taxonomy_node_* related code to use field API + upgrade path), but IMO this exactly what we need for the initial patch - any more would be too much to review properly - and it opens up a lot of other things as well (related terms and synonyms as a field? replacing profile autocomplete and keyword listings with a taxonomy field?) which don't rely on 412518.
Assuming #67 fixes everything webchick mentioned in #68, I think we're ready to go.
Comment #74
webchickI actually sat down with CCK module and this patch to dink around with it rather than doing a straight-up code review, which I already did earlier. A couple of things were a little bit funny, but overall this patch is completely awesome!
My first problem was I had a bit of difficulty locating the new field type in the list:
We could chalk this up to me searching for the keyword "Taxonomy" and not finding it, or me expecting "Text" to be the short "T" word at the bottom and thus ignoring it, etc. But it tripped me up, and I figured I'd mention it.
Additionally, although with "Node" we mask everywhere as "Content", we do not (at least so far) do that with "Taxonomy." And while "Taxonomy" is visible in the admin panel, and "Vocabularies" can be visible if there are more than one taxonomy vocabulary attached to a content type, I don't think "Term" is something commonly associated with taxonomy to the average person (more apt would be "Category" which we don't use), and it's a "normal" English word that means something relating to definitions, so I'm not sure the mental model quite meshes.
It's also possible that I'm just really exhausted and this would not trip up anyone else, including me on a day where I had more than 4 hours of sleep. ;)
The second thing to trip me up is that autocomplete was not in the selection of widgets to choose from. This was particularly weird given that core *does* ship with a Tags vocabulary for just this purpose:
I talked to bangpound about this and he informed me that this was next on the list of things to do, but the initial pass was just for normal categories. Fair enough.
The interim settings screen further scrambled my brains on the naming:
And finally, it'd be nice if there were some smarts passed along from the vocabulary chosen to the field settings:
I also discovered you cannot delete fields from CCK Field UI. Whee!
Talking over with bangpound, most of these issues identified here are CCK things. Really the only thing that affects this patch are the names. Posting this review so we can talk about it more.
Comment #75
webchickIn talking with bangpound a bit more, it probably makes sense to go ahead with the names/labels as-is for now and see how the follow-up work shakes out and whether it makes sense to rename at that point.
I found a few more minor things reading through (for example testTermfieldWidgets -> testTermFieldWidgets), but I can fix them prior to commit.
I would love to commit this tonight, but my brains are about to leak out all over the floor courtesy of a splitting migraine. So I'll look at this mid-day tomorrow and, assuming no one else has freaked out over anything, do final touch-ups and commit it then. Thanks for your patience. This week is a little rough. :\
Comment #76
Anonymous (not verified) CreditAttribution: Anonymous commentedThis patch is really supposed to be the first of a few that break taxonomy.module's relationship to nodes through node API and replace it with Field API. I'm happy to see that webchick expected to see things that aren't there yet.
Tags and tagging are the next problem I'm going to work on. The most basic requirement to bring tagging through Term Fields is to create an autocomplete widget that also adds new terms to the tag vocabulary. I've also solicited input from contrib module maintainers and the wider community about what tagging looks like in Drupal 7. #495774: Modularize tagging Much of the discussion there is a little free-wheeling, so rest assured that I know the priorities despite opening the issue like a brainstorming session. As a free gift from our sponsors, any autocomplete widget for tagging will also work with list module fields, and adding new terms will be optional per term field instance (list module won't support new allowed values).
Vocabulary settings such as nodes, multiple, required and tags are all eligible for being cut. 'Tags' could be an exception; see #495774: Modularize tagging discussion about UI for a tagging vocabulary versus a controlled vocabulary. 'Multiple' and 'required' are field or field instance settings. They're nothing to do with what a vocabulary really is.
Each new vocabulary will automatically create a new field. Although the Field UI is not settled, I believe that site builders will probably be adding existing fields to their "bundles" that will have a clear descriptive label: "Taxonomy: Topic vocabulary terms" or something like that. I defer to others to decide what's best here, but I just want to make sure you know that most use cases will not require setting the vocabulary for a new term field.
I'm unable to give you perfect answers for what this field should be called and how it should be described, though I'm happy to provide my reaction to other ideas! For me, this field is for making relational links to names of things in a schema or vocabulary. That's not a friendly definition, but it's a very powerful concept for organizing information on the web.
The name of the field is 'term' simply because it's what made sense to me. I think 'taxonomy' is not great as the name for the field because (to me) it's a very specific meaning that isn't valid for the values of a field.
The points about 'term reference' are taken. I think I was just making a search and replace there. That's part of the CCK Fields UI patch, so it was made primarily to ease the testing of the patch. It still needs some work!
Comment #77
catchOn naming, I think this'll be easier to do with CCK UI in core, and in followup patches, but I'd probably go for 'Taxonomy term' as the field name - it's provided by taxonomy module, and it's not a vocabulary - so it gives you a bit of a clue if you managed to find term fields before ever going to admin/build/taxonomy.
Completely agreed with bangpound on vocabulary settings - we want to cut that out of vocabulary creation and storage entirely, so they become just loose containers for terms - and all the help text/tags/multiple etc. is handled via the field UI. But all that happens in the other patch.
Comment #78
Anonymous (not verified) CreditAttribution: Anonymous commentedI have editor windows open for another patch on this issue.
So far I have fixed some funky test code that was a hornets nest of curly and square brackets. Chx pointed out how unreadable the code was.
Please find me in IRC to direct me to make any other changes.
Comment #79
Anonymous (not verified) CreditAttribution: Anonymous commentedOn second thought: 'no let's fix it in a follow up patch.' (per Catch)
Comment #80
Anonymous (not verified) CreditAttribution: Anonymous commentedField changes:
Term becomes Taxonomy term.
Description is now "This field stores a reference to a taxonomy term."
Field info changes:
term becomes taxonomy_term
Formatter info changes
term_default becomes taxonomy_term_default
term_plain becomes taxonomy_term_plain
Test changes:
TermFieldTestCase becomes TaxonomyTermFieldTestCase
testTermFieldValidation becomes testTaxonomyTermFieldValidation
testTermfieldWidgets becomes testTaxonomyTermFieldWidgets
monstrosity $entity->{$this->field['field_name']}[0]['value'] gets a little less monstrous as $entity->{$this->field_name}[0]['value']
Comment #81
RobLoachSubscribing........ Umm, nevermind, wrong issue.
Comment #82
drewish CreditAttribution: drewish commentedsub
Comment #83
webchickJust a note that I discussed this issue with Dries over dinner the other night, but he said he wanted another crack at it before committing.
Comment #84
catchFollow-up issues:
#526122: Autocomplete widget for taxonomy term fields
#526120: Remove related terms from core now that terms can have a term reference field
#412518: Convert taxonomy_node_* related code to use field API + upgrade path
Also this is likely to be useful for the 'list users with this profile field' component of #394720: Migrate profile module data to field API.
Comment #85
Anonymous (not verified) CreditAttribution: Anonymous commentedHooray! Dries' laptop is fixed! (cough)
BTW: This patch will effectively fix #275352: Taxonomy throwing odd errors when previewing node and all kinds of other issues related to taxonomy's legacy and exceptional behaviors.
Comment #86
Dries CreditAttribution: Dries commentedMy laptop isn't fixed yet, but I had a chance to look into this:
1. Why is field_formatter_taxonomy_term_default called field_formatter_taxonomy_term_default and not field_formatter_taxonomy_term_link?
2. The documentation of taxonomy_field_load mentions there being one object but it actually takes multiple objects.
3. taxonomy_field_load needs some code comments, IMO.
4. I expected taxonomy_field_widget_info_alter to be unnecessary and these things to be adjusted automatically but maybe I'm not 100% speed on that part yet. I'd need to dig a bit deeper into the code, or the documentation could be a bit more explicit/verbose about what is going on.
5. t("Test the creation of term fields.") should use single quotes.
Comment #87
Anonymous (not verified) CreditAttribution: Anonymous commentedHere's a new patch.
Thanks to tha_sun for helping me add useful comments in taxonomy_field_load. He also pointed out an unnecessary call to drupal_schema_fields_sql.
The hook_field_widget_info_alter implementation exists because the core field and widget modules explicitly mention support for each other but not for optional core fields, which admittedly don't exist yet. The first few comments on the issue were related to the need to patch options.module, and hook_field_widget_info_alter was developed as a result.
This was marked RTBC before. I apologize if I'm breaking the rules about re-setting that status. LMK.
Comment #88
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD.
This needs a CHANGELOG.txt entry, IMO.
I'd also like to see a list of follow-up issues.
Comment #89
catchFollowups:
#526122: Autocomplete widget for taxonomy term fields
#526120: Remove related terms from core now that terms can have a term reference field
#412518: Convert taxonomy_node_* related code to use field API + upgrade path
#538256: Use a text field for taxonomy term synonyms
Comment #90
Anonymous (not verified) CreditAttribution: Anonymous commentedThis patch adds a changelog entry.
Comment #91
webchickCommitted #90 to HEAD. Thanks!