The development of Drupal Field API obligates us to look at taxonomy.module to keep it relevant and modern. The concerns At Drupalcon DC, Benjamin Melançon (ben-agaric) and Benjamin Doherty (bangpound) began to look at this problem and untie it. Here's what we propose.

Module reorganization

Taxonomy.module should primarily contain features for the management of controlled vocabularies: CRUD of vocabularies and terms, describing the hierarchy of terms, and other features that may or may not be fully developed yet: synonymy, term relationships.

A new module called term.module will be the field module that will replace all or most of taxonomy.module's nodeapi functionality. Hopefully it will do more, too.

Field widgets also need some development. Currently, we're using Options.module select widgets, but we shouldn't limit ourselves only to that widget. Tagging autocompletion widget will need to be built.

Tagging features live in taxonomy.module because there had been a significant overlap in the functional, UI and data requirements with taxonomy.module's support for controlled vocabularies. However, this overlap will become much less significant. I'd like to solicit the community's ideas about how to move tagging into a rational place in Drupal 7 and this proposal. We certainly do not propose abandoning tagging at all. We simply need to know where it fits in a Drupal 7 where taxonomy.module lets us manage vocabularies and term.module lets us put terms on things.

Should-be invisible functional changes

Currently, the vocabulary definition contains properties that will need to be expressed in a term field's definition. Without a UI for adding fields to content types, the vocabulary admin form's submit handler needs to call the field functions for creating new fields and field instances. The same choices of multiple values and required translate easily to Field API instance properties.

All the functions in taxonomy.module that load and save term-node relationships are probably going to disappear or be put on a serious diet. All of these data transactions are going to be handled by Field API's own functions or the new term.module's implementation of Field API hook.

taxonomy_node_load
taxonomy_node_view
taxonomy_node_delete
taxonomy_node_delete_revision
taxonomy_node_validate
taxonomy_node_update
taxonomy_node_insert
taxonomy_node_save
taxonomy_node_update_index
taxonomy_node_rss_item

These functions are outliers. They work with node-term relationships, but they don't have an analog in the Field API.

taxonomy_node_get_terms_by_vocabulary
taxonomy_get_tids_from_nodes
taxonomy_term_count_nodes
taxonomy_select_nodes

Major repercussions of making terms into fields

The current field storage schema is a huge step away from the way term_node table is structured. Most importantly, term_node holds all of these relationships in one table, while Field API defaults to storing field values by bundle. This means that values of term fields of different vocabularies will be stored in separate tables and perhaps even in multiple tables if there are multiple fields made from the same vocabulary.

There's incredible value to easily selecting all nodes with a particular term or all terms for a particular node, so we propose maintaining the term_node table in addition to the Field API storage mechanism. The term_data table would be 100% redundant to the Field API storage table. Dropping term_node entirely would make previously easy tasks much more difficult. This isn't a stubborn commitment to maintaining the term_node table, so please comment if you have some alternatives.

Core dependencies on taxonomy.module

Changes to forum.module will be necessary.

Windows of opportunism

The initial motivation to undertake this project — rather the "final straw" against bangpound's indolence — was Stéphane Corlosquet's proposals for RDFa in Drupal. In this model, nodes are subjects, terms are objects, and dc:subject is the predicate. Currently, nodes and terms are doing something with each other, but we don't really know what they're doing. The RDFa proposal requires us to infer the nature of the relationship, and dc:subject is a safe default. If fields can be made out of terms, we will have better opportunities in core to define the "predicate" or nature of the relationship between nodes (or other things) and terms.

We will also be able to re-use the same vocabulary multiple times on the same thing. Benjamin Melançon gave us an example where nodes describe some aspect of international movement of people or goods. The same vocabulary of geographic place names will need to be used for fields capturing "country of origin" and "country of destination" information.

If user profiles will be comprised of fields, taxonomy terms have a sensible and exciting role to play there.

Development work

So far, the patch changes the taxonomy_form_vocabulary form and its submit handler to create fields and instances. We also needed to change one line in options.module to support the new term.module.

No tests yet. I know it's a requirement.

Files: 
CommentFileSizeAuthor
#199 taxonomy.patch121.27 KBcatch
Passed: 13441 passes, 0 fails, 0 exceptions View
#197 taxonomy.patch121.25 KBcatch
Failed: 13666 passes, 0 fails, 13 exceptions View
#193 taxonomy.patch121.25 KBcatch
Failed: Invalid PHP syntax in modules/forum/forum.admin.inc. View
#189 drupal.taxonomy-field-revamp.patch121.39 KBsun
Passed: 13687 passes, 0 fails, 0 exceptions View
#183 taxonomy-412518-183.patch121.47 KBAnonymous (not verified)
Passed: 13645 passes, 0 fails, 0 exceptions View
#181 term-edit.png12.59 KBandypost
#174 taxonomy.patch121.6 KBcatch
Failed: Invalid PHP syntax in modules/forum/forum.admin.inc. View
#172 taxonomy.patch121.62 KBcatch
Failed: 13436 passes, 0 fails, 69 exceptions View
#168 taxonomy.patch121.52 KBcatch
Passed: 13673 passes, 0 fails, 0 exceptions View
#166 taxonomy.patch118.72 KBcatch
Failed: Failed to apply patch. View
#163 taxonomy.patch118.69 KBcatch
Failed: Failed to apply patch. View
#161 taxonomy.patch118.58 KBcatch
Failed: Failed to apply patch. View
#155 taxonomy.patch118.97 KBcatch
Failed: Failed to apply patch. View
#151 taxonomy.patch117.95 KBcatch
Failed: Failed to apply patch. View
#146 taxonomy.patch115.7 KBcatch
Failed: Failed to apply patch. View
#144 taxonomy.patch92.73 KBcatch
Failed: Failed to apply patch. View
#129 taxonomy.patch92.95 KBcatch
Failed: Failed to apply patch. View
#127 1-head.png629.93 KBcatch
#127 1-patch.png567.81 KBcatch
#127 2-head.png709.9 KBcatch
#127 2-patch.png598.81 KBcatch
#127 3-head.png688.12 KBcatch
#127 3-patch.png571.98 KBcatch
#127 4-head.png548.06 KBcatch
#127 patch-4.png477.65 KBcatch
#123 taxonomy_diet-412518-123.patch92.93 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#120 taxonomy_diet-412518-120.patch93.17 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#118 taxonomy_diet-412518-118.patch92.81 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#116 taxonomy_diet.patch92.68 KBcatch
Failed: Failed to apply patch. View
#112 taxonomy_diet.patch91.98 KBcatch
Failed: 12364 passes, 5 fails, 2 exceptions View
#105 taxonomy_diet-412518-105.patch93.23 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#104 taxonomy_diet-412518-99a.patch95.98 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#99 taxonomy_diet-412518-99.patch96.57 KBAnonymous (not verified)
Failed: 12300 passes, 21 fails, 40 exceptions View
#97 taxonomy_diet-412518-97a.patch94.98 KBAnonymous (not verified)
Failed: 12335 passes, 28 fails, 75 exceptions View
#95 taxonomy_diet-412518-95.patch86.75 KBAnonymous (not verified)
Failed: 11960 passes, 279 fails, 103 exceptions View
#93 taxonomy_diet-412518-93.patch87.06 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#90 taxonomy_diet-412518-90.patch86.03 KBAnonymous (not verified)
Failed: Failed to install HEAD. View
#88 taxonomy-diet.patch90.51 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#85 taxonomy-diet.patch85.85 KBAnonymous (not verified)
Failed: 11934 passes, 295 fails, 121 exceptions View
#82 forumihateyou.patch76.54 KBcatch
Failed: Failed to apply patch. View
#80 taxonomy_diet.patch75.71 KBcatch
Failed: 11889 passes, 138 fails, 133 exceptions View
#76 taxonomy_diet.patch82.19 KBcatch
Failed: 12063 passes, 138 fails, 147 exceptions View
#71 taxonomy_diet.patch73.24 KBcatch
Failed: 11998 passes, 242 fails, 181 exceptions View
#69 taxonomy_diet.patch70.25 KBcatch
Failed: 11947 passes, 275 fails, 192 exceptions View
#67 taxonomy_diet.patch54.75 KBcatch
Failed: 11661 passes, 242 fails, 204 exceptions View
#65 slimmer_taxonomy.patch50.9 KBcatch
Failed: Failed to install HEAD. View
#61 taxonomy_upgrade.patch6.77 KBcatch
Failed: Failed to install HEAD. View
#30 taxonomy-terms-as-fields-03.patch16.88 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#12 taxonomy-terms-as-fields-02.patch20.79 KBAnonymous (not verified)
Failed: Failed to apply patch. View
#1 taxonomy-terms-as-fields-01.patch16.66 KBAnonymous (not verified)
Failed: Failed to apply patch. View

Comments

Anonymous’s picture

FileSize
16.66 KB
Failed: Failed to apply patch. View

Here's the first patch.

Anonymous’s picture

Title: Taxonomy.module vocabularies should be field » Taxonomy.module vocabularies should be fields
mlncn’s picture

Git repo

Because ripping out the guts of taxonomy and putting them back in breaks a lot of things, while we are very committed to not killing kittens and keeping patches as bite-size as possible, for development of this patch we created a git repository:

github.com/benjamin-agaric/drupal-taxonomy-as-fields

Developers interested in participating in developing this patch to a committable state are invited to use or fork the git repo or to post patch files to this issue queue. Thanks!

benjamin, Agaric Design Collective

catch’s picture

Status: Active » Needs work

subscribe.

catch’s picture

Initial feedback:

Why is vocabulary weight removed? Is this because term display will be per-field-instance? If so I guess we don't need the nasty order by vocabulary weight in taxonomy_node_get_terms() - and possibly could kill taxonomy_node_get_terms() altogether? If that's the case, then great! But want to make sure I understand the reasoning.

An answer to one @TODO - hierarchy is always on, if you add two parents to a taxonomy term it turns on multiple hierarchy - this was a very late change in D6 when drag and drop went in - probably needs re-refactoring.

Also a heads up that I just opened the opposite side of this issue #413192: Make taxonomy terms fieldable.

Jeff Burnz’s picture

subscribe.

Anonymous’s picture

Term display will be per-field-instance.

Many of the properties of vocabularies are like field instance properties, and I've had a difficult time knowing how ruthless I can be in removing things that should/would be managed by any bundle admin tool that might come up. In particular, weight would correlate to the weight property of a field instance. However until now, the weight and other properties have been attached to the whole vocabulary (which kinda correlates to the field itself).

If we remove weight, why not remove "multiple" and "required" too? Why even define the field name for the whole vocabulary? At some point we're faced with building our own field manager inside taxonomy module simply to make sure the features remain neatly consistent OR we agree to prune all these features to the bare minimum because whatever Field API bundle tools will not be saddled with this baggage.

Taxoman’s picture

Ref.

"The term_data table would be 100% redundant to the Field API storage table. Dropping term_node entirely would make previously easy tasks much more difficult."

Without having given this transition deep thought, the first question that comes to mind is this:

- How would this affect security (ACL)? I imagine it would pose significant challenges in the intersection between taxonomy based access control and the Views module, for example.

I wonder if the sheer amount of work and discussions and testing needed for this to land properly without jeopardizing security or delaying Drupal 7, might suggest that this is a Drupal 8 task...

(Drupal 7 will not be usable before its security scheme and the Views module works well together, given how crucial the Views module is becoming to many Drupal sites.)

catch’s picture

re: #7, weight being a field instance property sounds great to me - it's more flexible than vocabulary weight so why keep an inflexible system in place.

@Taxoman: afaik both term_data and term_node aren't going anywhere. It's completely fine for field modules to handle some additional storage by themselves, and term_node is the only thing required for taxonomy based permissions to work. We need to make much bigger changes than this to Drupal 7 before a release, and we've got 5 months before code freeze, plenty of time to get this right. It'd be a real shame to ship Drupal 7 with the Field API in core and no core modules taking advantage of it.

catch’s picture

Regarding the vocabulary properties like multiple select, required etc. - these should definitely move to field instance properties IMO - however doing that now also means we'll have no core UI for managing that stuff until a Field API patch lands. This is a general issue with any conversion (profile, poll etc.), so I've opened another issue at #414328: Converting core modules to provide fields leaves them with no UI to stop any discussion about that from polluting this one.

Xano’s picture

Subscribing.
(But WHY the git repo? :-( I am having trouble with CVS already :-P )

Anonymous’s picture

FileSize
20.79 KB
Failed: Failed to apply patch. View

CVS is a lot of trouble, Xano. I don't know what to tell you.

Here's a new patch that:
1. removes all taxonomy_vocabulary_node_type table transactions in taxonomy module.
2. fixes the term field module so it actually works.
3. further refines the awkward process of creating fields and instances from vocabularies.

The next patch coming from me may be a few weeks away. I have a major project to finish this week and then I'm training my client the week after. My next goal is to remove all the Node API stuff from taxonomy.module and start writing the code to either maintain term_node table or make the absence of term_node invisible to Drupal core.

dodorama’s picture

I'm not sure if this is the right place to ask for this, but since we're rewriting the module from the ground up can it be useful to link terms to the user who created them so that we can have, for example, personal vocabularies as stated here?

Anonymous’s picture

dodazzi this is the wrong queue for that feature request.

you may have some openings for this feature at #413192: Make taxonomy terms fieldable. adding a user reference field to a term object would be a foundation on which to develop this feature.

Owen Barton’s picture

Exciting stuff - subscribing :)

scor’s picture

subscribing

mlncn’s picture

Updated git repo to use and work with the latest D7. Trying to get this working for the RDF in core code sprint tomorrow.

benjamin, Agaric Design Collective

Dries’s picture

Issue tags: +Favorite-of-Dries

Taggging. Drupal-asm. :-)

catch’s picture

Benjamin showed me the latest status of this when he was in London.

I think this is an accurate summary of the remaining issues to resolve:

1. Upgrade path - we probably need to create a new field per vocabulary, and a new field instance per content type it's assigned to.

2. UI. Since there's no field UI in core, we might need to keep the checkboxes on admin/build/types - if it's possible to plug those directly into Field API and only have the minimum bridging code in taxonomy.module to do this that'd be great. Then if you want to do anything sensible, for now you'd have to install CCK HEAD, and we can rip out the bridging code later.

3. All of the term_node stuff in Drupal 6 is really hard to do in D7 without Views in core. I think we need to maintain both the field storage and term_node until patches like scalar searching etc. are committed. Again - just to get the initial patch in, if we can remove some of the term_node stuff later that's great.

Anonymous’s picture

great. I'll be able to get back on this patch next week.

skilip’s picture

Subscribing

moshe weitzman’s picture

subscribe ... scalar searching of fields is in.

catch’s picture

Last patch posted here was a month ago and I'm pretty sure work was done since then - no-one is going to look at a git repo, please post an updated patch if you've got time to do so.

catch’s picture

I had another look at the last patch and I think we might be able to get this in easier in two issues:

1. Add term.module which provides the field type - that gives us the taxonomy as field feature (attach it to users, attach it to nodes on new sites) in core - works great for new sites

2. Remove taxonomy_node_*() and all the spaghetti code around that, do the upgrade path, deal with the performance issues.

They're really two separate issues, and it'd mean the initial patch being very tiny, and the followup patch being entirely focused on removing stuff and the upgrade path. There'd be no reason not to work on them in parallel, but just a lot less to review at once.

Xano’s picture

So we basically split everything in Taxonomy API and Term field modules?

catch’s picture

That's what the current patch does, I'm not really sure why we need a separate module though - seems like those functions could just be added to taxonomy.module

Anonymous’s picture

Catch:

I will split the patch in two tomorrow.

Anonymous’s picture

A lot of Field API stuff changed in the last month. Pretty much nothing works in the code now.

I'll be working on it this weekend.

Term and Taxonomy module are separated because there's no requirement to use them both. Term is optional even if Taxonomy module is enabled.

andypost’s picture

Now terms have fields #413192: Make taxonomy terms fieldable

Is this issue going just to make term_node to be splited into field storage?

Anonymous’s picture

FileSize
16.88 KB
Failed: Failed to apply patch. View

I've brought the patch up to date with HEAD. More or less the same things work as with the previous patch.

I have some questions that are coming up though.

#413192: Make taxonomy terms fieldable brought in a 'machine_name' for vocabularies similar to the node type... even with the same UI. I too had copied this code for defining the field name. Is it redundant now? Is it OK to use (switching to my notes so I get it right) the same name for the vocabulary's term bundle AND for the vocabulary's field name?

Now that I have machine_name stored with the vocabulary, I no longer loop through term field instances to see if the field's vid setting == the vocabulary's vid. Instead I just grab fields whose field name is the vocabulary's machine name. Should I backpedal? I probably should because renaming bundles is possible now, but renaming fields isn't as far as I can tell.

I need some help with formatters and widgets. Right now, I'm using options widgets, but this will be no good for tags. The default formatter makes terms into links as in Drupal 6. Should there be more vocab field formatters than this in core?

Here's a patch. It also includes the patch at #490432: taxonomy_form_vocabulary always returns error when editing existing vocabularies. I'll be working throughout the weekend as well. See you in IRC!

catch’s picture

Can't comment on the field/bundle rules but that sounds awkward, seems like we need Barry or yched on that bit.

Would be good to have an autocomplete widget, other widgets or formatters might be nice to have but don't see any need to introduce them with the initial patch.

Anonymous’s picture

Part of this task is now posted at #491190: Provide a taxonomy term field type. Please turn your attention there. This issue should remain open, because there's still more code to write.

Anonymous’s picture

Title: Taxonomy.module vocabularies should be fields » Taxonomy.module vocabularies become fields, deprecate taxonomy-node functions, develop an upgrade path
chx’s picture

Title: Taxonomy.module vocabularies become fields, deprecate taxonomy-node functions, develop an upgrade path » make taxonomy terms fields

Mentioning "vocabularies" in the title is pooling wool over the unsuspecting eyes. If I have not asked in IRC "ok, so vocabularies are fields but what are then terms" turns out that this issue tries to sneak in terms-as-fields under a false title. Do not even try do that. This issue MUST be VERY VERY carefully benchmarked and only then can it be considered.

catch’s picture

Title: make taxonomy terms fields » Convert taxonomy_node_* related code to use field API + upgrade path

Clarifying issue titles between this and #491190: Provide a taxonomy term field type.

Anonymous’s picture

Talk in #drupal is that a term field should be automatically created with each vocabulary. On some level this seems obvious to me, because in almost every use case, you want your vocabulary to relate to some other thing on your site. The place where this relationship happens is the field.

This doesn't really address the extent of the UI and usability questions for this task. With Field API, we get to completely disconnect taxonomy.module from the node system. The "content type" checkboxes on the vocabulary settings form really need to go away! So even if we make a field for each vocabulary, we still have to make field instances for those fields on our node types.

When we create new node types, we don't want to automatically create a node reference field that's constrained to that new node type. Or do we?

Anonymous’s picture

I've created a new issue related to this project. I'd very much appreciate feedback from contrib module maintainers and anyone who has something to say about tagging in Drupal 7.

#495774: Modularize tagging

Any work I do on tagging is going to wait until term.module is committed, so I think you have a good amount of time to think about it and share ideas.

Xano’s picture

Catch and I discussed Term.module this week and we came to the conclusion there is not much reason to create a separate module next to Taxonomy.module as it may cause confusion for end users ("Do I need _another_ module to be able to categorise my content?!") and with the registry adding it to Taxonomy.module doesn't really cause overhead.

Anonymous’s picture

yched has realized some of the problems we will face as we migrate from nodeapi-terms to fieldapi-terms. I suggest everyone read it. Importantly he reminds that there has already been some discussion about this topic. It's probably better to continue this discussion here, because this issue will remain open until the various tasks for this project are completed.

Generally, I think the semantic changes are vital. We'll be able to name the predicate explicitly, while in the past, the node (subject) -> term (object) relationship was either ambiguous or deduced.

I see this problem having two faces:

  • Upgrading Drupal 6 sites to Drupal 7 will require us to translate the ambiguity into an arrangement of fields and instances that can either be elaborated by the site maintainers or left ambiguous. We'll want sites that upgrade to D7 to express term-node relationships in an identical or almost identical way as D6.
  • The "create a vocabulary" workflow(s) need to be assessed (and tested?) so that we're not leaving users out at sea with vague or presumptuous semantics. (scor has proposed that terms are "dc:subject" for nodes in his presentations on RDFa, and this sounds like a sensible default to start from.)

This leads into the next problems of what happens to at taxonomy/term/[tid] paths. If all else fails, do we implement hook_field_storage_write so we can continue to maintain a term_node table for these paths? (Yipes!) If users and terms can be classified with taxonomy terms, does that mean they too should appear in these lists? Open questions for discussion.

Answers to these questions are probably dependent on other changes in D7, so rather than answer them all now, I'm keeping an open mind as I learn more about what's going on in D7. As always, I appreciate feedback, suggestions, help, etc.

yched’s picture

"object - term relations can now have different predicates" is a nice way to describe the semantics change of "taxo as field". Let's be clear that those are 'mental' predicates, though, only known by humans (site admins and users). RDF can then be used to apply actual "machine" predicates.

IMO the upgrade path require we create one default field for each vocab, and one instance per node type using the vocab. Those fields will receive the D6 node-term relations. Site admins can then add other taxo fields for a given vocab using Field UI (a bit like what you can do with Content Taxonomy in D6), but the default fields should be good enough for most sites.

Theoretically we *can* bypass the regular field storage (one table per field) and have all data for object-term relations in a single, separate table (see hook_field_storage_pre_*() hooks). This term_object table must then have columns for obj_type, bundle, field_id, delta
Not sure exactly of all pros and cons, so I'm not saying this is what we actually want to do, but this is the only way to keep a "list all [obj_type] objects having term [tid] in any taxo field" page (taxonomy/term/[tid]/[obj_type]). Other than that, the best we can do is "list all [obj_type] objects having term [tid] in field [field_name]" (taxonomy/term/[tid]/[obj_type]/[field_name]) - which might be what we want indeed.

Note that the historical taxonomy/term/[tid] path is dead anyway (or it defaults to 'nodes'). There's no way to grab an ordered, paginated list of mixed object types (nodes + users + contrib entity type...)

catch’s picture

I reckon we have a few options here. {term_objects} with a type column in addition to field storage to allow some semblance of what we could do in D6. That seems to me like it should probably be an additional table rather than completely replacing field storage, not sure. {term_node} isn't exactly efficient as it is, but as long as we don't make things significantly worse that'd help. We could also consider generating term_$entity tables dynamically. Something like this seems necessary though.

For taxonomy/term/% itself - I think there's a few things we can look at:

Remove the 1+2,3,4,5,6/8 behaviour from core - Views handles that effectively in contrib now as of Views 2, and we have too much special casing in taxonomy_term_page() to try to handle it without having views in core to make it more abstracted. I may open a separate issue for this anyway. So taxonomy/term/% becomes taxonomy/term/%taxonomy_term with a proper menu loader.

For the listings themselves, we could probably go for taxonomy/term/%/content taxonomy/term/%/users taxonomy/term/%/terms etc. etc. - if you only have a vocabulary assigned to content, you get the same as D6 behaviour. Doesn't seem overly hard to implement if we have the dual storage there in the first place. I don't think a mixed page of entities is desirable anyway - at least we shouldn't worry about that.

Then just like now you can override those pages with whatever you want, but the default behaviour core provides actually makes a lot of sense for most cases - doesn't mean that we can't show terms grouped by field etc. on node rendering itself and link to different views in the formatters (maybe replacing hook_term_path())?

Anonymous’s picture

don't forget depth!

catch’s picture

bangpound - yeah I think we could consider dropping depth from core as well - just install views if you need it.

scor’s picture

For taxonomy/term/% itself - I think there's a few things we can look at:

yes, we should really keep a URI for each term, the RDF patches use it. It took many years to get an instance path for each comment (#26966: Fix comment links when paging is used.), let's not lose it for terms now!!!

catch’s picture

Posted #503456: Remove multiple tid and depth handling for core taxonomy paths which simplifies all the code around the default term page.

yched’s picture

@scor:
I might very well be missing it, but I don't see how this compares to comments permalinks.

The moment we enable terms on non-nodes, there cannot be a single 'URI' for a term, listing all objects tagged with that term. You cannot build a SQL query that lists heterogeneous objects sorted by "{node}.timestamp or {user}.timestamp or {foo}.bar". It has to be several pages filtering by object type.

To avoid link rot, taxonomy/term/%term would need to default to nodes, but can this path be considered "the" term's URI ? What's the meaning of attaching this URI to the RDF info added to the term when it's applied to a user ?

Anonymous’s picture

The term is content in its own right, so we can default to listing nodes as is done now, but the term page also shows the description, other fields, and perhaps some representations of hierarchy. Hypothetically, If we listed no nodes and kept taxonomy/term paths otherwise to show only the term and its bundled fields would this suffice for the RDF features?

catch’s picture

When we discussed comment permalinks, someone suggested having a comment/%/view url which just shows the individual comment.

So we could have taxonomy/term/%/content taxonomy/term/%/user, but also taxonomy/term/%/view - the last one doesn't show any lists of associated content by default, but maybe contains information about parents / children / relations / synonyms etc.

Other options are keeping taxonomy/term/% pages and adding a block/column for each object type so it stays as a single page - but that would require block module not to be crap and probably views in core so you could customise easier.

scor’s picture

@yched: we don't necessarily need to have all the content which is tagged with a term (though it would be nice for nodes at least as it is now in D6), but having simply the name of a term and some basic metadata about it would be great (as bangpound and catch said). We can let Views/contrib dealing with listing more complex instances types like users...

yched’s picture

re #49: well, the default output for 'taxo field' values cannot depend on Views.
When a taxo field is displayed on a user, it makes little sense to display it as a link to 'list of nodes tagged with that term', it would need to be a list of users.
OTOH, 'taxo as field' lets you add taxo terms to any fieldable type, but taxonomy.module cannot be expected to know how to build a list of [arbitrary fieldable type]. That would require some common 'data API': at least #460320: Standarized, pluggable entity loading (nodes, users, taxonomy, files, comments), and probably an equivalent on the 'build / display' side. Tricky.

Other than that, agreed with what bangpound and catch said.
It should be noted, though, that while Fields on terms has awesome use cases, IMO most terms on most sites will still be just a term name and at best a 'description', which makes the common 'only term data, no content listing' page pretty sparse. It's probably useful for RDF 'URI', but I'm not sure it makes much sense to have upfront in the UI.

Anonymous’s picture

New questions:

hook_field_widget_info() has no documentation. Do widgets have configurable settings?

We'll hopefully be able to define multiple vocabularies/subtrees as allowed values for a taxonomy term field. For tagging widgets where the user has permission to create new terms in a vocabulary, we need to define which vocabulary and parent term for new terms. Will this be a field setting or a field instance setting?

What will replace taxonomy/term/%taxonomy_term paths given the new problems we've created? If we create 1 field per vocabulary to mimic Drupal 6 behavior, will that page list nodes where that field has that term? Or do we want it to include all term fields? (I'm aware that we'd need #460320: Standarized, pluggable entity loading (nodes, users, taxonomy, files, comments) to be truly flexible.) It would be nice to write paths like view/[bundle]/[field]/[value] where [bundle] and [value] could be "all."

$node->taxonomy goes away?

yched’s picture

Do widgets have configurable settings?

Yes, look in text_field_widget_info()

For tagging widgets where the user has permission to create new terms in a vocabulary, we need to define which vocabulary and parent term for new terms. Will this be a field setting or a field instance setting?

I can't think of a reason that would enforce this setting to be the same for all instances, so that could be an instance setting, I guess ?
Or, since it only relevant for widgets that let you create new terms, perhaps a widget setting ?

What will replace taxonomy/term/%taxonomy_term paths given the new problems we've created? If we create 1 field per vocabulary to mimic Drupal 6 behavior, will that page list nodes where that field has that term?

I think we should definitely create fields and instances to match current D6 behavior. Then, to preserve existing links, taxonomy/term/%taxonomy_term needs to list nodes tagged with this term in the default field.

Anonymous’s picture

I did look at text_field_widget_info but I didn't see the settings. Now I do! Thanks.

I'm leaning toward widget settings, because there will be widgets that support adding terms and widgets that don't (such as options widgets). That effectively makes it like instance settings in that each bundle with that field could be adding terms to different vocabularies for multi-vocab fields.

The taxonomy/term paths then will "break" if people delete the default fields for a vocabulary. Or do we have a way to prevent a field from being deleted?

We'll have to be very careful about the documentation of this.

yched’s picture

"The taxonomy/term paths then will "break" if people delete the default fields for a vocabulary. Or do we have a way to prevent a field from being deleted?"
This area in Field API is still kind of grey, for lack of specific use cases so far.
The main issue is what people are able to do with module-defined fields and instances in Field UI. I'm not sure we can or want to prevent silly uses of field_delete_field() or field_delete_instances().
At most, admins will be able to remove *instances* (== "this vocab doesn't apply to this node type anymore" in D6), but not delete the field itself.

We'll probably have a need for module-defined instances to be 'undeletable through Field UI', but it's not implemented yet.

Anonymous’s picture

Rather than lose detail in the issue queue, I've created a short wiki page to document what migration from D6 to D7 taxonomy could look like. Please update it if you have ideas or details.

http://groups.drupal.org/node/24145

Anonymous’s picture

I have a big new question today.

Currently, when terms appear on a node, they are rendered as a link to the term's path. When the term is part of a vocabulary with a MODULE property ≠ 'taxonomy', taxonomy_term_path calls MODULE_term_path. Of course, people override this behavior in two ways: hook_link_alter and hook_term_path.

Now that the term is/should be a field, we have field API structures to reckon with: Fields, field instances and formatters. For example, you can choose a term field formatter that renders a term as plain text! No link at all.

Ignoring the menu callbacks for just a moment, please consider the term's path. Is it:

  1. An intrinsic property of the term itself? (Like it is now, kinda sorta.)
  2. A property of the field?
  3. A property of the field instance?

If you choose either of the last two, a term's path is specific to its context in content, it's a field formatter problem and #113614: Add centralized token/placeholder substitution to core has some interesting relevance.

Now returning to the menu callbacks. Taxonomy module will still have callbacks for taxonomy/term/% list pages. If we make the link to these pages into a property of the field or field instance, taxonomy.module's hook_menu/hook_menu_alter implementation should do something like this:

  1. Read all term fields.
  2. Iterate through them looking for their link path equal to 'taxonomy/term/%taxonomy_term'
  3. Any field (instance) which links to that path will return to hook_menu in an array of field names (and bundles).
  4. This array of field names (and bundles) will be passed as arguments to to taxonomy_term_page so it can generate its list of nodes (and possibly non-nodes).

If all this messing about is done in hook_menu and hook_menu_alter, it's cached. If taxonomy_term_page knows the field value, field name and bundle, it doesn't need to go looking for that information again.

What about field permissions? If a field can only be read by authenticated users, should anonymous users see a node where that field matches in the list?

I see another problem with this. If a term's link is specific to a field, what is a term's path according to hook_term_path for? And who on Drupal's green earth would expect field info to be used in generating menu structure? Is it weird and wrong to think we should be implementing menu callbacks on behalf of fields where their instance values are page arguments?!

catch’s picture

The only situation where we could ever think about using multiple field tables to produce taxonomy/term/%taxonomy_term is with materialized views in core, which doesn't seem likely at this point. {taxonomy_term_node} isn't exactly a performance star either but if we regress from there then someone will mark this issue won't fix very quickly and it might even be me.

At the moment, I'm thinking we should use parallel storage - i.e. each term field gets it's own field table as usual, but we also provide {taxonomy_term_$entity} tables with object_id and term_id - these would be created when an instance is attached to the entity type.

taxonomy_menu() could then provide taxonomy/term/%/$entity paths as tabs, defaulting to node. Then any other options we can leave to contrib (Views, Panels and then the various taxonomy_* modules) - but that gets us standard field storage and object structure when displaying entities, but also our legacy tables for looking up object ids.

Anonymous’s picture

Catch:

Got it.

In #52 yched alluded to the possibility of using the one-default-field-per-vocabulary as the source of taxonomy/term/[tid] results. This means that taxonomy/term/[tid] would return entities where this original/default field instance value is [tid].

I think you're suggesting that taxonomy/term/[tid] needs to return all entities (of the same kind) where any field has [tid] for its value.

Terms are only in one vocabulary at a time, so only one field will need to be queried value. If we cache the field names in the menu system and pass them as an argument with the term ID, there's no need for parallel storage if we follow yched's suggestion, is there? It's one query either way.

I think yched's idea is simpler that relies on some sensible default fields and instances that TOGETHER behave similarly to Drupal 6 taxonomy. Your idea (if I understand it correctly) is all about getting Drupal 7's taxonomy/term/[tid] callbacks to behave like they do in Drupal 6 regardless of how many instances a given vocabulary's term field might exist in any given bundle.

catch’s picture

I could happily live with yched's approach as well (but had sadly forgotten it when thinking of the above). It'd both be simpler and potentially a bit faster since field tables don't hold data for older revisions and we could just query on nid instead of vid - along with smaller table sizes.

One other thing is forum module will have to play nicely with this as well.

catch’s picture

catch’s picture

FileSize
6.77 KB
Failed: Failed to install HEAD. View

OK here's a fresh patch now that terms as fields and the autocomplete widget are in - just working on the upgrade path for now although this makes some assumptions on how things will work in general.

# We use yched's suggestion of a field per vocabulary - also discussed this with merlinofchaos in irc and he'd prefer field per vocab vs. term_node since it simplifies some queries.
# The upgrade path creates a field for each vocabulary, and an instance for each node type the vocabulary is assigned to (maintaining settings where they map to field API settings).
# Doing a direct database query for each vocabulary to migrate the data, unfortunately due to 'delta' we can't do an INSERT INTO SELECT FROM - so this is going to have to be batched at least for the inserts. Batch API is still a TODO - but posting what I have since it works in testing so far.

# Dropping all 'widgety' settings from {taxonomy_vocabulary} and the entire {vocabulary_node_type} table, so far left vocabulary.weight - not sure if there's use cases for that outside field conversion (we order the vocabulary listings by weight and allow you to drag and drop them for example).

There may well be issues with the upgrade path here apart from doing the batch, but the primary job to do here is to move taxonomy/term/% listings to use field_attach_query() - we'll only load terms from the default per-vocabulary field so the query should be as simple as the one in HEAD currently, just via field API instead. Then it's a case of ripping out taxonomy_node_* and supporting code once we have alternatives in place, and fixing tests, and field UI in core...

A few questions which I think have been raised here before, but reiterating:

formatters - the default term formatter points to taxonomy/term/% - but if we only show values there for per-vocabulary fields, that might be a bit strange. Do we default formatters to plain text rather than the link? What to do about taxonomy_term_path()?

taxonomy/term/% only applies to nodes - do we need to add per-entity tabs here? (so taxonomy/term/%/$entity_type) - or do we assume that site builders attaching terms to users will build their own listings with Views or whatever (I'd tend towards the latter).

moshe weitzman’s picture

I agree that we should let contrib add tabs (ornother UI) for non-node term listings.

batch api example module - http://cvs.drupal.org/viewvc.py/drupal/contributions/docs/developer/exam...

IMO taxonomy_term_path() should go and we should do #320331: Turn custom_url_rewrite_inbound and custom_url_rewrite_outbound into hooks. not sure we can count on that tho.

webchick’s picture

Subscriiiiiiiiibe!

Anonymous’s picture

@moshe in #62: Links for terms are now open to a whole bunch of interpretations. Term links displayed through term field formatters could possibly be instance-specific; they could even use tokens (?). Otherwise, taxonomy/term/% is simply the term equivalent of node/%: a definitive menu callback. The only uses of hook_term_path I've seen are image module and forum module. I'm sure there are others. With this in mind, I agree that it could be handed off to something like the hook proposed in #320331: Turn custom_url_rewrite_inbound and custom_url_rewrite_outbound into hooks. As it is, it's conceptually muddled for module developers and a reminder of the strange incompleteness of module-owned vocabs.

@catch in #61: The menu callback, field_attach_query business should be documented as an ideal example of this fundamental tactic, and I don't know if it's necessary to push it on users, other terms, whatever bundles yet in core. At some point, it becomes search, doesn't it?

catch’s picture

Status: Needs work » Needs review
FileSize
50.9 KB
Failed: Failed to install HEAD. View

Kept going with this, approaching a working patch, still very much in progress though and haven't even opened up taxonomy.test yet.

Status so far:

Upgrade path works, but needs batch API added in.

All taxonomy_node_* and supporting code is removed, with the exception of taxonomy_select_nodes() which we leave in modified form to allow for taxonomy/term/% listing pages.

Hit one major snag with #552716: Impossible to join or order by on field tables without using Views - explanation is there on the limitation in field API, we have two options to get around it here:

a. First get all nids attached to a term, then do a pager query on them with IN ($nid). This should work out faster than our current taxonomy_select_nodes() query (which causes a temp table/filesort) - but if a term has 100,000 records associated with it, we're loading all that data into memory , then risking max_allowed_packet errors in the IN() query.

b. Use hook_field_attach_insert() or hook_field_insert() + field schema + field indexes to add $entity->status $entity->timestamp and $entity->sticky columns to vocabulary field tables. This combined with the $pager option to field_attach_query() would let you do the same query, although while writing this, it still doesn't help with addTag('node access') so we're more or less at a dead end there.

c. Maintain term_$entity tables as well as field storage and use those for the listing pages - gets us back to the old-style query which was bad anyway.

The potential memory issues with taxonomy_select_nodes() are less than we had with taxonomy_get_tree() on that same page before that was removed, or some other situations in core, so I'm sticking with (a) until someone comes up with a better idea - and the query will at least be indexed in both cases, whereas it isn't now.pwolanin, bjaspan and bangpound sanity checked all this on irc - both bjaspan and pwolanin also thought about the IN() query which I'd previously discounted, but doesn't seem all that bad considering current alternatives.

Major TODOs:

1. Manual testing of the upgrade path, probably with field UI installed.
2. Benchmark taxonomy_term_page() and normal term listings with a before/after reasonably large dataset.
3. Figure out what to do with term paths/formatters if a term isn't in the default vocabulary fields
4. Actually create default vocabulary fields on vocabulary creation as well as in the upgrade path.
5. Consider whether we want to leave any of the other helper functions in modified for field API, like taxonomy_term_count_nodes().
6. Fix forum.module
7. fix blogapi.module
8. fix tests.
9. Help text needs a complete rewrite (string freeze rather than code freeze please)

Marking to CNR just to see how bad the test breakage is.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
54.75 KB
Failed: 11661 passes, 242 fails, 204 exceptions View

Fixed taxonomy.install and default profile, which meant implementing #4 from the TODOs above. Hopefully test bot will fail a bit more productively this time.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

FileSize
70.25 KB
Failed: 11947 passes, 275 fails, 192 exceptions View

Fixed up a lot more tests, need to work out the structure for submitting option selects and taxonomy autocomplete forms for the last one.

Anonymous’s picture

Catch:

On your patch:

  1. get rid of $form['settings'] in taxonomy_form_vocabulary.
  2. weight should ultimately be dropped.
  3. upgrade did not work at all for me, but i didn't realize it and kept no error messages.
  4. _taxonomy_term_select is a horrible scar but also an object lesson about avoiding procrastination.

I'm looking at forum.module. What a sicko.

The forum_nav_vocabulary variable could be more reasonably expressed as a hard coded field name. Or we could stash the name of the field in this variable. The nodes which are eligible to be in a forum are determined by the nodes in a vocabulary object, so I'd really like to see this vocabulary's field get a name like 'forum_'. $vocabulary->machine_name.

_forum_node_check_node_type and all of this jiggery-pokery can go away:

  $vid = variable_get('forum_nav_vocabulary', 0);
  $forum_terms = taxonomy_node_get_terms_by_vocabulary($node, $vid);

All of the tricks to list 'active topics' happen with joins on node_comment_statistics. Is this the same as #552716: Impossible to join or order by on field tables without using Views? see forum_get_forums, forum_block_view, _forum_topics_unread.

Given the fact that there are no non-SQL field storage engines yet and we have no idea how a non-SQL field storage engine could support something like joins, sorts and conditions, can't we add a dependency on field_sql_storage to forum and use _field_sql_storage_tablename and _field_sql_storage_revision_tablename to construct these queries? Or is this a big lame bangpound--? I shouldn't have said anything, I know...

catch’s picture

Status: Needs work » Needs review
Issue tags: +DIE
FileSize
73.24 KB
Failed: 11998 passes, 242 fails, 181 exceptions View

New patch:

Vocabulary settings completely removed!

Taxonomy tests are all passing apart from notices in taxonomy_field_is_empty() (undefined string offset 0) - I think this is when submitting a multi-value select.

_taxonomy_term_select() we can remove if we convert term->parents to a field (in terms of storage it's more or less the same, but I have a feeling field_attach_query will make it impossible to actually do without crazy refactoring we don't have time for...) - not going to die in this patch either way :(

Forum - yeah let's make it a special forum field referring to whichever vocabulary vocabulary, I'm a bit scared about forum containers too although the existing mechanism for those is disgusting, so it'll be hard to make it worse as long as it works.

Those node_comment_statistics() join queries are evil - so even if we have to replace them with a similar thing to what we're doing in taxonomy_term_page() it might be more scalable (more queries, but all indexed ones). The other option is we make forum.module maintain it's own denormalized table with the latest nodes, comments, authors and etc. for each forum so it can query that again on run-time. David Strauss has something like this already I think for Drupal.org.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

_taxonomy_term_select() we can remove if we convert term->parents to a field (in terms of storage it's more or less the same, but I have a feeling field_attach_query will make it impossible to actually do without crazy refactoring we don't have time for...) - not going to die in this patch either way :(

Term parents are also the sideshow that makes Taxonomy module buckle with deep hierarchies. It's also the only typed relationship in Drupal core. It has its own problems, too... multiple parents is a big one! As much as I'd like to see it be a field, too, I think it's something we need to live with for a release cycle. RDF, object relations, alternative field storage, and all of Field API need to come to bear on this problem in Drupal 8.

catch’s picture

i'm wondering if we could do a half-arsed conversion to a field, keeping current storage and queries, but re-using the field widgets just to clean up the code / UI. Either way, doesn't need to be done here.

bjaspan’s picture

subscribe

catch’s picture

Status: Needs work » Needs review
FileSize
82.19 KB
Failed: 12063 passes, 138 fails, 147 exceptions View

Including changes from #554164: Join on {forum} instead of {term_node} for forum queries for now, a few more bits and pieces cleaned up as well.

Didn't look at providing a forum field yet, but that's probably the next step for reducing failures there. CNR for the bot.

Status: Needs review » Needs work

The last submitted patch failed testing.

webchick’s picture

Status: Needs work » Needs review

Oh, testing bot...

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
75.71 KB
Failed: 11889 passes, 138 fails, 133 exceptions View

Forum patch is in, so re-rolled for that change. Also fixed last remaining notices in taxonomy.module.

So taxonomy.module, apart from a couple of minor TODOs is now fullly functional - we just need to fix all the bits of core which are using old APIs, which is unfortunately quite a lot. CNR for the bot only.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

FileSize
76.54 KB
Failed: Failed to apply patch. View

Started taking a crack at forum, what a horrible mess it is in there. This is nowhere near working, at all, and I ran out of steam about 10% of the way through.

We probably need to put in shadow handling back after #554164: Join on {forum} instead of {term_node} for forum queries broke it - I'm thinking a boolean 'shadow' field on the forum table, but didn't implement the schema changes or work out exactly how that would work ye, although there's some draft code for handling it everywhere apart from forum listings themselves.

Either we have to keep variable_get('form_nav_vocabulary', 0) or introduce a helper function to get the $vid allowed value from the field definition, thats a bit of a pain at the moment.

forum_containers is kept in a variable?? Who knew!

field_forum_navigation is too long for a fied name and a pain to type, that should probably be $node->field_forum instead.

While I was in there I realised forum_topic_navigation was still in core, had forgotten that existed and got a nasty reminder, so posted #556136: Remove theme_forum_topic_navigation() to just get rid of the thing.

This patch also updates default.profile for the default.install change. I'll probably not be able to look at this again until Sunday at the earliest and not too much after that either.

yched’s picture

Just wondering why patches #80 and #82 are approx. same size, while #82 is supposed contain lots of additional forum changes ? Are there missing bits in #82 ?

catch’s picture

Damn, looks like it's missing the hook_field_is_empty() and a couple of other small changes. I didn't touch the taxonomy hunks in the last patch though - so just applying the forum bits from #82 and everything else from #80 gets things as far as they are.

Anonymous’s picture

FileSize
85.85 KB
Failed: 11934 passes, 295 fails, 121 exceptions View

I did my best to merge the two patches according to what Catch said.

No changes though.

I'm going to be working on this tomorrow.

catch’s picture

Status: Needs work » Needs review

Looks like that fixed the broken patch, sorry :(

Marking CNR to see where we are with test failures.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

FileSize
90.51 KB
Failed: Failed to apply patch. View

I'm blocked on an issue with #367595: Translatable fields. The issues are #557932: Taxonomy term field autocomplete widgets never validate, always lose values. and #557924: Options widget broken.

The patch here begins to fix blogapi module. It's against HEAD as of now (after the removal of the function registry).

It adds a tags module. It's probably not necessary, but I need some replacement for $node->taxonomy, and $node->taxonomy_vocabulary_tags is the winner. It would be great to just have $node->tags.

catch’s picture

I think node->taxonomy_vocabulary_tags we might be able to get away with shortening those to node->taxonomy_tags (that would also mean some ugly substr code I put in to stop field exceptions could die).

Anonymous’s picture

FileSize
86.03 KB
Failed: Failed to install HEAD. View

I just want to get the tests to run again. This patch removes consideration for blogapi which is no longer in core.

Anonymous’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
87.06 KB
Failed: Failed to apply patch. View

New patch. Many tests still fail.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
86.75 KB
Failed: 11960 passes, 279 fails, 103 exceptions View

grrr... one more time.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
94.98 KB
Failed: 12335 passes, 28 fails, 75 exceptions View

I am nearly done with this patch, but I have unfortunately saved the most difficult part for last... the active topics block.

The entity loading patch that went in earlier today ... yipes. I hope I did all this right.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
96.57 KB
Failed: 12300 passes, 21 fails, 40 exceptions View

Maybe all tests pass?

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Simpletest's own tests are failing. When I investigated, the errors are:

SQLSTATE[42000]: Syntax error or access violation: 1103 Incorrect table name 'simpletest672694simpletest525064field_data_taxonomy_vocabulary_tags_2': CREATE TABLE {field_data_taxonomy_vocabulary_tags_2} ( `etid` INT unsigned NOT NULL COMMENT 'The entity type id this data is attached to', `bundle` VARCHAR(32) NOT NULL DEFAULT '' COMMENT 'The field instance bundle to which this row belongs, used when deleting a field instance', `deleted` TINYINT NOT NULL DEFAULT 0 COMMENT 'A boolean indicating whether this data item has been deleted', `entity_id` INT unsigned NOT NULL COMMENT 'The entity id this data is attached to', `revision_id` INT unsigned DEFAULT NULL COMMENT 'The entity revision id this data is attached to, or NULL if the entity type is not versioned', `delta` INT unsigned NOT NULL COMMENT 'The sequence number for this data item, used for multi-value fields', `language` VARCHAR(32) NOT NULL DEFAULT '' COMMENT 'The language for this data item.', `taxonomy_vocabulary_tags_value` INT unsigned DEFAULT NULL, PRIMARY KEY (`etid`, `entity_id`, `deleted`, `delta`, `language`), INDEX `taxonomy_vocabulary_tags_value` (`taxonomy_vocabulary_tags_value`) ) ENGINE = InnoDB DEFAULT CHARACTER SET UTF8 COMMENT 'Data storage for field 2 (taxonomy_vocabulary_tags)'; Array ( )

The table name simpletest672694simpletest525064field_data_taxonomy_vocabulary_tags_2 sure doesn't seem right, but it also doesn't look like it's really a problem with this patch.

yched’s picture

field_data_taxonomy_vocabulary_tags_2 is really long. It's within the length allowed by Field API, but if the field is part of a simpletested install, the simpletest prefix pushes us over MySQL's maw chars for tablenames.
It's strange that we seem to have 2 simpletest prefixes here, but maybe you could try naming the fields just 'a_[vocab_id]', just to see if it makes things better.

yched’s picture

oh, 2 prefixes is probably 'normal' in tests where simpletest is testing its own behavior :-(. Er, that's problematic...

Anonymous’s picture

Status: Needs work » Needs review
FileSize
95.98 KB
Failed: Failed to apply patch. View

I'm removing the string 'vocabulary_' from field names. It seemed needless anyway

Anonymous’s picture

FileSize
93.23 KB
Failed: Failed to apply patch. View

that last patch is going to fail to apply. this one should be better.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Those four test failures happen on a clean HEAD install. Someone committed bad code?

catch’s picture

Status: Needs work » Needs review

Putting back for a re-test in the hope it'll pass when HEAD is fixed.

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
yched’s picture

Congrats on tests ;-)
Taxonomy and forum changes are largely over my head, I mainly skipped those parts, this review is only Field-oriented. Not much.

+++ modules/forum/forum.install	26 Aug 2009 12:51:37 -0000
@@ -26,7 +26,7 @@ function forum_enable() {
-    $vocabulary->nodes['forum'] = 1;
+    //$vocabulary->nodes['forum'] = 1;

leftover ?

+++ modules/taxonomy/taxonomy.module	26 Aug 2009 12:51:42 -0000
@@ -63,6 +63,56 @@ function taxonomy_entity_info() {
+  $result = reset(field_attach_query($field_info['id'], $conditions, FIELD_QUERY_NO_LIMIT));

FIELD_QUERY_NO_LIMITS can easily explode PHP memory. I(=t's mainly there for testing purposes, but should be avoided by most actual code. You should use the $count and $cursor param to loop through limited sets of results.

+++ modules/taxonomy/taxonomy.module	26 Aug 2009 12:51:42 -0000
@@ -437,6 +448,31 @@ function taxonomy_check_vocabulary_hiera
+function taxonomy_vocabulary_create_field($vocabulary) {
+  $field = array(
+    'field_name' => 'taxonomy_' . $vocabulary->machine_name,

Field names are 32 chars max, so we should document that vocabs machine names shouldn't be longer than 32-9 = 21 chars :-(, and add a validation step on the 'create vocab' form.

+++ modules/taxonomy/taxonomy.test	26 Aug 2009 12:51:42 -0000
@@ -483,22 +420,26 @@ class TaxonomyTermTestCase extends Taxon
+    file_put_contents('/tmp/test.txt', print_r($instance, 1));

Leftover ? + we have debug() now ;-) (and devel has dd())

Beer-o-mania starts in 4 days! Don't drink and patch.

catch’s picture

FileSize
91.98 KB
Failed: 12364 passes, 5 fails, 2 exceptions View

Addresssed yched's points apart from FIELD_QUERY_NO_LIMIT. 1. I need to check how the $limit and $cursor params work in practice 2. We already have to use that in a couple of other places, where it can't be removed unless we duplicate a lot of data somewhere. We could fix the usage here, but it's not actually going to improve things on a real site, since that site is bound to run into one of the other queries somewhere anyway.

Also found a few more things to cleanup.

I'm not sure how update.php is doing at the moment, but I'll try to generate some data and do benchmarks on taxonomy/term and if possible /forum later on to see how we're doing before and after.

diffstat: 11 files changed, 411 insertions(+), 1248 deletions(-)

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

How do we deal with forum shadow topics?

Here's my idea. It's based in imagination so if it doesn't/can't work, I'll hear your ideas.

The forum table has nid, (node) vid and tid. The tid is the real forum term.

The field for the forum vocabulary (taxonomy_forums) has a cardinality of unlimited values.

The forum for a topic is the tid from the forum table. This part already works. The breadcrumb is made from the tid value in the forum table.

Shadow topics are those term values in taxonomy_forums that are not the tid from the forum table.

The way to handling shadow topics is to keep taxonomy_forums field with unlimited cardinality. When the node form is generated, stash these values in the form. Set the taxonomy_forums element's #multiple = FALSE and #default_value = the tid value from the forum table.

When the form is submitted, compare the stashed values of taxonomy_forums field, the shadow checkbox, and the submitted single value for taxonomy_forums. Change the taxonomy_forums field to have all of its values that were set aside earlier.

If the submitted value of taxonomy_forums is different, set the $node->forum_tid to be this new value.

if the shadow checkbox is FALSE, remove the original forum_tid from the taxonomy_forums values.

Is this making sense?

moshe weitzman’s picture

IMO, shadow topics are very esoteric and can be dropped as a feature.

catch’s picture

FileSize
92.68 KB
Failed: Failed to apply patch. View

Unbroke forum tests.

Shadow topics are supported by phpbb, so I'd rather retain support if possible.

Anonymous’s picture

Status: Needs work » Needs review

Let's see those test failures... if any.

Anonymous’s picture

FileSize
92.81 KB
Failed: Failed to apply patch. View

This patch implements shadow topic support. Basically the taxonomy_forums field has multiple cardinality but the widget-element is altered to be single valued. I really don't know why it works, but it does. The eyeball test and the simpletests verify this works...

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

FileSize
93.17 KB
Failed: Failed to apply patch. View

Let's try this one.

Anonymous’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
92.93 KB
Failed: Failed to apply patch. View

Ok. Finally!

Anonymous’s picture

The shadow topic feature of forums is present but only partially functional. You can move a topic from one forum to another and the topic "remembers" its old forum but the shadowed forums won't list the topic. Only the current forum will list the topic.

I'm committed to keeping this feature, and I know what needs to be done (see taxonomy_select_nodes() in the patch).

(bangpound pleads for mercy and the kind, thoughtful discretion of our beloved committers.)

catch’s picture

Just to clarify on shadow forum topics, they're broken in HEAD due to #554164: Join on {forum} instead of {term_node} for forum queries and the current state of this patch doesn't make them any worse at all. We used to join on both {forum} and {term_node} then use the disparity to work out if a topic was shadow or not - doing that is one of the reasons for forums performance, and I think we can put it back a lot faster (given the constraint of never being able to join on a field storage table). I'm a bit worried about doing that in this patch though given impending deadlines, and the number of API functions this patch changes and removes along with $node structure changes - whereas shadows will be a straight bugfix post-freeze. Ideally, we'd open a separate issue and start working on that now - so there's progress there without adding noise to this issue so the other changes can be reviewed properly.

Anonymous’s picture

catch’s picture

FileSize
477.65 KB
548.06 KB
571.98 KB
688.12 KB
598.81 KB
709.9 KB
567.81 KB
629.93 KB

Here's some preliminary benchmarks for taxonomy term pages. Note we're benchmarking both the change in query to get the list of nodes attached to a term, and the change in how the terms on those nodes are loaded and rendered, so a few variables to consider.

Generated 20,000 nodes and 10,000 taxonomy terms. D6 upgrade path and devel generate currently aren't working with HEAD to add term_node records, so did the following manipulating it directly:

Associated all 20,000 nodes with term/1

Associated 5,000 nodes with term/2

Associated 200 nodes with term/3

Associated 10,000 nodes with tids 1-10,000

mysql> SELECT COUNT(nid) FROM taxonomy_term_node;
+------------+
| COUNT(nid) |
+------------+
|      34995 | 


mysql> SELECT COUNT(nid) FROM taxonomy_term_node WHERE tid = 1;
+------------+
| COUNT(nid) |
+------------+
|      20000 | 
+------------+
1 row in set (0.01 sec)


mysql> SELECT COUNT(nid) FROM taxonomy_term_node WHERE tid = 2;
+------------+
| COUNT(nid) |
+------------+
|       4999 | 
+------------+
1 row in set (0.00 sec)


mysql> SELECT COUNT(nid) FROM taxonomy_term_node WHERE tid = 3;
+------------+
| COUNT(nid) |
+------------+
|        196 | 
+------------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(nid) FROM taxonomy_term_node WHERE tid = 4;
+------------+
| COUNT(nid) |
+------------+
|          1 | 
+------------+
1 row in set (0.00 sec)

Then ran ab -c1 -n1000 on each of these pages. Then applied the patch, ran update.php and did the same thing:

Results:

taxonomy/term/1
HEAD:
Document Path:          /taxonomy/term/1
Document Length:        14797 bytes

Concurrency Level:      1
Time taken for tests:   482.690 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      15262000 bytes
HTML transferred:       14797000 bytes
Requests per second:    2.07 [#/sec] (mean)
Time per request:       482.690 [ms] (mean)
Time per request:       482.690 [ms] (mean, across all concurrent requests)
Transfer rate:          30.88 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   449  483  19.8    480     604
Waiting:      432  464  19.3    461     584
Total:        449  483  19.8    480     604


Patch:

Concurrency Level:      1
Time taken for tests:   1105.669 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      11338000 bytes
HTML transferred:       10873000 bytes
Requests per second:    0.90 [#/sec] (mean)
Time per request:       1105.669 [ms] (mean)
Time per request:       1105.669 [ms] (mean, across all concurrent requests)
Transfer rate:          10.01 [Kbytes/sec] received


taxonomy/term/2

HEAD:
Concurrency Level:      1
Time taken for tests:   485.358 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      17839000 bytes
HTML transferred:       17374000 bytes
Requests per second:    2.06 [#/sec] (mean)
Time per request:       485.358 [ms] (mean)
Time per request:       485.358 [ms] (mean, across all concurrent requests)
Transfer rate:          35.89 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   457  485  17.2    483     599
Waiting:      439  466  16.7    464     579
Total:        457  485  17.2    483     599

Patch
Concurrency Level:      1
Time taken for tests:   370.349 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      11528000 bytes
HTML transferred:       11063000 bytes
Requests per second:    2.70 [#/sec] (mean)
Time per request:       370.349 [ms] (mean)
Time per request:       370.349 [ms] (mean, across all concurrent requests)
Transfer rate:          30.40 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   341  370  17.1    368     464
Waiting:      330  358  16.6    355     450
Total:        341  370  17.1    368     464


taxonomy/term/3

HEAD:
Concurrency Level:      1
Time taken for tests:   246.607 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      18597000 bytes
HTML transferred:       18132000 bytes
Requests per second:    4.06 [#/sec] (mean)
Time per request:       246.607 [ms] (mean)
Time per request:       246.607 [ms] (mean, across all concurrent requests)
Transfer rate:          73.64 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   223  247  15.2    244     333
Waiting:      207  228  14.6    226     314
Total:        223  247  15.2    244     333


Patch:

Concurrency Level:      1
Time taken for tests:   105.471 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      11435000 bytes
HTML transferred:       10970000 bytes
Requests per second:    9.48 [#/sec] (mean)
Time per request:       105.471 [ms] (mean)
Time per request:       105.471 [ms] (mean, across all concurrent requests)
Transfer rate:          105.88 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    84  105 195.6     94    6038
Waiting:       78   98 189.3     87    5848
Total:         84  105 195.6     94    6038


taxonomy/term/4

HEAD:
Concurrency Level:      1
Time taken for tests:   213.570 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      8562000 bytes
HTML transferred:       8097000 bytes
Requests per second:    4.68 [#/sec] (mean)
Time per request:       213.570 [ms] (mean)
Time per request:       213.570 [ms] (mean, across all concurrent requests)
Transfer rate:          39.15 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   193  213  13.5    210     325
Waiting:      177  196  13.1    194     302
Total:        193  214  13.5    210     325


Patch:
Concurrency Level:      1
Time taken for tests:   68.467 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      7045000 bytes
HTML transferred:       6558000 bytes
Requests per second:    14.61 [#/sec] (mean)
Time per request:       68.467 [ms] (mean)
Time per request:       68.467 [ms] (mean, across all concurrent requests)
Transfer rate:          100.48 [Kbytes/sec] received

Query log / page execution time / memory usage screenshots attached for all four pages, HEAD and patch.

I'm fairly sure that the HEAD queries will get slower the higher table cardinality (and result sets, due to temp table/filesort), and use less memory. Whereas the patch should only get slower and worse memory usage with high result sets since they're better indexed, but more bloated in terms of result sets.

yched’s picture

The performance of the filed_attach_query() part (the hook_field_storage_query() line in the requests table) is dragged down by the fact that indexes on field_sql_storage.module data tables are sub-optimal. Notably missing an index on 'bundle'.
We probably need to index all meta columns (et_id, entity_id, deleted, etc...) separately - indexing of the 'data columns' is left to hook_field_schema().

[edit: happens in _field_sql_storage_schema()]

catch’s picture

FileSize
92.95 KB
Failed: Failed to apply patch. View

And some fixes to upgrade path found when sorting this out.

pwolanin’s picture

applying the patch - I find the manage fields tab confusing on taxonomy vocabularies - what am I supposed ot be doing here?

Adding a vocabulray is a bit counter-intuitive - I have to create a field on the node, and then pick the vocabulary. Can I ad the same vocabulary multiple times to the same node type?

Also, apparently I control wither this is free-tagging when adding the field (by picking the widget)?

So really, (guess this is to be expected) there is not much action on the vocabulary admin page.

pwolanin’s picture

Term links doesn't seem to work, e.g. taxonomy/term/1

Also, I can indeed add the same vocabulary to a single node type as multiple fields using multiple widgets - I guess this could be useful, but also seems potentially confusing.

catch’s picture

Ran explain on the various queries involved:

HEAD:


<code>
mysql> EXPLAIN SELECT DISTINCT n.nid AS nid, n.sticky AS sticky, n.created AS created FROM node n INNER JOIN taxonomy_term_node tn ON n.vid = tn.vid WHERE (n.status = 1) AND (tn.tid IN (1)) ORDER BY n.sticky DESC, n.created DESC LIMIT 0, 10;
+----+-------------+-------+--------+----------------------+------------------+---------+----------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys        | key              | key_len | ref            | rows | Extra                                        |
+----+-------------+-------+--------+----------------------+------------------+---------+----------------+------+----------------------------------------------+
|  1 | SIMPLE      | n     | ref    | vid,node_status_type | node_status_type | 4       | const          | 9392 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | tn    | eq_ref | PRIMARY,vid          | PRIMARY          | 8       | const,d7.n.vid |    1 | Using index; Distinct                        | 
+----+-------------+-------+--------+----------------------+------------------+---------+----------------+------+----------------------------------------------+
2 rows in set (0.00 sec)

taxonomy_get_tids_from_nodes() also causes a filesort.

Patch:

mysql> EXPLAIN SELECT t.bundle AS bundle, t.entity_id AS entity_id, t.revision_id AS revision_id, e.type AS type FROM field_data_taxonomy_tags_2 t INNER JOIN field_config_entity_type e ON t.etid = e.etid WHERE (taxonomy_tags_value = 1) AND (type = 'node') AND (deleted = 0) ORDER BY t.etid ASC, t.entity_id ASC;
+----+-------------+-------+-------+-----------------------------+---------+---------+-------+-------+-----------------------------+
| id | select_type | table | type  | possible_keys               | key     | key_len | ref   | rows  | Extra                       |
+----+-------------+-------+-------+-----------------------------+---------+---------+-------+-------+-----------------------------+
|  1 | SIMPLE      | e     | const | PRIMARY,type                | type    | 767     | const |     1 | Using index; Using filesort | 
|  1 | SIMPLE      | t     | ref   | PRIMARY,taxonomy_tags_value | PRIMARY | 4       | const | 17683 | Using where                 | 
+----+-------------+-------+-------+-----------------------------+---------+---------+-------+-------+-----------------------------+
2 rows in set (0.03 sec)

Looks like indexes could be improved in field_sql_storage.module as yched points out.

mysql> EXPLAIN SELECT n.nid AS nid, n.sticky AS sticky, n.created AS created FROM node n WHERE (n.status = 1) AND (n.nid IN (1)) ORDER BY n.sticky DESC, n.created DESC LIMIT 0, 10;
+----+-------------+-------+-------+--------------------------+---------+---------+-------+------+-------+
| id | select_type | table | type  | possible_keys            | key     | key_len | ref   | rows | Extra |
+----+-------------+-------+-------+--------------------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | n     | const | PRIMARY,node_status_type | PRIMARY | 4       | const |    1 |       | 
+----+-------------+-------+-------+--------------------------+---------+---------+-------+------+-------+
1 row in set (0.00 sec)

And there's no taxonomy_get_tids_from_nodes() or taxonomy_term_load_multiple() because that's handled by the field cache.

@pwolanin

Manage fields on the taxonomy vocabulary page is for adding fields to terms - see #413192: Make taxonomy terms fieldable.

Free-tagging is now a widget setting, yes - a vocabulary doesn't need to know itself if it uses an autocomplete/create widget or not.

All vocabularies create a default field for that vocabulary, however bangpound designed it so you can indeed add more than one field per vocabulary - this allows for use cases like "Country I was born in" "Country I live in" "Countries I've visited" with separate form widgets and display settings, but all using the same countries vocabulary - which is handy, and also lends itself to defining RDF triples.

taxonomy/term/1 paths are working for me using upgraded content - however we only support taxonomy/term/1 for the default provided fields per-vocabulary - if you add additional fields, the nodes won't show up in that listing, same as we can't list users, or other terms, or comments there either.

yched’s picture

#562816: Poor performance on field_attach_query() should fix the performance issues for the field_attach_query() part (filesort + missing indexes)

catch’s picture

I opened #562824: Optimize term_node and forum topic queries - as mentioned there, may or may not be a prerequisite for a commit here, but this issue is already very long, and we need to do that anyway.

catch’s picture

Priority: Normal » Critical

Bumping this to critical - if we don't actually convert taxonomy_node_* stuff to field API this release, then we probably have to revert all the taxonomy_field patches in so far to avoid major usability wtfs.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review

Now now, test bot.

mlncn’s picture

What's the barrier for getting last-minute essential features committed to head?

That setting "Number of values: Maximum number of values users can enter for this field" actually sets a limit on how many autocomplete-style tags are saved is counter-intuitive to anyone used to D6 CCK, where more values mean more form fields, but, well, wow! That's awesome.

A bug that could be handled in a follow-up is that the autocomplete doesn't stop you from entering more terms than this restriction.

Tracking down why the taxonomy/term/[tid] listing pages being broken, a basic backtrace drupal_set_message('

'. var_export(debug_backtrace(),TRUE) .'

'); and a little more looking indicates that field_info_field() isn't returning anything in taxonomy_select_nodes(), and that's causing the error in field_sql_storage_field_storage_query().

ben, agaric

mlncn’s picture

Ah, OK-- term listing pages don't throw errors with upgraded, or all-new, terms. However:

* There is no interface for making a new taxonomy the default for a content type?

Very minor: The option of choosing which vocabulary goes with a taxonomy field is given twice.

catch’s picture

With #562824: Optimize term_node and forum topic queries we'll have (an improved version of) term_node back again - which will mean taxonomy/term/n could list all nodes attached to a term regardless of vocabulary. That'd solve the issue of content not showing up there because it's not attached to the default field.

mlncn’s picture

Status: Needs review » Reviewed & tested by the community

Marking this committable – it does the taxonomy conversion – with crazy follow-up work needed in other issues to make Field API fully up to the task, namely abovementioned #562824.

catch’s picture

I've updated #562824: Optimize term_node and forum topic queries with an in-progress patch. We should end up with both forum and taxonomy queries properly indexed when that's ready. I'm currently using the patch here as the basis for that patch - depending on whether this gets committed or not, I'll merge this back here, or re-roll afterwards.

webchick’s picture

Status: Reviewed & tested by the community » Needs work

This is a relatively cursory review; we're just on our way out to dinner so I had to cut it short before I could finish. However, I tried to focus on broken stuff and am pretty sure that once these are resolved, I've no other complaints about this patch.

+++ modules/forum/forum.admin.inc	28 Aug 2009 11:24:48 -0000
@@ -7,6 +7,7 @@
 
 function forum_form_main($type, $edit = array()) {
+  $edit = (array) $edit;
@@ -28,6 +29,7 @@ function forum_form_main($type, $edit = 
 function forum_form_forum(&$form_state, $edit = array()) {
+  $edit = (array) $edit;

What is this crap? :P

Also, while we're here, can we remove that blank line above the function so the PHPDoc shows up?

+++ modules/forum/forum.install	28 Aug 2009 11:24:48 -0000
@@ -24,9 +24,7 @@ function forum_install() {
-    // Existing install. Add back forum node type, if the forums
-    // vocabulary still exists. Keep all other node types intact there.
-    $vocabulary->nodes['forum'] = 1;
@@ -35,17 +33,26 @@ function forum_enable() {
-      'nodes' => array('forum' => 1),

Does this not re-introduce #199373: Enabling forums after being disabled loses vocabulary association?

+++ modules/forum/forum.module	28 Aug 2009 11:24:48 -0000
@@ -164,28 +145,20 @@ function forum_init() {
 /**
- * _forum_node_check_node_type
+ * Check whether a content type can be used in a forum.
+ *
+ * @TODO DIE.

Er. DIE? This is an important forum feature.

+++ modules/forum/forum.module	28 Aug 2009 11:24:48 -0000
@@ -224,15 +188,10 @@ function forum_node_view($node, $build_m
+      $node->taxonomy_forums[0]['value'] =  arg(3);

I don't like this; it ties the forum to the forum path. Can we derive this another way?

+++ modules/taxonomy/taxonomy.module	28 Aug 2009 11:24:49 -0000
@@ -666,331 +686,6 @@ function taxonomy_vocabulary_get_names()
 /**
- * Implement hook_form_alter().

The minus signs in this patch are GLORIOUS*!&@(*&*!(&@(

+++ modules/taxonomy/taxonomy.test	28 Aug 2009 11:24:50 -0000
@@ -276,27 +270,22 @@ class TaxonomyVocabularyUnitTest extends
-    $vocabulary_node_type = current(taxonomy_vocabulary_load_multiple(array($vocabulary1->vid), array('type' => 'article')));
-    $this->assertEqual($vocabulary_node_type, $vocabulary1, t('Vocabulary with specified node type loaded successfully.'));
+    $this->assertTrue(current(taxonomy_vocabulary_load_multiple(array($vocabulary1->vid), array('name' => $vocabulary1->name)))->vid == $vocabulary1->vid, t('Vocabulary loaded successfully by name and ID.'));

Ick. The old code was much easier to read. Can you please split them back up to separate lines?

+++ modules/taxonomy/taxonomy.test	28 Aug 2009 11:24:50 -0000
@@ -422,7 +366,7 @@ class TaxonomyTermTestCase extends Taxon
+    $edit["parent"] = $term1->tid;

single quotes.

+++ modules/taxonomy/taxonomy.test	28 Aug 2009 11:24:50 -0000
@@ -454,7 +398,7 @@ class TaxonomyTermTestCase extends Taxon
+    $edit[$this->instance['field_name'] .'['. $langcode .'][value][]'] = $term1->tid;
@@ -463,19 +407,11 @@ class TaxonomyTermTestCase extends Taxon
+    $edit[$this->instance['field_name'] .'['. $langcode .'][value][]'] = $term2->tid;
@@ -483,22 +419,25 @@ class TaxonomyTermTestCase extends Taxon
+    $edit[$instance['field_name'] ."[$langcode][value]"] = implode(', ', $terms);

watch your concatenation.

+++ modules/taxonomy/taxonomy.test	28 Aug 2009 11:24:50 -0000
@@ -483,22 +419,25 @@ class TaxonomyTermTestCase extends Taxon
+    $instance = $this->instance;
+    $instance['widget'] = array('type' => 'taxonomy_autocomplete');
+    $instance['bundle'] = 'page';
+    field_create_instance($instance);

Adding tags to 'page'?

This review is powered by Dreditor.

catch’s picture

FileSize
92.73 KB
Failed: Failed to apply patch. View
+++ modules/forum/forum.admin.inc	28 Aug 2009 11:24:48 -0000
@@ -7,6 +7,7 @@
 
 function forum_form_main($type, $edit = array()) {
+  $edit = (array) $edit;
@@ -28,6 +29,7 @@ function forum_form_main($type, $edit = 
 function forum_form_forum(&$form_state, $edit = array()) {
+  $edit = (array) $edit;

What is this crap? :P

This is the crap we have to introduce because we dropped this crap:

-  $items['admin/structure/forum/edit/%forum_term'] = array(
+  $items['admin/structure/forum/edit/%taxonomy_term'] = array(

And this:

 /**
- * Fetch a forum term.
- *
- * @param $tid
- *   The ID of the term which should be loaded.
- *
- * @return
- *   An associative array containing the term data or FALSE if the term cannot be loaded, or is not part of the forum vocabulary.
- */
-function forum_term_load($tid) {
-  return db_select('taxonomy_term_data', 't')
-    ->fields('t', array('tid', 'vid', 'name', 'description', 'weight'))
-    ->condition('tid', $tid)
-    ->condition('vid', variable_get('forum_nav_vocabulary', 0))
-    ->addTag('term_access')
-    ->execute()
-    ->fetchAssoc();
-}

However I worked out we only need to introduce one cast to array there, not two, so it's 1000% nicer than HEAD now instead of 500% :p

Also, while we're here, can we remove that blank line above the function so the PHPDoc shows up?

Yep.

Does this not re-introduce #199373: Enabling forums after being disabled loses vocabulary association?

It shouldn't do, because we replaced $vocabulary->nodes with field instances.

+++ modules/forum/forum.module	28 Aug 2009 11:24:48 -0000
@@ -164,28 +145,20 @@ function forum_init() {
 /**
- * _forum_node_check_node_type
+ * Check whether a content type can be used in a forum.
+ *
+ * @TODO DIE.

Er. DIE? This is an important forum feature.

Nice feature, but wasn't a very pretty helper function. However it's used in a lot of places, so I tidied up the PHPDOC a bit and removed the DIE.

+++ modules/forum/forum.module	28 Aug 2009 11:24:48 -0000
@@ -224,15 +188,10 @@ function forum_node_view($node, $build_m
+      $node->taxonomy_forums[0]['value'] =  arg(3);

I don't like this; it ties the forum to the forum path. Can we derive this another way?

Not invented here, and again it's an improvement on HEAD:

-      $node->taxonomy[arg(3)] = (object) array(
-        'vid' => $vid,
-        'tid' => arg(3),
-      );

So can we do it in a separate issue?

The minus signs in this patch are GLORIOUS*!&@(*&*!(&@(

We'll have to add some code back in #562824: Optimize term_node and forum topic queries but it's code which should've been there 2-3 releases ago, even with that, it's still a lovely diffstat.

Ick. The old code was much easier to read. Can you please split them back up to separate lines?

Done.

single quotes.

Done.

watch your concatenation.

Fixed.

Adding tags to 'page'?

Not quite, adding a new autocomplete widget for our test vocabulary to page. I left this how it was.

New patch.

catch’s picture

Status: Needs work » Needs review

Forgot status change.

catch’s picture

FileSize
115.7 KB
Failed: Failed to apply patch. View

I got #562824: Optimize term_node and forum topic queries to a point where it's got less bugs than this version, brings back shadow forum topics, and also solves the performance/memory issues in both HEAD and the current patch, so I think it's in a sufficient state to merge back to here. There'll be some cleanup to do regardless of how the patches go in, but I think if we can do this at once, it'll be straight bugfixing and tweaks more than any more huge changes. #144 is still good to go otherwise, I'll just need to re-roll this if we decide to commit that first.

summary of the new patch:

It creates taxonomy_index and forum_index tables owned by their respective modules.
These tables hold enough data, denormalized, to build the most expensive queries without adding extra joins.
This removes temporary tables, filesorts etc. from taxonomy/term/n, forum blocks, and forum topic lists. There are other potential queries which should be optimized in a followup patch.
In the process, it adds back some node specific code back to taxonomy.module - but with the crucial difference that this is only required to allow taxonomy module to provide out of the box listings of nodes like Drupal 6 does - i.e. we can eventually remove it when either materialized views or views or both is in core. The new code is specific to providing those listings efficiently, and not mixed in with taxonomy API functions like it used to be.. It also fixes long-standing and critical peformance issues with both modules, which have open issues elsewhere, and custom code on d.o to avoid them falling over.

int’s picture

Issue tags: +Exception code freeze

add tag

pwolanin’s picture

@catch - if we get access to SQL table info would we be able to remove this extra table info?

catch’s picture

pwolanin - we can, but keeping denormalized tables like this lets us fix some very nasty queries in the process. Also it doesn't solve the "showing all nodes tagged with a term regardless of which field was used" issue which you discovered on taxonomy/term/%term

Anonymous’s picture

Everyone:

This patch is postponed until #569224: Expose field storage details through field attach. is committed. All the panicked efforts to make this patch "work" before code freeze can be undone and made WAY more elegant and high performing after that.

catch’s picture

FileSize
117.95 KB
Failed: Failed to apply patch. View

Discussed this with bangpound and chx last night - while I think adding those hooks for field storage tables is a good thing, they don't help us much here:

1. Our existing queries for term listings and forum topics are some of the worst performing queries in HEAD.

2. To get taxonomy/term/%term listings working the same as D6, we'd have to join across all possible field tables which might reference a term ID (and taxonomy fields can potentially span multiple vocabularies or subtrees of vocabularies, so this would take quite some logic to determine which fields might potentially reference a term) - this would make things even worse than they are now

3. If we special-case to one vocabulary and do a join on just one field table, we get the same usability issue pwolanin ran across in http://drupal.org/node/412518#comment-1982796 of it 'not working'.

However, chx is very concerned that sites which actually do want to use pluggable field storage for term fields, say couchDB, can do so without having built-in overhead in taxonomy.module to maintain the denormalized tables - which is an extra insert on node_save() at least. This is fair enough.

So... new patch, which keeps the schema, has taxonomy_select_nodes() return early if using pluggable field storage (our answer to sites doing that is they should know how to use views or roll their own term listings in some other way), and the table is only updated if field_sql_storage.module is in use, via a conditional include. We don't support data migration between field engines or anything, so no need to do that here either.

While it's not too hard to do the same thing for forum.module, we'd pretty much have to add checks for this everywhere in forum.module - since all it really does is provide term listings and a few templates - so I think a hard requirement for field_sql_storage or at least these denormalized tables is in order there. However, if the same arguments apply, I'll do my best to move it out of there too.

chx and bangpound also mentioned on irc that we shouldn't try to introduce materialized views piecemeal into core without a centralized API. I vehemently disagree with this. Materialized views isn't even in contrib CVS yet, making it inaccessible to the very large majority of Drupal users, addtiionally David Strauss has said he's not yet happy enough with the maturity of the API to try to push it into core, and we've already introduced one lookup table with #105639: Tracker performance improvements. Either taxonomy.module provides indexed listings of nodes, or we remove that capacity from core (and tracker.module) and let mv and views handle it in contrib, but we have to stop shipping core Drupal versions with site-hosing queries once yet get above a few thousand items - especially when fixing them really isn't that hard. Since maintaining the tables is now optional, I hope that's enough of a compromise.

yched’s picture

catch: any idea if / how the special 'field_sql_storage' handling applies after #443422: 'per field' storage engine ? (I'm still fighting with the reroll, but there's a general agreement that we should get it in)

catch’s picture

yched, I'm not sure. We could potentially set/get any variable as to whether we want taxonomy module to denormalize field data or not, as opposed to the field_storage one. That would allow sites with their own materialized views to not use the core one, for example. That's a two-line change to the existing patch though, so would love feedback on the general idea.

yched’s picture

@catch: I'm not sure we're talking about the same thing. #443422: 'per field' storage engine is not about denormalization, but about field_a being stored by field_storage_sql and field_b stored by field_storage_couchdb instead of the current 'one storage backend rules them all'.

catch’s picture

FileSize
118.97 KB
Failed: Failed to apply patch. View

So if taxonomy_field_a can be in default storage, but taxonomy_field_b can be in couch_db, then the approach in the latest patch would still apply if we want taxonomy/term/%term to function - since joining between MySQL and couch DB we're not going to support if it's even theoretically possible, but I'd make it a variable_get('taxonomy_maintain_index', TRUE); instead of checking the field storage module variable. I've changed that in this patch, along with renaming the include to taxonomy.index.inc. Now any contrib module or site can stop taxonomy from maintaining this table and do this a different way - whether they're using views to provided different listings, materlaiized views to maintain their own lookup tables, or some other method - but core keeps the listings the same as D6 with much better performance.

Also took a first stab at the upgrade path, but getting batch API notices in 7004 which suggests update.php still doesn't work?

yched’s picture

OK, so when #443422: 'per field' storage engine lands, functions in taxonomy.index.inc will be wrapped in if ($field['storage']['type'] == 'field_sql_storage') { ? Or will the index include non local fields too ?

+ minor: it would be best IMO if all taxonomy_field_*() hooks were kept together in one place, instead of some in taxonomy.module and some loaded conditionally in taxonomy.index.inc.

batch API notices in 7004 ('simple', single pass update) ? or in 7005 (multistep) ?

catch’s picture

OK, so when #443422: 'per field' storage engine lands, functions in taxonomy.index.inc will be wrapped in if ($field['storage']['type'] == 'field_sql_storage') { ? Or will the index include non local fields too ?

Again, it depends how much we want to support Drupal 6 style taxonomy/term/%term listings of all nodes associated with a term (IMO they're useful and we probably do, but it'll be nicer when we generate them with default views). If the general agreement is that yes we do want to support those listings, then we need to maintain the index for all possible fields regardless of where the canonical data is stored, and let sites/modules disable this functionality if they have specific needs.

+ minor: it would be best IMO if all taxonomy_field_*() hooks were kept together in one place, instead of some in taxonomy.module and some loaded conditionally in taxonomy.index.inc.

Well we could put that variable check inside all the functions which were moved to taxonomy.index.inc. Hmm, looking at it it's < 10 functions so that might not be too horrible.

Batch API notices were in 7004.

Status: Needs review » Needs work

The last submitted patch failed testing.

chx’s picture

I am unhappy with only the variable and no table checks coming from #569224: Expose field storage details through field attach.. However, the Bible says

No patch can save the world, not even a Drupal core patch. If it works and does something useful on its own, then it is good to go. One of the worst things you can do is elaborate on other, equivalent approaches and suggest complex extensions to the patch. Additional features can be added later. A scalpel is often better than a Swiss Army Knife.

go ahead.

catch’s picture

Yeah I'm fine adding that additional check in when the patch is in, but don't want that issue to be a dependency.

Attached patch fixes the conflicts in forum.test, moves the taxonomy index stuff back into taxonomy.module, but with a new @defgroup and associated documentation block to explain what's going on, which also seemed an appropriate place to try to document the variable since we don't have obvious ways to do this otherwise.

The forum indexing should also be controlled by a variable probably, and that may need some cron indexing for sites which disable forum module, have people posting comments on the forum content, then switch it on again, which taxonomy doesn't - but webchick indicated in irc that this could probably be done in a followup. There's also a bunch of forum queries which could probably use the new table introduced to forum, but don't yet - however that's also performance tuning/internal factoring which we can do without the API changes here (and forum module works at least as well as in HEAD with this patch applied)

catch’s picture

Status: Needs work » Needs review
FileSize
118.58 KB
Failed: Failed to apply patch. View

Sorry, patch.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
118.69 KB
Failed: Failed to apply patch. View

Cross posted with the final dbtng patch... Should apply for real this time, and forum/taxonomy tests all pass.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review

Re-rolled.

catch’s picture

FileSize
118.72 KB
Failed: Failed to apply patch. View

And the patch, a week later.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
121.52 KB
Passed: 13673 passes, 0 fails, 0 exceptions View

rawhide :(

Dries’s picture

#443422: 'per field' storage engine has now been committed.

catch’s picture

After a lot of discussion with chx in irc, we agreed that the management of indexed tables needs to be handled by a variable (since you also might not want that if using materialized view API or similar) - however I think chx also wants an && $using_sql_storage in the conditions - so presumably we could get that information now per-field storage is in. The behaviour would then be, that we maintain those tables unless you disable it manually, or the field you're using doesn't use SQL storage - which would just be convenience for sites using pluggable storage so they don't need to change the value themselves. However it needs to be noted that it would then not be possible to both use these centralized lookup tables, and also store your fields outside SQL - I can't think of obvious use cases for that but would rather someone else agreed with me before another revision of this patch gets posted.

catch’s picture

FileSize
121.62 KB
Failed: 13436 passes, 0 fails, 69 exceptions View

New patch checks for field_sql_storage() on insert and update hooks. Since there's no hook_field_attach_delete(), I left the delete queries in - a no-op is going to be no-worse than looping through every single field to check the storage.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

FileSize
121.6 KB
Failed: Invalid PHP syntax in modules/forum/forum.admin.inc. View

Fix stupidity.

catch’s picture

Status: Needs work » Needs review

:(

yched’s picture

As a followup to #443422: 'per field' storage engine, we'll need to extend hook_field_storage_info() with a boolean 'local_sql' property, and remove the hardcoded checks against $field['storage']['type'] == 'field_sql_storage'. 'field_sql_storage' might not be the only local SQL storage backend.

Doesn't need to be done in this patch, though.

catch’s picture

That was one of my arguments against putting these checks in at all, and just putting in the variable. I leave it up to others whether we stick with the current patch and add the local_sql in later, or go back to the variable and at it in later.

Reviews here would be appreciated, less than two weeks of exception time left.

yched’s picture

Just throwing an explicit reminder that I'm not where I can easily review large patches, and also that I'm probably not the most qualified to greenlight this patch at this point. Last time I checked, it was OK from a Field API point of view.

So *other* reviewers needed :-p

catch’s picture

Yeah I didn't mean yched specifically ;)

Status: Needs review » Needs work

The last submitted patch failed testing.

andypost’s picture

FileSize
12.59 KB

Last patch #174 applies with offset

1) setup with normal profile successful, Tags created
2) New vocab created fine, tested text field addition
3) I found no ability to assign vocabulary to node-type
4) Trying to add existing text field to Tags vocab throw error when save field setting (/admin/structure/taxonomy/1/fields/field_test/edit?destinations[0]=admin/structure/taxonomy/1/fields)

FieldUpdateForbiddenException: field_sql_storage cannot change the schema for an existing field with data. in field_sql_storage_field_update_forbid() (line 226 of /var/www/d7/modules/field/modules/field_sql_storage/field_sql_storage.module).

5) Editing term form looks strange, theme changed from admin to garland and fields displayed in strange order (screen attached)
6) after saving new term and after editing no redirect and form fields not cleared

catch’s picture

3) I found no ability to assign vocabulary to node-type

This should be at admin/structure/types same as any other field.

4) Trying to add existing text field to Tags vocab throw error when save field setting (/admin/structure/taxonomy/1/fields/field_test/edit?destinations[0]=admin/structure/taxonomy/1/fields)

FieldUpdateForbiddenException: field_sql_storage cannot change the schema for an existing field with data. in field_sql_storage_field_update_forbid() (line 226 of /var/www/d7/modules/field/modules/field_sql_storage/field_sql_storage.module).

That's not good, but adding fields to terms isn't part of what this patch does - can you verify whether that error occurs in HEAD too?

5) Editing term form looks strange, theme changed from admin to garland and fields displayed in strange order (screen attached)
6) after saving new term and after editing no redirect and form fields not cleared

Same thing here - please confirm that these aren't existing bugs in HEAD.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
121.47 KB
Passed: 13645 passes, 0 fails, 0 exceptions View

#570900: Destroy remnants of update_sql() meant the hook_update_Ns in this patch were old fashioned and (even worse) big failures. However, in my usage, update.php still showed some meaningless output when the updates succeeded. I don't know if this is the new Drupal way or if this is an issue with this patch. Please let me know.

This patch applies to HEAD and the relevant tests pass. Hopefully testbot will say the other tests pass, too.

andypost’s picture

@catch - previous test p.4-5 have trouble with core, so not related to patch

Another review #183

1) new vocab creation fine
2) new terms fine, except form fields still have old values (but this is a core bug - no time to search issue)
3) assigning to node type (Page) a field type taxonomy (checkboxes widget) - works but submit on third step (page settings) brings

    *  Notice: Array to string conversion in drupal_validate_utf8() (line 1152 of /var/www/d7/includes/bootstrap.inc).
    * Warning: preg_match() expects parameter 2 to be string, array given in drupal_validate_utf8() (line 1158 of /var/www/d7/includes/bootstrap.inc).
    * Notice: Array to string conversion in drupal_validate_utf8() (line 1152 of /var/www/d7/includes/bootstrap.inc).
    * Warning: preg_match() expects parameter 2 to be string, array given in drupal_validate_utf8() (line 1158 of /var/www/d7/includes/bootstrap.inc).

4) node/add/page - works

Error message
Warning: Invalid argument supplied for foreach() in options_buttons_elements_process() (line 135 of /var/www/d7/modules/field/modules/options/options.module).

5) nodes are created and navigates fine

6) new vocab, addition existing field (taxonomy from node-page) but with autocomplete widget - excelent, no warnings
7) new term - widget for term-field below Save button :(
8) Addition of term field to vocabulary works fine

Everything works fine except warning and notices

bensnyder’s picture

subscribe.

Anonymous’s picture

@andypost

The issues you're describing (in #184, nos. 3 and 4) are not related to this patch AFAIK. Can you open a new issue about them? I can have a look after this patch is committed as I'm pretty familiar with the widget and the term field.

andypost’s picture

@bangpound Summarized #595702: Warnings when adding taxonomy term-field with checkboxes wodget

So about this patch +1 to RTBC!
we found more comfortable to assign vocabularies to node-types with fields oposit old-style (editing vocabulary)

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

I read through this patch and I found the flow easy to follow and the code high quality. I added/edited/deleted content and terms without problem.

We are in a terrible twilight zone right now with taxonomy and I think it is important to commit this patch. We should performance optimize the upgrade path at a later time.

sun’s picture

FileSize
121.39 KB
Passed: 13687 passes, 0 fails, 0 exceptions View

This is incredible! The patch, indeed, looks totally awesome, and we should tackle any resulting/remaining issues in follow-ups.

Testbot doesn't seem to come back from re-testing. Same patch. Don't credit me.

Anonymous’s picture

I verified that the patch sun posted in #189 is identical to #183. (Excessive caution?) Thanks for the re-roll!

catch’s picture

Just a note on the upgrade path - it's already multi-pass, the issue is whether we can make it an INSERT INTO .... SELECT FROM instead, which requires finding a way to handle field deltas during that process if so, we should also look at doing the same for the body as field update (and that's a lot simpler since it doesn't have to worry about deltas).

These are the two main follow-up issues: #597736: Do forum indexing on cron, #597742: Optimize forum.module query performance.

catch’s picture

Removed a dead comment, a duplicate comment, and an spare apostrophe.

catch’s picture

Issue tags: +Favorite-of-Dries, +Fields in Core, +taxonomy fields, +DIE, +Exception code freeze
FileSize
121.25 KB
Failed: Invalid PHP syntax in modules/forum/forum.admin.inc. View

Removed a dead comment, a duplicate comment, and an spare apostrophe.

webchick’s picture

Ok. Spent a good hour doing a deep-dive on this patch. I had a few questions, and found a few very minor things. None were enough to hold up this patch, though. As soon as catch posts a quick re-roll and it gets green from the testing bot, this is going in.

Here were my comments from IRC:

Oct 08 01:09:06 <webchick>	$vocabulary->nodes['forum'] = 1;
Oct 08 01:09:10 <webchick>	What does that get replaced by?
Oct 08 01:09:12 <webchick>	the 'bundle' ?
Oct 08 01:09:55 <catch>	webchick: yes the bundle.
Oct 08 01:10:09 <webchick>	catch, So bundle is a single key? What if I want "tags" on both "Article" and "File"?
Oct 08 01:10:36 <catch>	webchick: field api allows you to attach a field Oct 08 01:10:41 <webchick>	catch, I guess I'd do two field_create_instance()s?
Oct 08 01:10:46 <catch>	yep.

...

Oct 08 01:20:22 <webchick>	hunks lke this make me smile:
Oct 08 01:20:23 <webchick>	 /**
Oct 08 01:20:23 <webchick>	- * _forum_node_check_node_type
Oct 08 01:20:23 <webchick>	+ * Check whether a content type can be used in a forum.
Oct 08 01:20:34 <webchick>	On what fricking planet is _forum_node_check_node_type acceptable PHPDoc

...

Oct 08 01:24:47 <webchick>	+    // If the node has a shadow forum topic, update the record for this
Oct 08 01:24:47 <webchick>	+    // revision.
Oct 08 01:24:47 <webchick>	+    if ($node->shadow) {
Oct 08 01:24:54 <webchick>	OMG! You maintained shadow topic support!*@&(
Oct 08 01:25:07 <catch>	webchick: actually this puts it back ;)
Oct 08 01:26:27 <webchick>	catch, oh, I remember that now. tee hee.

...


Oct 08 01:27:51 <webchick>	+/**
Oct 08 01:27:51 <webchick>	+ * Update the taxonomy index with the most recent o
Oct 08 01:27:51 <webchick>	 
Oct 08 01:27:54 <webchick>	that looks like a bug.
Oct 08 01:28:03 <catch>	The most recent o?
Oct 08 01:28:14 <webchick>	Yes. As opposed to that old crappy o.

...

Oct 08 01:29:18 <webchick>	+ * Comment module doesn't call hook_comment_unpublish() when saving individual
Oct 08 01:29:18 <webchick>	+ * comments so we need to check for those here.
Oct 08 01:29:18 <webchick>	+
Oct 08 01:29:24 <webchick>	catch, why don't we fix comment module..?
Oct 08 01:30:05 <catch>	webchick: this code is almost 1-1 for the tracker code, and yes we need to fix that.
Oct 08 01:30:15 <catch>	webchick: one week left!

// Note: Since this flew for tracker module, I'm going to let it fly here. But please file a follow-up about this.

...

Oct 08 01:32:36 <webchick>	catch, On a related note, why are we calling update_index directly?
Oct 08 01:32:43 <webchick>	Or wait, this is some other index that's not the search index.
Oct 08 01:33:05 <webchick>	This is that materialized views stuff, isn't it?
Oct 08 01:33:11 <catch>	webchick: yes.

...

Oct 08 01:36:40 <webchick>	catch, boy. there sure is some code in forum module that really makes absolutely no sense, huh?
Oct 08 01:36:51 <catch>	webchick: it's crazy in there.
Oct 08 01:37:02 <webchick>	I should apoligize to user module.
Oct 08 01:37:20 <catch>	webchick: in forum.test, there's some code which creates about 10 nodes, then views them in a loop, to test 'active forum topics'
Oct 08 01:37:28 <catch>	And active forum topics is about comments, not page views.
Oct 08 01:38:34 <webchick>	catch, LOL

...

Oct 08 01:38:22 <webchick>	catch, question about +function _forum_update_forum_index($nid) {
Oct 08 01:39:02 <webchick>	catch, Should this be (not in this patch) revamped to some generic "node statistics generator thingy" function?
Oct 08 01:39:11 <webchick>	catch, or is it pretty specific to forum?
Oct 08 01:39:50 <catch>	webchick: that's copied from tracker.module
Oct 08 01:40:00 <webchick>	Then.."yes" ;)
Oct 08 01:40:23 <catch>	webchick: yes. But probably when materialized views goes into core.
Oct 08 01:40:41 <webchick>	snicker. ;P
Oct 08 01:40:50 <webchick>	catch, so forum_index is not a direct copy/paste of tracker_index?
Oct 08 01:40:59 <webchick>	and they do make sense as separate functions?
Oct 08 01:41:32 <catch>	webchick: the tables are very different and only share a few fields.
Oct 08 01:41:43 <webchick>	catch, ok cool. thanks!

...

Oct 08 01:41:32 <webchick>	+      $edit["taxonomy_forums[$langcode][value]"] = $this->root_forum['tid']; // Assumes the topic is initially associated with $forum.
Oct 08 01:41:35 <webchick>	duplicated comment.
Oct 08 01:42:44 <catch>	webchick: mv would handle deciding whether to use one table or three, but it's better to keep this self-contained for now.
Oct 08 01:42:52 <webchick>	catch, yep, wfm.

...

Oct 08 01:42:30 <webchick>	catch, what's up with this hunk in node_revision_revert_confirm:
Oct 08 01:42:30 <webchick>	-  if (module_exists('taxonomy')) {
Oct 08 01:42:30 <webchick>	-    $node_revision->taxonomy = array_keys($node_revision->taxonomy);
Oct 08 01:42:30 <webchick>	-  }
Oct 08 01:43:03 <catch>	webchick: node->taxonomy sometimes has tids, sometimes term objects.
Oct 08 01:43:11 <catch>	webchick: all that stuff just dies when it's fields.
Oct 08 01:43:26 <webchick>	catch, But, this looks like a regression. Won't it fail to version the terms?
Oct 08 01:43:33 <webchick>	Or does this now get handled explicitly by field API
Oct 08 01:43:35 <catch>	webchick: field api does that for us
Oct 08 01:43:35 <webchick>	?
Oct 08 01:43:40 <webchick>	Oh! Nifty!

...

Oct 08 01:44:53 <webchick>	catch, 
Oct 08 01:44:53 <webchick>	+    // Restrict machine names to 21 characters to avoid exceeding the limit
Oct 08 01:44:53 <webchick>	+    // for field names.
Oct 08 01:45:02 <webchick>	how the heck was 21 of all possible lengths decided for Field API :P
Oct 08 01:45:04 <webchick>	do you know? ;)
Oct 08 01:45:16 <catch>	webchick: maximum limit for MySQL table names.
Oct 08 01:45:19 <catch>	I think.
Oct 08 01:45:20 <webchick>	Ahhh.
Oct 08 01:45:21 <webchick>	Makes sense.

...


Oct 08 01:50:56 <webchick>	catch, someone tested the upgrade path? Or it's a theoretical one? ;P
Oct 08 01:51:13 <moshe_work>	webchick: can't be tested. too slow.
Oct 08 01:51:18 <catch>	webchick: thoroughly tested. But not by me for a month - although bangpound did.
Oct 08 01:51:21 <webchick>	+ * This function requires taxonomy module to be maintaining it's own tables,
Oct 08 01:51:21 <webchick>	 <-- ITS
Oct 08 01:51:57 <moshe_work>	apostrophe police

...


Oct 08 01:53:17 <webchick>	+function taxonomy_vocabulary_create_field($vocabulary) {
Oct 08 01:53:24 <webchick>	looks like the indentation in that function needs to be fixed.

...

Oct 08 01:55:36 <webchick>	Oh yay! 25 screen-fulls of - signs!
Oct 08 01:55:39 <webchick>	This is the best patch ever. :D

Catch was able to address all of my "big" questions and the re-roll should take care of the tiny issues.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

webchick’s picture

Oh, well shoot. In my eagerness to see this committed, I forgot that I haven't yet checked the UI-related changes. And when I did, I found some bugs.

1. After running update.php, my "Tags" vocabulary on the Article type moved from the top of the list to below body. Possibly some kind of weighting issue in the update path.

2. Things get really crazy though, with Forum module:

Ick

As you can see, this is horribly broken. :P

The best thing I can figure out is that it somehow thinks "Tags" is the forum vocabulary, since all 3 of those things it's attempting to create links to have that vocabulary assigned to them.

We should limit this list to only node types, not other entities. If this is a huge hassle, we can save it for another patch.

catch’s picture

Status: Needs work » Needs review
FileSize
121.25 KB
Failed: 13666 passes, 0 fails, 13 exceptions View

I can't reproduce the syntax error in forum.admin.inc locally with php -l. This should fix the "post new $type" thing - for some reason that was checking the tags field, although I didn't yet restrict it to nodes only.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
121.27 KB
Passed: 13441 passes, 0 fails, 0 exceptions View

Grr, this one should pass.

webchick’s picture

Status: Needs review » Fixed

Excellent. Committed to HEAD!!

We need follow-ups for (at least):
* Limit types of things you can post to forums to nodes only, and add tests for this feature.
* Clean up comment.module so it fires hook_unpublish on single-saves as well.

catch’s picture

Status: Fixed » Needs review
catch’s picture

Status: Needs review » Fixed

Didn't mean to change status!

I also added a note to the upgrade docs. Since we removed hundreds of lines of code here, this just says "we removed a lot of stuff, please look at the issue or field API docs" at the moment.

giorgio79’s picture

Fantastic stuff! Looking forward to this.

This may not be the best place to ask, but it is organically following this issue so I Thought it is worth asking.

With this upgrade, on taxonomy term edit pages, will we be able to replace the large select boxes for selecting related terms and parent terms (in the advanced options) with CCK autocomplete for example?

At the moment the term edit page just loads all terms from the vocab in the select boxes which can be a performance and usability issue with larger vocabs for example.

catch’s picture

Related terms are no longer in HEAD.

Parent items, sadly, we didn't have time to convert to a field in Drupal 7 (or remove the horrible form code which generates that select list). We should look at performance improvements in D7 though, and term hierarchy is going to be a major TODO for Drupal 8.

giorgio79’s picture

Cool, thanks Catch.

I just found this issue related to to what we just talked about:
#526120: Remove related terms from core now that terms can have a term reference field

Anonymous’s picture

Parent items are going to be one tough job for Drupal 8.

I've heard people talk about a hierarchical field type. It took me a while to really understand what this means, and I still can't put it into a brief comment on an issue.

People are very dissatisfied with the performance of deep hierarchies in taxonomy.module vocabularies.

I predict that Drupal 8 taxonomy module needs to provide or use a field that has a different kind of storage pattern that it can somehow impose on the bundles it's attached to. We already have a simple way of expressing hierarchy, but it's inefficient. There are other ways that are more suitable for deep hierarchies, and they don't require recursion or multiple SQL queries. There are database models that are inherently hierarchical, and Field API is supposed to be agnostic about where field values are stored.

However, we still have to finish Drupal 7.

mfb’s picture

Follow-up bug fixes for forum.install and forum.module at #601938: Forum exceptions

Xano’s picture

Just poking a bit: a lot of contrib depends on taxonomy_term_count_nodes(). Any chance we can get it back?

catch’s picture

We could probably use the term_index table for that yeah. Can you open a new issue? It'd be putting back an old API rather than a new one, so if you can argue it's a regression probably OK for the spit and polish phase.

Xano’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Gábor Hojtsy’s picture

Status: Closed (fixed) » Needs work

@webchick, @catch: seeing that no category information is printed with node feeds currently and no Feed UI exists (that I could find) to set that up, I was curiously looking for the patch that removed this functionality. Looking at this patch that was committed, it looks like the RSS category addition functionality was removed and no facility was added back in to reproduce it. Or am I missing something?

jessebeach’s picture

Here is the RSS 2.0 specification: http://www.rssboard.org/rss-2-0-1-rv-6

catch’s picture

Status: Needs work » Closed (fixed)

Gabor, please open a new issue for that if you think it's a regression, no need to re-open 200 comment issues.