The straight conversion of TOC from D6 to D7 does not utilise some of the major benefits of D7, particularly the Field API which allows a far better integration of 3rd party plugins into the management and handling of fields - with Body now being just another field.

So, I would like to suggest planning a complete refactoring.

Here's the off-the-top-of-my-head suggestions:

  • Field API offers hook_field_attach_view_alter() which allows the modification of field output. This is the best place to put the TOC processing (and adding the necessary JS & CSS which is currently dumped on every page.
  • This will also mean that TOCs can be applied to any textarea type field on any entity.
  • We can hook into the field settings form and add any settings we require on a per-field basis.
  • One issue I came across because I use Display Suite is that TOC assumes there are only "node" and "teaser" view modes but there's RSS and Search as well, and DS can create others.
  • If it's feasible I'd like to get away from having an embedded [toc] completely. (But is that a good idea?)

An example of what can be done with fields is here http://drupal.org/sandbox/tedbow/1208762 - Ted put it on d.o but I wrote it. It allows you to encrypt any field by hooking into the Field API.

What else?

Comments

adaddinsane’s picture

Hm, can't edit the original.

Okay in the last couple of hours I've been hacking the module to pieces and proved to myself that the hook_field_attach_view_alter() approach is valid. I removed all the input filter handling and put the creation of the TOC into the hook only. No frills, it just inserts the TOC wherever it finds [toc], and only if it's a "full" node.

Works a treat.

So, it seems to me that the action of the input filter should be to only insert the [toc] in the correct position with the required parameters.

The hook then decides whether to display or not, and uses the embedded parameters to style up the TOC - it can store the TOC using a standard D7 entity_type/bundle/field/delta/id-type table, with an MD5 marker based on the text to see if it's changed.

I'm omitting the block option, and the comments. I'm thinking that running the output through a "drupal_alter()" before rendering would allow other modules to do their thing.

OTOH: Since even the main content is now in a block, and blocks can be made to show only on specific node pages, perhaps the correct action is to put the TOC in a block always, the user can then configure it to be above/below or indeed anywhere.

Anyone else have any thoughts on this?

adaddinsane’s picture

StatusFileSize
new21.78 KB

Right, I've attached a zip file of my mostly reconstructed tableofcontents module for D7.

Major changes:

  • Uses hook_field_attach_view_alter() to add the tableofcontents but still only adds it to a field called "body" and delta 0 only.
  • Works with any entity/bundle that has that field;
  • Replaced global "$toc" variable with global drupal_static();
  • Totally reworked the $toc variable and the admin page so they are the same structure instead of one being an array and the other being an object: which means (1) no longer pollutes the variables table with over twenty variables per input filter (now just one per filter); (2) easier to understand and modify; (3) removes massive chunks of variable_gets;
  • Got rid of the block module for now - but added a setting that renders the TOC but doesn't put it into the output which means a block module could now just grab the content from the cache and output it;
  • Changed the caching to a proper D7 cache;
  • Got rid of the "don't display on print page" code - that should be controlled by CSS;
  • Only puts the JS and CSS on the pages that need them, instead of every page;
  • All theme functions separated out into their own file;
  • All filter functions separated out into their own file;

Not working:

  • Can't specify which entities/bundles - this may not be necessary anyway.
  • Only renders the TOC for "full" view modes;
  • Modified JS (wasn't even standard D6) is not actually functioning right now;
  • Haven't added a drupal_alter() for modifying the TOC as yet;

That should keep you going.

If you think it's worth it, Alexis you might want to add it as a 7.x-2.x-alpha? Or something. Obviously it needs more testing than me just bashing it out in one website.

adaddinsane’s picture

StatusFileSize
new21.8 KB

Turns out that version was a bit buggy. Been using it in a real site I'm developing and got those bugs out...

mwallenberg’s picture

I'm currently evaluating this version, and will post back here in case I run into any bugs, but so far it looks stable.

EDIT:
I'm starting a list of issues/fixes I find. Adaddinsane, feel free to roll them up into your zip, since I think you are the de-facto maintainer for now :)

  • in tableofcontents.filters.inc, line 18, tips callback has a typo in the function name. Change to: 'tips callback' => '_tableofcontents_filter_tips',
  • in tableofcontents.js, lines 12, 21, and 22, the parameter "self" used in the selection statements caused the show/hide link to not appear. Removing "self" from said lines fixes the issue. This also allows lin 10 to be removed completely
dmitriy.trt’s picture

Addadinsane, thanks for your amazing work on this. I have a few fixes:

  • Hide/show link is fixed and works now.
  • ScrollTo and LocalScroll plugins are included only if scroll feature is enabled.
  • Limit LocalScroll to context and protect it with .once()
  • Tip callback typo is fixed, thanks to Mwallenberg.

Attaching full module and diff from #3.

adaddinsane’s picture

Thanks for your input guys - I'll add it to my to-do list :-/

Trouble is I really don't like what it's doing here, it's so hacky it's criminal.

There's got to be a better way.

What if it wasn't an input filter at all? What if it's a field and in the field settings you specify which other field you want to create the table of contents for. If it's a separate field then the developer/configurer has full control over where it goes using standard tools.

Trouble is it still has to read and modify the field it's TOCing. Aaargh. Hacky.

dmitriy.trt’s picture

What about splitting this module into multiple?
1. TOC API - basic API to

  • Get list of TOC settings profiles. There are way too much options to set them all again and again for each filter (automatic TOC, render/save to cache, min/max headers level, numbering mode, etc.)
  • Get settings for selected profile.
  • Generate HTML TOC from text content based on profile settings extended by inline [toc] settings
  • (a) replace [toc] with HTML TOC or (b) save it to static cache for other sub-modules and replace [toc] with harmless HTML comment indicating (1) that automatic TOC should NOT be added to the text and (2) containing ID to retrieve it from static cache.
  • Check if text content has [toc] tag or TOC was already generated from this text content

2. TOC Profiles UI. Settings forms to edit TOC settings profiles.

3. TOC Input Filter. It could just use basic API: allow to select profile and what to do with generated HTML TOC, then apply profile to content on display step.

4. TOC Field Formatter. It could alter some text formatters (using hook_field_formatter_info_alter() to inject our settings + hook_form_alter() to add sub-form to formatter settings form) or add custom formatter. There will be no problems with supported list of display types (admin will just select different formatters for each display type). Then, apply profile to field content if TOC was not already generated for this text.

5. TOC Block. Just extracts all or first TOC from cache and renders it to block.

6. TOC Panels Pane. The same as block, but renders Panels content pane.

It's just a suggestion. What do you think about it?

adaddinsane’s picture

Hi Dmitry

I'm liking that - profiles yes, separate blocks, & panes all good. Use HTML comments - definitely.

I can't quite get my head around the formatter idea, processing with formatters was something I toyed with but as we can't chain formatters I don't think it's a valid way to go (though that might be because I can't see what you're saying).

However we do need Input Filter action to process the headers in the body text. It is the correct place to do it. (Thinking about it should each separate type of header process be a separate Input Filter? One to process anchors, another one to do the header numbering? Separation of function is always a good way to go.

If we can break up the processing pipeline better then we can probably get somewhere positive:

1. Process field text headers to create unique anchors (add numbering, whatever) and include TOC meta-data using the HTML5 data-x attribute;
2. Build and cache a renderable array representing the TOC;
3. Use the array in some setting;

Step 1: Occurs in the input filter;
Step 2: Occurs in hook_field_attach_view_alter (probably);
Step 3: Occurs wherever it's required;

I think we're making progress...

dmitriy.trt’s picture

Hi Steve,

Yes, we can't chain formatters, but we can inject custom form elements to any formatter settings form and then use these settings in hook_field_attach_view_alter() to generate ToC. All required meta-info is already selected when user set-up formatter: entity, bundle, field instance and display mode. But probably it is too dirty hack, I'm not sure. Maybe there is a better place for settings form of this layer.

I've tried to use your version of module, but had to modify it because of the problem introduced by current "do each step on a separate layer" design. We're migrating from D6 and have a lot of content using input format with ToC filter (automatic ToC enabled). And "content" here means not only nodes, but also blocks and potentially a lot of other rich text data not related to entity fields. So, in addition to expected result we got blocks showing huge [toc ...] tags with all settings, because they were added at the filter layer and there was no second layer to replace them with HTML tags (or just remove). It becomes not possible to use this input filter anywhere except for entity fields. And more, we would have to restrict text format to this single use-case, so life would become too complex.

Because of the problems described above I suggest to come back to "do whole job on a single layer" design, but layers must communicate to prevent multiple automatic ToCs and multiple headers processing. Introducing multiple filters can make an administrator's life harder too. What about improving internal design instead? Grouping text processing functions (1st, 2nd and probably 3rd step you've described) into a single class could help to get rid of (1) global variables and (2) functions communicating through static cache, including callbacks passed to preg_replace_callback() and similar functions. Instead of described workarounds we could have a single class instance: initialize it with profile name and text for processing, then use for checks, formatting, ToC extracting, caching, etc. What do you think?

mattbk’s picture

I'm getting this error:
PDOException: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'drupal.cache_tableofcontents' doesn't exist: DELETE FROM {cache_tableofcontents} WHERE (cid LIKE :db_condition_placeholder_0 ESCAPE '\\') ; Array ( [:db_condition_placeholder_0] => d56fd7c8888a7948ffffeb71e485e8d3ca7c8df94c15cfce94a7dbcf24820e65% ) in tableofcontents_cache_get() (line 196 of /var/www/drupal/sites/all/modules/tableofcontents/tableofcontents.module).

I can eliminate it by commenting out the requisite lines (192-197 of tableofcontents.module, below), but I imagine that doesn't do well for my database.

  // We didn't find it, so the text may have changed,
  // remove any other version that may exist
//  list($cid, ) = explode(':', $cid);
//  db_delete('cache_tableofcontents')->condition('cid', "$cid%", 'LIKE')->execute();
  return FALSE;

Any thoughts?

adaddinsane’s picture

Yes. I expect you dropped it in as a direct replacement of the other version. It isn't. They don't use the same DB tables.

Your best option is to replace this newer version - which can't even be regarded as an alpha - with the old version, disable and uninstall that one, then load this one and install from scratch.

silurius’s picture

Subscribing. (Nice work, guys.)

adaddinsane’s picture

Sorry guys been doing paying work. And found an irritation with comments. I have website where the TOC filter is enabled as part of a Wiki set-up, it defaults to this.

Which means that comment text gets the whole [toc] definition which doesn't get processed and does get displayed - that's a definite argument for putting it in an HTML comment.

marcoka’s picture

tested. its working here.

kalabro’s picture

Thanks for your work! I'm thinking about using my own module instead of this, because I can't understand:

  • Why we need to attach all TOC properties to Filter? I see, that we have to modify (filter) field value to add anchors. But display settings fot TOC can be moved closer to the entity, because structure of TOC doesn't depend on input filter.
  • How property “If no [toc ...], create and cache it but don\'t put it on the page” works? Now it seems that I have to parse and cache TOC myself when prepering a block.
adaddinsane’s picture

I completely agree. It's not appropriate to have it all as an input filter.

That's why we've been discussing where to put it.

Anonymous’s picture

I was looking for some post about Panels integration, and i landed here.

I think may be TOC isn't prepared for it? I have a page Panel overriding the node/%node path, so some of my nodes are rendered thought panels. Problem is that the node itself have a couple of H2, but on other panel parts (some views, and other thing) i have other H2' s that are not seen , when i use the TOC block.

There is some way to make it work with panels?

alekth’s picture

Thanks so much for working on the D7 version of this, there really isn't an alternative for this module.

Using the version that Dmitriy.trt posted, I've run into a problem with a setting that would allow users to select whether they want the TOC or not (in node, not using block).

With no automatic TOC in the input filter settings and in the node type settings (let user choose) when authoring/editing a node, I get this error:
Notice: Undefined property: stdClass::$toc_automatic in _tableofcontents_node_form_alter() (line 380 of /home/nihongaku/www/drupal/sites/all/modules/tableofcontents/tableofcontents.admin.inc).

It results in no TOC whatever option is chosen.

If automatic TOC is on in the filter settings, the TOC still renders. (if user choice is left, error is there too)
If it's off in the filter but on in the node type settings, however, there is no TOC. (no error output)

I hope this is a typo that won't be difficult to fix, any insights?

omaster’s picture

Has anyone got this working with panels?

Edit: Actually I solved this. You use Entity- Rendered Node :)

btopro’s picture

I wish this was the official D7 version of the module as it's at least working for me. Could be very appealing for inclusion in the MOOC distribution. Most likely I'll end up forking this project and calling it something like mooc_toc or something (willing to help get some of this committed as it's a really cool project)

Here's a helper function for rendering the toc based on any kind of output. This is a modification of the hook_field_attach_view_alter implementation in the code posted to this thread. It strips all of the logic out of the rending of the table of contents to the screen. Enjoy.

/**
 * Helper to return a table of contents on anything
 */
function YOURMODULENAME_tableofcontents(&$body) {
	$text = $body;
	$toc = _tableofcontents_toc_extract($text);
	// Set the [toc] globally
	tableofcontents_toc($toc);
	// Process the headers on this page (we have to do this)
	module_load_include('inc', 'tableofcontents');
	$toc =& tableofcontents_toc();
	// add the headers
	$text = _tableofcontents_headers($text);
	// theme the toc output
	$html =  theme('tableofcontents_toc', array('toc' => $toc));
	// Insert the rendered [toc] in the right place.
	if ($toc['on_off']['automatic']!=3) {
		// Automatic "3" means don't put it on the page (it may go into a block).
		$body = preg_replace(TABLEOFCONTENTS_REMOVE_PATTERN, $html, $text);
	}
	// Add the styling and controls
	$settings = array('tableofcontents' => array(
		'collapse' => !!$toc['box']['collapsed'],
		'scroll' => !!$toc['back_to_top']['scroll'],
	));
	drupal_add_js($settings, 'setting');
	$path = drupal_get_path('module', 'tableofcontents');
	if (!empty($toc['back_to_top']['scroll'])) {
		drupal_add_js($path . '/js/jquery.scrollTo-min.js');
		drupal_add_js($path . '/js/jquery.localscroll-min.js');
	}
	drupal_add_js($path . '/js/tableofcontents.js');
	drupal_add_css($path . '/tableofcontents.css');
	// Remove any leftover [toc]
	$body = preg_replace(TABLEOFCONTENTS_REMOVE_PATTERN, '', $body);
	if (strpos($body, '[toc')!==FALSE) {
		$body = preg_replace(TABLEOFCONTENTS_REMOVE_PATTERN, '', $body . ']');
	}
}
btopro’s picture

Status: Active » Needs work

The code in this issue needs some additional work but I've gotten maintainership access so I'd be happy to push this into a 7.x-2.x branch since this changes the original scope of the project to work with entity fields of any kind. I'll also incorporate a modified version of the function in the previous comment which allows for programatically applying the TOC to any text area that gets processed by an input filter that has TOC enabled. This allows you to run TOC against non-entities (such as a page's output that you defined as a custom menu path).

btopro’s picture

Version: 7.x-1.x-dev » 7.x-2.x-dev
btopro’s picture

Status: Needs work » Needs review
StatusFileSize
new2.09 MB

code in this branch has been added as well as function above and associated api documentation on usage. This has been added to a 2.x branch since this is all entity based and very different from initial scope. Please review code and submit issues associated with problems you find. I haven't done a ton of testing but my initial tests worked very well the other day and I plan on using this in production in the coming weeks in the MOOC distribution.

btopro’s picture

StatusFileSize
new79.56 KB

Didn't realize file was so big.
This is a screenshot showing this function applied to a non entity text field:
TOC applied to a menu callback

adaddinsane’s picture

Thanks for taking this on - I was feeling very guilty for doing nothing, I have so little time.

btopro’s picture

no problem, wanted to at least get it pushed along a bit further as the code-base in this thread is much more flexible then the original 1.x branch. Won't publish a full release til there are a lot more eyes on this version / bug fixes (not all the settings seem to modify how it works) but it is at least working.

neRok’s picture

StatusFileSize
new3.36 KB

I havent looked at TOC for a while, and thought about trying to implement it on my site once more. It seems you have made some improvements with this branch, but I have some suggestions to change the whole thing up again.

The main problem I feel is using input filters and having the TOC in the 'body', which seems a hard way of going about it. The TOC should just be a block/display-suite-field/panel, so the site admin can configure its location etc and it appears in the same spot when required. There could also be an option to just insert TOC at the beginning/end/both of the node content.

Next is creating the TOC. I didnt even bother trying to understand the current code, it looks massive. I figured creating a TOC from html using PHP must be done a lot. A quick google search revealed a few solutions, regex apparently having a few drawbacks for this kind of thing, but using DOMDocument's provides a powerful solution. I tweaked up some code and have the basics working. You pass it the body field after it has been passed through input filters etc (seems to be called the safe_value) and it adds IDs to all the headings and makes list from the results. It doesnt modify anything, just provides the new data that can then be used by calling hook/function. CODE has been refined and moved to a sandbox project and can be viewed there. See http://drupal.org/sandbox/neRok/1912730

The function lets you do everything I said above. You can call it in hook_node_view_alter, send it the marked up body field ie.$node->body['und'][0]['safe_value'], then replace it with the returned ['html']. You can add the ['toc'] at the start/end depending on options. Any block/panel/DS-field/etc can be coded to do a menu_get_object if required and just send the body in the same way, then give back the ['toc'] as its output.

Thoughts on this? Quickly whipped-up module attached (works for adding a panel TOC or DS TOC field. Havent done block yet, dont have block enabled on my server). It needs enabling per content type under "Display Settings" tab, and my quick tests show no adverse effects on speed.

Please dont use the rough module attached. Instead, try out the greatly refined one at http://drupal.org/sandbox/neRok/1912730

btopro’s picture

Status: Needs review » Needs work

hmm... i see where you are coming from with this. The current setup has the advantage of working off of any entity type. I also added a function to the package that allows for applying it to any amount of text so long as the text has a [toc] tag appended to it. I like the notion of the TOC in a block / DS / Panel wrapper (really just needs to be a block at that point I think) but i'm not entirely sure that the proposed solution is more flexible then having an input filter.

I'm speaking strictly from my usage of this so far but i'd almost want some way in which I could just place a block on the page and if the block is present, it attempts to pull out the associated ToC from page headers.

I won't reject this type of a rewrite, I'd just like the conversation to play out a bit more as to what direction to go with it. I ran into lots of issues getting things to work via input filter and if it could be implemented only if a block / call appears I think that'd be ideal.

Quickly, I think it should play out this way:

  • Base ToC "API" / library, provides no usble functionality by default (this will allow for custom implementations like what you have proposed)
  • As an input filter -- module by itself, allows for implementation via this method (this works on any entity field then)
  • As a block -- module by itself, allows for block / DS / panel placement (only works against node bodies / some other targeted approach)

This would allow for flexibility w/o forking the project branch again by changing the scope completely. I agree that in glancing at how it generates the outline it feels very inefficient though I'm not positive of this fact.

neRok’s picture

I still dont believe that to parse a field and build a TOC requires such an involved module like this one. I have gone ahead and made the rough module I attached above into a (currently sandbox) project at http://drupal.org/sandbox/neRok/1912730.

Keeping the TOC simple (KISS) improves its functionallity IMHO. For example

  • There is an option for a collapsible TOC in this module. That can be achieved in my module by using a block/panel collapsible style.
  • In this module, putting [[toc]] into the field will print the TOC on the page. This can be achieved with my module by by having an on/off field on the node edit form and using Context or similar to handle the conditional display.

Also, rather than parsing through with regex, mine passes through using DOMDocument, which all reports indicate is a far superior method. I would take the above into consideration when advancing this module in the future.

With regards to API, my module has 1 function that takes a HTML field, modifies it and outputs the new HTML and TOC. I am pretty sure any module could call this function for any field and do what it wants with the return.

btopro’s picture

don't like the notion of reliance on other projects (especially with the bloat of panels) to do something simply like collapse I'm not sure why that wouldn't be kept as an option.

On off switch on the node form assumes it's only used for nodes. I do like the idea over the whole [[toc]] input filter method.
Not sure the best way of doing this, it would be nice to have all the fields on the page and then hitting the entire page to generate the ToC. Another possible way of handling this is a block that if it's loaded on the page it processes the rendered page content. Context could then be used to only place it on the pages that is desired (or core block placement). I think this would be the most simple way of handling this personally.

Single function API is most desirable but might not even be needed w/ the possible block implementation method.

I don't care for the million options in the original scope of ToC w/ the input filter. The thing I DO like associated with the current project is the "start at heading ..." and "end at heading..". This would allow you to ignore h1 and drill down as far as h4 as an example.

neRok’s picture

I thought about your comment re the entire page content before my last comment but forgot to write it. I believe this could cause some un-desirable results as you may get huge TOCs and TOCs that are illogical (ie you cant rely on other modules/blocks/fields etc to have decent headings and structure). But it is definitely do-able.

Re start and end at heading, I had started to implement that as well, but decided not to. My use case will be for wiki pages, and I have set the input filter to only allow H3-H6, because the page title etc uses H1 and H2 already. I did code in some CSS to the organised list that is output as the TOC. This allows me to hide all the H6 lists in the TOC for example.

So I can achieve the same results, just in different ways. I also believe it is essential to have display options other than blocks, as I personally have stopped using blocks all together. I can achieve layouts a lot easier using panels, views, etc. and have found no significant speed difference. But that is for another discussion.

adaddinsane’s picture

Rather than diving straight into implementation I think there needs to be some stepping back and thinking about what a TOC is for.

The idea of a TOC is to provide the automatic construction of list of key items in the main content of a page so that navigation to those key items can be achieved with a simple link click rather than scrolling and visually searching for the required item.

We could create something which simply searches the the main content looking for key items and creating a structure based on that. The problem with this is that there is no guarantee the main content will provide the necessary hooks to know (a) what should be linked to [though that is usually headers]; or (b) the ability to link to those items when found [usually anchors].

The advantage of that is that it does not require the intervening in the structure of the main content.

Other issues here are that different types of content may require different settings for the TOC.

So we really need some way of processing the main content to ensure it conforms to what's needed by the TOC.

How many different ways of doing that are there?

(a) Client-side JavaScript. (Non-starter for numerous obvious reasons.)

(b) Input format to post-process the content, we already know this can work.

(c) Intercept preprocess_page - since we're talking D7 (and D8) this is an array structure so it should be feasible to identify and modify the main content. But it's still not good.

(d) New textarea field type with widget and formatter

Then there's the question of where the TOC is created.

(a) As part of the body of the field.

(b) As an external block.

(c) As a separately formatted part of the new textarea field type

You'll notice I've added something nobody seems to have suggested before: a new textarea field type.

This allows individual formatting of different fields and permits those fields to be processed before rendering to allow the TOC to be constructed correctly. It may even be possible to provide options to create the TOC for a field as a block.

How about that?

btopro’s picture

I think you are illustrating that there are many different ways to think of a table of contents within the content of a page and thus it might be very hard to narrow down scope without providing a multitude of implementation options via sub-modules / features that are all using a common toc core library of functions.

adaddinsane’s picture

Yes, that was the point. There are different ways to achieve the result - but which fits Drupal best?

I hadn't thought of the formatter idea until I was writing that post, but the more I think about it the more I think it's the right way to go. (I've been creating a lot of custom formatters in my current contract to fulfill specific needs so I know how useful they can be.)

Especially as we could have a set of (say) three formatters: Field + TOC, Field alone, TOC alone. Which then permits us to render the field content and TOC separately, so they can be put into different display objects as needed. (In fact I can think of all sorts of games you could play by simply doing this. For example, having a Views page of TOCs for different nodes which jump to the right position on the right page.)

Using a formatter avoids the messiness that creating the TOC using an Input Format caused (and it was really unpleasant - remember I was the first person to make the D7 version actually work). And attempting to intercept the content in a preprocess function is also very unpleasant.

Summarising: the Input Format method works but has many drawbacks; a formatter version should work more elegantly, is versatile and has huge advantages.

EDIT: I've just done a very quick test and can confirm the formatter route is very doable.

neRok’s picture

Also something to consider, I could see advantages to not generating the TOC on the fly, but creating it upon node save/update. Then it could be quickly called upon in a lot of different ways. This could be setup as a 'field' similar to URL Alias, ie a checkbox with a text box to allow manual override.

adaddinsane’s picture

I agree the on-the-fly creation is not ideal. I was thinking of simply caching the TOC for a given entity/field combination - and using an MD5 key in case the content is modified.

However I was running through various scenarios in my head on my walk to work and, related to your comment, was the fact that there is a huge overhead in processing the field content in preparation for building a functioning TOC.

While I think the old system is overly complicated it does do necessary processing (creating the anchors and so forth) - whether we also want it to auto-number is another matter completely - but doing the pre-processing stage is imperative and costly in time.

In which case there is an argument for an Input Filter as well, the sole purpose of that is to process the content so the TOC can be created. EDIT: Because field content is automatically cached after Input Filter processing (until modified).

(Another consideration is that a text field could have multiple parts (multiple deltas) but the TOC should treat them all as one item - this is easily achieved but just needs to be borne in mind.)

EDIT 2: Spent my lunch re-reading the whole thread (apologies to all the good people up the thread who made excellent suggestions some of which I clearly absorbed and then thought were my ideas) and particularly the recent additions. The use of DOM in your work, neRok, is definitely the way forward, and I like what you've done.

In regard to outputting TOC as a block, my feeling is that a support module (toc_extras?) which provides a range of alternative ways of getting a TOC on the page separately could be created, and provide an input filter to do auto-numbering of headers (for example). Keep the core TOC as plain and simple as is reasonable, provide the other module for cleverness.

btopro’s picture

In regard to outputting TOC as a block, my feeling is that a support module (toc_extras?) which provides a range of alternative ways of getting a TOC on the page separately could be created, and provide an input filter to do auto-numbering of headers (for example). Keep the core TOC as plain and simple as is reasonable, provide the other module for cleverness.

To this I'd suggest TOC being just an API and the other options are added in via sub-modules. This keeps TOC core very small as it's just the most efficient method of getting a ToC (DOM) built off of a textblock. Submodules can then implement this text processing method the way they want, whether that's entity body, entity fields, block, final rendered text, a series of nodes mashed together, whatever the case may be.

adaddinsane’s picture

But which part being just an API? While I agree this is a good way to go in many cases, I'm not 100% sure it's the right approach here.

I think the one thing this module must do is modify the field text to include the necessary anchors in the headers. With those anchors providing sufficient information for a TOC generator. It should also provide the basic TOC generator - being a field formatter.

So I'm proposing:

  • Input filter which processes the field content to insert the necessary anchors and any support information we may need (depth? parent ID?)
  • Text formatter which generates a basic TOC before the field content

I think we must provide enough for someone to install a single module and just start.

How does that sound?

adaddinsane’s picture

StatusFileSize
new3.04 KB

Actually, I've thrown it together. And it works.

I borrowed the DOM manipulation code from neRok, doesn't quite match my spec above because neRok's code sets up the field content and generates the TOC at the same time - those could do with being separate functions because it means I have to cache the TOC in the Input Filter and then recover it in the formatter. Not nice. (We can still cache the TOC to prevent too much processing but it's healthier to do it in the formatter since that's where it should be generated.)

Also, ideally, the TOC should be generated as a renderable array rather than final HTML. generating the TOC in the input filter also prevents creating a TOC that spans multiple deltas.

I wonder what the jQuery TOC needs...

neRok’s picture

Just a thought re the input filter method, wont this cause the mods to happen every time the field is displayed, including in views etc? Is this desirable?!

FYI, I have tweaked the DOMDoc code in my module so it works a little better, and also has "back to top" functionality built in. Check out the sandbox linked in comment #29 rather than the module attached to #27.

adaddinsane’s picture

Yeah, got that version of your DOMDoc code - it's very smooth :-) we could make the "back to top" a filter option.

As for processing, no it doesn't cause a problem. All input Filters have a "cache" option, you combine input filters into an input format, as long as they all say "okay to cache" the text is processed once and then cached. Until it's edited, then it gets re-processed and re-cached.

If any input filter says "no cache" then that input format will never be cached. This is noted in the documentation. It's totally fine for our input filter to allow caching, so we're not causing a problem :-)

EDIT: Any chance you could split your DOMDoc code into a piece that processes the text to add the necessary IDs and add the "back to top" items, and a second that processes HTML to generate a renderable array representing the header structure? with options to define the "h" to start with and the "h" to end (e.g. h2 to h5)?

adaddinsane’s picture

StatusFileSize
new5.56 KB

Soooo ... I had a bit of time and I've fixed things up a bit.

The attached zip contains the module using the "toc" name - except we can't use that because the jQuery TOC project has nabbed it, I just didn't want to type tableofcontents every time.

So what have we got?

An input filter that finds the headers in a configurable range and adds IDs as needed (note to neRok: the code you have creates multiple IDs and you can't do that I'm afraid). It adds "Jump to top" links as configured (configuration could be improved for that). It also let's you configure the jump to top text. This Input filter is best applied as the very last input filter in a format.

And then we have the Formatter which builds the TOC (I'm not currently caching it). The formatter options allow you to specify absolute URLs (which means a TOC could be put in an RSS feed and they would jump to the right page. You can specify the title of the TOC and add custom classes for CSS. You can also switch off either the TOC or the Main text (or both).

The most important/interesting feature is that if you have a header structure with breaks in it (such as an H3 block containing only H5s and no H4s) there is an option to clean-up the structure so that empty levels of the hierarchy are removed. The TOC is output as renderable item_list array.

The Filter and Formatter are independent so you could add the formatter to text that you know has suitable IDs, or use a different formatter but apply the input filter to ensure the IDs are in place, and the jump to top links.

Things to be done as I see it:
* Change the module prefix back to tableofcontents
* Add the TOC caching
* Add another input filter to do the auto-numbering of headers. (It's useful.)
* Add the processing of multiple deltas.

neRok’s picture

Good point re multiple ID. How will you handle it? You dont want to override existing ID, they may be needed for something else.

adaddinsane’s picture

It's not important. Either the heading has an ID or it doesn't, if it doesn't the input filter puts one in. The formatter doesn't care what the ID is as long as there is one.

It does mean we do an analysis of the content structure twice - which is not ideal - but there's no way around that if we're splitting the functionality.

rosemeria’s picture

Hi - love your TOC module the best!
FYI - Many of us at Drupal Commons 7.x-3.x-dev want your module in the Commons wiki content type.

Commons wiki Issues: http://drupal.org/node/1541086 and http://drupal.org/node/1932528

-Rose

neRok’s picture

Hi adaddinsane. FYI, someone found a 'bug' in the DOM code I created in my module and you subsequently used. You may like to check out the issue #1955782: DOMDocument::createElement(): unterminated entity reference error. It is in regards to special characters (such as &) being used in the headers.

adaddinsane’s picture

Cool, thanks.

WorldFallz’s picture

How does this version of the module fit in with regards to the posted d7.2 and d7.3 versions listed on the project page? Are they completely separate? Has any of this code been committed?

btopro’s picture

2.x branch was started based on code up to this comment https://drupal.org/node/1424896#comment-6908244

Additional discussion in this thread hasn't made it into any branch though would probably be a 3 branch just based on the way it was leading if it makes it to code.

jrockowitz’s picture

Issue summary: View changes

Hi,

I am the maintainer of the TOC filter module and have begun a D8 implementation for this module. With D8's new architecture, it made sense to start implementing a OO TOC API. I reviewed and was inspired by a lot of features from this module. At this point I have ported about 75% of this module's functionality to Drupal 8. I would like to collaborate and setup a shared TOC API module with several TOC implementation specific submodules for D8.

I have read through all the comments and below are my notes.

Key Comments

Possible TOC Modules/Implementations

  • TOC: An API for parsing and building TOCs
  • TOC profiles: Configuration entity used to manage TOC options
  • TOC block: A block that displays the TOC for the current request
  • TOC field formatter: Field formatter that converts HTML to a table of contents
  • TOC extras: Provides custom JS plugin behaviors including hide/show, smooth scroll, etc..

Some Notes/Conclusions

  • TOC options need to be simplified.
  • TOC block provides the most flexibility when it comes to placing a TOC on a page, in a pane, etc…
  • TOC is best namespace for a TOC API but this namespace is occupied by a jQuery TOC implementation.

My Thoughts…

  • My number one goal is to implement a TOC API that I can re-use for all my projects even if the project requires a very customized TOC implementation like a ScrollSpy widget.
  • I am a big fan of input filters.
  • TOC API should be focused on configuration, parsing, building, and theming.
  • TOC profiles should use Configuration Entities, this is something that still needs to be done.
  • A TOC block might have to be part of the TOC API with submodules being able to initialize and populate it.
  • Establishing a TOC namespace is the first step, so I am going to ask the TOC module maintainer if they would be open to allowing the D8 TOC API to move into their module.
btopro’s picture

added as comaintainer; you've got a good track record and a well thought out plan. If you plan on doing a D7 version as you say then I'd make each run on a 3.x branch

yukare’s picture

Do you have a sandbox or something like this that i can help ? I want this on my site.

jrockowitz’s picture

I am actively working on this code in my TOC filter module. This module is still underdevelopment but does work. If you were to install, it would be at your own (minimal) risk. Until there is a stable/beta release, you will definitely have reinstall and reconfigure it every time you update the code. Also this code is mostly either going to live in the toc or tableofcontents namespace.

http://cgit.drupalcode.org/toc_filter/log/?h=8.x-2.x

Anonymous’s picture

jrockowitz your table of contents module is the best I have come across. But it requires manual insertion on each page, where as I prefer to attach it to the top of the page for a certain content type to save a lot of work.

I am currently using this module until hopefully you will implement this into your (Drupal 7) module.

jrockowitz’s picture

Thank you for granting me co-maintainer access.

To get everyone on the same page, below are two screencasts demoing my TOC filter module's implementation for site builders and the TOC API for developers.

I have started a discussion with maintainer of toc.module exploring the possibility of using the toc namespace for the TOC API #2632118: Plans for Drupal 8 implementation of TOC module . If we decide not to use the 'toc' namespace, I will just use the 'toc_api' namespace.
Once the toc (or toc_api) namespace is setup, we can move some of this discussion over to that module's issue queue.

@freddi I am no longer maintaining any D7 websites, so my sole focus is the D8 implementation of the TOC API and TOC filter module.

tunprog’s picture

Hi @btopro,

I want to know why the development on this module has ceased (D7),
I am also interested in helping on updating this module to Drupal 8.
I would be happy to have a co-maintainer access or at least to help on the issue queue for now.

btopro’s picture

@Tunprog I dont use this anymore. Ill add you as a co.

tunprog’s picture

Hi @btopro,

Thanks a lot for the quick reply and for the co role. I will spend some time on understanding the code and then I will see what I can do.

Cheers.

vladimiraus’s picture

Status: Needs work » Fixed
vladimiraus’s picture

Status: Fixed » Needs work

Sorry for changing status. Came here from toc_api module README file.
Do we need D10 version of this of should we deprecate this module?