We're extensively using panel pages to build a site due to the need to have many different layouts for each page.

When running the link checker with the "Panels" content type checked the linkchecker does not find failed links.

1 Reproduce: Create panel page with broken link in one of the regions of the panel.
2 Run link checker.
3 Link is missed

Comments

hass’s picture

Category: bug » feature
Status: Active » Postponed (maintainer needs more info)

There is no explicit Panels support implemented. How do you create your content? Normally you have a n-nodes and assign this nodes to a panel. This way you can check the node and you get the broken node links in the nodes assigned to a panel. If panels would be analysed you will have duplicates... can you explain how you assign content to panels and why the above way do not work for you?

harking’s picture

Latest panels module adds Panel Pages that act just like nodes. Panel pages show up in the search index and have node ids and that's why we're using them.

You're right that this will cause issue with the same broken link being identified twice if the panel brings in a node.

No rush, in the mean time i've used the W3C::checklink perl script. See http://search.cpan.org/dist/W3C-LinkChecker/

hass’s picture

If "panel page" is a node type you should be able to activate it... never tried it yet, sorry. This is currently not possible?

hass’s picture

You are talking about "Panel nodes" not "Panel pages" isn't it? Panel pages have normal nodes assigned. There is no need to add any extra suppport for.

hass’s picture

I don't get it. In link checker settings you can enable link checking for "Panel" and than the teaser text is evaluated for links. On the other side you assign existing nodes to layout columns. This nodes also have a node type and all links inside this nodes can also checked as all other nodes, too. There is nothing to do - Panel pages and nodes and mini panels must work out of the box with linkchecker.

The only text that may not checked for links is "custom content" as this is saved in a very special way in the panels_pane table and I currently have no idea how to grab this data in an API way. Can you verify if you are talking about this type of content, please?

If someone is willing to write a patch I can review.

hass’s picture

Title: No support for panel pages » Panels: "custom content" is not checked for broken links
Version: 6.x-2.4 » 6.x-2.x-dev

Changing title.

harking’s picture

@hass,

Yup, the "custom content" is the item that is not having the content checked for links.

I'll work on a patch for it.

hass’s picture

Status: Postponed (maintainer needs more info) » Active

Let's write a patch for it and get it in...

hass’s picture

Version: 6.x-2.x-dev » 7.x-1.x-dev
Priority: Normal » Minor

Any news about the patch?

Argus’s picture

Also entities created with Fieldable panel panes (http://drupal.org/project/fieldable_panels_panes) are not supported. Do you want me to add another issue for that? I suppose it is related...?

hass’s picture

If you can provide a patch... I don't care; if we make this a panels integration issue.

Personally I have no plans to work on this "custom content" integration and it may therefore never be done as it's tooo special for a generic integration. I guess the Fieldable panel panes may be much more easier to integrate as it uses a generic core way, but you need to try it yourself.

Argus’s picture

Shouldn't you better mark it 'won't fix' then? :)

hass’s picture

Status: Active » Closed (won't fix)
azinck’s picture

Priority: Minor » Normal
Status: Closed (won't fix) » Active

It seems to me that the better way to solve this for a wide range of use cases, not just Panels, is to process the rendered entity for links rather than the individual fields. I'm re-opening this for discussion along those lines.

In my specific case I need to support text added to a node in Panelizer but this would also allow Link Checker to support field types other than the hard-coded few (text_with_summary, text_long, text, and link_field).

hass’s picture

I'm not sure what you mean and what the benefit is. Without reading the fields we are also not able to update them. For memory and reliability reason it is good to know what we are extracting and from what and not wasting time with fields we don't know. Without this information we may not able to construct an url.

azinck’s picture

Thanks for the response, hass.

First: to be clearer on what I'm suggesting. During node insert/update, instead of parsing all supported fields for links I'm suggesting instead we should render the node using a user-selectable view mode and parse the rendered result for links. This would allow Link Checker to discover links in nodes using Panelizer, Field Collections, unsupported custom/contrib field types, and any other novel, non-field-based way of adding content to a node.

I'll be the first to admit that I'm new to Link Checker so I'm sure I'm not familiar with all the functionality and use cases this module was designed to serve. Your comment rightly points out some downsides to my suggestion. Indeed, after your comment about updating fields I took a closer look and noticed the Link Checker functionality that updates permanently moved links. I hadn't noticed that before! Very handy but obviously not compatible with what I'm suggesting.

I'd argue that Link Checker will never be able to be smart enough to fully understand and interact with all the possible ways data can be attached to a node, but it *can* be smart enough to find any links on a rendered node, and the module should focus on that core competency. If Link Checker were to switch to using rendered entities you could keep the 301-fixing behavior for fields but change the description of the feature to:

If enabled, outdated links in content providing a status Moved Permanently (status code 301) are automatically updated to the most recent URL when possible...

Performance/memory consumption should be considered but I'd be interested in seeing the actual benchmarked difference. The node loading/rendering pipeline has been fairly closely scrutinized, optimized, and cached so I wouldn't necessarily assume it's going to be an enormous penalty (though perhaps it will be!). As it stands, most of the sites I build can't really make use of Link Checker just due to its incompatibility with Field Collection and Panelizer, let alone other more exotic configurations. What if the existing behavior were left as the default, but an option were added to support scanning for links in rendered nodes?

As for your last comment: "Without this information we may not able to construct an url." -- I don't follow. For what specific Link Checker functionality would this be true?

hass’s picture

Priority: Normal » Minor
Status: Active » Closed (won't fix)

I do not see any benefit changing all this stuff. Link checker works extremly reliable and new field types can be added quite easy and I'm not crazy in changing it to less code quality/worse reliability. See #1888102: Field Collection compatibility for an half ready patch about field collections.

What panels does with "custom content" is extremly rare and this is why this case is minor. Other Panels stuff works well, just try not using custom content. I'm not aware of another module that does not implement fields. If you'd like to write the panels integration code I can review it. But I hoped more that panels change.

chrisgross’s picture

Issue summary: View changes
Priority: Minor » Normal
Status: Closed (won't fix) » Active

I can't quite figure out why this request has been dismissed so readily. The use of custom content or fieldable panel panes are not edge cases anymore. Maybe it's simply because Panels has progressed so much in the past few years, but I think it is absolutely reasonable to think that this module could work with all panel panes. For example, the Apache Solr Search module can index content on Panelized nodes because it has an option to scan the entire page for text instead of needing to read each field individually.

I like the idea of checking all entities and not just nodes, but worst case scenario, it should be possible to use something like file_get_contents to scan for links, though this might have performance implications. The former would be better, by modifying _linkchecker_extract_node_links.

If no one else is willing to look into it, perhaps I will, if I can find the time.

azinck’s picture

@chrisgross: hass has consistently been resistant to efforts along these lines. I, of course, agree with you, but in the interim our team wrote a patch that we're using to support Panelizer: https://www.drupal.org/node/1946252#comment-8158909

chrisgross’s picture

@aznick: Interesting. Any chance you'd be willing to share the code or an example of the code you used specifically to get this to work with Panelizer? I looked at the example in that issue, but am not quite sure how to use that in the case of my panels pages.

azinck’s picture

@chrisgross: here's the code we're using, more or less.

/**
 *  Implements hook_linkchecker_node_links_alter().
 *  Scan for links in rendered panelizer nodes.
 */
function custommodule_linkchecker_node_links_alter(&$links, $node, $context) {
  global $user;
  // If this is a field permissions check, just let linkchecker do its thing. We
  // can't comment on the access level of rendered panelizer nodes.
  if ($context['return_field_names'] == TRUE) {
    return;
  }

  // If this node is panelized, render it through panelizer and scan the
  // panelized result.
  if (isset($node->panelizer)) {
    $current_user = $user;
    $user = drupal_anonymous_user();
    $handler = panelizer_entity_plugin_get_handler('node');
    // The process of rendering the panelizer portion of the node can result in
    // Some attributes of the display getting modified, so we make a copy.
    $node_copy = node_load($node->nid, $node->vid);
    $info = $handler->render_entity($node_copy);
    $user = $current_user;
    $rendered_panelizer = $info['content'];
    $links += _linkchecker_extract_links($rendered_panelizer, $context['path']);
  }
}
chrisgross’s picture

@aznick: I will give it a shot. Thanks so much!

chrisgross’s picture

@aznick: It worked! For my particular case I just had to render the entity as a different view mode.

$info = $handler->render_entity($node_copy, 'page_manager');

and remove the field permissions check, since I'm not using that module, and that did the trick!

Thanks so much, you're a lifesaver!