Support from Acquia helps fund testing for Drupal Acquia logo

Comments

lilou’s picture

Patch re-rolled.

lilou’s picture

Title: Prevent php error from cron.php » Taxonomy : prevent php error from cron.php
catch’s picture

Status: Needs review » Needs work

Should we not clean up the APIs so we're not constantly casting between ints and arrays? Either way this could use a test case to prevent regressions.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
2.86 KB

A little simpler patch, with test. Test has two fails in the absence of the patch to taxonomy module.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Looks great. We should forget about array/int casting and just get the vocabularies as fields patch in.

Dries’s picture

I'm a bit confused by this -- aren't warnings already reported as exceptions? Is the error handling in this patch strictly necessary? (If it is, it might also be useful to abstract it some more, and to make it available as a service/API to other tests.)

pwolanin’s picture

Hmm, good question - looking at the drupal_web_test_case.php code, it looks like it should show any warning as an exception.

Indeed, running a simplified version of the test, I see 3 exceptions, such as:

Invalid argument supplied for foreach()	Warning	taxonomy.module	1642	taxonomy_node_update_index()

So, if exceptions (vs. fails) are good enough, we can go that route.

pwolanin’s picture

here's such a version of the test

catch’s picture

Looks great, again.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks!

cwgordon7’s picture

Version: 7.x-dev » 6.x-dev
Status: Fixed » Needs review
FileSize
826 bytes

The same problem exists in Drupal 6. The attached patch is a backport of pwolanin's patch in #8 without the test case.

agerson’s picture

Thanks cwgordon7. I was unable to run cron with search turned on due to this issue and this patched fixed it for d6.

Anonymous’s picture

How does one apply this patch, I too am having trouble with this error.
Gary

Damien Tournoud’s picture

Version: 6.x-dev » 7.x-dev
Status: Needs review » Needs work

Can someone explain what is the exact problem here?

The taxonomy_node_load() loads $node->taxonomy unconditionally, so the taxonomy module can safely assume that $node->taxonomy is set when taxonomy_node_update_index() is called.

The patch, and the test, fall in the "don't babysit broken code" zone. Please revert.

pwolanin’s picture

Damien - the error is occurring often in 6.x sites it seems. I'm not sure if the root bug is in core or contrib, so I don't think we should consider reverting unless someone is able to get more insight into that.

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
2 KB

@pwolanin: I never personally saw that error. I'm asking for a revert based on two facts:

  • we are now, in TaxonomyTermUnitTest, voluntarily introducing a bug (someone is unsetting $node->taxonomy) and testing that we are shielded against it
  • we are completely hiding the error message in taxonomy_node_update_index(), so that the bug (wherever is the root cause, core or contrib) is now completely hidden

Here is a patch.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Looking at this again I agree with Damien. We should revert, then this is active (needs more info) until Earnie can reproduce the error with just core.

Dries’s picture

Priority: Critical » Normal
Status: Reviewed & tested by the community » Fixed

After studying the program flow, I have to agree. Rolled back, but we can discuss this further if desired.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

cyberswat’s picture

I fully blame the data I'm working with but I'm going to document a little more about how this error is showing up on one of our d6 sites and why I believe Damiens statement about someone unsetting $node->taxonomy is incorrect.

When I get the error I can output the data that is being passed into taxonomy_node_update_index() and verify that it does not remotely resemble the structure of a node:

stdClass Object
(
    [build_mode] => 2
    [body] => <span class='print-link'></span>
    [readmore] => 
    [content] => Array
        (
            [print_links] => Array
                (
                    [#weight] => -101
                    [#value] => <span class='print-link'></span>
                    [#title] => 
                    [#description] => 
                    [#printed] => 1
                )

            [body] => Array
                (
                    [#weight] => 0
                    [#value] => 
                    [#title] => 
                    [#description] => 
                    [#printed] => 1
                )

            [#title] => 
            [#description] => 
            [#children] => <span class='print-link'></span>
            [#printed] => 1
        )

)

Doing a debug_backtrace shows me that _node_index_node() received a nid of XYZ ... I can verify that this node is a forum node that exists in the node table. However, doing a node_load(array('nid' => XYZ)) does not return a node object. I'm still researching the root cause and will post again but figured I'd start sharing my findings here.

cyberswat’s picture

Status: Closed (fixed) » Needs review
FileSize
1.99 KB

Obviously there is issues with the data that is in the db I am working with. That being said here is what is happening. node_update_index() is running a query that is doing a left join between the node and search_dataset tables for nids that need to be indexed. It then passes each of these nids to _node_index_node() which begins by doing a node_load ultimately calling node_invoke_nodeapi() which calls taxonomy_nodeapi() with an $op of 'update index' which calls taxonomy_node_update_index(). The last call is what is generating the error that is the symptom this issue was created around.

The problem (discounting the bad data) is that the query run in node_update_index() always returns the same set of nids because they never get properly indexed due to the node_load failing ... so the cycle repeats every time you try to re-index the site or call cron. The less than useful error message that's generated pointing to the taxonomy module is in no way shape or form useful to the vast majority of users that may be encountering the problem. As such this becomes difficult to debug.

I'd like to suggest that a modification to the _node_index_node() function be made so that if node_load returns false (because of bad data) that a more descriptive error message and watchdog entry get generated to help admins of large sites narrow down what may be the root cause.

I guess it could be argued that there is no need to bother because Drupal core didn't directly create the problem ... However, I'm dealing with imports that bring in thousands of nodes where weird stuff sometimes happens. A more explicit error message like the one in the attached patch would cut my debugging time down from potentially hours to a matter of seconds.

cyberswat’s picture

Just an FYI that this is remarkably similar to #471080: Trigger watchdog when node rebuild permissions fails

cyberswat’s picture

FileSize
850 bytes

After seeing similar issues in the queues I'm going to modify the patch to provide a watchdog entry only in node_load(). This won't stop the errors that are being reported by Drupal but it seems like the best place to address the root problem.

Status: Needs review » Needs work

The last submitted patch, node_load-225864-23.patch, failed testing.

cyberswat’s picture

Status: Needs work » Needs review
FileSize
874 bytes

Let's try that again from the correct directory

Aren Cambre’s picture

Possibly related: #690914: Converting taxonomy vocabulary to required causes PHP errors. Is this really a a band aid fix for an underlying error?

cyberswat’s picture

It's not a band aid fix in that the root problem is that the data Drupal is being told to deal with is incorrect. There are a significant number of places that modules rely on the node being returned from node_load to be a properly structured node. The nodes are not improper because of something Drupal is doing .. they are improper because of actions performed by developers to get their data into place. I'm not convinced that this would ever happen with a pure core installation.

Drupal should not have to adjust it's behavior to account for developer errors. I believe that the assumption that if node_load returns a node that it should be correct is valid. The generation of the nodes your referring to in #690914: Converting taxonomy vocabulary to required causes PHP errors is likely similar to what I posted as output above and is not a valid node.

I believe the best approach to this problem is to provide relevant information in watchdog that identifies the node that was being requested to admins so that the task of debugging the developers bad data is as simple as reading the logs and adjusting accordingly.

Aren Cambre’s picture

At the moment, I don't believe #692764: Converting taxonomy vocabulary to required can cause PHP errors in watchdog is caused by a contributed module.

However, I totally agree with your last comment about providing useful debug info. In the least, we should be told which node is the problem. http://drupal.org/node/259632#comment-1793774 shows how I was able to discover bad nodes through a different method.

dalin’s picture

Version: 7.x-dev » 6.x-dev
Component: taxonomy.module » node.module
FileSize
560 bytes

Sending something to Watchdog when a node fails to load is unrelated to this taxonomy cron issue and should be a new issue. What really needs to happen here is that if node_load returns a FALSE, then remove the node from the search index and return. See this patch.

P.S. I'm a bit unsure on the protocol of what order bugs get fixed in. D6 vs. D7. This patch is for D6.

Aren Cambre’s picture

I think they're generally supposed to be fixed in 7 first then backported?

piersg’s picture

I too had the error "Invalid argument supplied for foreach() in /modules/taxonomy/taxonomy.module on line 1215." A fix suggested in another thread (#692764: Converting taxonomy vocabulary to required can cause PHP errors in watchdog) by dman worked for me but this involves modifying core files.

Like cyberswat in #20 above, a var_dump of $node gave nonsense. A debug_print_backtrace() gave the following:

#0  taxonomy_node_update_index(stdClass Object ([build_mode] => 2,[body] => ,[readmore] => ,[content] => Array ([body] => Array ([#weight] => 0,[#value] => ,[#title] => ,[#description] => ,[#printed] => 1),[#title] => ,[#description] => ,[#printed] => 1))) called at [/public_html/modules/taxonomy/taxonomy.module:1206]
#1  taxonomy_nodeapi(stdClass Object ([build_mode] => 2,[body] => ,[readmore] => ,[content] => Array ([body] => Array ([#weight] => 0,[#value] => ,[#title] => ,[#description] => ,[#printed] => 1),[#title] => ,[#description] => ,[#printed] => 1)), update index, , ) called at [/public_html/modules/node/node.module:673]
#2  node_invoke_nodeapi(stdClass Object ([build_mode] => 2,[body] => ,[readmore] => ,[content] => Array ([body] => Array ([#weight] => 0,[#value] => ,[#title] => ,[#description] => ,[#printed] => 1),[#title] => ,[#description] => ,[#printed] => 1)), update index) called at [/public_html/modules/node/node.module:1825]
#3  _node_index_node(stdClass Object ([nid] => 69)) called at [/public_html/modules/node/node.module:1801]
#4  node_update_index()
#5  call_user_func_array(node_update_index, Array ()) called at [/public_html/includes/module.inc:462]
#6  module_invoke(node, update_index) called at [/public_html/modules/search/search.module:273]
#7  search_cron()
#8  call_user_func_array(search_cron, Array ()) called at [/public_html/includes/module.inc:483]
#9  module_invoke_all(cron) called at [/public_html/includes/common.inc:2692]
#10 drupal_cron_run() called at [/public_html/cron.php:11]

So node 69 is at fault (#3). Browsing to /node/69 gave a page not found error. Looking at node 69 in {node_revision} the only thing that looked wrong was [uid] was set to a non-existant user id, and did not match the [uid] for node 69 in {node}. I changed uid in {node_revision} to user 1 and the node then displayed properly, and the cron error went away too. I hope that this helps someone else to solve this elusive bug.

dalin’s picture

Title: Taxonomy : prevent php error from cron.php » Don't assume that node_load returns a node
Version: 6.x-dev » 7.x-dev
Component: node.module » node system
FileSize
16.51 KB

This patch cleans up all instances of node_load.

dalin’s picture

I should add that by not assuming that node_load returns a node we also fix several hard-to-find bugs such as a race condition where someone else deletes a node before you submit your form that tries to act on said node.

Status: Needs review » Needs work

The last submitted patch, 225864_node_load.diff, failed testing.

dalin’s picture

Status: Needs work » Needs review
FileSize
16.98 KB

Status: Needs review » Needs work

The last submitted patch, 225864_node_load_2.diff, failed testing.

dalin’s picture

Status: Needs work » Needs review

#36: 225864_node_load_2.diff queued for re-testing.

pwolanin’s picture

Looks like a nice cleanup.

janusman’s picture

Status: Needs review » Reviewed & tested by the community

Looks good. RTBC?

catch’s picture

Version: 7.x-dev » 8.x-dev
Status: Reviewed & tested by the community » Needs review

Needs to go into 8 first. With most of these it looks like we will fail silently if the node doesn't get returned, that doesn't help people to identify broken nodes and similar.

janusman’s picture

I agree that we should help identify broken nodes (a more "diagnosable" Drupal?), but should also strive for code that is strict and (makes less assumptions... "stability"?)

On this particular issue we have neither (or what is could be improved on) in core, regarding using whatever is returned by node_load(). This patch just aims for "stability" IMO.

@catch: what do you think about just adding some code to the patch, that trigers exceptions inside all the ifs()? Or else how could we add "diagnosability" of broken nodes? (Watchdog()?)

catch’s picture

OK just read through the patch and there's a couple of different things going on:

1. In some of these cases, it's only dealing with a node not existing. There are valid reasons for a node to not exist (i.e. it was just deleted by another user) - if that's the case, we could either fail silently like we are here, or possibly show an friendly error to say the node doesn't exist any more (since presumably the user is expecting behaviour that depends on it existing) and/or log to watchdog.

2.

+    $type = isset($node->type) ? $node->type : t('Unknown');
+    $type_name = isset($node->type) ? node_type_get_name($node) : t('Unknown');
+    $title = isset($node->title) ? $node->title : t('Unknown');
+    watchdog('content', '@type: deleted %title.', array('@type' => $type, '%title' => $title));
+    drupal_set_message(t('@type %title has been deleted.', array('@type' => $type_name, '%title' => $title)));

In this case it's checking for invalid node types or node types as empty strings. IMO that should throw and/or log an error message - right now the patch makes it fail silently, and in this case I'd prefer the garbage in garbage out approach since at least there's some kind of record that the node was invalid when it was deleted. Looking through though, that may be the only occasion where we really, really need to log an error, I'd need to see what happens to the actual functionality when we just continue; in the other code hunks - but if functionality gets broken by the missing nodes, then a trigger_error() or even drupal_set_message() + watchdog with a friendly message seems best - we have something similar for trying to delete a non-existing comment.

pwolanin’s picture

Status: Needs review » Needs work

So, needs work to throw exceptions in some cases?

EvanDonovan’s picture

Got here from https://drupal.org/node/259632. Is anyone continuing to work on this?

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

  • Dries committed 0de72ee on 8.3.x
    - Patch #225864 by pwolanin: prevent PHP errors from cron.php.
    
    
  • Dries committed 212d378 on 8.3.x
    - Patch #225864: rollback based on discussion.
    
    

  • Dries committed 0de72ee on 8.3.x
    - Patch #225864 by pwolanin: prevent PHP errors from cron.php.
    
    
  • Dries committed 212d378 on 8.3.x
    - Patch #225864: rollback based on discussion.
    
    

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

  • Dries committed 0de72ee on 8.4.x
    - Patch #225864 by pwolanin: prevent PHP errors from cron.php.
    
    
  • Dries committed 212d378 on 8.4.x
    - Patch #225864: rollback based on discussion.
    
    

  • Dries committed 0de72ee on 8.4.x
    - Patch #225864 by pwolanin: prevent PHP errors from cron.php.
    
    
  • Dries committed 212d378 on 8.4.x
    - Patch #225864: rollback based on discussion.
    
    

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.6 was released on August 1, 2018 and is the final bugfix release for the Drupal 8.5.x series. Drupal 8.5.x will not receive any further development aside from security fixes. Sites should prepare to update to 8.6.0 on September 5, 2018. (Drupal 8.6.0-rc1 is available for testing.)

Bug reports should be targeted against the 8.6.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.8.x-dev

Drupal 8.6.x will not receive any further development aside from security fixes. Bug reports should be targeted against the 8.8.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.9.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

  • Dries committed 0de72ee on 9.1.x
    - Patch #225864 by pwolanin: prevent PHP errors from cron.php.
    
    
  • Dries committed 212d378 on 9.1.x
    - Patch #225864: rollback based on discussion.
    
    

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.7 was released on June 3, 2020 and is the final full bugfix release for the Drupal 8.8.x series. Drupal 8.8.x will not receive any further development aside from security fixes. Sites should prepare to update to Drupal 8.9.0 or Drupal 9.0.0 for ongoing support.

Bug reports should be targeted against the 8.9.x-dev branch from now on, and new development or disruptive changes should be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

catch’s picture

Version: 8.9.x-dev » 7.x-dev
Issue summary: View changes
Status: Needs work » Fixed

Moving this back to fixed against Drupal 7, after ten years a lot of this code has changed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.