Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Invalid argument to foreach is causing a watchdog entry for a PHP error. This patch prevents the error.
Comment | File | Size | Author |
---|---|---|---|
#36 | 225864_node_load_2.diff | 16.98 KB | dalin |
#32 | 225864_node_load.diff | 16.51 KB | dalin |
#29 | node_index_node-225864-D6.diff | 560 bytes | dalin |
#25 | node_load-225864-24.patch | 874 bytes | cyberswat |
#23 | node_load-225864-23.patch | 850 bytes | cyberswat |
Comments
Comment #1
lilou CreditAttribution: lilou commentedPatch re-rolled.
Comment #2
lilou CreditAttribution: lilou commentedComment #3
catchShould we not clean up the APIs so we're not constantly casting between ints and arrays? Either way this could use a test case to prevent regressions.
Comment #4
pwolanin CreditAttribution: pwolanin commentedA little simpler patch, with test. Test has two fails in the absence of the patch to taxonomy module.
Comment #5
catchLooks great. We should forget about array/int casting and just get the vocabularies as fields patch in.
Comment #6
Dries CreditAttribution: Dries commentedI'm a bit confused by this -- aren't warnings already reported as exceptions? Is the error handling in this patch strictly necessary? (If it is, it might also be useful to abstract it some more, and to make it available as a service/API to other tests.)
Comment #7
pwolanin CreditAttribution: pwolanin commentedHmm, good question - looking at the drupal_web_test_case.php code, it looks like it should show any warning as an exception.
Indeed, running a simplified version of the test, I see 3 exceptions, such as:
So, if exceptions (vs. fails) are good enough, we can go that route.
Comment #8
pwolanin CreditAttribution: pwolanin commentedhere's such a version of the test
Comment #9
catchLooks great, again.
Comment #10
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks!
Comment #11
cwgordon7 CreditAttribution: cwgordon7 commentedThe same problem exists in Drupal 6. The attached patch is a backport of pwolanin's patch in #8 without the test case.
Comment #12
agerson CreditAttribution: agerson commentedThanks cwgordon7. I was unable to run cron with search turned on due to this issue and this patched fixed it for d6.
Comment #13
Anonymous (not verified) CreditAttribution: Anonymous commentedHow does one apply this patch, I too am having trouble with this error.
Gary
Comment #14
Damien Tournoud CreditAttribution: Damien Tournoud commentedCan someone explain what is the exact problem here?
The taxonomy_node_load() loads $node->taxonomy unconditionally, so the taxonomy module can safely assume that $node->taxonomy is set when taxonomy_node_update_index() is called.
The patch, and the test, fall in the "don't babysit broken code" zone. Please revert.
Comment #15
pwolanin CreditAttribution: pwolanin commentedDamien - the error is occurring often in 6.x sites it seems. I'm not sure if the root bug is in core or contrib, so I don't think we should consider reverting unless someone is able to get more insight into that.
Comment #16
Damien Tournoud CreditAttribution: Damien Tournoud commented@pwolanin: I never personally saw that error. I'm asking for a revert based on two facts:
Here is a patch.
Comment #17
catchLooking at this again I agree with Damien. We should revert, then this is active (needs more info) until Earnie can reproduce the error with just core.
Comment #18
Dries CreditAttribution: Dries commentedAfter studying the program flow, I have to agree. Rolled back, but we can discuss this further if desired.
Comment #20
cyberswat CreditAttribution: cyberswat commentedI fully blame the data I'm working with but I'm going to document a little more about how this error is showing up on one of our d6 sites and why I believe Damiens statement about someone unsetting $node->taxonomy is incorrect.
When I get the error I can output the data that is being passed into taxonomy_node_update_index() and verify that it does not remotely resemble the structure of a node:
Doing a debug_backtrace shows me that _node_index_node() received a nid of XYZ ... I can verify that this node is a forum node that exists in the node table. However, doing a node_load(array('nid' => XYZ)) does not return a node object. I'm still researching the root cause and will post again but figured I'd start sharing my findings here.
Comment #21
cyberswat CreditAttribution: cyberswat commentedObviously there is issues with the data that is in the db I am working with. That being said here is what is happening. node_update_index() is running a query that is doing a left join between the node and search_dataset tables for nids that need to be indexed. It then passes each of these nids to _node_index_node() which begins by doing a node_load ultimately calling node_invoke_nodeapi() which calls taxonomy_nodeapi() with an $op of 'update index' which calls taxonomy_node_update_index(). The last call is what is generating the error that is the symptom this issue was created around.
The problem (discounting the bad data) is that the query run in node_update_index() always returns the same set of nids because they never get properly indexed due to the node_load failing ... so the cycle repeats every time you try to re-index the site or call cron. The less than useful error message that's generated pointing to the taxonomy module is in no way shape or form useful to the vast majority of users that may be encountering the problem. As such this becomes difficult to debug.
I'd like to suggest that a modification to the _node_index_node() function be made so that if node_load returns false (because of bad data) that a more descriptive error message and watchdog entry get generated to help admins of large sites narrow down what may be the root cause.
I guess it could be argued that there is no need to bother because Drupal core didn't directly create the problem ... However, I'm dealing with imports that bring in thousands of nodes where weird stuff sometimes happens. A more explicit error message like the one in the attached patch would cut my debugging time down from potentially hours to a matter of seconds.
Comment #22
cyberswat CreditAttribution: cyberswat commentedJust an FYI that this is remarkably similar to #471080: Trigger watchdog when node rebuild permissions fails
Comment #23
cyberswat CreditAttribution: cyberswat commentedAfter seeing similar issues in the queues I'm going to modify the patch to provide a watchdog entry only in node_load(). This won't stop the errors that are being reported by Drupal but it seems like the best place to address the root problem.
Comment #25
cyberswat CreditAttribution: cyberswat commentedLet's try that again from the correct directory
Comment #26
Aren Cambre CreditAttribution: Aren Cambre commentedPossibly related: #690914: Converting taxonomy vocabulary to required causes PHP errors. Is this really a a band aid fix for an underlying error?
Comment #27
cyberswat CreditAttribution: cyberswat commentedIt's not a band aid fix in that the root problem is that the data Drupal is being told to deal with is incorrect. There are a significant number of places that modules rely on the node being returned from node_load to be a properly structured node. The nodes are not improper because of something Drupal is doing .. they are improper because of actions performed by developers to get their data into place. I'm not convinced that this would ever happen with a pure core installation.
Drupal should not have to adjust it's behavior to account for developer errors. I believe that the assumption that if node_load returns a node that it should be correct is valid. The generation of the nodes your referring to in #690914: Converting taxonomy vocabulary to required causes PHP errors is likely similar to what I posted as output above and is not a valid node.
I believe the best approach to this problem is to provide relevant information in watchdog that identifies the node that was being requested to admins so that the task of debugging the developers bad data is as simple as reading the logs and adjusting accordingly.
Comment #28
Aren Cambre CreditAttribution: Aren Cambre commentedAt the moment, I don't believe #692764: Converting taxonomy vocabulary to required can cause PHP errors in watchdog is caused by a contributed module.
However, I totally agree with your last comment about providing useful debug info. In the least, we should be told which node is the problem. http://drupal.org/node/259632#comment-1793774 shows how I was able to discover bad nodes through a different method.
Comment #29
dalinSending something to Watchdog when a node fails to load is unrelated to this taxonomy cron issue and should be a new issue. What really needs to happen here is that if node_load returns a FALSE, then remove the node from the search index and return. See this patch.
P.S. I'm a bit unsure on the protocol of what order bugs get fixed in. D6 vs. D7. This patch is for D6.
Comment #30
Aren Cambre CreditAttribution: Aren Cambre commentedI think they're generally supposed to be fixed in 7 first then backported?
Comment #31
piersg CreditAttribution: piersg commentedI too had the error "Invalid argument supplied for foreach() in /modules/taxonomy/taxonomy.module on line 1215." A fix suggested in another thread (#692764: Converting taxonomy vocabulary to required can cause PHP errors in watchdog) by dman worked for me but this involves modifying core files.
Like cyberswat in #20 above, a var_dump of $node gave nonsense. A debug_print_backtrace() gave the following:
So node 69 is at fault (#3). Browsing to /node/69 gave a page not found error. Looking at node 69 in {node_revision} the only thing that looked wrong was [uid] was set to a non-existant user id, and did not match the [uid] for node 69 in {node}. I changed uid in {node_revision} to user 1 and the node then displayed properly, and the cron error went away too. I hope that this helps someone else to solve this elusive bug.
Comment #32
dalinThis patch cleans up all instances of node_load.
Comment #33
dalinI should add that by not assuming that node_load returns a node we also fix several hard-to-find bugs such as a race condition where someone else deletes a node before you submit your form that tries to act on said node.
Comment #36
dalinComment #38
dalin#36: 225864_node_load_2.diff queued for re-testing.
Comment #39
pwolanin CreditAttribution: pwolanin commentedLooks like a nice cleanup.
Comment #40
janusman CreditAttribution: janusman commentedLooks good. RTBC?
Comment #41
catchNeeds to go into 8 first. With most of these it looks like we will fail silently if the node doesn't get returned, that doesn't help people to identify broken nodes and similar.
Comment #42
janusman CreditAttribution: janusman commentedI agree that we should help identify broken nodes (a more "diagnosable" Drupal?), but should also strive for code that is strict and (makes less assumptions... "stability"?)
On this particular issue we have neither (or what is could be improved on) in core, regarding using whatever is returned by node_load(). This patch just aims for "stability" IMO.
@catch: what do you think about just adding some code to the patch, that trigers exceptions inside all the ifs()? Or else how could we add "diagnosability" of broken nodes? (Watchdog()?)
Comment #43
catchOK just read through the patch and there's a couple of different things going on:
1. In some of these cases, it's only dealing with a node not existing. There are valid reasons for a node to not exist (i.e. it was just deleted by another user) - if that's the case, we could either fail silently like we are here, or possibly show an friendly error to say the node doesn't exist any more (since presumably the user is expecting behaviour that depends on it existing) and/or log to watchdog.
2.
In this case it's checking for invalid node types or node types as empty strings. IMO that should throw and/or log an error message - right now the patch makes it fail silently, and in this case I'd prefer the garbage in garbage out approach since at least there's some kind of record that the node was invalid when it was deleted. Looking through though, that may be the only occasion where we really, really need to log an error, I'd need to see what happens to the actual functionality when we just continue; in the other code hunks - but if functionality gets broken by the missing nodes, then a trigger_error() or even drupal_set_message() + watchdog with a friendly message seems best - we have something similar for trying to delete a non-existing comment.
Comment #44
pwolanin CreditAttribution: pwolanin commentedSo, needs work to throw exceptions in some cases?
Comment #45
EvanDonovan CreditAttribution: EvanDonovan commentedGot here from https://drupal.org/node/259632. Is anyone continuing to work on this?
Comment #59
catchMoving this back to fixed against Drupal 7, after ten years a lot of this code has changed.