First of all: I don't know if I put correctly this bug/feature. It is intended to the developers of drupal, for the newest version, but at this time, I'm testing on 4.7.5 version. I've no time to test this on v5-rc2 (sorry).

I ve seen, that, in some cases (in my case in category nodes) , saving is a bit slower. This is OK, but I've seen a big problem when Drupal is exceeding the PHP max_execution_time, or the Apache max_time (for connection).

Think about this scenario:

- PHP "max_execution_time" set to 2 seconds.
- Try to save new nodes.

In some cases, you can save data, in others not. But, there are some cases that the data will be not 100% saved.

And here's the problem: In MySQL, the "node" table will get the new row about your new node, but the table "node_revisions" don't.

The "save" transaction gets broken, and the data corrupted. In this case, you will get an error when you try to view the node: "Access denied". And the same error raises if you try to see it with de superuser (or god user, user 0).

And, if you install some modules, you can obtain errors derived form this, because some tables got filled, but others not.

When I was developing programs in VB & MS-SQL Server, I had a pair of SQL instructions: BEGIN TRANS and COMMIT TRANS.
I don't know if there's something similar both in MySQL and PostgreSQL.

But if you can, I think you should put one "BEGIN TRANSACTION" at start of node saving. And one "COMMIT TRANSACTION" at its end.

This will ensure the data integrity of our databases. And, when Drupal exceeds the max execution time, MySQL will never save a transaction that it's not completed.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

moshe weitzman’s picture

Status: Active » Closed (works as designed)

drupal tries to remain slim and not handle everything that can possibly go wrong. your example of 2 sec max_execution_time is not realistic. we need real world benefits to justify added complexity. also remember database independance requirement.

deavidsedice’s picture

No, It is not about strange cases.

I've find this bug in a real (and more complex) drupal installation.
I have a max_execution_time of 30 seconds, but, some modules are slower when you have a big site.
Then, when I try to save, it expends from 20 to 40 seconds to complete.

In some cases, Drupal gets slower, and it eats a lot of memory (because the admins has installed some third-party modules, of course). Then, some times PHP cannot end the execution.

Finally, think in other drupal-admins, that aren't programmer, and they doesn't understand how work the databases. One day, he discovered something strange on Drupal: one node that he can't access, and he can't delete. This is a bit frustrating.
I've discovered how to repair them, because I searched the Drupal tables. But others can have the same problem and they can't do anything to solve it.

seanmclucas’s picture

Version: 4.7.5 » 5.6

Yeah, I don't understand why there is no transaction function built into Drupal. It seems they have included it for version 6 if you are using psgsql? I need a solution for MySQL and D5.6.

My issue is about importing data from another database. I could just write the SQL queries by hand, however, I would prefer to use Drupal's node save function. The problem is that I haven't found a way to wrap this function inside a transaction. I have node relationships that need to be maintained while I import the data. If one node can't get created for some reason, Drupal will still insert that node's "relationship" node and the database will have corrupt data. Not good...

moshe weitzman’s picture

Version: 5.6 » 7.x-dev
Component: node system » database system
Category: bug » feature
Priority: Minor » Normal
Status: Closed (works as designed) » Active

Lets see what DB team thinks of this. We actually have transaction support now in the DB layer.

seanmclucas’s picture

Transaction support in version 6 & 7 right? Not D5?

From what I could tell it was only for D6+ with Postgresql databases. Shouldn't I also be able to convert my MySQL tables to innoDB and use some kind of built in Drupal transaction functions?

I guess this isn't a high priority for D5, but it is limitation of the technology stack that I am using.

Crell’s picture

This is blocked on #301049: Transaction nesting not tracked by connection. Once that's in and the transaction API is exposed, however, node_save() is a perfect candidate for where we should add a transaction. (This is D7 only. D6 does not support transactions.)

Crell’s picture

Title: node_save system needs a transaction » Add transactions to key _save routines
Category: feature » task
Status: Active » Needs review
FileSize
36.94 KB

OK, here we go with transactions for node_save(), user_save(), and comment_save(). There are likely other places, but we'll get to those in later patches. :-)

David Strauss’s picture

Status: Needs review » Needs work

I'm all for actually using transactions. The only reason I haven't put forward such a patch is because our testing system simply won't test it. The testing system will, however, happily give the green light. Even if the patch is perfect, it's not a good idea to commit things that (1) aren't tested and (2) could easily be broken by later changes. That said, the value of this patch may exceed the risk of leaving the codepath untested on our main testing system.

Objections to the current patch itself:
* Doesn't ->rollback() throw an exception if you're not actually in a transaction? If not, that needs to change, and this needs to be ready to handle an exception on unsuccessful rollback. We also don't accomplish anything by handling a generic exception, rolling back, and then throwing an exception that won't be handled.
* Generically catching all exceptions is bad. Either this patch should catch specific exceptions or let all exceptions bubble up. If there's an unhandled exception, the transaction will roll back, anyway, by virtue of not being committed.

Crell’s picture

My concern is that most Drupal devs don't know how to deal with exceptions, and until we have a good "exception culture" I'm wary about allowing arbitrary exceptions to just keep on throwing up god-knows how far.

I also, thinking about it, realized another issue. The watchdog() call would also get rolled back, as the transaction doesn't actually close until $transaction goes out of scope. When that happens, the watchdog() record will get rolled back, too. Oops.

David, I'll defer to you on the better way to do this since you've worked with transactions more than I have.

David Strauss’s picture

My concern is that most Drupal devs don't know how to deal with exceptions, and until we have a good "exception culture" I'm wary about allowing arbitrary exceptions to just keep on throwing up god-knows how far.

Having a completely unhandled exception is at least as useful as what we get now with a fatal error. I also don't think we'll foster a good "exception culture" unless we provide incentive to properly handle exceptions.

I also, thinking about it, realized another issue. The watchdog() call would also get rolled back, as the transaction doesn't actually close until $transaction goes out of scope. When that happens, the watchdog() record will get rolled back, too. Oops.

David, I'll defer to you on the better way to do this since you've worked with transactions more than I have.

Well, we have a few options:

* Use a different database connection for logging. We could connect it on the first write to watchdog. Users of syslog wouldn't have this problem. Since watchdog isn't designed to be a very high-volume log, this wouldn't be a big problem.
* Move the guts of *_save to private functions, re-throw any exceptions from the catch block (where we schedule the roll back), and catch+log the exception in the new, outer function.
* Have watchdog() defer its database writes while in a transaction. There are many ways to do this; none are particularly clean.
* Have ->rollback() take watchdog-esque arguments that the transaction API then records until the transaction is over and then sends to watchdog. This is reasonably elegant because developers generally want to log on rollback.

I generally prefer solutions that keep watchdog from being touched in any way by transactional behavior.

David Strauss’s picture

I should add that the "move the guts" option only works if another function hasn't also decided to call *_save while already in a transaction.

David Strauss’s picture

Tagging for work this week.

David Strauss’s picture

Status: Needs work » Needs review
FileSize
42.58 KB
Crell’s picture

Conceptually I like #13. It provides a clean way to say "hey, transaction specific error stuff here" while still blowing away any other watchdog actions that get rolled back. My one problem is that it introduces a dependency in DBTNG to the watchdog function. Right now DBTNG is extremely self-contained, and touches higher level utilities at only 2 levels (one check_plain() call and one drupal_alter() call). Is there no way we can turn that around and make it a listener, like the query logger?

Hm. You know... I wonder if this is somewhere that #601020: Exception Handler Logging and Improvements would come in useful?

David Strauss’s picture

I like the listener pattern, but how could Watchdog register itself as a DB-TNG listener? Every pattern I can think of would either offer a subset of watchdog functionality or make the decoupling from Watchdog superficial (where DB-TNG depends on a service exactly like Watchdog, even if it isn't watchdog).

As for "Watchdog Exception," it appears to merely streamline the process of logging exceptions to Watchdog. Part of the problem here is there isn't a good place for the transaction layer to throw an exception. If it throws an exception during the destructor, PHP breaks. It could throw the exception during the rollback request, but it's unclear how we could efficiently collect and pipe the data to Watchdog post-transaction when dealing with a set of nested transactions.

Crell’s picture

Well, DBTNG could depend on a system similar to watchdog *IF* you want to get transaction logs. If you don't want transaction logs, it just drops the rollback exception information on the floor.

My thinking of watchdog_exception() is to make it do the special "record to memory and save after the transaction is over" stuff. Not sure if that's a good idea, but it came to mind as a maybe.

David Strauss’s picture

FileSize
46.02 KB
Crell’s picture

Status: Needs review » Needs work

We're always retrieving the logging callback properties as a matched set, so let's just make them a single array property of Database rather than 3 separate properties.

The user_save hunk includes a crapload of tabs. :-(

Otherwise I like this approach.

David Strauss’s picture

Status: Needs work » Needs review
FileSize
45.84 KB

Now using arrays for callbacks.

David Strauss’s picture

FileSize
46.17 KB

Minus tabs.

David Strauss’s picture

Here's a patch without whitespace changes. This is only for human review.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

As soon as the bot approves, #2 has my blessing. (Ignore #21, it's a review-only patch.)

Crell’s picture

Erk. #20 has my blessing, I mean. Not #2. :-)

Josh Waihi’s picture

yay!

David Strauss’s picture

Because of the indentation changes, this is a huge pain to re-roll. Can we get this committed, please?

moshe weitzman’s picture

The security team received a report which would be fixed by this issue. Here is an excerpt from node_save().

drupal_write_record('node', $node);
_node_save_revision($node, $user->uid);

module_invoke_all('node_' . $op, $node);

// Update the node access table for this node.
node_access_acquire_grants($node);

If php fails between during hook_node(), we have created a node without its access control. Considering that we now explicitly support external storage for fields, this is perhaps more likely.

I'll add that economist.com is using this patch successfully on D6.

moshe weitzman’s picture

Priority: Normal » Critical
Crell’s picture

Status: Reviewed & tested by the community » Needs work

Somewhere along the line this broke and the testbot didn't notice due to the patch in #21. David, can you reroll?

webchick’s picture

This needs a re-roll. Masked by #21.

chx’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
45.62 KB

This is important so I rerolled.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

chx’s picture

Status: Needs work » Needs review

Can't reproduce. Tried with MyISAM tables too.

Status: Needs review » Needs work

The last submitted patch failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
46.8 KB

Crell points out i left out the test module. That would do it.

Crell’s picture

Status: Needs review » Reviewed & tested by the community

chxsnack!

Note that most of this patch is just playing games with indentation as we add more try-catch blocks, so it's not as big as it seems.

David Strauss’s picture

All also add that I don't believe a single substantive (non-whitespace and non-fuzz-related) line of this patch has changed since before code freeze.

webchick’s picture

Given that this patch was ready to go by 10/15 and didn't make it into UNSTABLE-10 due to a test bot freak out, and given also that the consensus appears to be from two of the database system maintainers that this patch is basically required to make our new shiny transaction support actually do something, and also has thumbs-up from two folks on the security team, I'm inclined to commit this.

Leaving as RTBC for 48 hours in case Dries wants to chime in. #21 shows the changes without the whitespace differences for easier reviewing.

Dries’s picture

This looks good to me. Cody-style is not 100% (documentation wraps too early), but that shouldn't be a showstopper IMO.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Ok, committed to HEAD! Thanks!

Maybe we can file a novice patch for fixing all of our documentation to consistently wrap at 80 chars. Hm...

Status: Fixed » Closed (fixed)
Issue tags: -Security Advisory follow-up, -chx-and-david-performance-sprint

Automatically closed -- issue fixed for 2 weeks with no activity.