Support from Acquia helps fund testing for Drupal Acquia logo

Comments

catch’s picture

Priority: Normal » Critical

Bumping priority - we can't fully convert core until hook_update_N() are converted - and this is blocking it.

sun’s picture

Status: Active » Needs review
Issue tags: +Needs architectural review
FileSize
642 bytes

I wonder whether it is actually as simple as this.

sun’s picture

FileSize
631 bytes

Of course, we rather want this (and properly intialize $ret btw).

Obviously, $ret['queries'] is not (yet) used, but it demonstrates the killing concept.

Dries’s picture

Issue tags: +Favorite-of-Dries

Tagging!

moshe weitzman’s picture

Damn, thats simple.

sun’s picture

I've tagged this issue with "Needs architectural review", but I don't think anyone does really review all issues with this tag. DrupalWTF? ;)

The difference to update_sql() is that update_sql() could be used specifically. Which means:

Not for SELECT queries that may just run to return relevant data for the update.

However, due to limitations of update_sql(), many module updates did not return _any_ result. For example, a simple update with variable_get() or variable_del() involved did not return anything. With generic database query logging we would see _all_ queries that run during an update.

On top of that, all DBTNG methods currently throw a PDO exception when any query cannot be executed. This means: Site is set into maintenance mode whenever any query results in a warning. I didn't test, but I really wonder what happens if a query during a module update throws an exception...

Anyway, to me, the actual architectural question boils down to: What do we really want to see during updates?

a) All queries.

b) Only insert/update/delete queries.

Please consider thoroughly. I was inclined to vote for b), but on a second thought, a) would produce much more helpful output for debugging purposes. (I hope) we don't really believe that anyone really reads the output of module updates - so in fact, they really only matter when something goes wrong.

Berdir’s picture

Converting data with batch api will (can) result in *many* (as in: thousands) queries. I don't think it's a good idea to display them all.

c) Only display DDL (http://en.wikipedia.org/wiki/Data_Definition_Language) queries.

d) show DDL and auto-sumarized DML (http://en.wikipedia.org/wiki/Data_Manipulation_Language) queries, for example"INSERT INTO node VALUES (....)" (Executed 1264 times

Imho, d) sounds cool, but might be harder to implement as the other options.

Dries’s picture

I'd record and store all queries, but not display them. For the average user, the queries are scary. I think it is OK to store them all, as long we don't put them in everyone's face.

yched’s picture

Everything you put in $ret ends up stored in the batch 'results' (see update_do_one()), which is then fetched/unserialized and serialized/stored along with the whole $batch structure on each subsequent batch request.
As Berdir pointed in #7, a multipass update such as the one in #372743: Body and teaser as fields will potentially log thousands of queries, which might badly slow down the remaining process. There were similar performance issues in the first iterations of the "run tests through batch API" patch, until the tests results got moved out of the batch results and into a dedicated table.

David Strauss’s picture

I have to admit, I don't like this suggested approach. The idea of starting a query log seems very hackish to me, at least as a core (non-debug/dev) system. Perhaps the whole approach of "display the update queries" is flawed, as Dries has suggested.

I do agree that the current update_sql() is a poor system. At the same time, there aren't too many good upgrade options in MySQL, because MySQL doesn't support transactional DDL.

For enterprise projects, I mostly care what upgrade function(s) failed. The queries themselves aren't too useful to me. Could we have an update system that throws/catches exceptions and reports failed upgrade functions?

catch’s picture

Now that we have the descriptions for updates before running them, logging the queries doesn't seem as important. A better pass/fail indicator sounds like a good way around it. If we didn't have $ret any more, then hook_update_N() functions could also return simple strings like 'added foo index to bar table'.

moshe weitzman’s picture

I agree that logging and displaying queries is not needed.

Dries’s picture

Looks like we're all on the same page then. :)

We could do this in two steps: (i) remove the $ret stuff and (ii) come up with alternative, more high-level logging.

Status: Needs review » Needs work

The last submitted patch failed testing.

webchick’s picture

Status: Needs work » Needs review
catch’s picture

Status: Needs review » Active

I'd be pretty happy with showing the descriptions we currently have on update.php prior to updates as confirmation as well "the following updates were run, here's that list again we just showed you". Since there's no patch for that, or removing $ret, marking back to active.

Crell’s picture

Subscribing. With the better descriptions available now I like catch's suggestion in #16. I agree that we don't necessarily need to always show queries.

After some discussion just now in IRC with webchick and sun, what about this:

1) All update hooks may optionally return a string, which in turn will be displayed on the results page. The results page will then show the update description again followed by whatever message was returned. The return message is assumed to be a success message.

2) Wrap every update function in a try-catch block. If something goes wrong in the DB layer, the DB layer throws an exception already and we catch it and display that as an error message. If anything else goes wrong, the update function itself should throw an exception (it is an error condition, after all) that also gets caught and we display that error message. We could even stop and run no further updates from the same module at that point, but could continue to run updates in other modules.

3) A devel-only variable is added to the controlling code that runs the update hooks. If set TRUE by devel module, we start a separate query logger for each update hook, catch all of the queries (DDL and DML) and display those. That's there for development and testing only, so has no effect on normal usage. And since it's devel-only, we can answer the "but what if you have 5000 queries" question with "it's a development tool, if it explodes on you like that it's your problem." (We already do that with things like, say, dpm() inside a long loop.)

That hugely cleans up the update process, lets us then kill update_sql() and the abomination that is $ret, and makes my refactoring of schema API easier because I can get rid of $ret. It's also using exceptions for exactly what they are intended for: Out of band error handling that lets us recover gracefully from a variety of error conditions.

Who wants to code it? :-)

sun’s picture

Status: Active » Needs review
FileSize
1.42 KB

OK. Here comes a patch I wrote based on a one-liner by Crell in IRC, but I don't really understand what I'm doing. :-D

I hope this enables someone who groks this a bit more to complete it. :)

webchick’s picture

Issue tags: +DrupalWTF, +DIE

Tagging. ;)

sun’s picture

Issue tags: -DrupalWTF, -DIE
FileSize
2.35 KB

ok, slightly improved.

sun’s picture

Issue tags: +DrupalWTF, +DIE

.

moshe weitzman’s picture

I propose to keep update.php out of the query logging business entirely. Instead, add a module_invoke_all() before each update. Add one at the beginning and end of all updates for good measure. Devel will use the hooks to do its own query logging just like it does for the whole page query log.

Crell’s picture

Status: Needs review » Needs work

I'm good with a variable or with a double-hook. I have no strong preference there. However, the docblock change at the top of the patch is wrong. In the new API, an update hook should *return* a success message or throw a DrupalUpdateException with an error message. It shouldn't return an error message.

Crell’s picture

Status: Needs work » Needs review
FileSize
2.85 KB

Reroll. This is really just a doc cleanup. I talked to Moshe in IRC and he withdrew the extra hook as pointless. :-) The rest of the exception code is the same as before.

Crell’s picture

And I just realized that is not going to work, because update functions are batch API callsbacks (kinda) and therefore need to be able to return batch API-related information, like #finished. So we still need to support returning an array from the update function. The exception part is fine, just not the normal behavior part. :-(

Any batch API gurus want to weigh in here?

Damien Tournoud’s picture

Erm? It's not really pratical to enable Devel in most kind of major version updates. There is a huge chicken-and-egg issue there.

I would *really* like to log all failed queries. That's very important to debug *why* updates are failing (and they generally do... ask Gabor and the upgrade path of drupal.org from Drupal 5 to 6).

Crell’s picture

FileSize
6.58 KB

After talking with yched, this will NOT break multi-stage updates because those should be using the passed-in $sandbox variable anyway. Attached patch includes the appropriate updates, a note on the one-liner BC layer (which we can remove really quickly after existing updates are updated), and several dummy update hooks in system.install to demonstrate the new code working. I couldn't figure out how else to test the update system with normal simpletest, but this should be sufficient. Naturally, *do not commit anything to system.install from this patch*. :-)

@Damien: If a query fails, it will throw an exception. We're still catching all exceptions and printing their exception message. So we'll still get whatever the exception-causing query was printed to the screen.

sun’s picture

Quite interesting!

Another *follow-up* issue (before converting all existing updates) would then be to _remove_ the $ret argument from all database functions that take it as first argument. Yuck. Definitely a multi-step process.

yched’s picture

Status: Needs review » Needs work
+ * If an update function needs to be re-run as part of a batch process, it
+ * should accept the $context array by reference as its first parameter
+ * and set the #finished property to the percentage completed that it is, as a
+ * decmial fraction of 1.

There's a snafu here. Quick summary:
- update_do_one() is the only batch callback involved, batch API does not see hook_update_N.
- update_do_one() receives a $context array by ref from batch API, which includes a $context['sandbox'] array for storage of multipass business data.
- update_do_one() calls hook_update_N(). We dont want them to mess with the whole $context, we only give them $context['sandbox'], which they take by ref as $sandbox param (if they need multipass processing; they can just overlook the param if not).

So :
- The doc above should mention the $sandbox param, not $context.
- update_do_one() needs to:
take $context['sandbox']['#finished'] (set by hook_update_N), if any,
push it one level up to $context['finished'] (because this is where batch API will look, it doesn't care about what's in the sandbox),
and then unset the one in $context['sandbox'] (because we don't need it in the sandbox on hook_update_N's next pass)

Also, the # in '#finished' was here for compatibility with pre-batch API convention on $ret['#finished'] for updates. If we remove $ret altogether and set up a new convention, we don't *have* to keep the #. Although, since $sandbox is supposed to be 'this is all yours, do absolutely what you need in there', setting a convention on a specific entry might warrant a special prefix.

Of course, this API change will need to be mentioned in the module upgrade guide - not only '$ret is dead', but also 'the way to report "not finished" for multipass updates has changed'.

yched’s picture

Also: I expect the name of the 'query' entry will change once the BC layer is removed and this entry becomes strictly internal to update_do_one() and update_results_page() ? Because 'query' holding a user-facing message is a little off ;-)

Crell’s picture

Status: Needs work » Needs review
FileSize
7.06 KB

Thanks for the clarification, yched! New patch with revised docs and an extra code block to handle both the new and old methods. Again, it's marked for removal later.

yched’s picture

+  else {
+    $sandbox['#finished'] = 1;
+  }

shouldn't be needed. Batch API assumes 'finished' not set means 1 (finished, no need to call again). We should just make sure that the update_do_one() intermediate layer preserves that behavior (no $sandbox['#finished'] means no $context['finished'] and thus batch API assumes 1)

Crell’s picture

FileSize
7.06 KB

Yep, it works that way, too. :-)

Status: Needs review » Needs work

The last submitted patch failed testing.

Crell’s picture

Status: Needs work » Needs review
FileSize
6.88 KB

Chasing head. C'mon, webchick, let's kill it dead now!

Crell’s picture

Issue tags: +API change

Tagging to get more attention in the next few days, I hope...

chx’s picture

+++ includes/update.inc	2009-08-26 05:48:07 +0000
@@ -365,13 +365,28 @@ function update_parse_db_url($db_url) {
+ * will be aborded. The schema version will not be updated in this case, and all

aborted and i think it might be too wide, even.

+++ includes/update.inc	2009-08-26 05:48:07 +0000
@@ -365,13 +365,28 @@ function update_parse_db_url($db_url) {
+ * decmial fraction of 1.

Fraction

+++ includes/update.inc	2009-08-26 05:48:07 +0000
@@ -386,16 +401,58 @@ function update_do_one($module, $number,
+    $context['log'] = (bool) variable_get('update_log_queries', FALSE);

that bool casting is ugly. I can haz clean code please?

+++ includes/update.inc	2009-08-26 05:48:07 +0000
@@ -386,16 +401,58 @@ function update_do_one($module, $number,
+    // @TODO We may want to do different error handling for different exception
+    // types, but for now we'll just print the message.

Then don't write the code twice just mention in the comment that DrupalUpdateException might use different code.

+++ includes/update.inc	2009-08-26 05:48:07 +0000
@@ -416,6 +473,11 @@ function update_do_one($module, $number,
+class DrupalUpdateException extends Exception { }

The } needs to be on a separate line.

+++ modules/system/system.install	2009-08-26 05:50:45 +0000
@@ -2457,7 +2457,70 @@ function system_update_7036() {
+ * DO NOT COMMIT THIS FUNCTION.

Then why it is in the patch and more important why it's in system_update?? If you want to test then use a testing module, hm?

Beer-o-mania starts in 3 days! Don't drink and patch.

Crell’s picture

Attached patch addresses chx's comments, with 2 caveats:

1) All of our other exceptions in core now use { } on the same line. If we want to change that later, this isn't the place to do it. I'm just being consistent.

2) Per brief discussion with webchick, I've split the patch into two pieces. One is the real patch, the other is some extra update hooks for testing purposes. Hopefully the testbot doesn't choke. :-) Only commit the real part.

Status: Needs review » Needs work

The last submitted patch failed testing.

lilou’s picture

Status: Needs work » Needs review
webchick’s picture

Hm. I get this when I try the patch:

Mysterious failed

That looks like a mistake?

moshe weitzman’s picture

Status: Needs review » Needs work

Looks like we are CNW based on webchick pic

Crell’s picture

Looks like the existing update framework is very very fragile. :-) We had to unset a value that didn't need to be unset before. Fixed now.

Crell’s picture

Status: Needs work » Needs review

Looks like the existing update framework is very very fragile. :-) We had to unset a value that didn't need to be unset before. Fixed now.

Crell’s picture

Assigned: Unassigned » Crell
FileSize
1.63 KB
5.39 KB

Narayan said to upload these again to remind the bot...

moshe weitzman’s picture

@Crell et al. - the update framework will be less brittle if we can get this in #557582: Reset statics after each update. It is a one line patch. Feel free to RTBC.

mgifford’s picture

FileSize
56.98 KB

The patches applied well enough. Ran the database though and there was still one error:

Update #9004
* Failed: This is an error message.

Didn't find any more information in watchdog.

yched’s picture

Hm, #abort is messy in current HEAD...

To be consistent with the current #abort conventions ("An update function can force the current and all later updates for this module to abort by returning a $ret array with an element like: $ret['#abort'] = array('success' => FALSE, 'query' => 'What went wrong');" in the PHPdoc for update_do_one() ),

+      $ret['results'] = array('success' => FALSE, 'query' => $e->getMessage());
+      $ret['#abort'] = TRUE;

should be

+      $ret['#abort'] = array('success' => FALSE, 'query' => $e->getMessage());

and

+    unset ($context['results'][$module][$number]['#abort']);

which was added in #43 is not needed.

*Then* the way #aborted updates are reported on the result page is broken (reports an error in the wrong update func, and does not show results for the updates that could run corectly before the #abort), but that's already the case in HEAD, and this patch does not make it better or worse.

After the points above are fixed, we should commit this so that "kill $ret in DBTNG" can happen, and cleanup update_do_one() and update_results_page() after existing updates have been moved to the new return format.

Crell’s picture

@ #47: Yes, 9004 should fail. It's there to confirm that failure handling works. If it didn't fail, that would be a problem. :-)

yched: I don't fully understand. You will need to explain it to me in person tomorrow sometime. :-)

mgifford’s picture

If it's intentional, then could you explain that in the message. Say

Success: The SQL failure test behaved properly.

That error message might not cause panic.

Crell’s picture

FileSize
5.36 KB

Addresses yched's comments in #47. He was over my shoulder as I wrote this version. :-)

@mgifford: Comment #38 says it's just a testing patch, and the inline docblocks say quite clearly what is supposed to happen. Please read the comments and the descriptions of the tests themselves.

webchick, back over to you.

yched’s picture

Status: Needs review » Reviewed & tested by the community

This patch is ready.
Followup work on
a) removing $ret from DBTNG,
b) moving existing update functions to the new convention and remove the BC layer from update_do_one()
can happen in separate tasks.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

I'd really rather not *require* a string to be returned. We should only return a string if there's something worth talking about that the user needs to act on (for example, "The mail variable format has changed, and you will need to manually switch them to XXX.") Otherwise, users end up trying to find important messages in amongst this:

* The update completed successfully.
* The update was successful.
* The update succeeded.
* The update worked okay.
* The update didn't cause any major heart-aches.
* The update, as far as we know, did not cause the world to spin wildly off its axis and destroy the universe as we know it. But we could also be wrong, so hope you took a back up!
etc.

Yuck. And we still retain the DX problem, just pushing it from requiring an empty array to requiring an empty string if you don't want to print anything. So please allow me to not return anything from my function (default), in addition to a string if I want.

webchick’s picture

Status: Needs review » Needs work
Crell’s picture

Status: Needs work » Needs review
FileSize
1.75 KB
7.28 KB

After discussion with webchick and Bohan, this patch now hides an update's messages entirely if the update returns nothing. We also adjust the update results page to not say "these queries ran" anymore, since they didn't, but instead say "hey, the following messages were returned."

Updated test patch includes an update function that will have no display at all on the results page, to show that it all works. There's even a docblock update. :-)

Crell’s picture

FileSize
7.28 KB

Stupid patches...

Also I forgot to mention that it turns out the static cache clearing was breaking things in mysterious ways, so I took that out as well. Sad panda. :-(

yched’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
7.29 KB

batch_get() uses drupal_static(), so the drupal_static_reset() call that's currently in update_do_one() can probably have strange results indeed...

Simple reroll to fix strange indentation in #56's update_results_page(), and back to RTBC.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

I reviewed this patch in Paris, and discussed it with webchick and Crell. Changes look good so committed to CVS HEAD.

Crell’s picture

Woohoo!

Follow-up patch is here if anyone wants to have a go at it: #570900: Destroy remnants of update_sql()

yched’s picture

Thanks Dries.

Followup : #570910: Return meaningful messages from update hooks
I'll let Larry create the followup task about removing $ret from DBTNG (if the issue doesn't exist already ?) [edit: Gah, crosspost]

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

David_Rothstein’s picture

I think there are a number of issues that resulted from this change. Here are some of the ones that might be considered followup bug reports:

Any help following up there is appreciated :)

jhodgdon’s picture

Status: Closed (fixed) » Needs work
Issue tags: +Needs documentation updates

This was tagged as an API change. Was it one that woudl affect 6/7 module updaters, and if so, was it announced on the 6/7 update page? Temporarily tagging and changing status; if nothing needs to be reported there (or if it already was), please revert.

webchick’s picture

Priority: Critical » Normal

This is no longer a critical bug.

Crell’s picture

Priority: Normal » Critical
Status: Needs work » Closed (fixed)
Issue tags: -Favorite-of-Dries, -Needs architectural review, -DrupalWTF, -DIE, -API change, -Needs documentation updates

This was long-since documented: http://drupal.org/update/modules/6/7#update_sql