When a node has title with long utf8 and/or possibly invalid characters in it, the module stops working completely, meaning it silently hits the following error (no dblog notice or anything , on the contrary db log report the scheduler run successfully) and consequentially not publishing anything.

The call on line 770 of scheduler.module :
watchdog('scheduler', '@type: scheduled publishing of %title.', array('@type' => $n->type, '%title' => $n->title), WATCHDOG_NOTICE, l(t('view'), 'node/'. $n->nid));

The error only is only viewable when trying to manually run the scheduler http://domain.com/scheduler/cron
PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xCE' for column 'link' at row 1: INSERT INTO {watchdog} (uid, type, message, variables, severity, link, location, referer, hostname, timestamp) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7, :db_insert_placeholder_8, :db_insert_placeholder_9); Array ( [:db_insert_placeholder_0] => 1 [:db_insert_placeholder_1] => scheduler [:db_insert_placeholder_2] => @type: scheduled publishing of %title. [:db_insert_placeholder_3] => a:2:{s:5:"@type";s:4:"cars";s:6:"%title";s:98:"Mercedes-Benz S 500 5500cc Πράσινο σκούρο μεταχειρισμένο 32000 ευρώ";} [:db_insert_placeholder_4] => 5 [:db_insert_placeholder_5] => <a href="/vehicle/234324/mercedes-benz-s-500-5500cc-%CF%80%CF%81%CE%AC%CF%83%CE%B9%CE%BD%CE%BF-%CF%83%CE%BA%CE%BF%CF%8D%CF%81%CE%BF-%CE%BC%CE%B5%CF%84%CE%B1%CF%87%CE%B5%CE%B9%CF%81%CE%B9%CF%83%CE%BC%CE%AD%CE%BD%CE%BF-32000-%CE%B5%CF%85%CF%81%CF%8E">προβοΠ[:db_insert_placeholder_6] => http://domain.com/scheduler/cron [:db_insert_placeholder_7] => http://domain.com/admin/reports/event/282565 [:db_insert_placeholder_8] => 178.128.97.192 [:db_insert_placeholder_9] => 1370731434 ) on dblog_watchdog() (line 160 του /modules/dblog/dblog.module).

As a temporary solution i use the unaliased path for the l(t('view'), 'node/'. $n->nid) part of the watchdog call.

By the way, scheduler is one of the most useful modules out there. Thank you.

Comments

jonathan1055’s picture

Hi silios,
Thanks for reporting your problem, and I'm glad you find the scheduler module useful.

I have some questions, please:

  1. In your first paragraph you say 'When a node has title with long utf8 and/or possibly invalid characters'. Do you actually mean the title? It looks like the link is the variable which caused the sql error
  2. [:db_insert_placeholder_5] holds the value of the link, and in your example above it is of the format <a href="chars">text, ie there is no closing </a> tag - do you know if this is just a quirk of how your error is displayed above, or is this actually the cause of the problem?
  3. "As a temporary solution i use the unaliased path for the l(t('view'), 'node/'. $n->nid) part of the watchdog call." I do not quite understand what you mean here. Do you mean that you removed the l( ) function from the watchdog call and inserted a raw link instead?

I need your help in answering these so that we can replicate the error. Then we can work on a solution, either to avoid the error or at least make scheduler detect that a problem has occurred and report it.

Jonathan

silios’s picture

Hi Jonathan,

1+2) The link variable is indeed causing the problem. I am suspecting that there is a twofold problem on the data part : the link column in the database is varchar(255) and <a href="/vehicle/234324/mercedes-benz-s-500-5500cc-%CF%80%CF%81%CE%AC%CF%83%CE%B9%CE%BD%CE%BF-%CF%83%CE%BA%CE%BF%CF%8D%CF%81%CE%BF-%CE%BC%CE%B5%CF%84%CE%B1%CF%87%CE%B5%CE%B9%CF%81%CE%B9%CF%83%CE%BC%CE%AD%CE%BD%CE%BF-32000-%CE%B5%CF%85%CF%81%CF%8E">προβοΠis already at 260 chars.
I have seen this problem happening in the past even with core modules such as aggregator when it couldn't store long links in the appropriate field.

and it just so happens that the General error: 1366 Incorrect string value: '\xCE' for column 'link' kicks in first due to characters in title not being utf8.

My reasoning for "blaming" the title in our case is that the l() function returns the alias for the ahref attribute of the link we create in the last part of the watchdog call and thus it leads the watchdog call to crash without warning.

3) The exact call i used to temporarily overcome the error is

watchdog('scheduler', '@type: scheduled publishing of %title.', array('@type' => $n->type, '%title' => $n->title), WATCHDOG_NOTICE, '<a href="/node/'.$n->nid .'">' . t('view') . '</a>');

so i am just saving the nid in the href instead of the aliased path (i do have clean urls enabled always thus the formatting). Not feeling good about it(hackish?) but it gets the job done for now, since im heavily relying on scheduler.

I think malformed data in title shouldn't be there in the first place,but since this is usually out of our control, it would be nice for scheduler to be able to overcome the error and continue to work while reporting the error as you suggested.

jonathan1055’s picture

Thanks for the further explanations. It would seem that the core watchdog() function should detect and avoid some of these errors. But getting that changed would take a huge amount of time and effort.

Looking at https://api.drupal.org/api/drupal/includes%21common.inc/function/l/7 and https://api.drupal.org/api/drupal/includes%21common.inc/function/url/7 a simple fix that might do the trick would be to use an $options array containing ('alias'=>true) when calling the l() function. This would mean that our node/$n->nid would be treated as if it were already an alias and the further conversion would not be done. That, in effect, is what you have hardcoded in your example above.

Given that you have nodes which cause this failure, would you like to test this?

watchdog('scheduler', '@type: scheduled publishing of %title.', array('@type' => $n->type, '%title' => $n->title), WATCHDOG_NOTICE, l(t('view'), 'node/'. $n->nid, array('alias'=>TRUE)));

Thanks.

silios’s picture

Tested it and works fine,all nodes (including ones with ridiculously long titles in Greek) get saved and the reporting doesn't crash.

It should be handled indeed by the core watchdog() function as this is a show stopper for sites in languages such as Russian and Greek but in the meantime im happy with protecting scheduler from the watchdog error.

Thanks.

jonathan1055’s picture

Category: bug » task
Status: Active » Needs review
StatusFileSize
new1.44 KB

Thanks for testing. Here is a patch which makes the change in both the watchdog messages (publish and unpublish). I have also tested this and the link does remain at /node/nid instead of being converted to the url alias. I do not think there is any real loss of functionality for scheduler in using this shortened url, so I am happy that we should consider this to avoid the core watchdog fault identified above.

jonathan1055’s picture

Title: PDOexception because of watchdog call » Avoid PDOexception with shorter watchdog link
StatusFileSize
new1.45 KB

Reroll after latest dev 1.1+10

silios’s picture

Status: Needs review » Reviewed & tested by the community

Thanks again!

silios’s picture

Thank you it works fine.

silios’s picture

Thank you it works fine.

jonathan1055’s picture

That's good. This will get committed then be in the 7.x-1.2 release in due course.

rickmanelius’s picture

Status: Reviewed & tested by the community » Fixed

Hi Everyone. This was committed!
http://drupalcode.org/project/scheduler.git/commitdiff/6b9cac3

One nitpick on the patches. They first line in #6 is as follows

--- scheduler.module.~1~	2013-07-27 18:42:28.000000000 +0100

The "~1~" was causing a little trouble.

jonathan1055’s picture

Ha! sorry that .~1~ was my fault. I had been using diff to create the patches, and the 'old' version of the file has the .~1~ extension. I usually remember to edit this out of the patch, but forogt this time.

However, I have now downloaded git so I will be able to make patches in the a/ b/ format. Is that preferrable to you?

By the way, thanks for all the recent commits. I won't add a 'thank you' on each one, but seeing I had to respond to this one anyway, I thought I would say it is appreciated.

Jonathan

rickmanelius’s picture

Hi jonathan1055. Well the praise is well deserved because you've been doing a great job!

With respect to your question on git and the a/ b/ formatting, yes that is preferable. I typically just go to the root folder of the module and run "git diff > out.patch" and then rename it to something appropriate to d.o.

-Rick

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.