At the moment no filter is being applied to body text, so it is losing things like line breaks when it comes back from Google translated. We need to apply the same filter as the source node used to text returning from Google so it is formatted using the same rules.

Comments

greg.harvey’s picture

Status: Active » Needs work

Ok, I had a look at this. Problem is twofold:

1. Google strips line breaks. We can get around this by converting the string to HTML before we send it - this is around line 160:

      else {
        try {
          $gt = new Gtranslate;
          // we need to send HTML to Google to not lose formatting
          $body_markup = check_markup($node->body);
          $tbody = $gt->$func($body_markup);
          $new_node->body = $tbody;
        }
        catch (GTranslateException $ge) {
          // if there was a problem, pass the error to watchdog.
          $message = t('From GTranslate: ') . $ge->getMessage();
          watchdog(t('auto translate'), $message, array(), WATCHDOG_ERROR);
          // fall back to original node body.
          $new_node->body = $node->body;
        }

Now the $tbody variable has properly formed *mark-up* in the correct language coming back (so line breaks are there, but in the form of HTML <p> tags. But:

2. node.module then filters out the mark-up from Google again on node_save() and we're back to square one, no line breaks in translations. I've written this, which is a nasty but effective workaround for now:

      else {
        try {
          $gt = new Gtranslate;
          // we need to send HTML to Google to not lose formatting
          $body_markup = check_markup($node->body);
          $tbody = $gt->$func($body_markup);
          $tbody = str_replace('<p>', '', $tbody);
          $tbody = str_replace('</p>', "\n\n", $tbody);
          $tbody = str_replace('<br>', "\n", $tbody);
          $tbody = str_replace('<br/>', "\n", $tbody);
          $tbody = str_replace('<br />', "\n", $tbody);
          $new_node->body = $tbody;
        }
        catch (GTranslateException $ge) {
          // if there was a problem, pass the error to watchdog.
          $message = t('From GTranslate: ') . $ge->getMessage();
          watchdog(t('auto translate'), $message, array(), WATCHDOG_ERROR);
          // fall back to original node body.
          $new_node->body = $node->body;
        }

But I hope someone who's a better coder than I can come up with a better way of doing this. The way I see it, we can either:

1. Make it so node_save() does not strip the mark-up from Google (I think this is safest).
2. Make a smarter way of converting the line breaks back than those ugly str_replace repeats.

greg.harvey’s picture

Project: i18n Auto Draft » i18n auto translate
Version: 6.x-1.x-dev » 6.x-2.0
Assigned: greg.harvey » Unassigned

Moving to new issue queue.

cyberwolf’s picture

Subscribing.

greg.harvey’s picture

Category: bug » feature
Status: Needs work » Active

Marking this as a feature request. The initial bug report has been dealt with. If someone wants to tidy this up, feel free. Otherwise it will stay how it is.

greg.harvey’s picture

Status: Active » Closed (fixed)

About to commit a fix for this today.