Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Berdir’s picture

This used to work just fine, we tested this quite a bit when initially developing this.

miro_dietiker’s picture

I have requested an example of what doesn't work from the original reporter

miro_dietiker’s picture

"Is there a way that TMGMT could take care of it by..."
That's what masking is all about.
However, the external service should support masking natively, too.

... and all that will require quite some work. :-)

Berdir’s picture

Instead of trying to do anything ourself, we should probably sent it as content type text/html, then we just need to check if we have (valid) HTML and switch the content type: http://msdn.microsoft.com/en-us/library/ff512421.aspx

I haven't found a way to exclude something from translation, which would still be useful for e.g. locale placeholders.

Berdir’s picture

Ah, also spans with notranslate class, similar (equal?) to google translate: http://social.msdn.microsoft.com/Forums/en-US/41f09c5d-68ae-4d26-ad93-a2...

jantoine’s picture

Issue summary: View changes
Status: Active » Needs review
FileSize
1.81 KB

The attached patch changes the content type to text/html as suggested in #5. It does not, however, validate the HTML. This works fine when translating text without HTML, so it shouldn't affect plain text translations.

It also implements escapeStart and escapeEnd for escaping user defined strings.

This patch is working for me, although I had to extend the TMGMTEntitySourcePluginController class in order to define custom strings to be escaped. Would be great if this could be handled via the UI.

Anybody’s picture

The patch works great for me also. If we can get 1-2 more reviews, we can perhaps set it RTBC and get this into the next dev release?

akalam’s picture

Works perfect. Great patch!

Thanks jantoine

Anybody’s picture

Status: Needs review » Reviewed & tested by the community
gge’s picture

I just tested this patch and is working great until now, but I found two minor things that could be improved.

1. I'm using the CKEditor module and added "{ name : 'Do not translate' , element : 'span', attributes: { 'class': 'notranslate' } }" to ckeditor.styles.js. I'm able to select some text and easily add a span with class "Do not translate", from the Select dropdown. Everything is perfect except there should be an empty space after the closing < /span>
Original text:
<span class="notranslate">Do not translate this in</span> German
the translated text:
Do not translate this inDeutsch
and should be:
in Deutsch

2. How can "notranslate" can be used for the title field?

Thank you!

jacktonkin’s picture

Version: 7.x-1.x-dev » 8.x-1.x-dev
Status: Reviewed & tested by the community » Needs review
FileSize
1.78 KB

Reviving this issue because I'm having similar issues with 8.x-1.0-beta1, with <a> tags occasionally being translated as < un > for English -> Spanish translations.

It was trivial to port the patch from #7, and it appears to work with minimal testing so far. I've additionally removed the Content-Type header as I think it's meaningless for GET requests.

heddn’s picture

Is this still working on your site after 4 years? Or did MS fix things on their side so this patch is no longer needed?

jacktonkin’s picture

I'm still applying a version of this patch and translations work with it applied. The patch above is against an old version of the API. I have a newer patch I'll re-roll against HEAD.

I haven't tested without this since I updated to the V3 API, but looking at the documentation it seems clear to me that any markup should be submitted to the service with 'textType' => 'html'.

https://docs.microsoft.com/en-gb/azure/cognitive-services/translator/ref...

Also, thank you so much for taking the time to prepare a Drupal 9 compatible release of this module!

jacktonkin’s picture

Updated patch.

heddn’s picture

+++ b/src/Plugin/tmgmt/Translator/MicrosoftTranslator.php
@@ -61,6 +61,16 @@ class MicrosoftTranslator extends TranslatorPluginBase implements ContainerFacto
+  protected $escapeStart = '<span class="notranslate">';
...
+  protected $escapeEnd = '</span>';

I don't see where this is used. Or does it come into play from the parent class?

I am really going to have to depend on you to tell me if this is a change that risks breaking anything. I only lightly use this module and don't have sufficient time to thoroughly grok if this is a risky change.

jacktonkin’s picture

Yes, parent::escapeText() and parent::unescapeText() wrap substrings that shouldn't be translated with those <span> tags.

This is so locale placeholders (e.g. @title in t("Created new content @title", ['@title' => $node->label()]);) don't get translated. See #5 above. I believe that escaping like this is only available when translating HTML on Microsoft's service.

All of the text we send for translation using this service has been entered in CKEditor and is wrapped in paragraph tags at least, so I don't know for sure that sending text that isn't wrapped in HTML tags will cause a problem. I did see problems in the past with HTML tags being translated, which is why I took up this issue. Also, I'd have thought that most of the translation sources in Drupal are markup rather than plain text (in the sense that they are safe to include directly in the rendered page without further escaping) so it's hard to see how this can make things worse?

  • heddn committed d7ebeed on 8.x-1.x authored by jacktonkin
    Issue #2042873 by jacktonkin, jantoine, pmu: Preserve HTML
    
heddn’s picture

Status: Needs review » Fixed

Thanks for explaining your logic. Committed it.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.