Not sure this is anything that can be handled on the drupal-end of things, but I just received a translation back which had all XML tags stripped from the document. This means that the plugin cannot allocate the translated text pieces to the different fields anymore (plus HTML formatting is gone as well). The translation job simply stays in "In Progress".

Watchdog shows the following entries:

  • Warning: simplexml_load_string(): Entity: line 1: parser error : Start tag expected, '<' not found in TMGMTOhtPluginController->parseTranslationData() (line 287 of [...]/sites/all/modules/tmgmt_oht/tmgmt_oht.plugin.inc).
    
  • Warning: simplexml_load_string(): Clarification of Duties: in TMGMTOhtPluginController->parseTranslationData() (line 287 of [...]/sites/all/modules/tmgmt_oht/tmgmt_oht.plugin.inc).
    
  • Warning: simplexml_load_string(): ^ in TMGMTOhtPluginController->parseTranslationData() (line 287 of [...]/sites/all/modules/tmgmt_oht/tmgmt_oht.plugin.inc).
    
  • Notice: Trying to get property of non-object in TMGMTOhtPluginController->parseTranslationData() (line 290 of [...]/sites/all/modules/tmgmt_oht/tmgmt_oht.plugin.inc).
    
  • Warning: Invalid argument supplied for foreach() in TMGMTOhtPluginController->parseTranslationData() (line 290 of [...]/sites/all/modules/tmgmt_oht/tmgmt_oht.plugin.inc).
    

The user interface doesn't show/reflect any of these problems. If this cannot be fixed, it would be good to at least inform the user that things have gone awry ;-)

CommentFileSizeAuthor
#3 oht-structure-lost-bug.png110.11 KBbforchhammer
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bforchhammer’s picture

It looks like tmgmt_oht currently submits data in the "text/plain" content type instead of using "text/xml". Could this be the problem?

The api docs for "new project" says that you can provide a "content_type" parameter, which defaults to "text/plain".

miro_dietiker’s picture

Assigned: Unassigned » blueminds

bforchhammer, Can you provide us a snippet of the job submitted and from the response?

We will need to check with OHT how they can make sure the structure we are relying on is protected by guarantee.
In addition we will need to make sure the error handling is right.

bforchhammer’s picture

FileSize
110.11 KB

bforchhammer, Can you provide us a snippet of the job submitted and from the response?

Here's a screenshot of what the respective OHT project page looks like:

Hope this helps :-)

miro_dietiker’s picture

Priority: Normal » Critical

Haha, very ugly. That can't be true.
We will check what we can do.

miro_dietiker’s picture

OHT is currently improving their backend implementation.
I will update you as soon as their system is ready to protect the structure of documents and even HTML.

Berdir’s picture

This is not about protecting user-provided HTML.

The items/item/text tags are used by the module itself to identify single data items in a job. They must not be touched or everything falls apart.

I'm not sure if this is caused by an change or maybe used a tool to translate it on their side which did not support this correctly or some other one-time thing.

@bforchhammer, is this reproducable currently or did it happen just once? The job id is in the screenshot, so we should be able to ask the OHT guys if there's something special about how this job was processed if it's not reproducable.

miro_dietiker’s picture

Both are a problem.

First, OHT needs to cleanly identify the job items and protect them. A translator should never see the item/data boundaries.
Without this, TMGMT will not work.

Additionally, still if then the translator has the item payload and can destroy the HTML, we risk to lose formattings.
Without this, the quality of a HTML translation is likely to be unsatisfying (needs HTML reformatting...).

Both issues have been reported to OHT and we will review the new solution and provide you feedback with all that.

bforchhammer’s picture

The items/item/text tags are used by the module itself to identify single data items in a job. They must not be touched or everything falls apart.

Yes, that's the problem precisely :-)

@bforchhammer, is this reproducable currently or did it happen just once? The job id is in the screenshot, so we should be able to ask the OHT guys if there's something special about how this job was processed if it's not reproducable.

Well, this only happened because a human translator stripped the tags out; a different translator left all tags intact, and I only tested with two. So I don't think there's a way to reproduce this reliably.

My initial thought was that we submit data in a wrong format, but from what @miro_dietiker says it sounds like it's a problem with the way OHT presents translatable data to their translators.

miro_dietiker’s picture

We'll have a call soon and check for all details and possibilities.

Matroschker’s picture

It seems that the problem still isn't solved, is it right?
Currently I sent a job to OHT, that means that sending job via Drupal/TmgmT is not a problem. I recieved the job as a word file via email. The job status didn't change and I can't see any translated text on the right site of the review page.
I contacted the support from OHT and they sent the job back as an XML file (I recieved it via email), but nothing changed in the job.

I use the new Beta version from tmgmt and the latest dev release (7.x-1.x-dev from 30. Oct. 2013) from the OHT plugin.

Matroschker

miro_dietiker’s picture

OHT promised to fix this in their UI / workflow. We can't do anything before. We are awaiting feedback from them but didn't hear back about this request.

Lionsharz’s picture

still having this issue with OHT - recommend not using them until this is seen to. They stated that the Drupal module was created by a third party and refused to give support for weeks. I still have not found a solution. Many hundreds of pages are kinda trapped in the OHT system:S

Berdir’s picture

Status: Active » Closed (duplicate)

Sorry about the silence here. We're picking this up and improving it.

I think that #1971672: Switch to XLIFF format will fix this. We will then send a XLIFF document (same structure as tmgmt_file export).

We didn't do much testing yet, but we confirmed that the word count is correct when sending an XLIFF as .xliff, so I think they will enforce the structure then.

If anyone here is still using OHT, I'd love to get feedback on the patch there for real world usage. Note that you need the latest tmgmt dev snapshot and that the patch there isn't complete yet at the time of writing this.