Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Hello,
submitting any node with more than 1200 chars will result in this error:
Translation has been rejected with following error: Unable to connect to Google Translate service due to following error: Request-URI Too Large at https://www.googleapis.com/language/translate/v2/? [... very long url here... ]
The problem should be solved by sending the data via POST.
Useful references:
https://groups.google.com/forum/?fromgroups=#!topic/google-ajax-search-a...
https://developers.google.com/translate/v2/using_rest
Comment | File | Size | Author |
---|---|---|---|
#6 | tmgmt_google-length-limit.patch | 7.78 KB | mikel1 |
#3 | tmgmt_google-http_post-1799502-3.patch | 1.91 KB | zhuber |
#2 | 0001-POSTinsteadOfGET-1799502.patch.patch | 2.17 KB | micwille |
Comments
Comment #1
Sifro CreditAttribution: Sifro commentedOk, i've tried but i'm out of ideas.. The solution must be very simple and near, but i couldn't find it.
I first tried using drupal_http_request... but i get this error:
"Bad Request" only? It seems one of microsoft highly descriptive errors!
This is the code i used:
And this is the $options array:
So i changed approach. I found an example on the internet with cURL, and i decided to try this way.
I could succesfully send queries to google by using cURL: google returned the right translations.
But for some reason the integration with the tmgmt system doesn't work.. i see "translation in progress" and not "ready for review", without any errors.
This is the code i used:
The code above gives me this:
If i remove the die() instruction, run the script and then go to the page node/MY-NODE-ID/translate, i can see that the status is "not translated" and pending translation is "in progress"... while, as far as i've understood, it should be in a "needs review" status.
If i go to the page admin/config/regional/tmgmt/jobs/MY-JOB-ID i can see this under progress: 0/2/7
But i don't really get what it means.
Translating through file export\import works fine.
I'm completely stuck, i hope someone can help me. Please let me know if there's anything else i can add to the discussion.
Comment #2
micwille CreditAttribution: micwille commentedI attached a patch to change from GET to POST request.
data is in x-www-form-urlencoded format and existing options added to the doRequest function are overwritten with drupal_http_request options that enable a POST request.
Limit is now effectively 5000 chars, as set by the api itself:
https://developers.google.com/translate/v2/faq
Comment #3
zhuber CreditAttribution: zhuber commentedI had trouble applying the last patch, although the changes seem to have worked for me.
I recreated the changes, cleaned up some of the syntax formatting and then recreated the patch. I would also like to patch the tmgmt module to show the total character count, in addition to the total word count. The word count can be misleading, since the google translate API has a max character limit instead of a word limit.
Comment #4
BerdirTestbot bump, this might need changes in the tests.
Comment #6
mikel1 CreditAttribution: mikel1 commentedI had spotty luck getting POST to work, and it only increases the length from 1400 to 5000 characters (which is still too small for many pages I want to translate). Attached is a patch which removes the length limit entirely, by breaking up large fields. It does not change the method from GET to POST, but it fulfills the spirit of this feature by removing length limits on translated data.
The new code works as follows.
If a field is too big (> $maxCharacters after URL encoding), it looks for HTML paragraph tags and tries to split it into small enough chunks. If a paragraph is still too big it tries to split it on sentence boundaries (delimited by "." "!" or "?"). If a single sentence is still too big it will split it on white space boundaries. If a single word is too big it leaves it untranslated. It makes a reasonable attempt to put as much stuff in each call to google as possible, to minimize latency.
I've been using this on my sites for a couple of weeks now, and it appears to work well on nodes with large bodies.
I think this patch is still useful even if the module is modified to use POST, as it will remove the 5000 character limit. To change the limit for splitting large text, simply change $maxCharacters. It is set to 1400 right now because that seems to be close to the maximum size google translate will accept.
Hope this helps.
Comment #7
BerdirComment #8
BerdirThanks for working on this.
tmgmt_word_count() has a list of characters it considers as punctuation, not all of them apply here but maybe some do?
In general, splitting text up is quite tricky, and will require a good amount of tests, also in non-english languages.
Additionally, it's a thing that many translators need, so we should try to extract this into common functions/methods so that e.g. the microsoft translator can use it. That would then live in the core module.
Might also require some thinking to come up with a good API to reduce code duplication.
Comments should start with an uppercase character, end with a ".", < 80 characters and consist of complete, actual english sentences.
Comments like this should be removed :)
Only class propertiy should use camel case, normal variables should be $q_length/$query_length (variable names are usually not shortened)
Drupal has wrappers for this: drupal_strlen(). That uses the mbstring extension when available.
The question, how does Google count/treat multibyte characters?
Then do it away :)
I don't like Helper method prefixes, just state what it does, starting with a verb: "Splits long text into chunks and translates it."
Opening { should be on the same line as the method.
This is when it can't be splitted? We can't just return the source text then?
Debug code should be removed.
Comment #9
CarlHinton CreditAttribution: CarlHinton commentedThis seems to be a pretty major issue with the translator. I would suggest breaking up nodes into short sentences, then posting them one by one. Breaking at a full-stop doesn't work well (I tried it), as the full-stop is used for many, many purposes - not least as a decimal point.
Comment #10
AnybodyThis problem is very heavy and stops the module from working for many cases. How can we proceed here? It's really terrible.
Comment #11
AnybodyI tested around a lot and was NOT successful using the POST method, anyway I think there must be a way.
The path from #6 works good, but also seems to run into limits in some cases. So my suggestion would be to clean it (#6) up like suggested and then commit it to the next dev release? I think there are many people having problems with the GETs limitations.
Comment #12
carsonwThe patch from #6 worked for me, and I agree with @Anybody's comments in #10 and #11.