Problem/Motivation
In #2813089: Remove newlines from sentences in HU txt files the HU translation team is investigating the use of a translation memory tool to aid in the translation in Hungarian and on-going maintenance of that translation. In general it sounds like a potentially really useful tool that could also benefit other translation teams once they've figured out how to best leverage it. The tool they are using, and from the sounds of it most translation memories, don't like that we essentially have random newlines in the middle of sentences.
This is a result of our current standard of wrapping text at 80 characters.
This will be a problem going forward for any team that chooses to use a translation memory.
Proposed resolution
Remove requirement to wrap at 80 characters and update all english language files to remove newlines in text. This way the content that is being copied as a starting point for all translations is already free of newlines, and translation teams will not need to go through and remove the newlines from every file in their sub-directory.
Remaining tasks
- I would like to get Jennifer's thoughts on this before we proceed.
- Remove newlines from all files in the source/en directory
- Update the contributor guidelines
Comments
Comment #2
jhodgdonHm.
Disadvantages of not word-wrapping:
a) If we change one word in one line, or fix a small punctuation error, the diffs are harder to see because the whole paragraph is one line.
b) It is harder for people using some text editors and/or command lines to read the source files. Wrapped text at 80 characters is usually easier to read.
c) Generally the whole project uses 80 character lines for text files.
However, that said, of course I do not want to get in the way of translation memories... I have no idea how many of the translation teams would want to use them? Also AsciiDoc does not care. The only thing it complicates, IMO, is the git/diff/command line process.
So... I could go either way.
Comment #3
eojthebraveYou can probably mostly get around this problem using `git diff --word-diff` which will highlight words/sentences that have changed a bit better.
I don't disagree with this, but I also think it's not SUPER important. Almost every text editor or shell available these days can be configured to deal with lines longer than 80 characters. I also think that unless this is likely to be a problem for someone that's pretty actively involved with the user guide we shouldn't worry about it to much. I do think it's still fairly important in the context of comments that are inline or in code files. Mostly because a lot of people like to edit code with soft-wrapping turned off. I feel like that's less of an issue for people working on content/writing tasks where tools like a good word processor are the standard for editing, not an IDE.
This is probably the biggest hangup for me. And in the other thread where this came up that was my initial reaction. However, after having learned a bit more about what a translation memory would help with, and having people who know way more about this than I do say it's a good idea to use one, I'm feeling like I can be swayed to break with tradition for this use-case.
Comment #4
baluertlThanks Joe and Jennifer for bringing up the reasons of 80 char limit, that's new for me.
I'd suggest to do this test only with /source/hu directory and don't touch the original English source yet. Even we can set this issue status to "Postponed" until the results our TM-tests turns out.
With balagan we already prepared the removal patches (listed under #2813089: Remove newlines from sentences in HU txt files), but I wait your agreement before pushing it up to the repository.
Comment #5
eojthebraveThat works for me. @Balu Ertl I think you should feel free to commit those changes to the HU section for testing.
So current status of this issue is:
Comment #6
baluertlThanks Joe, so yesterday I committed the patches, seems okay, but please take a look on them. Thus I marked #2813089: Remove newlines from sentences in HU txt files as Fixed and also posted a screenshot comparing the before & after why this step is a big help.
As balagan mentioned earlier, we still can comply with the 80 char limit, but only after the source files being fully translated. For example we can agree on a routine that maintainers hard-wrap the freshly translated source files before committing into the repo. This way could be a win-win consensus, what do you think?
Comment #7
jhodgdonRegarding #3, doing word differences doesn't work well either, if the lines are very long. At least in my terminal window, I would have to make a very very very wide terminal window to see the diffs, because they don't wrap, they just get truncated if the line goes over the size of the terminal window.
Anyway... Given all of that, I'm still OK with the idea of removing the line breaks... What do you both think we should do about this?
Comment #8
jhodgdonOn #2813089-17: Remove newlines from sentences in HU txt files, balagan reported that removing the newlines is not actually necessary to use a translation memory, so I think we should close this as "outdated".
Comment #9
eojthebraveNice find!