[mainline] there is a need for a en_US translation

Brainstorm ideas of possible additions to the game. Read this before posting!

Moderator: Forum Moderators

Forum rules
Before posting a new idea, you must read the following:
Post Reply
User avatar
Celtic_Minstrel
Developer
Posts: 2158
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

Michal- wrote: April 25th, 2022, 10:04 pm
Pentarctagon wrote: April 24th, 2022, 4:52 pm I think that'd be very helpful.
vim with po.vim ftplugin for editing, git log -S to find out who, when and why, GitHub four-color line/word diff to see what exactly changed, DeepL to verify my translations or find the correct phrasing (try to insert Drogan's By all rights... - there are differences both in Czech and French), perfect Czech/English vocabulary, Czech language institute's precise pages for Czech grammar, Google search with quotes to dive into old English or to measure obscurity of my translation, our GitHub repo to collaborate using Issues, PRs and mainly Suggestions (for authorized corrections), msgmerge to push changes from wesnoth to our repo (yes, with --previous), msgcat to rewrap after vim or GH editing, msgfmt to check changes in the game, msgattrib to remove obsolete messages, vimdiff to quickly sync changes between branches, msync.sh, which uses three msg* utils to sync already translated messages between branches (in case of long-delayed POT update in one branch) and last but not least my bash script w.sh, which glues together mentioned utilities with our repo to quickly manage translations including packing for Nils and making stats (it's tighly tied with our repo, so nothing to share without prior rewriting)
I doubt that workflow is particularly common among translators… vim, seriously? And I doubt most translators actually use the gettext tools directly, either. Or bash scripts to automate tedious parts of the process.

I mean, there's nothing wrong with your workflow, but I doubt it represents the average translator.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
User avatar
Antro
Translator
Posts: 70
Joined: February 11th, 2009, 3:53 pm

Re: [mainline] there is a need for a en_US translation

Post by Antro »

Sorry for delay, real life issues...

Actually, Italian active translation team is quite small: a flock of 2 , maybe 2 and half...

Fuzzy: I agree with Michal, they are quite useful for translators. We use also the "fuzzy" state for campaign review: a translator that want to review a campaign make his/her changes, mark all of them as fuzzy and submit to the maitainer ( that's me...).

I focus at most my time on the stable branch: don't have enough resources to check and follow the dynamics of develop branches. There is only one exception to this rule: a new campaign approaching.
In this case, I prefer to start the translation of this new campaign in develop branch in order to avoid rushing at string freeze... (yes, WoF is at the beginning, but I'm working on it...)

Translation environment:
- an old PC with MxLinux;
- git clone of wesnoth repository ( usually stable branch 1.6, master branch if required );
- executable build from scratch, both with gcc and clang, in order to test the translations before submit them;
- Bash script to sync two-ways my translation workbench with git branches and to use appropriate msgmerge to update a partial translation;
- poedit for the main editing job and vim for low level editing;
- online dictionaries: google, merriam-webster, wikipedia, whatever helps...
- aspell and other scripts/unix commands for basic grammar and quality check

I would like to share also my personal path: I stumbled upon Wesnoth in 2009 and loved it. Unfortunately, at the time, my favorite campaing (Liberty) was almost untranslated in my native language (Italian). Since I use free-soft from the past century, I simply installed poedit, finalize the translation of the campaign and submit it to the former maintainer, as per translation wikipage. For me, having english sentence on the screen was a motivation to collaborate to have a better italian game.

A new en_US.po ? I don't think it could really help and smooth the workflow for translators, I prefer to have some more comments and suggestions as per my previous post.
Also, as translator, I should give at least a look at the en_US.po in order to have an idea of the mood that a sentence shoud have.

As player, it is better to play a campaign with "inaccurate" translation but less "english messages" or one with a lot of english but updated?
Honestly, I were pushed to the translation side of the force by the lack of completness, so, if not complete... I would probably play the whole campaign in english.

As matter of fact, as already stated, I think that a translation team should spent on the project at least few hour weekly in order to keep the house clean, and checking daily the changes.
Not easy to achive, but I suspect that any other solution would be a mere palliative, if the translation team isn't committed.

At last, just a provocation: I can always check all the fuzzy string of a file, mark as "unfuzzy", save and submit... 2 seconds for a fully translated campaign... :twisted: :twisted: :twisted:
User avatar
Celtic_Minstrel
Developer
Posts: 2158
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

Honestly, I wish there were different levels of fuzzy… then we could explicitly mark strings with only very minor (spelling) changes as "minor fuzzy", and the po loader would continue to use the old translation, unlike with regular fuzzies.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
User avatar
egallager
Posts: 568
Joined: November 19th, 2020, 7:27 pm
Location: Concord, New Hampshire
Contact:

Re: [mainline] there is a need for a en_US translation

Post by egallager »

Celtic_Minstrel wrote: April 30th, 2022, 4:23 pm Honestly, I wish there were different levels of fuzzy… then we could explicitly mark strings with only very minor (spelling) changes as "minor fuzzy", and the po loader would continue to use the old translation, unlike with regular fuzzies.
Is the po file format extensible at all? I'm wondering if that's something we could implement (say as comments?) just for Wesnoth, or if it'd have to be done in upstream gettext first?
User avatar
Celtic_Minstrel
Developer
Posts: 2158
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

I'm not sure. In theory you could probably add custom tags, but there are a number of potential problems with that.
  • Is a conforming po parser required to preserve unrecognized tags?
  • Even if we have a custom tag, there's no way to show it to the translator without a custom po editor that understands the tag.
  • Probably more problems too…?
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
CloudiDust
Translator
Posts: 34
Joined: May 1st, 2010, 10:25 am

Re: [mainline] there is a need for a en_US translation

Post by CloudiDust »

As the Simplified Chinese translation maintainer, IMHO, having a separate en_US translation is not the way to go.

"Flavored" and "unflavored" versions of the "same" message, are ultimately, different, as the "flavor" itself, or lack there of, is part of the information conveyed. Translators are not supposed to add or remove "flavor" when translating, instead they must do their best to retain the original "feel".

It is the developers' responsibility to present the translators with a single authoritative set of untranslated messages, which most accurately matches the developers' intentions and can be however simple or flavored. If the translators could not faithfully retain the "feel", it's the translators' problem, not the developers'. The developers need not, or rather, should not, provide two similar but not identical sets of untranslated messages, no matter how similar the two sets are.

In a typical gettext workflow, this "single authoritative set of untranslated messages" is supposed to be presented in the form of msgids, and translators are not required to consult msgstrs in po files of other target languages. Currently, the Wesnoth translation workflow works this way.

But the "separate en_US translation" proposal doesn't, as it now puts the authoritative "flavored" messages only in en_US.po files as msgstrs, while leaving the unauthoritative "unflavored" en_US messages as msgids. The informal contract between developers and translators would be broken.

This would lead to much confusion and translations with guaranteed information distortion (as there will be translators who unsurprisingly ignore the en_US msgstrs and only translate the "unflavored" msgids).

Also, for developers, keeping the two versions of en_US messages in sync would be extra work.

No, I don't think "a separate en_US translation" is a good idea.

More comments in the po files about the developers' intentions would be great though.
CloudiDust, the Simplified Chinese translation maintainer.
Michal-
Posts: 5
Joined: January 18th, 2021, 10:16 pm
Location: Czechia

Re: [mainline] there is a need for a en_US translation

Post by Michal- »

demario wrote: April 28th, 2022, 11:20 pm Believing that given you reach that level of performance, it should be the standard for translation for wesnoth would be arrogant.
Please, do not equal number of my tools with performance of our team. We have 100% thanks to Septim, our non-technician translator, who simply translates and revises many times faster than me. He uses Poedit and submits his work via GitHub Pull Requests (whose we both enjoy, because I see what he did, I can suggest corrections and Septim can accept or reject them).
demario wrote: April 28th, 2022, 11:20 pm The only question that matter: how will the project take care of the work after they left and how it will hold to the responsibility of maintaining the translations it was entrusted with.
The answer is: Life (or translation) finds a way. I started my first submits when I find out that there are bugs in Czech translation, while the original English text is correct. That was my way.
Celtic_Minstrel wrote: April 29th, 2022, 12:55 pm vim, seriously?
Yes, in most cases, but for some (correction) tasks the sed is more efficient.
Antro wrote: April 29th, 2022, 5:28 pm In this case, I prefer to start the translation of this new campaign in develop branch in order to avoid rushing at string freeze... (yes, WoF is at the beginning, but I'm working on it...)
Your whole experience is similar to mine, thanks for sharing. And thanks for asking about the WoF .po files. We started translate it today.
Antro wrote: April 29th, 2022, 5:28 pm As player, it is better to play a campaign with "inaccurate" translation but less "english messages" or one with a lot of english but updated?
As a player I vote for accuracy and English. But there are more severe bugs than fuzzy/untranslated that disable both English and translated text altogether. I mean mistakes introduced by translators in the Help formating that confuse parser and in such case the Help immediately disappears. I know about them, we had these in Czech translation too. Try to display Italian Struttura di gioco - Vittoria e sconfitta or French Déroulement du jeu. There are many of them. I can fix it for you, if you take me on in your team.
CloudiDust
Translator
Posts: 34
Joined: May 1st, 2010, 10:25 am

Re: [mainline] there is a need for a en_US translation

Post by CloudiDust »

nemaara wrote: April 9th, 2022, 5:58 pm Yes there's some small issues on the dev side with forgetting to update the WML file when necessary but on the flip side this would be really useful for me. I've actually hesitated to add more "flavor" to the text in some areas like Liberty because I was worried people would be completely unable to translate it. With this, I could go crazy with the en_US translation while leaving a more basic broadly understandable version in the WML file.
Generally, developers need not be too concerned about hard-to-translate messages.

As I said above, retaining the "feel" of untranslated messages in translations, is part of translators' job. Developers should express themselves freely, and please trust that translators could communicate such expressions accurately through translations, or failing that, use the best approximations within their abilities. Translators are expected to do so.

And developers are not expected to do what should be done by translators, that is, deciding whether untranslated messages have too-hard-to-translate "flavor" and what to do in such occasions.

Also, under the "separate en_US translation" proposal, the en_US translation actually would be an intentionally bad translation, as it would intentionally have extra "flavor" that wouldn't exist in the WML files.

Thus, the en_US translation would be considered either a confusing alternative set of untranslated messages, or an intentionally bad translation, and both are undesirable.
CloudiDust, the Simplified Chinese translation maintainer.
User avatar
Elvish_Hunter
Posts: 1575
Joined: September 4th, 2009, 2:39 pm
Location: Lintanir Forest...

Re: [mainline] there is a need for a en_US translation

Post by Elvish_Hunter »

Michal- wrote: May 1st, 2022, 9:15 pm But there are more severe bugs than fuzzy/untranslated that disable both English and translated text altogether. I mean mistakes introduced by translators in the Help formating that confuse parser and in such case the Help immediately disappears. I know about them, we had these in Czech translation too. Try to display Italian Struttura di gioco - Vittoria e sconfitta or French Déroulement du jeu. There are many of them. I can fix it for you, if you take me on in your team.
Indeed it's true. I just tried to open Aiuto -> Struttura di gioco -> Vittoria e sconfitta and the help browser immediately closed. I also got this error message in the stderr:

Code: Select all

error help: Errore di parsing durante l’analisi del testo della sezione Aiuto: Unterminated element: bolf
This <bolf> tag (which obviously should be <bold>) is at line 2430 of the wesnoth-help/it.po file.
If you're aware of more of these mistakes in the Italian translation, please let me know so I can fix them and send the corrected files to Antro (yes, I'm also a magenta name).
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
User avatar
Antro
Translator
Posts: 70
Joined: February 11th, 2009, 3:53 pm

Re: [mainline] there is a need for a en_US translation

Post by Antro »

Elvish_Hunter wrote: May 2nd, 2022, 3:35 pm
Michal- wrote: May 1st, 2022, 9:15 pm But there are more severe bugs than fuzzy/untranslated that disable both English and translated text altogether. I mean mistakes introduced by translators in the Help formating that confuse parser and in such case the Help immediately disappears. I know about them, we had these in Czech translation too. Try to display Italian Struttura di gioco - Vittoria e sconfitta or French Déroulement du jeu. There are many of them. I can fix it for you, if you take me on in your team.
Indeed it's true. I just tried to open Aiuto -> Struttura di gioco -> Vittoria e sconfitta and the help browser immediately closed. I also got this error message in the stderr:

Code: Select all

error help: Errore di parsing durante l’analisi del testo della sezione Aiuto: Unterminated element: bolf
This <bolf> tag (which obviously should be <bold>) is at line 2430 of the wesnoth-help/it.po file.
If you're aware of more of these mistakes in the Italian translation, please let me know so I can fix them and send the corrected files to Antro (yes, I'm also a magenta name).
This sounds as something already fixed... but evidence says that is still alive and kicking... and that I must review/update my quality-check scripts

Thank for the hint
User avatar
Pentarctagon
Project Manager
Posts: 5496
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

Antro wrote: May 2nd, 2022, 5:33 pm This sounds as something already fixed... but evidence says that is still alive and kicking... and that I must review/update my quality-check scripts

Thank for the hint
If you have a script that's able to find these types of issues, that'd be something we could potentially add to Wesnoth's CI and use to check all translations automatically, which would be useful.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
Michal-
Posts: 5
Joined: January 18th, 2021, 10:16 pm
Location: Czechia

Re: [mainline] there is a need for a en_US translation

Post by Michal- »

Elvish_Hunter wrote: May 2nd, 2022, 3:35 pm If you're aware of more of these mistakes in the Italian translation, please let me know so I can fix them and send the corrected files to Antro (yes, I'm also a magenta name).
Reclutare e richiamare unità, Terreni, Utilizzare le estensioni, Editor Mappe e Scenari, Grande Fiume.
User avatar
Elvish_Hunter
Posts: 1575
Joined: September 4th, 2009, 2:39 pm
Location: Lintanir Forest...

Re: [mainline] there is a need for a en_US translation

Post by Elvish_Hunter »

Michal- wrote: May 2nd, 2022, 7:29 pm Reclutare e richiamare unità, Terreni, Utilizzare le estensioni, Editor Mappe e Scenari, Grande Fiume.
Thank you for the report! I fixed all of them.
Antro wrote: May 2nd, 2022, 5:33 pm This sounds as something already fixed... but evidence says that is still alive and kicking... and that I must review/update my quality-check scripts
Here's the fixed file. I'm sending you a PM with all the details.
Attachments
wesnoth-help-it.zip
(102.3 KiB) Downloaded 144 times
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
CloudiDust
Translator
Posts: 34
Joined: May 1st, 2010, 10:25 am

Re: [mainline] there is a need for a en_US translation

Post by CloudiDust »

I think there are three problems that this thread tried/is trying to find a solution for:

1. Flavored messages might be too hard to translate.
2. Minor message updates create unnecessary work for translators working on the project.
3. Minor message updates invalidate perfectly/mostly working translations when there are no translators to update the translations.

In posts above I have discussed why Problem 1 should generally be a non-issue for developers.

About Problem 2, I, as a translator, am Okay with the so-called "unnecessary work", as I consider fuzzy translations (no matter what causes the fuzziness) opportunities to recheck and, if necessary, improve previous translations. Even if I do decide to "only remove the fuzzy flag", that is my decision to make, and my responsibility to take. Translators generally (should) have better grasp of the target languages than developers, so the former should be expected to be better at determining whether the "minor updates" are truly minor regarding their impact on translations.

So, if translators are to be involved anyway, then developers need not, or I would say, should not, prematurely fix translations for them.

If developers were to design some auto-fix tool, then the tool should preferably be run by translators (who get to decide whether auto-fixing is needed in each particular case), not by the developers themselves, unless…

… Problem 3, what if there are no translators to update the translations?

In this situation, auto-fixing run by developers, is (or conservatively speaking, might be) better than invalidating perfectly/mostly working translations. "Developers taking over translators' responsibilities" is/might be better than "doing nothing" then.

Also, I think auto-fixing run by developers, should only work at build time, and not modify the po files in the repository in any way. If those po files were auto-fixed, then (new or returning) translators would not be aware, and might fail to recheck the auto-fixed entries.

----------------

Addendum:

The auto-fix tool, if developed, shouldn't involve any "en_US translation". What the auto-fix tool needs might be mappings between old and new msgids, and those mappings shouldn't disguise themselves as en_US po files. Po files, if not used in a hackish way, are expected to be mappings from msgids to msgstrs.
CloudiDust, the Simplified Chinese translation maintainer.
User avatar
octalot
General Code Maintainer
Posts: 777
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

The initial post in this thread was about Problem 2. There seems to be a clear "no" to Problem 1.

Summary:
  • Reducing the translation effort for languages that aren't keeping up is important, however providing the full changes for languages that are at 100% is also important.
  • The dev team have contradictory expectations of the translation teams. The workflows involving emails to Ivanovic assume translators are unable to use Git, yet we're also expecting them to copy the Vim and Git tricks of the Czech team.
  • We should forget about pofix entirely.
  • There are a lot of lessons to learn from this.
  • The current primary contact person for translators has only 1h per week available for the role.

Fuzzies and auto-correction tools

cs, it and zh_CN explicitly want to deal with the fuzzies themselves, and don't want tooling to hide fuzzies, even for changes that are just typo fixes in the source.

For en_GB, fuzzies are desirable because any grammar corrections in the source are almost certain to need to be copied to the translation too. However, for this language there's a possibility that the translated text could be fixed by tools making the same change to both msgid and msgstr.

So the first conclusion from this is that the workflow for those languages shouldn't try to hide fuzzies. Reviving the pofix tool has been discussed in this thread and on IRC/Discord, but contrary to developers' expectations, that tool isn't welcome for those languages.
demario wrote: April 28th, 2022, 11:20 pm But it doesn't solve anything for the translators. The same number of strings still need to be checked at the end. So the same total amount of work is needed to update, making it a problem for translations with low level of activity. If some translator doesn't show up and fix the mess before next release cycle starts, the translation is broken (with most the work being preventable with a en_US translation: grammar, capitalization, typos...).
For fr, de and probably other languages, the fuzzies created by grammar corrections in the source are unwelcome. Having a mechanism for avoiding duplicated effort would be welcome. That was the intention of my #6634 and #6638 - while I'm not going to merge those, for 1.16.3 I intend to use the de.po files and can easily make the fr.po files useable too (as explained in 6634's description, it needs a pot-update so don't take the files directly from 6648).

For future releases, I have some vague ideas about how to clear fuzzy flags from .po files in a sensible way (only for trivial changes, only for flags which were automatically added in a pot-update). I think we should run that automatically just before the release on files that haven't been updated during the translation window, covering what CloudiDust has helpfully enumerated as Problem 3.

I'd also like a tool for sanity-checking translations against each other, detecting where some languages changed the translated text while others just unset the fuzzy flag.
CloudiDust wrote: May 3rd, 2022, 1:43 am About Problem 2, I, as a translator, am Okay with the so-called "unnecessary work", as I consider fuzzy translations (no matter what causes the fuzziness) opportunities to recheck and, if necessary, improve previous translations.
Does this need fuzzies? I'm not understanding why you couldn't decide to review the existing translations even without the fuzzies.


Expected technical levels of the translation teams

At least three teams (cs, it and de) are running gettext tools themselves.

cs and it are using Git repos to manage their translations, but are then mailing it in to Ivanovic. In the case of de, Bitron is commiting it directly, but is questioning whether that's the right workflow.

The development team has a confused image of the translation teams. The workflows involve emailing .po files to Ivanovic, a workflow constructed to support translators who are non-technical; however in this thread there are suggestions that the teams that aren't keeping up should adopt the tools being used by the cs team.
Pentarctagon wrote: April 29th, 2022, 1:45 am If there are things we as developers can do to reduce the amount of work for translators, then we should investigate doing those things. And if there are additional tools that translators can use to reduce their own work, then they should use them - or ask other translators who are using them how to do so rather than just saying "it's too technical for me".
When the tool list contains Vim, multiple Git branches, and 3-way or 4-way diffs, then that's quite a learning curve. Sharing the knowledge is good, but making that the minimum bar is a step too far.


po: comments in Git commit messages

In addition to po: comments in source files, we could make po: be a standard text to include in Git commit messages, with the idea that these would be for hints that are useful to translators comparing the differences between two versions. If the hint will be useful in the future then it should still go in the WML files; an example that would go in the Git commit would be:

Code: Select all

po: In SoF S09, many strings involving Krawg (a gryphon) assumed that Krawg was
the only flying unit. Similar texts have been added that are used when the player
has gryphon riders too; most of these will be marked as fuzzy versions of the
Krawg-only strings.
The plan is just to pipe the output of git log 1.16.2.. or git log --grep '^po:' 1.16.2..0 into a text editor, the same workflow as my "add missing stuff to the changelog" task. The idea being that it's something that's collated, edited, and sent to the i18n list when the pot-update is done.


Other thoughts
Discord wrote: All from discussions on April 29th:
14:27:07 [octalot] How often do translators send in .po files based on old .pot files?
14:46:49 [Ivanovic] they send in old files all the time unless we only update the potfiles once per year
14:47:02 [Ivanovic] they grab the files and work for weeks if not months on the files before sending them in
15:00:17 [Ivanovic] the point here is that they are trying to "complete" a translation for a textdomain and only afterwards check if something changed and fix those things up
15:00:19 [Ivanovic] if you look at some of the larger textdomains like utbs this will take weeks to translate or to redo
15:02:09 [Ivanovic] from many teams I hear nothing for like 4 months and then get a bunch of "here are all the files for the release, should be done now" messages
15:26:47 [octalot] @Ivanovic so what happens when, given the lack of fixes in the current version of pofix, you find that there's a load of fuzzies when merging the files?
15:27:15 [Ivanovic] then the translators have to download the files I uploaded with hundreds of fixes
15:27:37 [Ivanovic] they address them and some weeks later I get back the files, afterwards they hopefully have less fuzzy / new strings
15:27:41 [Ivanovic] rinse and repeat

15:53:59 [Ivanovic] 5) I do not have the time that I am willing to spend on hours of [discussions about pofix] seeing that my Wesnoth time is usually limited to about 1h on saturday to commit updates I received during the week (yeah, I am mostly inactive, just doing that part)
I think we need to give the translation teams better support than that - one of the things that I learnt here is that the primary contact for translators has very limited time available for the role.
Pentarctagon wrote: May 2nd, 2022, 6:39 pm
Elvish_Hunter wrote: May 2nd, 2022, 3:35 pm This <bolf> tag (which obviously should be <bold>).
If you have a script that's able to find these types of issues, that'd be something we could potentially add to Wesnoth's CI and use to check all translations automatically, which would be useful.
I logged a feature request for such a tool as #6035.

Antro wrote: April 29th, 2022, 5:28 pm As matter of fact, as already stated, I think that a translation team should spent on the project at least few hour weekly in order to keep the house clean, and checking daily the changes.
Getting WML authors to write more po: comments, and C++ authors to write more TRANSLATORS: comments, seems like something that we should promote in code reviews. Would you feel comfortable commenting directly on PRs with the "Prose" label (and apologies if you already do)?
Post Reply