[mainline] there is a need for a en_US translation

Brainstorm ideas of possible additions to the game. Read this before posting!

Moderator: Forum Moderators

Forum rules
Before posting a new idea, you must read the following:
Post Reply
CloudiDust
Translator
Posts: 34
Joined: May 1st, 2010, 10:25 am

Re: [mainline] there is a need for a en_US translation

Post by CloudiDust »

octalot wrote: May 3rd, 2022, 4:30 am cs, it and zh_CN explicitly want to deal with the fuzzies themselves, and don't want tooling to hide fuzzies, even for changes that are just typo fixes in the source.
Yes, I think zh_CN doesn't need the auto-fixing, and while I said above that "the auto-fix tool should preferably be run by translators", I realize now that translators can also choose to delegate this task to the developers, instead of running the tool themselves.

And I see that there are translators for other languages that much prefer such delegation.
octalot wrote: May 3rd, 2022, 4:30 am
CloudiDust wrote: May 3rd, 2022, 1:43 am About Problem 2, I, as a translator, am Okay with the so-called "unnecessary work", as I consider fuzzy translations (no matter what causes the fuzziness) opportunities to recheck and, if necessary, improve previous translations.
Does this need fuzzies? I'm not understanding why you couldn't decide to review the existing translations even without the fuzzies.
Fuzzies are not needed of course, but if there are fuzzies, then I would be more motivated to review.
octalot wrote: May 3rd, 2022, 4:30 am At least three teams (cs, it and de) are running gettext tools themselves.
zh_CN translations are hosted in my wesnoth-cn github repository that contains translations for Wesnoth the game, release notes, and Steam metadata. The game translation part is managed by a rough python script that invokes the gettext tools (and we don't actually use the official po files, only merge the official pot templates with the existing po files in our repository). The translation updates are then packed and emailed to Ivanovic.

I would make sure that our po files are merged with the most recent pots before sending them out.
CloudiDust, the Simplified Chinese translation maintainer.
User avatar
Celtic_Minstrel
Developer
Posts: 2158
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

CloudiDust wrote: May 3rd, 2022, 6:39 am we don't actually use the official po files, only merge the official pot templates with the existing po files in our repository
This is standard for gettext as I understand – the "official po files" are basically whatever you sent us last.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
CloudiDust
Translator
Posts: 34
Joined: May 1st, 2010, 10:25 am

Re: [mainline] there is a need for a en_US translation

Post by CloudiDust »

Celtic_Minstrel wrote: May 3rd, 2022, 2:00 pm
CloudiDust wrote: May 3rd, 2022, 6:39 am we don't actually use the official po files, only merge the official pot templates with the existing po files in our repository
This is standard for gettext as I understand – the "official po files" are basically whatever you sent us last.
I think there are two ways for translators to manage translations if they use git:
  1. Use a translation branch (or branches) of a cloned official repository. (Then the official po files, in theory, could be used for "git style merging" into translators' po files, as git considers them different branches of the same files.)
  2. Use a separate translation repository. (The official po files would not be used, other than maybe used during the initialization of the translation repo.)
Also, "official po files" would get modified by developers to stay up-to-date, so they will diverge from "whatever we sent you last".
  1. I think the po files would be processed by Ivanovic before being committed into the official repo. We target 1.16 but our translation updates would also go into master, where the pot templates are not identical to 1.16, so the actual po files committed to master would also be different. (I suppose the files committed to 1.16 are technically also not the same files we send, but their contents would generally be identical.)
  2. pot-updates run across all official pot and po files. Updates introduced by such operations wouldn't be reflected automatically in our po files. I will invoke msgmerge to update our po files then.
One of the reasons why our repo didn't start as a clone of the official one, is that our workflow was established back when Wesnoth was still hosted with SVN.
CloudiDust, the Simplified Chinese translation maintainer.
demario
Posts: 130
Joined: July 3rd, 2019, 1:05 pm

Re: [mainline] there is a need for a en_US translation

Post by demario »

octalot wrote: May 3rd, 2022, 4:30 am For fr, de and probably other languages, the fuzzies created by grammar corrections in the source are unwelcome. Having a mechanism for avoiding duplicated effort would be welcome.
Thank you octalot for your hard work on this.
I fully agree that new fuzzies by grammar corrections are unwelcome, but unfortunately I think an automatic clear of these fuzzies is too scary to be run by someone that doesn't understand the target language. The best solution is that they are not created at the first place.

Generally speaking, if a writer can't reach the conclusion that his change to the original string doesn't change the meaning, no computer solution could reliably reach the conclusion that the corresponding translation doesn't need an update.
Pentarctagon wrote: April 29th, 2022, 1:45 am Also it's worth pointing out that there aren't any languages that Wesnoth targets for keeping up to date.
Not sure how much this statement is worth. I guess English doesn't count as "any language" in there. :roll:
As long as this stands as a project policy, we can all stop pretending that anyone care enough to make any change.
...and no additional communication is needed really. :annoyed:
User avatar
Pentarctagon
Project Manager
Posts: 5496
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

demario wrote: May 6th, 2022, 9:39 am
Pentarctagon wrote: April 29th, 2022, 1:45 am Also it's worth pointing out that there aren't any languages that Wesnoth targets for keeping up to date.
Not sure how much this statement is worth. I guess English doesn't count as "any language" in there. :roll:
As long as this stands as a project policy, we can all stop pretending that anyone care enough to make any change.
...and no additional communication is needed really. :annoyed:
language -> translation :roll:

And this has always been the effective policy - as far as I know, there has never been a release delayed because of translation incompleteness. Sometimes longer string freezes are done, but never an otherwise planned and ready release delayed.

And at the risk of repeating myself, doing so would require more active communication (or at least being easier to contact, such as being on IRC/Discord), than is generally the case currently. Actively trying to keep any translation at 100% would necessarily need more coordination than simply having translators send updates to ivanovic whenever they happen to be finished, for starters.

If all I do is declare that we're not releasing any update where some list of translations I arbitrarily chose aren't 100% complete and nothing changes beyond that, it will not work.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
User avatar
Jarom
Posts: 110
Joined: January 4th, 2015, 8:23 pm
Location: Green Isle, Irdya or Poland, Earth - I'm not quite sure

Re: [mainline] there is a need for a en_US translation

Post by Jarom »

Just a small update, as a person who advocated using some sort of tool to automatically unfuzzy changes with minor grammar fixes, I want to note that adding a comment as #po: would suffice for me. The tedious scanning of every single character in poedit or looking up those phrases one by one in diff tools are the problems, not having fuzzies by itself. In other words, the long search for "why did this 5 sentences long message became a fuzzy and should I be concerned about it?" is what should be fixed, not being fuzzy itself. If that is adressed, even one translator could quickly fix most fuzzies in like 1-2 hours once a major version, eg. during string freeze.
User avatar
Celtic_Minstrel
Developer
Posts: 2158
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

Sounds like the "po comments on commits" idea may be able to satisfy that requirement? And then distributing a separate translator's changelog. Maybe offering a quick and easy way to merge said changelog into the po file itself so the changes show up in poedit or whatever.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
User avatar
Pentarctagon
Project Manager
Posts: 5496
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

So for the original idea of having an en_US translation, the majority of translators who have replied have been against doing this, so at this point I don't think it's something we'll end up doing. In a similar way, some (most?) of the translators who have spoken up are also opposed to using pofix to mass-update translations for minor changes, so it seems like that tool can effectively be considered to be retired.

There have been some additional ideas brought up during this thread as well though that could be considered and potentially implemented:
  • Adding po comments to commit messages when making changes that aren't consequential for the string's meaning so that translators don't need to spend a bunch of time scrutinizing a string trying to find what minor thing was changed. This would be generated via a new script to create a separate "translators changelog" per release. This would also accomplish much the same as pofix does, with the exception of letting translators choose whether they want to just clear the fuzzy status or take a closer look.
  • Be stricter about changing strings in the stable series. This would mean a permanent string freeze on everything except UI related strings.
  • Communicate to the translators via the mailing list which campaigns are likely to have major overhauls during the current developer cycle.
  • Use Weblate. This would be required to be self-hosted, since based on their pricing criteria for their cloud offering we'd fall into the Enterprise tier (there are more than 10,000 source strings), which is over $3,000/year. This could be considered, but there would need to be at least a few translation teams that would definitely use it - I don't want us to go to the trouble of getting it setup only to then sit there unused.
  • Have certain languages that are targeted to be 100% translated before a release is tagged. This would also require the translators for those languages be more actively engaged with the rest of the team - releases can't be stalled indefinitely waiting for a translation that has no estimated completion date and no good way to contact its translators.
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
User avatar
Celtic_Minstrel
Developer
Posts: 2158
Joined: August 3rd, 2012, 11:26 pm
Location: Canada
Contact:

Re: [mainline] there is a need for a en_US translation

Post by Celtic_Minstrel »

Pentarctagon wrote: May 15th, 2022, 1:59 am I'm not clear on why it would be beneficial to put po comments in commit messages rather than the cfg/lua/cpp file though.
Because they're comments on how the string has changed, so they're useless clutter to someone starting the translation from scratch. These po comments would not replace the translator's changelog idea – said changelog would be automatically generated by scraping commits for po comments.
Author of The Black Cross of Aleron campaign and Default++ era.
Former maintainer of Steelhive.
oooo
Posts: 19
Joined: May 14th, 2014, 12:58 pm
Location: Japan

Re: [mainline] there is a need for a en_US translation

Post by oooo »

I heard weblate.org hosted FOSS projects for free. (https://github.com/naev/naev/issues/158 ... -748379631)
I use weblate.org as a translator in Naev which has more than 10,000 source strings.
User avatar
Pentarctagon
Project Manager
Posts: 5496
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

Celtic_Minstrel wrote: May 15th, 2022, 11:34 am
Pentarctagon wrote: May 15th, 2022, 1:59 am I'm not clear on why it would be beneficial to put po comments in commit messages rather than the cfg/lua/cpp file though.
Because they're comments on how the string has changed, so they're useless clutter to someone starting the translation from scratch. These po comments would not replace the translator's changelog idea – said changelog would be automatically generated by scraping commits for po comments.
Alright, I've updated my previous post to reflect that.
oooo wrote: May 15th, 2022, 12:15 pm I heard weblate.org hosted FOSS projects for free. (https://github.com/naev/naev/issues/158 ... -748379631)
I use weblate.org as a translator in Naev which has more than 10,000 source strings.
Huh, alright. It sounds like that's something we'd need to contact Weblate about and see if they'd allow Wesnoth's translations to be hosted there for free?
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
User avatar
octalot
General Code Maintainer
Posts: 777
Joined: July 17th, 2010, 7:40 pm
Location: Austria

Re: [mainline] there is a need for a en_US translation

Post by octalot »

What are the benefits of using Weblate, please? A couple of people have mentioned it, but I'm not sure if they're recommending it for its benefits, or just discussing it because it's already been mentioned.
IceSandslash
Developer
Posts: 17
Joined: February 12th, 2023, 1:13 pm

Re: [mainline] there is a need for a en_US translation

Post by IceSandslash »

Pentarctagon wrote: May 15th, 2022, 1:59 am So for the original idea of having an en_US translation, the majority of translators who have replied have been against doing this, so at this point I don't think it's something we'll end up doing. In a similar way, some (most?) of the translators who have spoken up are also opposed to using pofix to mass-update translations for minor changes, so it seems like that tool can effectively be considered to be retired.
There is a sort of bias here that the people we are more likely to talk to are the most active, and therefore the highest achieving translators. Since their translations are already mostly working in quantity, their main focus should be quality. For them, unfuzzying strings wouldn't help in their high-quality effort.

However, the proposal in this thread would help the middle-to-low end-of-spectrum of translation groups, by alleviating their workload. Honestly, this should be the focus. Helping them in quantity should be very impactful. Still, there should be ways to help translators focus on the smallest details if so they wish.

For that purpose, I intend to add four tools.
1. A tool that pulls the en_US versions of every msgid into the .po file of another language. But they would be pulled as comments, in the form

Code: Select all

#.updated(en_US):"Paragraph that may be edited every other week."
Tentatively, the command (also available from a GUI), would be:

Code: Select all

wmlxgettext --pull-variants en_US en_US.po es.po
2. A tool that removes the comments added by (1) (i.e. #.updated(en_US) etc). Tentatively, the command (also available from a GUI), would be:

Code: Select all

wmlxgettext --prune-variants en_US es.po
3. A tool that adds comments in behalf of the translator, signaling that all the comments added by (1) (i.e. #.updated(en_US)) have already been reviewed and the translations adjusted accordingly (or that the translator/translator group doesn't care about such minor changes). Tentatively, the command (also available from a GUI), would be:

Code: Select all

wmlxgettext --set-variants-done en_US es.po
Note. Such a comment may also be added by the translators in POEdit or their favorite editors. They would have the form:

Code: Select all

# updated: "Paragraph that may be edited every other week."
4. A tool that checks for any mismatches between the #.updated(en_US): TEXT automatic comments, and the # updated: TEXT translator-authored comments. Every string where there is a mismatch is then marked as fuzzy, and translators may then use their favorite editors to review the latest updates. Tentatively, the command (also available from a GUI), would be:

Code: Select all

wmlxgettext --set-variants-fuzzy en_US es.po
This implies that there would be two main steps in translation workflows:
1. Update the new or updated msgids as per the simplified English/source version.
2A. Ignore the updates in the fully localized English version.
2B. Don't ignore the updates. Instead, pull them, mark strings as fuzzy, and review them.
User avatar
Pentarctagon
Project Manager
Posts: 5496
Joined: March 22nd, 2009, 10:50 pm
Location: Earth (occasionally)

Re: [mainline] there is a need for a en_US translation

Post by Pentarctagon »

Some initial questions that come to mind, mostly regarding the workflow for using these tools:
  1. Would this mean anything for the po update I run for each release? Specifically that being scons pot-update update-po4a manual.
  2. Is this something that each individual translator/translation team would need to run themselves?
  3. What, specifically, is meant by en_US here?
99 little bugs in the code, 99 little bugs
take one down, patch it around
-2,147,483,648 little bugs in the code
IceSandslash
Developer
Posts: 17
Joined: February 12th, 2023, 1:13 pm

Re: [mainline] there is a need for a en_US translation

Post by IceSandslash »

3. What, specifically, is meant by en_US here?
At the moment of its creation, for each campaign/domain, it would be a file of the form:

Code: Select all

msgid "Same string in both lines."
msgstr "Same string in both lines."
After that, this file is to contain the canonical version of Wesnoth texts, including minor ortography, punctuation, and grammar fixes, but also potentially uncommon archaisms, and fad English-specific phrases/memes, etc., all according to the relevant campaign author's opinion (SP lead would have veto rights regarding this judgment call here. I suppose that's not new.)

This follows the fact that Wesnoth mainly follows US English, as opposed to British, Australian, or other dialects. (I have seen a few instances of British spelling, though, which will have to be fixed.) Someone might come up with the idea of adding support for other regional variants of English, but that would be a separate proposal.

Major changes regarding intention, personality, or whatever else the campaign author deems important would have to go into the campaign source, and therefore modify the `msgid`, with impact across all languages.

These judgment calls regarding importancy affect the string freezes. Some minor English text fixes may be added, to the .po file only, during string freeze, again according to the judgment of the author. Translators may follow the judgment calls through process 2A, or override them through process 2B.
1. Would this mean anything for the po update I run for each release? Specifically that being scons pot-update update-po4a manual.
--pull-variants may be run either in the Wesnoth-wide build process, or by translators that intend to follow 2B.
If it's done in the build process, it probably should be done only for a few languages which are interested in it. Doing it in the build process would mean that translators don't need to look for the most up-to-date en_US.po file themselves.

On the other hand, translators that are the most interested in quality may want to re-run this all along the string freeze process. It's possible this may introduce the need for two string freezes: one for the source, and another one for both source and en_US.po, but that requires first-hand experience.
2. Is this something that each individual translator/translation team would need to run themselves?
--set-variants-fuzzy and --set-variants-done are to be run by translators that intend to follow 2B.
--prune-variants would always be run either by each translator or by ivanovic. This would remove the #. updated(en_US) comments, but NOT the translator-authored # updated comments.

Translator teams not interested in closely matching the en_US variants don't need to run anything.
Post Reply